[Help][Db query] difference when missing datas

Johndodev · March 18, 2025, 11:45pm

Hello,
I’m trying to get a line chart of, kind of “query per seconds”, or “per x seconds”.
My counter is saved every 15 seconds.
Sometimes, I may have a whole in the datas, like no datas for 30 minutes.
When it append, the line chart has a huge point which I don’t know how to handle.
There is screen 1 with the counter datas, and screen 2 with the difference. What you see is missing datas, when data is there again, the first new point is like at 2000, then normal again.
I don’t even know if there is something wrong, but can you check the request please ?

Anaisdg · March 20, 2025, 6:27pm

Hello @Johndodev,
can you explain in a little more detail what’s happening?
I can also replicate a similar spike, but the difference is still being calculated as expected for example:

import "array"

data = array.from(rows: [
  {_time: time(v: "2024-03-20T00:00:00Z"), _field: "temperature", _value: 20.0},
  {_time: time(v: "2024-03-20T01:00:00Z"), _field: "temperature", _value: 21.0},
  {_time: time(v: "2024-03-20T02:00:00Z"), _field: "temperature", _value: 22.0},
  {_time: time(v: "2024-03-20T03:00:00Z"), _field: "temperature", _value: 23.0},

  // 4-hour gap here

  {_time: time(v: "2024-03-20T07:00:00Z"), _field: "temperature", _value: 27.0},
  {_time: time(v: "2024-03-20T08:00:00Z"), _field: "temperature", _value: 28.0},
  {_time: time(v: "2024-03-20T09:00:00Z"), _field: "temperature", _value: 29.0},
  {_time: time(v: "2024-03-20T10:00:00Z"), _field: "temperature", _value: 30.0},
  {_time: time(v: "2024-03-20T11:00:00Z"), _field: "temperature", _value: 31.0},
  {_time: time(v: "2024-03-20T12:00:00Z"), _field: "temperature", _value: 32.0}
])

data
  |> range(start: time(v: "2024-03-20T00:00:00Z"), stop: time(v: "2024-03-20T12:00:00Z"))
  |> yield(name: "before difference")
  |> difference(nonNegative: false, columns: ["_value"])  
  |> yield(name: "after difference")
  |> aggregateWindow(every: 2h, fn: mean, createEmpty: false)
  |> yield(name: "after agg")

Where before difference is:

the after difference is:

And the aggregate window after results are

As we can see we still get a spike but this isn’t due to the time gap. Its just that the difference is higher there and that’s being reflected accurately in the subsequent calculations.

I recommend using multiple yield statements like I did to help dive into the data to see what’s the cause of your unexpected output and then maybe handle that special case iwth some additional flux.

Johndodev · March 20, 2025, 8:58pm

Hello,

yes it is working as expected, I was pretty sure it was a logic problem, on my side.

I’ve dived more into the doc, tryied some stuff with “fill” and “createEmpty”, then I found the derivative function, which is exaclty what I needed ! Which was, “do not calculate the difference when n-1 data is null” (and do not use the previous non null value).

Thanks !

Topic		Replies	Views
Find gap between _time InfluxDB 2 influxdb , query , flux	3	781	April 8, 2022
Show time difference between particular records InfluxDB 2 influxdb	0	477	May 29, 2020
Weird query behaviour with different time ranges InfluxDB 2 influxdb , query	7	741	April 6, 2023
InfluxDB query with Python - time difference of 2h InfluxDB 2 influxdb , query , python , timestamp	1	532	November 20, 2023
Data not showing in query until after similar data is inserted Store influxdb , influxdata , influxql	0	620	October 3, 2018

[Help][Db query] difference when missing datas

Related topics