Incorrect data when combining aggregateWindow() and fill(..., usePrevious: true)

Hello!

I am using InfluxDB 2 with Grafana to query and sum up readings from various power meters around our house. The basic query is this:

from(bucket: "Home")
  |> range(start: v.timeRangeStart, stop:v.timeRangeStop)
  |> filter(fn: (r) => r._measurement =~ /sonoff\.0\.Kueche_St_Links\.ENERGY_Power/
                   and r._field == "value"
                   and r.from =~ /mqtt|sonoff|influxdb/
            )
  |> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: true, timeSrc: "_start")
  |> fill(column: "_value", usePrevious: true)

The results will be displayed in a stacked graph in Grafana 9.0.2. But my issue is also reproducable in InfluxDB’s web UI. The sensors return a measurement every ten seconds.

I use aggregateWindow() because plotting this data in Grafana would overflow the graph with too many datapoints. I also use fill() because the sensors supply a new value only if it is different from the previous one, i.e. if there is a constant energy usage over a long time just the first measurement is actually saved. Putting fill() before the aggregation does not help with this issue.

But for sensors which have only very brief readings, like in the screenshot below, this concept fails, and I don’t know how to improve it. You can see the above query with (blue) and without (red) aggregateWindow() and fill() plotted in the Web UI, and Influx for some reasons misses out on the fact that the device is being switched off, and reports constant enery consumption of >1.5kW, which is obviously incorrect.

How can I combine (a) aggregation, (b) gap filling, and (c) not missing significant changes in value, when querying this measurement?

image

@Anaisdg, do you have an idea maybe?

Hello @Jens,
Hmm it seems that fill and aggregateWindow are working as expected. I would expect the mean over the raw data (red) to be essentially one straight line (blue).
You could change the window period to be smaller to try and better capture some of those peaks.
You could also:

  • filter function to filter for data that’s greater than 0 and plot those points
  • and filter for data that’s less than 0 and aggregate on that

You could also:

  • write the on periods to a new bucket with a task and query just the on periods/reduced data/only capture those events

Do any of these solutions sound like they’re in line with what you’re looking for?

Jenno @anaidsg, thank you for your answer!

How can the mean() over the red line be a straight line? There are only a few very brief periods >0, and if I understand aggregateWindow() correctly it always takes a chunk of values, calculates the “function” (mean in this case) for these values, then returns a single value. So it should at least approximately - depending on window size - retrace the original line.

The problem is that my instruments omit repeated values (any, not just 0), and if this leaves no values at all inside such an aggregate window, aggregateWindow() returns “null”, and fill() with usePrevious then copies the previoues value, which is formally correct, but physically nonsense.

Am I using the wrong functions for this task?

If your instruments omit repeated values, and “usePrevious” does exactly what
it says, but as you say, this is “formally correct but physically nonsense”,
then what would you expect the (guessed) values to be instead?

If your instruments occasionally do not provide values, what would you want to
have substituted in their place?

Oh, and just in case you are thinking “I would like to have the missing values
interpolated between the last known value and the next known value”, bear in
mind that at the time at which the missing value needs to be substituted by
something else, the “next known value” is not yet known, so that calculation
is impossible. You can only use historical, not future, data.

Antony.

Hi @Jens -

I’m a bit confused about the desired output, but referring to your red/blue pix above, is it correct that:

Sonoff is reporting power-on at the 1st red spike, power-off at the 2nd, power-on at the 3rd and off again at the 4th? If so, then would you want to see a “square wave” graphic like this?

image

1 Like

Yes.
Actually, the red and blue lines should at least look identical unless zoomed in.
However, when on, the device reports a value every 10 seconds, because the consumption has some noise (+/- a few watts).
So currently it is not possible to reduce data like this using aggregateWindow() because this misses some spikes and will then produce large errors, and it is not possible to plot data like this without reduction, because sometimes this data also has far too many consecutive points to plot when the device is switched on for a longer time, and I cannot say beforehand which happens when.
One possible solution would be to fill all NULL values with zeros BEFORE using aggregateWindow() so that the mean aggregator would eventually reach zero after a short usage spike.

1 Like

@Jens
I’m sorry I’m confused it looks like your data is essentially square waves already. You’re looking to apply aggregate window to reduce the number of data points but also preserve the spikes?
Can you maybe filter for when data is higher? and then apply an aggregateWindow to create empty values and fill with whatever value you want?
Something like:

import "experimental/array"

array.from(rows: [{foo: "bar", baz: 21.2, _time: 2022-07-30T00:00:00Z}, {foo: "bar", baz: 23.8, _time: 2022-07-31T00:00:00Z}])
|> range(start: 2022-07-29T00:00:00Z, stop: 2022-08-02T00:00:00Z)
|> aggregateWindow(every: 1d, fn: first, column: "baz")
|> group()
|> fill(value: 0)

Yes, essentially.
I now solved it by separating my data sources into two groups, and not aggregateWindow()ing the group with the data sources which have these very short usage spikes.
This is a workaround which won’t work with data sources which have short usage spikes AND a lot of data points, but it works for me. But I would still expect a time series database like Influx to have a general solution for this eventually …

Thank you :slightly_smiling_face:

Hello @Jens,
I agree. Influx needs to provide various data compression algorithms to retain the shape of the data without concealing local minima and maxima.
Are there any specific compression algorithms that you would be interested in using?

Also what are you using InfluxDB for? Learning about users’ use cases makes my day.
Thanks!

Hello,

I think I am having the same issue. I am combining some sparse boolean data stored in InfluxDB with grafana timelines and have incorrect state displayed.

The data set contains door states (opened-1 or closed-0) that with measurement at the time of the change or every hour. I use the aggregate function because I sometime look at the data over a long timespan, to avoid having too many datapoints. The aggregate function uses max because it highlights the open states that I don’t want to miss. This issue arrises when I have the door open for a few seconds then shut (happens frequently). The fill(... usePrevious: true) then uses the aggregated data (open - the max), rather than the real data (closed) for the part where there is no measurement. Therefore the timeline graph following this event is wrong until the next stored data point, unless I zoom enough so that the aggregation window is only a few seconds long.

Here is the query:

from(bucket: "mybucket")
  |> range(start:  v.timeRangeStart, stop:v.timeRangeStop)
  |> filter(fn: (r) => r._measurement =~ /^.*Appt_${Appartement:regex}_.*Porte_CTS_OFB$/
      or r._measurement =~ [...]
    )
  |> aggregateWindow(every: $__interval, fn: max, createEmpty: true)
  |> fill(usePrevious: true)

The resulting graf over 24h ($__interval equal to 1 minute) is the following, you can see a 55 minute open time (under the tooltip) in red starting at 22:06:

But when zooming in to a shorter 5s $__interval used by the aggregate window, one can see that the door was in fact only open 5 seconds:

I would have expected to see a door open status for 1 minute in the 24h graphic (the duration of the aggregation window).

I also tried to inverse the fill and aggregateWindow functions but this does not help. I really have no idea on how to achieve this result. Is this something that I could achieve with other functions ? I’d love some help if you can.

(Other issue visible in the second picture: one thing I’d like to improve is the lack of data at the beginning of the timeline. I’d like to fetch 1 previous point up to 1 hour before, but this is another topic.)