Problematic default behaviour of window and aggregateWindow

rruusu · October 19, 2022, 10:32am

Hi,

The documentation of window() and aggregateWindow() are not very clear on the topic, but by the examples, confirmed by trial, the selection of rows in each window uses a condition _start <= _time < _stop.

At the same time, the default for timeSrc is "_stop", which leaves out any sample that has a time stamp actually matching the resulting value for the aggregated data.

This combination results in problematic and quite unexpected behavior. The selection rule implies that each sample represents a value that is related to a time period after the _time value, but the aggregate produces data in which the samples are related to a period of time time before the _time value.

This results in some funky behavior in certain use cases:

If an aggregate happens to use a every parameter that already matches the sampling time of the data, the result is a shift of one time step in the data, while a developer may expect no change at all.
When a time series is windowed, and then windowed again, the period of time drifts strangely away from the time stamps in the _time column. For example, if one has data in 30 s intervals and then takes the sum in 5 minute windows and then the mean of those in 1 hour windows, the data that actually gets put into the resulting 1 hour windows is from samples with minute parts from -5:00 to 54:30.

This doesn’t happen if one chooses the timeSrc: "_start" option, in which cases the sample selection and the output timestamp match each other…

Is there some hidden way to change the selection rule to _start < _time <= _stop? This would be necessary, for example when processing data that is already an aggregate of some window into the past, as produced by many kinds of measurement hardware. The only way that I could think of is a horrible kludge that uses a rule _start <= _time - eps < _stop.

  |> aggregateWindow(every: 5m, offset: 1ms, fn: mean, createEmpty: false)
  |> timeShift(columns: ["_time"], duration: -1ms)

Jay_Clifford · October 20, 2022, 9:54am

@Anaisdg do you have any thoughts on this one?

Topic		Replies	Views
Unexpected timestamp when using aggregated window per day InfluxDB 2 grafana , flux	7	2066	October 8, 2022
Query to generate windowed aggregates with timestamp at window start and without partial data Fluxlang query , flux	2	1156	May 11, 2020
Query to generate windowed aggregates with timestamp at window start and without partial data (updated for 2023) Fluxlang query , flux	0	179	January 1, 2024
Custom Aggregate Window function that preserves timestamp from Selector InfluxDB 2 flux	1	353	June 13, 2023
The aggregateWindow function is unable to retrieve data at the time interval specified by the step value InfluxDB 2 influxdb , flux	6	733	March 23, 2022

Problematic default behaviour of window and aggregateWindow

Related topics