Query to generate windowed aggregates with timestamp at window start and without partial data

I was looking for a way to have a query that would generate mean aggregated windowed data.
For example to get the last 15 minutes in 5 minutes windows, my first attempt was:

from(bucket: "bucket")
  |> range(start: -15m)
  |> aggregateWindow(every: 5m, fn: mean)

However this generates these timestamps:

2020-05-08T00:25:00Z
2020-05-08T00:30:00Z
2020-05-08T00:35:00Z
2020-05-08T00:37:59.820957265Z

the problems with this approach were:

  • the first window starts at 2020-05-08T00:22:59.820957265Z so the window ending with 2020-05-08T00:25:00Z isn’t complete and if you run the query again the value for that window might change because a different range of data is included
  • the timestamps still show the end of the window and the last one shows the current timestamp

To solve these problems you need to first use the timestamp truncated at a 5 minutes interval:

import "date"
import "experimental"

startTime = experimental.subDuration(
  d: 15m,
  from: now()
)
from(bucket: "bucket")
  |> range(start: date.truncate(t: startTime, unit: 5m))

this way instead of starting at 2020-05-08T00:22:59.820957265Z the first window will start at 2020-05-08T00:20:00Z and after 5 minutes will be at 2020-05-08T00:25:00Z and so on.

To use the timestamp at the beginning of the window instead use:

  |> aggregateWindow(every: 5m, fn: mean, timeSrc: "_start")

so the final query will be:

import "date"
import "experimental"
startTime = experimental.subDuration(
  d: 15m,
  from: now()
)
from(bucket: "bucket")
  |> range(start: date.truncate(t: startTime, unit: 5m))
  |> filter(fn: (r) =>
    r._measurement == "metric"
    and r.machine == "83"
    and (r._field == "temp_1")
  )
  |> aggregateWindow(every: 5m, fn: mean, timeSrc: "_start")
1 Like

Hello @alex88,
Welcome! Thanks for sharing your query. Solution above :).

This topic was automatically closed 60 minutes after the last reply. New replies are no longer allowed.