Get timestamps with fixed interval between start..stop regardless of data availability

I have data from sensors that is being collected every 10 seconds, and I have task that creates downsampled data every 5 minutes.

During some periods there is no data from sensors, but for reporting I need to have 5-minute intervals with null data for such cases

For example:

In downsampled bucket I have data with timestamps 2024-06-18T23:15:00.000Z, 2024-06-18T23:20:00.000Z and 2024-06-18T23:30:00.000Z, but do not have records with timestamps 2024-06-18T23:10:00.000Z and 2024-06-18T23:25:00.000Z

When I query data from 2024-06-18T23:10:00.000Z to 2024-06-18T23:30:00.000Z I would like to get the following result:

2024-06-18T23:10:00.000Z - NULL
2024-06-18T23:15:00.000Z - real value from DB
2024-06-18T23:20:00.000Z - real value from DB
2024-06-18T23:25:00.000Z - NULL
2024-06-18T23:30:00.000Z - real value from DB

I have an idea to check data availability in task and insert dummy records (i.e. timestamp with value == NULL), but can’t find out how to do that in Flux. Or may be it possible to solve without inserting such dummy records.

I am aware of availability of interpolate.linear(every: 5m), but it doesn’t fit for my case (I need NULLs, not linear interpolation).

Any ideas are very appreciated

@ebabeshko You can use aggregateWindow() with createEmpty set to true. If the data is already downsampled, just use last() as the fn. Set every to the same interval that the data is downsampled to. You’ll also want to set timeSrc to _start to use the left window boundary as the output timestamp.

// ...
    |> aggregateWindow(every: 5m, fn: last, createEmpty: true, timeSrc: "_start")

Here’s an example using the timestamps you provided and some filler values:

import "array"

array.from(
    rows: [
        {_time: 2024-06-18T23:15:00Z, _value: 2},
        {_time: 2024-06-18T23:20:00Z, _value: 3},
        {_time: 2024-06-18T23:30:00Z, _value: 5},
    ],
)
    |> range(start: 2024-06-18T23:10:00Z, stop: 2024-06-18T23:35:00Z)
    |> aggregateWindow(every: 5m, fn: last, createEmpty: true, timeSrc: "_start")

This returns:

_start* _stop* _time _value
2024-06-18T23:10:00Z 2024-06-18T23:35:00Z 2024-06-18T23:10:00Z
2024-06-18T23:10:00Z 2024-06-18T23:35:00Z 2024-06-18T23:15:00Z 2
2024-06-18T23:10:00Z 2024-06-18T23:35:00Z 2024-06-18T23:20:00Z 3
2024-06-18T23:10:00Z 2024-06-18T23:35:00Z 2024-06-18T23:25:00Z
2024-06-18T23:10:00Z 2024-06-18T23:35:00Z 2024-06-18T23:30:00Z 5
2 Likes