Having difficulties with 'window' related functions

gorkem · June 27, 2021, 6:12pm

Hello everyone,

I’ve been struggling with ‘windowing’ for the last 4-5 days and I just wasn’t able to solve the issue. So I decided to try my chances here…

I’ve seen a couple of topics related to my issue but they did not solve my problem.

What I’ve been trying to do is to windowing my query by ‘exact’ durations. window() and aggregateWindow() functions do not work the way I want.

With window()

t1 = data
    |> range(start: -1h, stop: now())
    |> window(every:15m)
    |> first()
    |> yield(name: "t1")

Result:

start	_stop	_time	_value
2021-06-27T16:40:22.023043671Z	2021-06-27T16:45:00Z	2021-06-27T16:40:56Z	v1
2021-06-27T16:45:00Z	2021-06-27T17:00:00Z	2021-06-27T16:45:56Z	v2
2021-06-27T17:00:00Z	2021-06-27T17:15:00Z	2021-06-27T17:00:55Z	v3
2021-06-27T17:15:00Z	2021-06-27T17:30:00Z	2021-06-27T17:15:54Z	v4
2021-06-27T17:30:00Z	2021-06-27T17:40:22.023043671Z	2021-06-27T17:30:55Z	v5

with aggregateWindow()

t1 = data
    |> range(start: -1h, stop: now())
    |> aggregateWindow(every:15m, fn: first)
    |> yield(name: "t1")

Result:

_start	_stop	_time	_value
2021-06-27T16:41:09.116567911Z	2021-06-27T17:41:09.116567911Z	2021-06-27T16:45:00Z	v1
2021-06-27T16:41:09.116567911Z	2021-06-27T17:41:09.116567911Z	2021-06-27T17:00:00Z	v2
2021-06-27T16:41:09.116567911Z	2021-06-27T17:41:09.116567911Z	2021-06-27T17:15:00Z	v3
2021-06-27T16:41:09.116567911Z	2021-06-27T17:41:09.116567911Z	2021-06-27T17:30:00Z	v4
2021-06-27T16:41:09.116567911Z	2021-06-27T17:41:09.116567911Z	2021-06-27T17:41:09.116567911Z	v5

now the first problem arises here, I’m trying to get 15minute windows out of 1hour. so technically what I want is 4 exact evenly divided time blocks… WITHOUT dropping the _time values…

so here is an example of what I’m trying to get:

start	_stop	_time	_value
2021-06-27T16:40:22.023043671Z	2021-06-27T16:55:22.023043671Z	T1	V1
2021-06-27T16:55:22.023043671Z	2021-06-27T17:10:22.023043671Z	T2	V2
2021-06-27T17:10:22.023043671Z	2021-06-27T17:25:22.023043671Z	T3	V3
2021-06-27T17:25:22.023043671Z	2021-06-27T17:40:22.023043671Z	T4	V4

I was able to get this result by querying each 15 minute blocks one by one…

x1 = data
    |> range(start: -60m, stop: -45m)
    |> first()
    |> yield(....)

x2 = data
    |> range(start: -45m, stop: -30m)
    |> first()
    |> yield(....)

x3 = data
    |> range(start: -30m, stop: -15m)
    |> first()
    |> yield(....)

x4 = data
    |> range(start: -15m, stop: -0m)
    |> first()
    |> yield(....)

as you can guess, this is going to be a huge cumbersome because I will be using other selectors too… like last() min() max() and mean()

have been using this for the last 1-2 days. works fine for 1h periods, but what if I wanted to query last 12 hours?

and I’m not worried about visualizing the data. I’m sending results to my application to process further. so it is fine if the result includes multiple tables

Would be great if anybody can help me out with this…

thanks!

russ_savage · June 28, 2021, 2:25pm

Hi! if i understand this correctly, I think all you need to do is align your time ranges to the top of the hour. For example, using the following:

import "date"
from(bucket: "telegraf")
  |> range(start: date.truncate(t: -1h, unit: 1h), stop: date.truncate(t: now(), unit: 1h))
  |> filter..
  |> aggregateWindow(every:15m, fn: first)

returns 4 rows, for 15, 30, 45, and 00 of the hour.

gorkem · June 28, 2021, 2:32pm

hello, thanks for the reply.

Yes that would work and I tried it actually but then I wouldnt be able to get the exactly last 1 hour data… For example, if the query runs at 17:14, then the the result would return records between 16:00 - 17:00 whereas I was expecting to see between 16:14 - 17:14

no?

russ_savage · June 28, 2021, 3:50pm

ah ok. i think the “offset” param of the window function is helpful here. here’s an example of calculating it for your query. Someone else might have a more “elegant” way:

import "date"

every = 15m
offset = duration(v: string(v: uint(v: now()) - uint(v:date.truncate(t: now(), unit: every))) + "ns")

from(bucket: "telegraf")
  |> range(start: -1h)
  |> filter...
  |> window(every: every, offset: offset)
  |> first()
  |> group()

gorkem · June 28, 2021, 5:30pm

I dont know what to say, man. you are a genius. thank you very much. this is exactly what I’ve been trying to do.

ughh I knew the ‘offset’ thing had something to do with this but couldn’t figure out how it worked exactly.

thank you again!

russ_savage · June 28, 2021, 5:42pm

great to hear! i mean, there’s nothing obvious about any of that. i’ve opened this issue in the docs to make it more obvious for the next person: add note about calculating "unaligned" windows in the window function · Issue #2787 · influxdata/docs-v2 · GitHub

Topic		Replies	Views
Puzzled time with aggregateWindow() function InfluxDB 2 influxdb , query , flux	1	177	June 5, 2024
The aggregateWindow function is unable to retrieve data at the time interval specified by the step value InfluxDB 2 influxdb , flux	6	730	March 23, 2022
Question about WIndow Period/aggregateWindow Dashboards influxdb	1	228	May 28, 2024
Influxdb window question InfluxDB 2 influxdb , flux	1	438	June 17, 2022
Does InfluxQL (3.0) support something like aggregateWindow? query	1	335	October 3, 2023

Having difficulties with 'window' related functions

Related topics