Hello,
tldr; Timestamp wraps after aggregateWindow is called
I have a flux query which looks like the following:
import "date"
from(bucket: "${bucket}")
|> range(start: date.truncate(t: v.timeRangeStart, unit: 1m), stop: date.truncate(t: v.timeRangeStop , unit: 1m))
|> filter(fn: (r) => r["_measurement"] == "${measurement}")
|> filter(fn: (r) => r["_field"] == "_if_rx_bytes")
|> filter(fn: (r) => r["host.name"] == "${hostname}")
|> aggregateWindow(every: 1m0s, fn: mean, createEmpty: false)
|> derivative(unit: 1s, nonNegative: true, columns: ["_value"], timeColumn: "_time")
I have taken this from Grafana, so please excuse the variables. It outputs a table as follows:
I have edited out the rest of the columns, just keeping the relevant ones for clarity:
Start: 2021-07-15T14:33:00Z
Stop: 2021-07-15T14:38:00Z
| table | _time | _value | thread |
|---|---|---|---|
| 0 | 2021-07-15T14:35:00Z | 0 | 0 |
| 0 | 2021-07-15T14:36:00Z | 0 | 0 |
| 0 | 2021-07-15T14:37:00Z | 0 | 0 |
| 0 | 2021-07-15T14:38:00Z | 0 | 0 |
| 1 | 2021-07-15T14:35:00Z | 0 | 1 |
| 1 | 2021-07-15T14:36:00Z | 0 | 1 |
| 1 | 2021-07-15T14:37:00Z | 0 | 1 |
| 1 | 2021-07-15T14:38:00Z | 0 | 1 |
| 2 | 2021-07-15T14:35:00Z | 0 | 2 |
| 2 | 2021-07-15T14:36:00Z | 0 | 2 |
| 2 | 2021-07-15T14:37:00Z | 0 | 2 |
| 2 | 2021-07-15T14:38:00Z | 0 | 2 |
| 3 | 2021-07-15T14:35:00Z | 0 | 3 |
| 3 | 2021-07-15T14:36:00Z | 0 | 3 |
| 3 | 2021-07-15T14:37:00Z | 0 | 3 |
| 3 | 2021-07-15T14:38:00Z | 0 | 3 |
| 4 | 2021-07-15T14:35:00Z | 0 | 4 |
| 4 | 2021-07-15T14:36:00Z | 0 | 4 |
| 4 | 2021-07-15T14:37:00Z | 0 | 4 |
| 4 | 2021-07-15T14:38:00Z | 0 | 4 |
| 5 | 2021-07-15T14:35:00Z | 60831945.6 | 5 |
| 5 | 2021-07-15T14:36:00Z | 56926255.1 | 5 |
| 5 | 2021-07-15T14:37:00Z | 57886311.1333333 | 5 |
| 5 | 2021-07-15T14:38:00Z | 60243873.6166667 | 5 |
| 6 | 2021-07-15T14:35:00Z | 52827733.2666667 | 6 |
| 6 | 2021-07-15T14:36:00Z | 53644551.2 | 6 |
| 6 | 2021-07-15T14:37:00Z | 50365765.05 | 6 |
| 6 | 2021-07-15T14:38:00Z | 58241522.5 | 6 |
| 7 | 2021-07-15T14:35:00Z | 49950747.0333333 | 7 |
| 7 | 2021-07-15T14:36:00Z | 50595559.8333333 | 7 |
| 7 | 2021-07-15T14:37:00Z | 53781031.5333333 | 7 |
| 7 | 2021-07-15T14:38:00Z | 54343340.75 | 7 |
| 8 | 2021-07-15T14:35:00Z | 55823160.8833333 | 8 |
| 8 | 2021-07-15T14:36:00Z | 55482989.05 | 8 |
| 8 | 2021-07-15T14:37:00Z | 51526161.5666667 | 8 |
| 8 | 2021-07-15T14:38:00Z | 51678041.1666667 | 8 |
Next I added one more line to the query, making it:
import "date"
from(bucket: "${bucket}")
|> range(start: date.truncate(t: 2021-07-15T14:33:49Z, unit: 1m), stop: date.truncate(t: 2021-07-15T14:38:49Z , unit: 1m))
|> filter(fn: (r) => r["_measurement"] == "${measurement}")
|> filter(fn: (r) => r["_field"] == "_if_rx_bytes")
|> filter(fn: (r) => r["host.name"] == "${hostname}")
|> aggregateWindow(every: 1m0s, fn: mean, createEmpty: false)
|> derivative(unit: 1s, nonNegative: true, columns: ["_value"], timeColumn: "_time")
|> drop(columns: ["thread"])
The output, after removing table which is all 0, and manually sorting by timestamp becomes:
| _time | _value | sum |
|---|---|---|
| 2021-07-15T14:35:00Z | 0 | |
| 2021-07-15T14:35:00Z | 0 | |
| 2021-07-15T14:35:00Z | 0 | |
| 2021-07-15T14:35:00Z | 0 | |
| 2021-07-15T14:35:00Z | 0 | |
| 2021-07-15T14:35:00Z | 60831945.6 | |
| 2021-07-15T14:35:00Z | 52827733.27 | |
| 2021-07-15T14:35:00Z | 49950747 | |
| 2021-07-15T14:35:00Z | 55823160.88 | 219433587 |
| 2021-07-15T14:36:00Z | 0 | |
| 2021-07-15T14:36:00Z | 0 | |
| 2021-07-15T14:36:00Z | 0 | |
| 2021-07-15T14:36:00Z | 0 | |
| 2021-07-15T14:36:00Z | 0 | |
| 2021-07-15T14:36:00Z | 56926255.1 | |
| 2021-07-15T14:36:00Z | 53644551.2 | |
| 2021-07-15T14:36:00Z | 50595559.83 | |
| 2021-07-15T14:36:00Z | 55482989.05 | 216649355 |
| 2021-07-15T14:37:00Z | 0 | |
| 2021-07-15T14:37:00Z | 0 | |
| 2021-07-15T14:37:00Z | 0 | |
| 2021-07-15T14:37:00Z | 0 | |
| 2021-07-15T14:37:00Z | 0 | |
| 2021-07-15T14:37:00Z | 57886311.13 | |
| 2021-07-15T14:37:00Z | 50365765.05 | |
| 2021-07-15T14:37:00Z | 53781031.53 | |
| 2021-07-15T14:37:00Z | 51526161.57 | 213559269 |
| 2021-07-15T14:38:00Z | 0 | |
| 2021-07-15T14:38:00Z | 0 | |
| 2021-07-15T14:38:00Z | 0 | |
| 2021-07-15T14:38:00Z | 0 | |
| 2021-07-15T14:38:00Z | 0 | |
| 2021-07-15T14:38:00Z | 60243873.62 | |
| 2021-07-15T14:38:00Z | 58241522.5 | |
| 2021-07-15T14:38:00Z | 54343340.75 | |
| 2021-07-15T14:38:00Z | 51678041.17 | 224506778 |
I have added the last column, to show the sum I calculated in a spread sheet for each time stamp.
Now finally I added one more aggregateWindow to sum all values belonging to same timestamp:
import "date"
from(bucket: "${bucket}")
|> range(start: date.truncate(t: 2021-07-15T14:33:49Z, unit: 1m), stop: date.truncate(t: 2021-07-15T14:38:49Z , unit: 1m))
|> filter(fn: (r) => r["_measurement"] == "${measurement}")
|> filter(fn: (r) => r["_field"] == "_if_rx_bytes")
|> filter(fn: (r) => r["host.name"] == "${hostname}")
|> aggregateWindow(every: 1m0s, fn: mean, createEmpty: false)
|> derivative(unit: 1s, nonNegative: true, columns: ["_value"], timeColumn: "_time")
|> drop(columns: ["thread"])
|> aggregateWindow(every: 1m0s, fn: sum, createEmpty: false)
and the output becomes:
| _value | _time |
|---|---|
| 224506778 | 2021-07-15T14:38:00Z |
| 219433587 | 2021-07-15T14:36:00Z |
| 216649355 | 2021-07-15T14:37:00Z |
| 213559269 | 2021-07-15T14:38:00Z |
At this point the output wraps. If you notice the sum of values of 14:35 in previous table is now the sum of 14:36, it goes on, but the sum of values of 14:38 appears in first row.
I am sure this happens because I fail to grasp some subtility of aggregateWindow. It would be great if some one points me towards the right direction.
Thanks
