Duplicate timestamp after aggregateWindow

Hello,

tldr; Timestamp wraps after aggregateWindow is called

I have a flux query which looks like the following:

import "date"
from(bucket: "${bucket}")
  |> range(start: date.truncate(t: v.timeRangeStart, unit: 1m), stop: date.truncate(t: v.timeRangeStop , unit: 1m))
  |> filter(fn: (r) => r["_measurement"] == "${measurement}")
  |> filter(fn: (r) => r["_field"] == "_if_rx_bytes")
  |> filter(fn: (r) => r["host.name"] == "${hostname}")
  |> aggregateWindow(every: 1m0s, fn: mean, createEmpty: false)
  |> derivative(unit: 1s, nonNegative: true, columns: ["_value"], timeColumn: "_time")

I have taken this from Grafana, so please excuse the variables. It outputs a table as follows:
I have edited out the rest of the columns, just keeping the relevant ones for clarity:

Start: 2021-07-15T14:33:00Z
Stop: 2021-07-15T14:38:00Z

table _time _value thread
0 2021-07-15T14:35:00Z 0 0
0 2021-07-15T14:36:00Z 0 0
0 2021-07-15T14:37:00Z 0 0
0 2021-07-15T14:38:00Z 0 0
1 2021-07-15T14:35:00Z 0 1
1 2021-07-15T14:36:00Z 0 1
1 2021-07-15T14:37:00Z 0 1
1 2021-07-15T14:38:00Z 0 1
2 2021-07-15T14:35:00Z 0 2
2 2021-07-15T14:36:00Z 0 2
2 2021-07-15T14:37:00Z 0 2
2 2021-07-15T14:38:00Z 0 2
3 2021-07-15T14:35:00Z 0 3
3 2021-07-15T14:36:00Z 0 3
3 2021-07-15T14:37:00Z 0 3
3 2021-07-15T14:38:00Z 0 3
4 2021-07-15T14:35:00Z 0 4
4 2021-07-15T14:36:00Z 0 4
4 2021-07-15T14:37:00Z 0 4
4 2021-07-15T14:38:00Z 0 4
5 2021-07-15T14:35:00Z 60831945.6 5
5 2021-07-15T14:36:00Z 56926255.1 5
5 2021-07-15T14:37:00Z 57886311.1333333 5
5 2021-07-15T14:38:00Z 60243873.6166667 5
6 2021-07-15T14:35:00Z 52827733.2666667 6
6 2021-07-15T14:36:00Z 53644551.2 6
6 2021-07-15T14:37:00Z 50365765.05 6
6 2021-07-15T14:38:00Z 58241522.5 6
7 2021-07-15T14:35:00Z 49950747.0333333 7
7 2021-07-15T14:36:00Z 50595559.8333333 7
7 2021-07-15T14:37:00Z 53781031.5333333 7
7 2021-07-15T14:38:00Z 54343340.75 7
8 2021-07-15T14:35:00Z 55823160.8833333 8
8 2021-07-15T14:36:00Z 55482989.05 8
8 2021-07-15T14:37:00Z 51526161.5666667 8
8 2021-07-15T14:38:00Z 51678041.1666667 8

Next I added one more line to the query, making it:

import "date"
from(bucket: "${bucket}")
  |> range(start: date.truncate(t: 2021-07-15T14:33:49Z, unit: 1m), stop: date.truncate(t: 2021-07-15T14:38:49Z , unit: 1m))
  |> filter(fn: (r) => r["_measurement"] == "${measurement}")
  |> filter(fn: (r) => r["_field"] == "_if_rx_bytes")
  |> filter(fn: (r) => r["host.name"] == "${hostname}")
  |> aggregateWindow(every: 1m0s, fn: mean, createEmpty: false)
  |> derivative(unit: 1s, nonNegative: true, columns: ["_value"], timeColumn: "_time")
  |> drop(columns: ["thread"])

The output, after removing table which is all 0, and manually sorting by timestamp becomes:

_time _value sum
2021-07-15T14:35:00Z 0
2021-07-15T14:35:00Z 0
2021-07-15T14:35:00Z 0
2021-07-15T14:35:00Z 0
2021-07-15T14:35:00Z 0
2021-07-15T14:35:00Z 60831945.6
2021-07-15T14:35:00Z 52827733.27
2021-07-15T14:35:00Z 49950747
2021-07-15T14:35:00Z 55823160.88 219433587
2021-07-15T14:36:00Z 0
2021-07-15T14:36:00Z 0
2021-07-15T14:36:00Z 0
2021-07-15T14:36:00Z 0
2021-07-15T14:36:00Z 0
2021-07-15T14:36:00Z 56926255.1
2021-07-15T14:36:00Z 53644551.2
2021-07-15T14:36:00Z 50595559.83
2021-07-15T14:36:00Z 55482989.05 216649355
2021-07-15T14:37:00Z 0
2021-07-15T14:37:00Z 0
2021-07-15T14:37:00Z 0
2021-07-15T14:37:00Z 0
2021-07-15T14:37:00Z 0
2021-07-15T14:37:00Z 57886311.13
2021-07-15T14:37:00Z 50365765.05
2021-07-15T14:37:00Z 53781031.53
2021-07-15T14:37:00Z 51526161.57 213559269
2021-07-15T14:38:00Z 0
2021-07-15T14:38:00Z 0
2021-07-15T14:38:00Z 0
2021-07-15T14:38:00Z 0
2021-07-15T14:38:00Z 0
2021-07-15T14:38:00Z 60243873.62
2021-07-15T14:38:00Z 58241522.5
2021-07-15T14:38:00Z 54343340.75
2021-07-15T14:38:00Z 51678041.17 224506778

I have added the last column, to show the sum I calculated in a spread sheet for each time stamp.

Now finally I added one more aggregateWindow to sum all values belonging to same timestamp:

import "date"
from(bucket: "${bucket}")
  |> range(start: date.truncate(t: 2021-07-15T14:33:49Z, unit: 1m), stop: date.truncate(t: 2021-07-15T14:38:49Z , unit: 1m))
  |> filter(fn: (r) => r["_measurement"] == "${measurement}")
  |> filter(fn: (r) => r["_field"] == "_if_rx_bytes")
  |> filter(fn: (r) => r["host.name"] == "${hostname}")
  |> aggregateWindow(every: 1m0s, fn: mean, createEmpty: false)
  |> derivative(unit: 1s, nonNegative: true, columns: ["_value"], timeColumn: "_time")
  |> drop(columns: ["thread"])
  |> aggregateWindow(every: 1m0s, fn: sum, createEmpty: false)

and the output becomes:

_value _time
224506778 2021-07-15T14:38:00Z
219433587 2021-07-15T14:36:00Z
216649355 2021-07-15T14:37:00Z
213559269 2021-07-15T14:38:00Z

At this point the output wraps. If you notice the sum of values of 14:35 in previous table is now the sum of 14:36, it goes on, but the sum of values of 14:38 appears in first row.

I am sure this happens because I fail to grasp some subtility of aggregateWindow. It would be great if some one points me towards the right direction.

Thanks

Can some one confirm if this problem is related to Issue #2425?

Hello @xuoguoto,
It certainly looks related. Can you please try replacing aggregateWindow with window() and mean() or window() and sum()?
Can you also please comment on the issue to help the flux team prioritize it?
Thank you.

I removed the first aggregateWindow and things started working.

That was a blind try and not sure how it helped though.