Timestamp changing when wideTo() writes data

Hello everyone! I see that there aren’t many hits here for wideTo (nor are there many in a global Internet search…), but hopefully someone out there will be able to guide me here.

I’m using InfluxDB 2 OSS (version 2.7.4). I’ve got a table called network-traffic which was derived from the Telegraf ‘net’ collector, but the bytes_recv and bytes_send columns have been transformed into differences instead of totals.

A simple query using the Query Builder produces a result like this:

table mean _measurement group string _field group string _value no group double _time no group dateTime:RFC3339 host group string interface group string
0 net bytes_recv 1132607 2023-12-31T17:29:10.000Z ns5 ens5
1 net bytes_sent 15443530 2023-12-31T17:29:10.000Z ns5 ens5

My goal is to produce a downsampled version of this table, with sums of the two fields for each hour. Using this query:

import "date"

from(bucket: "network-traffic")                                                                                                                                                                                                                                                                                                                        
    |> range(start: date.sub(from: date.truncate(t: now(), unit: 1h), d: 3h), stop: date.truncate(t: now(), unit: 1h))
    |> filter(fn: (r) => r["_measurement"] == "net")                                                                                                                                                                                                                                                                                                   
    |> filter(                                                                                                                                                                                                                                                                                                                                         
        fn: (r) => r["_field"] == "bytes_sent" or r["_field"] == "bytes_recv",                                                                                                                                                                                                                                                                         
    |> aggregateWindow(every: 1h, fn: sum, createEmpty: false)                                                                                                                                                                                                                                                                                         
    |> map(fn: (r) => ({r with hour_of_day: string(v: date.hour(t: r._time))}))                                                                                                                                                                                                                                                                        
    |> group(columns: ["_measurement", "host", "interface", "hour_of_day"])                                                                                                                                                                                                                                                                            
    |> pivot(rowKey: ["_time"], columnKey: ["_field"], valueColumn: "_value")

I get this output:

table _result _measurement group string _time no group dateTime:RFC3339 bytes_recv no group long bytes_sent no group long host group string hour_of_day group string interface group string
0 net 2023-12-31T15:00:00.000Z 915526 586714 casa23 15 primary
1 net 2023-12-31T16:00:00.000Z 2656375 1724388 casa23 16 primary
2 net 2023-12-31T17:00:00.000Z 2657586 1737768 casa23 17 primary

The important aspects here are:

  1. Data has been summarized by hour.
  2. The timestamp of each new row is the exact end of the hour that was summarized (aggregateWindow does this automatically).
  3. Only data for complete hours is summarized; the range start and stop values are coerced to ensure this happens.
  4. An hour_of_day column is added, and made part of the group key.

If I then use wideTo to send this data to a bucket, the data almost arrives correctly: the timestamps change, though, to become exactly one minute later than the source timestamps. This is consistent… if the timestamp shown in the Data Explorer for the query above is ‘2023-12-31T15:00:00.000Z’, the timestamp shown in the Data Explorer for a query from the destination bucket will be ‘2023-12-31T15:01:00.000Z’.

Since I’m planning to build dashboards based on this downsampled data, I really don’t want the timestamps to be changed. Is there something I can do to stop this from happening?

Blargh! This is not an actual problem, the data being stored is fine. The ‘default’ aggregation settings in the Data Explorer’s Query Builder are the cause of this problem, it’s just a display artifact.

1 Like