Hello everyone! I see that there aren’t many hits here for wideTo
(nor are there many in a global Internet search…), but hopefully someone out there will be able to guide me here.
I’m using InfluxDB 2 OSS (version 2.7.4). I’ve got a table called network-traffic
which was derived from the Telegraf ‘net’ collector, but the bytes_recv
and bytes_send
columns have been transformed into differences instead of totals.
A simple query using the Query Builder produces a result like this:
table mean | _measurement group string | _field group string | _value no group double | _time no group dateTime:RFC3339 | host group string | interface group string |
---|---|---|---|---|---|---|
0 | net | bytes_recv | 1132607 | 2023-12-31T17:29:10.000Z | ns5 | ens5 |
1 | net | bytes_sent | 15443530 | 2023-12-31T17:29:10.000Z | ns5 | ens5 |
My goal is to produce a downsampled version of this table, with sums of the two fields for each hour. Using this query:
import "date"
from(bucket: "network-traffic")
|> range(start: date.sub(from: date.truncate(t: now(), unit: 1h), d: 3h), stop: date.truncate(t: now(), unit: 1h))
|> filter(fn: (r) => r["_measurement"] == "net")
|> filter(
fn: (r) => r["_field"] == "bytes_sent" or r["_field"] == "bytes_recv",
)
|> aggregateWindow(every: 1h, fn: sum, createEmpty: false)
|> map(fn: (r) => ({r with hour_of_day: string(v: date.hour(t: r._time))}))
|> group(columns: ["_measurement", "host", "interface", "hour_of_day"])
|> pivot(rowKey: ["_time"], columnKey: ["_field"], valueColumn: "_value")
I get this output:
table _result | _measurement group string | _time no group dateTime:RFC3339 | bytes_recv no group long | bytes_sent no group long | host group string | hour_of_day group string | interface group string |
---|---|---|---|---|---|---|---|
0 | net | 2023-12-31T15:00:00.000Z | 915526 | 586714 | casa23 | 15 | primary |
1 | net | 2023-12-31T16:00:00.000Z | 2656375 | 1724388 | casa23 | 16 | primary |
2 | net | 2023-12-31T17:00:00.000Z | 2657586 | 1737768 | casa23 | 17 | primary |
The important aspects here are:
- Data has been summarized by hour.
- The timestamp of each new row is the exact end of the hour that was summarized (
aggregateWindow
does this automatically). - Only data for complete hours is summarized; the
range
start and stop values are coerced to ensure this happens. - An
hour_of_day
column is added, and made part of the group key.
If I then use wideTo
to send this data to a bucket, the data almost arrives correctly: the timestamps change, though, to become exactly one minute later than the source timestamps. This is consistent… if the timestamp shown in the Data Explorer for the query above is ‘2023-12-31T15:00:00.000Z’, the timestamp shown in the Data Explorer for a query from the destination bucket will be ‘2023-12-31T15:01:00.000Z’.
Since I’m planning to build dashboards based on this downsampled data, I really don’t want the timestamps to be changed. Is there something I can do to stop this from happening?