I am collecting network data from my router via telegraf and displaying them as monthly aggregates with this rather complicated query:
from(bucket: bucket_name)
|> range(start: -12mo)
|> filter(fn: (r) => r["_measurement"] == "net")
|> filter(fn: (r) => r["_field"] == "bytes_recv" or r["_field"] == "bytes_sent")
|> filter(fn: (r) => r["host"] == "OPNsense")
|> filter(fn: (r) => r["interface"] == "vtnet0")
|> derivative(unit: 1s, nonNegative: true)
|> map(fn: (r) => ({ r with _value: float(v: r._value) * 8.0 }))
|> aggregateWindow(every: 1mo, fn: sum, createEmpty: false)
|> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")
|> map(fn: (r) => ({ r with month: string(v: date.sub(from: r["_time"], d: 1h)), count: r["_value"] }))
|> yield(name: "sum")
Obviously this is also slow, i after confirming this works I downsampled the data to a new bucket with the following task:
from(bucket: bucket_name)
|> range(start: -task.every)
|> filter(fn: (r) => r["_measurement"] == "net")
|> filter(fn: (r) => r["_field"] == "bytes_recv" or r["_field"] == "bytes_sent")
|> filter(fn: (r) => r["host"] == "OPNsense")
|> increase()
|> aggregateWindow(every: 1h, fn: sum, createEmpty: false)
|> to(bucket: "opnsense-net_1d")
Then i made a query over the downsampled data like this:
from(bucket: "opnsense-net_1h")
|> range(start: -12mo)
|> filter(fn: (r) => r["_measurement"] == "net")
|> filter(fn: (r) => r["_field"] == "bytes_recv" or r["_field"] == "bytes_sent")
|> filter(fn: (r) => r["host"] == "OPNsense")
|> filter(fn: (r) => r["interface"] == "vtnet0")
|> derivative(unit: 1s, nonNegative: true)
|> map(fn: (r) => ({ r with _value: float(v: r._value) * 8.0 }))
|> aggregateWindow(every: 1mo, fn: sum, createEmpty: false)
|> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")
|> map(fn: (r) => ({ r with month: string(v: date.sub(from: r["_time"], d: 1h)), count: r["_value"] }))
|> yield(name: "sum")
However I am getting substantially different results when queryring the downsampled data as compared to the original ones. The downsampled data shows around 30-70% higher values.
I am thinking this is affected by the restarts of the cummulative values in the downsampled period. I thought the increase()
function will be mitigatting it, but apparetnly not very well.
Any suggestion how to improve these queries so that i get more realistic output from the downsampled data?