Efficient flux query to summarize measurements with same timestamp

In our system we have around 10.000-50.000 meters (time series) per bucket and each meter performs 4 measurements per hour. Our users are able to select any subset (typically 100-200 meters, but it can potentially be all meters) and create various graphs for periods of up to a 3 year time range. We have successfully used flux queries to generate most graphs with nice query speeds (especially with the more recent updates of influx 2.0). However, we are struggling with query speed for a certain type of graphs, where all measurements with the same timestamp are summed. For just 10 meters, we have a query time of 7-8s. To improve speed, we have actually been forced to query raw data from influxdb and perform operations outside influxdb.

Are we doing it all wrong? Is there a faster way? Or are there potentially any optimizations on the way for per timestamp grouping/aggregation?

The query would look something like this for 3 meters.

from(bucket: "COMPANY")
 |> range(start: 2020-06-01T00:00:00.0000000Z, stop: 2021-06-01T00:01:00.0000000Z)
 |> filter(fn: (r) => r["MeterNumber"] == "18" or r["MeterNumber"] == "19" or r["MeterNumber"] == "20")
 |> group(columns: ["_time"], mode: "by")
 |> sum()
 |> group()
 |> aggregateWindow(every: 1mo, fn: max)

Hello @PeteHeine,
I might suggest trying to perform a pivot instead of a group by time.

pivot(rowKey:["_time"], columnKey: ["MeterNumber"], valueColumn: "_value")

Then you sum like:

map(fn: (r) => ({ sum: r.18 + r.19 + r.20}))

Does that help the performance at all?
Also this blog could be useful:

I am facing the same problem.

Here is my script:

import "date"

// from grafana v.defaultBucket
defaultBucket = "my_bucket"
// from grafana v.windowPeriod
reqPeriod = 5s
// from grafana template variable ${devices:regex}
devices = /some_measurement/

minPeriod = 5s
defaultPeriod = if uint(v: reqPeriod) < uint(v: minPeriod) then
  minPeriod
else
  reqPeriod

query_base = (bucket=defaultBucket, devices=/.*/, systems=/.*/, fields=/.*/, start=v.timeRangeStart, stop=v.timeRangeStop) =>
  from(bucket: bucket)
    |> range(start: start, stop: stop)
    |> filter(fn: (r) =>
      r._measurement =~ devices
      and r.system =~ systems
      and r._field =~ fields
    )

query_timeseries = (bucket=defaultBucket, devices=/.*/, systems=/.*/, fields=/.*/, start=v.timeRangeStart, stop=v.timeRangeStop, period=defaultPeriod) =>
  query_base(
    bucket: bucket,
    devices: devices,
    systems: systems,
    fields: fields,
    start: start,
    stop: stop
  )
  |> aggregateWindow(every: period, fn: mean)

qCommon = query_timeseries(devices: devices, systems: /(mppt|pd)/, fields: /(carOut|(usb|qcUsb|typec)\d+)Watts/)
|> map(fn: (r) => ({
  _time: r._time,
// Resulting field
// since we use already use map because of 'carOutWatts', do it here
// it avoids a second call to map or rename
  _field: "DC",
// This should be done prior to storing in the database, ignore for this problem
  _value: if r._field == "carOutWatts" then r._value / 10.0 else r._value
}))

a = () => qCommon
|> group(columns: ["_time","_field"])
|> sum()
|> group(columns: ["_field"])

b = () => qCommon
// Be sure to use:
// - timeSrc: "_start" (to avoid skewing the results by 1s)
// - createEmpty: false (to avoid creating empty values in our 5s original aggregate)
|> aggregateWindow(every: 1s, fn: sum, timeSrc: "_start", createEmpty: false)

//a()
b()

function b (using aggregateWindow) is way faster than a (group + sum + group), about 2 to 3 times faster.

The suggestion of using pivot prior to the calculation is not possible, since i don’t know the exact fields to sum (the \d in the regexp, i may have 3 usb, 4 typec, or maybe 5, or… who knows?)

It was my understanding that aggregateWindow was something like:

tables
|> window(...)
|> sum(...)
|> group(...)

Wich, at least in my mind, should actually be slower than

tables
|> group(...)
|> sum(...)
|> group(...)

Since window actually has to compare the time of every row of every table to group them together.

I hence have 3 questions:

  • Is my assumption about the logic of aggregateWindow wrong?
  • If not, how can it be faster that the “group + sum + group” solution?
  • Isn’t there a better, faster, way to do this? (while aggregateWindow is faster, it is still quite slow on a large window)