In our system we have around 10.000-50.000 meters (time series) per bucket and each meter performs 4 measurements per hour. Our users are able to select any subset (typically 100-200 meters, but it can potentially be all meters) and create various graphs for periods of up to a 3 year time range. We have successfully used flux queries to generate most graphs with nice query speeds (especially with the more recent updates of influx 2.0). However, we are struggling with query speed for a certain type of graphs, where all measurements with the same timestamp are summed. For just 10 meters, we have a query time of 7-8s. To improve speed, we have actually been forced to query raw data from influxdb and perform operations outside influxdb.
Are we doing it all wrong? Is there a faster way? Or are there potentially any optimizations on the way for per timestamp grouping/aggregation?
The query would look something like this for 3 meters.
from(bucket: "COMPANY") |> range(start: 2020-06-01T00:00:00.0000000Z, stop: 2021-06-01T00:01:00.0000000Z) |> filter(fn: (r) => r["MeterNumber"] == "18" or r["MeterNumber"] == "19" or r["MeterNumber"] == "20") |> group(columns: ["_time"], mode: "by") |> sum() |> group() |> aggregateWindow(every: 1mo, fn: max)