I have a DB which is ingesting about 1000 data points per second running on a VM with 4 GB RAM. The system has very low load when ingesting data, and graphing the data without any aggregate functions is pretty snappy.
However, when I try to graph the aggregate sum of data, it is extremely slow and frequently runs out of memory.
The query is as follows:
from(bucket: “default”)
|> range(start: -3h)
|> filter(fn: ® => r._measurement == “iptraffic”)
|> filter(fn: ® => r._field == “out_bytes” or r._field == “in_bytes”)
|> group(columns: ["_field"])
|> keep(columns: ["_time", “_value”, “_field”])
|> aggregateWindow(every: 1m, fn: sum)
I found that the addition of the keep() function made a significant difference to the execution speed, however the performance is still pretty bad, with the query taking about 20 seconds to execute.
The DB is really pretty small, at around 140 MB. I have a hard time believing that InfluxDB needs more than 4 GB to produce aggregates for just 3 hours of data (i.e., about 2.8M data points).
Is anyone else experiencing such bad performance with aggregate functions? Version is InfluxDB 2.0 alpha 18.