Hello everyone,
I need some help in optimising a query. I need to join the timeseries data with data from PostgreSQL through the sql package. I managed to make the query work, but i have some performance issues.
The SQL data is paginated, where i read for example 50 devices. I need to left join that data with the last values for these 50 devices. The thing how i’m pulling Influx data now is for example:
data = from(bucket: "my-bucket")
|> range(start: -13h)
|> filter(fn: (r) => r._measurement == "my-measurement" and (r._field == "temperature" or r._field == "battery"))
|> last()
|> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")
|> group(columns: ["device_address"])
|> sort(columns: ["_time"], desc: true)
|> limit(n: 1)
|> group(columns: ["result"])
The problem is, this query fetches last values for around 50 000 different devices, and the query gets really slow. Where ReadWindowAggregate takes 99% of the query.
Obviously i would like to read only for these devices i read through sql, but i’m not sure if thats possible with flux. I tried contains but i still experienced bad performance for some reason. Even worse than without it. The ,device_address, i would like to filter by is a tag.
i thought about using a regex, but building a regex that big will not make it that much better.
Any advice? The InfluxDB version i’m using locally is 2.5.1, and flux version is v0.188.1