Hi there
The data structure, as it is currently in influxdb:
What I am trying to achieve:
I am trying to sum up the values of phase 1, phase 2, phase 3 to get a new value/column for total energy.
phase 1, 2 and 3 are values of the tag entity_id.
The problem:
The query is extremely slow. The bucket contains about 6 million entries per entity. It takes around 18 seconds of CPU time if performed from the console. I have tried the same in SQLite, where it takes around 3 seconds of CPU time.
I’m new to InfluxDB and it is my first time using flux. I suspect that there is a much better way to write this flux query. Could anyone give me a hint how to write a faster query? I’m using influxdb 2.1.1 on linux.
The query in flux:
from(bucket: "energy-data-test-05")
|> range(start: 2020-11-14T00:00:00Z, stop: 2021-11-14T12:00:00Z)
|> filter(fn: (r) => r["entity_id"] == "phase1" or r["entity_id"] == "phase2" or r["entity_id"] == "phase3")
|> group(columns: ["entity_id"])
|> pivot(rowKey:["_time"], columnKey: ["entity_id"], valueColumn: "_value")
|> map(fn: (r) => ({r with _value: r.phase1 + r.phase2 + r.phase3} ))
The query in SQL (in case my flux query is too convoluted to be understandable):
SELECT state_id, created, value, domain, measurement,
SUM(states.value) AS 'total energy (Wh)'
FROM states
WHERE (states.entity_id ="phase1" OR entity_id = "phase2" OR entity_id = "phase3")
AND (states.created BETWEEN '2020-11-14 00:00:00' AND '2021-11-14 12:00:00')
GROUP BY created