Flux query using pivot in InfluxDB 2.1 extremely slow - how to sum up values of 3 different tags and create 1 new column containing that sum?

electrick · December 10, 2021, 9:51am

Hi there
The data structure, as it is currently in influxdb:

What I am trying to achieve:
I am trying to sum up the values of phase 1, phase 2, phase 3 to get a new value/column for total energy.
phase 1, 2 and 3 are values of the tag entity_id.

The problem:
The query is extremely slow. The bucket contains about 6 million entries per entity. It takes around 18 seconds of CPU time if performed from the console. I have tried the same in SQLite, where it takes around 3 seconds of CPU time.
I’m new to InfluxDB and it is my first time using flux. I suspect that there is a much better way to write this flux query. Could anyone give me a hint how to write a faster query? I’m using influxdb 2.1.1 on linux.

The query in flux:

from(bucket: "energy-data-test-05")
  |> range(start: 2020-11-14T00:00:00Z, stop: 2021-11-14T12:00:00Z)
  |> filter(fn: (r) => r["entity_id"] == "phase1" or r["entity_id"] == "phase2" or r["entity_id"] == "phase3")
  |> group(columns: ["entity_id"])
  |> pivot(rowKey:["_time"], columnKey: ["entity_id"], valueColumn: "_value")
  |> map(fn: (r) => ({r with _value: r.phase1 + r.phase2 + r.phase3} ))

The query in SQL (in case my flux query is too convoluted to be understandable):

SELECT state_id, created, value, domain, measurement, 
SUM(states.value) AS 'total energy (Wh)'
FROM states
WHERE (states.entity_id ="phase1" OR entity_id = "phase2" OR entity_id = "phase3")
	 AND (states.created BETWEEN '2020-11-14 00:00:00' AND '2021-11-14 12:00:00')
GROUP BY created

ticchioned · February 15, 2022, 9:26pm

I have run into the same issue with the pivot function. This issue is NOT and issue on some lower versions of influx 2.x. I believe version 2.0.5 and 2.0.6 don’t have this issue. Might be worth checking out other versions to attempt to fix this.

Domenico_Briganti · May 4, 2022, 2:57pm

Same here. I have two Influxdb (2.0.9 and 2.1.1) and the same query, with the same data, on 2.1.1 take about 10x time to complete.

Topic		Replies	Views
Using Pivot in Flux on InfluxDB data - what am I doing wrong? Fluxlang influxdb , flux	6	1723	November 20, 2022
Help using flux sum()	2	2932	January 14, 2020
Optimizing Flux Query to Retrieve CPU MAX and Memory Usage Without Using Pivot InfluxDB 2	3	96	October 24, 2024
Efficient flux query to summarize measurements with same timestamp InfluxDB 2 influxdb , flux	2	2205	April 26, 2023
Optimize query with pivot Fluxlang	1	982	July 31, 2020

Flux query using pivot in InfluxDB 2.1 extremely slow - how to sum up values of 3 different tags and create 1 new column containing that sum?

Related topics