Using python client, query on a specific bucket take 75.x seconds

ctd · October 22, 2022, 6:39pm

Hi,

I am using the python client to insert and get data from InfluxDB. I noticed a pattern and I do not know what configuration is wrong.

I have a bucket “X”. Every read from that bucket takes 75.x seconds. It doesn’t matter how many points I have in that bucket, it always takes this time to query data.
I am not sure where is the problem, is there some configuration in InfluxDB which is wrong?

Anaisdg · October 23, 2022, 6:29pm

Hello @ctd,
Welcome!
What version of InfluxDB are you using and which client and version?
Can you also please share your python script?
Thank you!

ctd · October 23, 2022, 8:01pm

Hi,
My Flux query is this one:

from(bucket: “sensors”)
|> range(start: -2h)
|> drop(columns: [“_start”, “_stop”])
|> group(columns: [“_measurement”], mode:“by”)
|> pivot(rowKey: [“_time”, “type”], columnKey: [“_field”], valueColumn: “_value”)

User-Agent: influxdb-client-python/1.33.0

<<< Content-Encoding: gzip
<<< Content-Type: text/csv; charset=utf-8
<<< Vary: Accept-Encoding
<<< X-Influxdb-Build: OSS
<<< X-Influxdb-Version: v2.4.0

Python:

with InfluxDBClient(url=url, token=token,
                            org=org,
                            timeout=300_000,
                            debug=False,
                            enable_gzip=True) as client:
                querier = client.query_api(query_options=QueryOptions(profilers=["query", "operator"],
                                                                      profiler_callback=callback))

                query_text = f'from(bucket: "sensors")' \
                                     f' |> range(start: -2h)' \
                                     f' |> drop(columns: ["_start", "_stop"])' \
                                     f' |> group(columns: ["_measurement"], mode:"by")' \
                                     f' |> pivot(rowKey: ["_time", "ticker", "wallet"], ' \
                                     f'columnKey: ["_field"], valueColumn: "_value")'
                 query_data = querier.query_data_frame(query=query_text)

After I have added this comment, I noticed that if I execute for the first time the query it takes 75.133 s and I get in a DataFrame (pandas) around 300 results. If I execute second time this query within the same client (two times, one after another with different range) but with a range of 4mo, I get 16.000 entries and it takes 2 seconds to execute the second time.

Topic		Replies	Views
Data query through Python API is slow InfluxDB 2	1	873	June 28, 2023
InfluxDB 1.8.5 query takes long Telegraf influxdb , time-series , python	2	1387	May 18, 2021
Influx DB - Flux query InfluxDB 2 influxdb , query , flux	5	1150	November 23, 2020
Taking too long to get data through query InfluxDB 2 influxdb , query , flux	8	1193	July 11, 2024
Query timeout using python influxdB_client InfluxDB 2 influxdb	7	11163	January 23, 2024

Using python client, query on a specific bucket take 75.x seconds

Related topics