Using python client, query on a specific bucket take 75.x seconds

Hi,

I am using the python client to insert and get data from InfluxDB. I noticed a pattern and I do not know what configuration is wrong.

I have a bucket “X”. Every read from that bucket takes 75.x seconds. It doesn’t matter how many points I have in that bucket, it always takes this time to query data.
I am not sure where is the problem, is there some configuration in InfluxDB which is wrong?

Hello @ctd,
Welcome!
What version of InfluxDB are you using and which client and version?
Can you also please share your python script?
Thank you!

Hi,
My Flux query is this one:

from(bucket: “sensors”)
|> range(start: -2h)
|> drop(columns: [“_start”, “_stop”])
|> group(columns: [“_measurement”], mode:“by”)
|> pivot(rowKey: [“_time”, “type”], columnKey: [“_field”], valueColumn: “_value”)

User-Agent: influxdb-client-python/1.33.0

<<< Content-Encoding: gzip
<<< Content-Type: text/csv; charset=utf-8
<<< Vary: Accept-Encoding
<<< X-Influxdb-Build: OSS
<<< X-Influxdb-Version: v2.4.0

Python:

with InfluxDBClient(url=url, token=token,
                            org=org,
                            timeout=300_000,
                            debug=False,
                            enable_gzip=True) as client:
                querier = client.query_api(query_options=QueryOptions(profilers=["query", "operator"],
                                                                      profiler_callback=callback))

                query_text = f'from(bucket: "sensors")' \
                                     f' |> range(start: -2h)' \
                                     f' |> drop(columns: ["_start", "_stop"])' \
                                     f' |> group(columns: ["_measurement"], mode:"by")' \
                                     f' |> pivot(rowKey: ["_time", "ticker", "wallet"], ' \
                                     f'columnKey: ["_field"], valueColumn: "_value")'
                 query_data = querier.query_data_frame(query=query_text)

After I have added this comment, I noticed that if I execute for the first time the query it takes 75.133 s and I get in a DataFrame (pandas) around 300 results. If I execute second time this query within the same client (two times, one after another with different range) but with a range of 4mo, I get 16.000 entries and it takes 2 seconds to execute the second time.