I am using the python client to insert and get data from InfluxDB. I noticed a pattern and I do not know what configuration is wrong.
I have a bucket “X”. Every read from that bucket takes 75.x seconds. It doesn’t matter how many points I have in that bucket, it always takes this time to query data.
I am not sure where is the problem, is there some configuration in InfluxDB which is wrong?
What version of InfluxDB are you using and which client and version?
Can you also please share your python script?
My Flux query is this one:
|> range(start: -2h)
|> drop(columns: [“_start”, “_stop”])
|> group(columns: [“_measurement”], mode:“by”)
|> pivot(rowKey: [“_time”, “type”], columnKey: [“_field”], valueColumn: “_value”)
<<< Content-Encoding: gzip
<<< Content-Type: text/csv; charset=utf-8
<<< Vary: Accept-Encoding
<<< X-Influxdb-Build: OSS
<<< X-Influxdb-Version: v2.4.0
with InfluxDBClient(url=url, token=token,
enable_gzip=True) as client:
querier = client.query_api(query_options=QueryOptions(profilers=["query", "operator"],
query_text = f'from(bucket: "sensors")' \
f' |> range(start: -2h)' \
f' |> drop(columns: ["_start", "_stop"])' \
f' |> group(columns: ["_measurement"], mode:"by")' \
f' |> pivot(rowKey: ["_time", "ticker", "wallet"], ' \
f'columnKey: ["_field"], valueColumn: "_value")'
query_data = querier.query_data_frame(query=query_text)
After I have added this comment, I noticed that if I execute for the first time the query it takes 75.133 s and I get in a DataFrame (pandas) around 300 results. If I execute second time this query within the same client (two times, one after another with different range) but with a range of 4mo, I get 16.000 entries and it takes 2 seconds to execute the second time.