I am using the python client to insert and get data from InfluxDB. I noticed a pattern and I do not know what configuration is wrong.
I have a bucket “X”. Every read from that bucket takes 75.x seconds. It doesn’t matter how many points I have in that bucket, it always takes this time to query data.
I am not sure where is the problem, is there some configuration in InfluxDB which is wrong?
with InfluxDBClient(url=url, token=token,
org=org,
timeout=300_000,
debug=False,
enable_gzip=True) as client:
querier = client.query_api(query_options=QueryOptions(profilers=["query", "operator"],
profiler_callback=callback))
query_text = f'from(bucket: "sensors")' \
f' |> range(start: -2h)' \
f' |> drop(columns: ["_start", "_stop"])' \
f' |> group(columns: ["_measurement"], mode:"by")' \
f' |> pivot(rowKey: ["_time", "ticker", "wallet"], ' \
f'columnKey: ["_field"], valueColumn: "_value")'
query_data = querier.query_data_frame(query=query_text)
After I have added this comment, I noticed that if I execute for the first time the query it takes 75.133 s and I get in a DataFrame (pandas) around 300 results. If I execute second time this query within the same client (two times, one after another with different range) but with a range of 4mo, I get 16.000 entries and it takes 2 seconds to execute the second time.