InfluxDB 1.8.5 query takes long

Manuele · May 12, 2021, 2:02pm

Hello,
I am trying to integrate InfluxDB (OSS 1.8.5) to a Scada (Ignition) in order to more efficiently store and retrieve time series data. To connect the two, I am using the InfluxDB python module. The OS that I am using is Windows 10.

This is what I did:

Generated a database (called historian ) in InfluxDB.
Randomly generated a value between -5 and 5 (which is going to simulate my time series data).
Saved that point of data with the timestamp of “May 1st 2021 00:00:00”, in a measurement called tag_history . This point is also added a tag, called tagid , with the value 1 ( Note: Ignition also uses the concept of “tags”. tagid refers to the tag inside Ignition, not the tag stored in InfluxDB ).
Saved the same point (with the same timestamp) to a SQL Server for comparison.
Repeated step 3. and 4. increasing the timestamp by 5 minutes each time, until the date “June 1st 2021 00:00:00” is reached (basically generating 1 month worth of data, with increments of 5 minutes).

This comes out to a total number of 8928 points (12 variations per hour, for 31 days).

The problem is that, even with this relatively small amount of data, querying takes very long. In order to query the data I use the following code:

from influxdb import InfluxDBClient

# constants are defined in another file
connector = InfluxDBClient(host=HOST, port=PORT, username=ADMIN, password=PASSWORD)
query = "select * from \"historian\".\"autogen\".\"tag_history\""
results = connector.query(query)
return results.get_points()

The average time needed to run this code is around 6 seconds (average over 20 attempts), whereas the time taken to extract the same data from the SQL Server is around 1 second (average over 20 attempts). What am I doing wrong? Is there a way to increase the efficiency of the query? Or am I saving data in the wrong way to begin with?

Note: I also tried querying over a specific time range

[...]
query = """
    select *
    from "historian"."autogen"."tag_history"
    where time >= '2021-05-10T00:00:00Z' and time <= '2021-05-20T00:00:00Z'
"""
[...]

and adding a where clause on the tagid tag like this

[...]
query = "select * from \"historian\".\"autogen\".\"tag_history\" where \"tagid\" = '1'"
[...]

But querying on InfluxDB consistently requires a lot more time than querying on the SQL Server.

I do not know if this is the correct place to ask this question.

Thanks in advance!

Anaisdg · May 17, 2021, 4:58pm

Hello @Manuele,
I don’t know if this will help, but that’s the old client. Here’s the new one: GitHub - influxdata/influxdb-client-python: InfluxDB 2.0 python client

@bednar do you have thoughts on performance here?

Thank you.

bednar · May 18, 2021, 7:12am

I don’t have an exact measured performance change with v2 client, but there is an example how to query InfluxDB 1.8

github.com

influxdata/influxdb-client-python/blob/4792fd5a285817f99391b792150497814ce4b1e5/examples/influxdb_18_example.py#L25


      
          
          
    point = Point("mem").tag("host", "host1").field("used_percent", 25.43234543)
              print(point.to_line_protocol())
          
          
    write_api.write(bucket=bucket, record=point)
          
          
print('*** Query Points ***')
          
          
query_api = client.query_api()
          query = f'from(bucket: \"{bucket}\") |> range(start: -1h)'
          tables = query_api.query(query)
          for record in tables[0].records:
              print(f'#{record.get_time()} #{record.get_measurement()}: #{record.get_field()} #{record.get_value()}')

Topic		Replies	Views
influxDB python query is slow InfluxDB 1 influxdb	1	1851	August 14, 2022
Slower query time compared to SQL Server influxdb	4	3988	August 14, 2017
Data query through Python API is slow InfluxDB 2	1	873	June 28, 2023
Slower query time compared to PostgreSQL influxdb , influxql	1	2661	August 23, 2017
Using python client, query on a specific bucket take 75.x seconds InfluxDB 2 flux	2	677	October 23, 2022

InfluxDB 1.8.5 query takes long

Related topics