Hello,
I am trying to integrate InfluxDB (OSS 1.8.5) to a Scada (Ignition) in order to more efficiently store and retrieve time series data. To connect the two, I am using the InfluxDB python module. The OS that I am using is Windows 10.
This is what I did:
- Generated a database (called historian ) in InfluxDB.
- Randomly generated a value between -5 and 5 (which is going to simulate my time series data).
- Saved that point of data with the timestamp of “May 1st 2021 00:00:00”, in a measurement called tag_history . This point is also added a tag, called tagid , with the value 1 ( Note: Ignition also uses the concept of “tags”. tagid refers to the tag inside Ignition, not the tag stored in InfluxDB ).
- Saved the same point (with the same timestamp) to a SQL Server for comparison.
- Repeated step 3. and 4. increasing the timestamp by 5 minutes each time, until the date “June 1st 2021 00:00:00” is reached (basically generating 1 month worth of data, with increments of 5 minutes).
This comes out to a total number of 8928 points (12 variations per hour, for 31 days).
The problem is that, even with this relatively small amount of data, querying takes very long. In order to query the data I use the following code:
from influxdb import InfluxDBClient
# constants are defined in another file
connector = InfluxDBClient(host=HOST, port=PORT, username=ADMIN, password=PASSWORD)
query = "select * from \"historian\".\"autogen\".\"tag_history\""
results = connector.query(query)
return results.get_points()
The average time needed to run this code is around 6 seconds (average over 20 attempts), whereas the time taken to extract the same data from the SQL Server is around 1 second (average over 20 attempts). What am I doing wrong? Is there a way to increase the efficiency of the query? Or am I saving data in the wrong way to begin with?
Note: I also tried querying over a specific time range
[...]
query = """
select *
from "historian"."autogen"."tag_history"
where time >= '2021-05-10T00:00:00Z' and time <= '2021-05-20T00:00:00Z'
"""
[...]
and adding a where clause on the tagid tag like this
[...]
query = "select * from \"historian\".\"autogen\".\"tag_history\" where \"tagid\" = '1'"
[...]
But querying on InfluxDB consistently requires a lot more time than querying on the SQL Server.
I do not know if this is the correct place to ask this question.
Thanks in advance!