Speed up InfluxDB2 Query via golang client?

jos · June 23, 2020, 9:58am

I’m creating a simulation that uses influxDB (via the golang client) to store data. Writing the information is very quick, but the query from the client to the server takes longer than I am hoping for, since I am trying to update the simulation multiple times a second.

If the query rate occurs faster than once per second, the response somehow stalls, with the updated information regarding the query lagging behind the actual query rate. For example, if the client asks for information at a specific timestamp every 0.25 seconds, it will take about ~3 seconds before the simulation begins sending back fresh data about that specific timestamp - everything before that will be using older data with different timestamps.

Inversely, if the client asks for info every 1 second, it will only take 1 second to send back the updated data. I think this is the sign of a network bottleneck, but I’m not sure if it’s possible to avoid it.

How can I speed up the query process and avoid network bottlenecks in order to query multiple times every second?

VlastaHajek · June 23, 2020, 1:59pm

Hi @jos ,
could you, please, share the queries you’ve mentioned? Also some info about data you are writing, e.g. frequency of writing, types, cardinality?

jos · June 24, 2020, 12:47am

Sure.

An example of a query is below:

from(bucket: "MyBucket")
        |> range(start: -2d, stop: 2d)
        |> filter(fn: (r) => r["_measurement"] == "ExampleData")
        |> filter(fn: (r) => r["_field"] == "Var1" or r["_field"] == "Var2" or r["_field"] == "Var3" or r["_field"] == "Var4")
        |> pivot(
          rowKey:["_time"],
          columnKey: ["_field"],
          valueColumn: "_value"
        )
            |> filter(fn: (r) => r["_time"] == Example Timestamp Value (in golang time format))
            |> filter(fn: (r) => r["name"] == "ObjectName")

I am using queryAPI.Query to make the query via golang.
A written point is structured as:

p := influxdb2.NewPointWithMeasurement("ObjectData").
		AddTag("name", object.Name). 
		AddField("var1", var1).
		AddField("var2", var2).
		...
		AddField("var8", var8). 
		SetTime(timestamp)

Vars 1-6 are float64, with the last to Vars being int. I’m simulating a day’s worth of data currently in seconds, which is written only at the beginning of the simulation and takes about 5 seconds to complete.

Currently I am making a query for data every one second. The way this works is:

I get a request from the client for data, via unmarshalling a json string. The request includes a timestamp.
I begin constructing a response struct.
I use the golang InfluxDB2 client to make a query to the database, with the data at that timestamp.
I use the data from the query to fill out the response struct.
The response struct is marshalled into JSON and is sent to the client.

If the info request from the client is faster than once per second, the request timestamp from the client will stall - that is, if I request data about 5:34:01 am, and the last set of data I asked for was 5:34:00am, it will take an additional 3 seconds before the request data actually switches from 5:34:00 to 5:34:01.

Do you require more info? Likewise, what can be done here to improve query response time? Would the use of QueryRaw be better suited for applications such as this?

VlastaHajek · June 25, 2020, 4:32pm

You wrote in the first post, that if you query once per second it takes just a sec query to complete. Is it the same query as you showed above?

For speeding up your query I would recommend filtering everything before pivot, if possible.
Also, you could also ask for a specific point in time by setting range to you timestamp:
start = your timestamp
stop = your timestamp + 1nanosecond.

jos · July 1, 2020, 12:21am

Hello,

Yes, it is that query above. However, from my own testing it looks like the Query takes about 250ms to actually complete, which seems slow for this kind of database.

I will try implementing the operations you mentioned here, but out of curiosity, is there a way to write data such that a pivot() is not required when querying?

EDIT: It looks like the timestamp start/stop suggestion helped speed it up greatly (from 250ms to 28ms, by looking one second before and after) but I’m guessing there are still performance gains to be obtained here.

Topic		Replies	Views
Synchronizing data using query + write influxdb	1	2658	May 2, 2017
InfluxDB 1.8.5 query takes long Telegraf influxdb , time-series , python	2	1388	May 18, 2021
Using python client, query on a specific bucket take 75.x seconds InfluxDB 2 flux	2	677	October 23, 2022
Any Enhancement for InfluxDB Query Performance now? InfluxDB 2 query , performance	1	319	July 24, 2023
Long query time (30sec+) on LANCache Dashboard InfluxDB 2	5	87	August 24, 2024

Speed up InfluxDB2 Query via golang client?

Related topics