Large query performance

Roman_Ripp · July 23, 2021, 12:08am

Ok here is a problem we have. We are streaming data into influxdb. The data is sampled at 1s interval.
From time to time we need to query data over large time range. 5 years is a good example. We do not need 1s precision, but would like this query to be fast.

Example:
SELECT mean(value) as value FROM “values” WHERE ( id=$F_0 ) AND time >= 2016-07-22T00:00:00-07:00 AND time < 2021-07-21T00:00:00-07:00 GROUP BY time(86400s),id fill(none);
Query like this, will take > 20s

We tried down sampling data using continuous queries, but its a major challenge to keep down sampled data synchronized with actual data. Our ingestion is not constant, we could have outages and catch up, that results in gaps in down sampled data.

Is there a performant way to fill gaps in data produced by continuous queries?
is there a performant way to query data over large periods of time at 24h interval for example?

Giovanni_Luisotto · July 23, 2021, 5:45pm

About filling gaps, it depends…
You can always fill in manually by running the query yourself for the “missing range”, but if you know how long it takes to recover those data, you can just reprocess a whole period every time it runs
as and example you can:

AggrInterval - downsample data to 24h intervals
Frequency - every 24h
TimeWindow - reprocess 3 whole days (72h) of data, overriding previous results.

see CQ advanced synthax

About performance, this case is the perfect example of downsampling usage, with so much data I doubt there is much you can do to improve performance. (I assume you are already getting as few data as possible with the query). the only other option I see is increasing resources, or maybe trying t get the result with more than one query (one per year?) that should help with time but will require more resources.

Topic		Replies	Views
Continuous query on the full database influxdb , query	7	3364	April 10, 2019
Downsampling batch-imported-data automatically influxdb	0	931	February 14, 2018
Down sampling existing data influxdb	3	1987	July 26, 2019
Data aggregation using continuous queries	3	2302	October 12, 2018
Downsample and Downsampling a Downsample	2	985	April 5, 2017

Large query performance

Related topics