InfluxQL: Using time in Subqueries slows performance

oleg_dev · March 23, 2018, 3:51pm

Hi All,

Maybe the problem is trivial, but it is reproducible and makes a lot of confusion with what we see in the documentation.

InfluxDB v1.4.2

Was trying to improve some query performance on my environment and faced with subqueriy expressions slowness.
E.g.
SELECT COUNT(*) FROM (SELECT mean(aaa) FROM xxx WHERE bbb=‘abc’ AND time > now()-10m AND ccc=true GROUP BY ddd) WHERE mean >= 40 AND mean <= 60

As a result of ANALYZE i see the following:
EXPRESSION: mean(aaa::float)
NUMBER OF SHARDS: 7
NUMBER OF SERIES: 2483
CACHED VALUES: 216
NUMBER OF FILES: 440
NUMBER OF BLOCKS: 466
SIZE OF BLOCKS: 13923

EXPLAIN ANALYZE:
└── select
├── execution_time: 1.118612ms
├── planning_time: 79.916545ms
├── total_time: 81.035157ms
…

I thought why I see the whole shard group scan if only 10 minutes is the targeted data range.

Changed the query to the following, moving time predicate outside the subquery:
SELECT COUNT(*) FROM (SELECT MEAN(aaa) FROM xxx WHERE bbb=‘abc’ AND ccc=true GROUP BY ddd) WHERE mean >= 40 AND mean <= 60 AND time > now()-10m

Result:

ANALYZE:
EXPRESSION: mean(aaa::float)
NUMBER OF SHARDS: 1
NUMBER OF SERIES: 247
CACHED VALUES: 301
NUMBER OF FILES: 439
NUMBER OF BLOCKS: 439
SIZE OF BLOCKS: 12917

EXPLAIN ANALYZE:
└── select
├── execution_time: 573.621µs
├── planning_time: 23.936256ms
├── total_time: 24.509877ms
…

Previously I assumed that subquery run first and external query run afterwards on the collected data. However the plan and execution time show opposite things. Documentation puts the time predicate in the subquery.

Correct me if I`m wrong, but looks like there is something wrong with the InfluxQL parser and semantic tree builder. I know in IFQL these problems may be already solved.

Regards,

Topic		Replies	Views
InfluxQL - Subquery generates awful query-plan InfluxQL influxql , query	2	735	May 13, 2022
Can only aggregate 2 fields using a subquery	1	938	September 19, 2017
Ifql initial observations Store	2	800	January 28, 2018
Strange Behaviour of MEAN() function InfluxDB 2 influxql	7	394	February 19, 2024
Performance issue with Aggregation and Limit InfluxQL influxdb , query	0	503	January 26, 2022

InfluxQL: Using time in Subqueries slows performance

Related topics