Maybe the problem is trivial, but it is reproducible and makes a lot of confusion with what we see in the documentation.
Was trying to improve some query performance on my environment and faced with subqueriy expressions slowness.
SELECT COUNT(*) FROM (SELECT mean(aaa) FROM xxx WHERE bbb=‘abc’ AND time > now()-10m AND ccc=true GROUP BY ddd) WHERE mean >= 40 AND mean <= 60
As a result of ANALYZE i see the following:
NUMBER OF SHARDS: 7
NUMBER OF SERIES: 2483
CACHED VALUES: 216
NUMBER OF FILES: 440
NUMBER OF BLOCKS: 466
SIZE OF BLOCKS: 13923
├── execution_time: 1.118612ms
├── planning_time: 79.916545ms
├── total_time: 81.035157ms
I thought why I see the whole shard group scan if only 10 minutes is the targeted data range.
Changed the query to the following, moving time predicate outside the subquery:
SELECT COUNT(*) FROM (SELECT MEAN(aaa) FROM xxx WHERE bbb=‘abc’ AND ccc=true GROUP BY ddd) WHERE mean >= 40 AND mean <= 60 AND time > now()-10m
NUMBER OF SHARDS: 1
NUMBER OF SERIES: 247
CACHED VALUES: 301
NUMBER OF FILES: 439
NUMBER OF BLOCKS: 439
SIZE OF BLOCKS: 12917
├── execution_time: 573.621µs
├── planning_time: 23.936256ms
├── total_time: 24.509877ms
Previously I assumed that subquery run first and external query run afterwards on the collected data. However the plan and execution time show opposite things. Documentation puts the time predicate in the subquery.
Correct me if I`m wrong, but looks like there is something wrong with the InfluxQL parser and semantic tree builder. I know in IFQL these problems may be already solved.