Hi community, I’ve got a weird issue with InfluxQL query plans… I’ve got a pretty simple query, the aim is to get the timestamp of the latest written point for each monitored server.
The basic query is the following: (let’s call it Query A)
SELECT LAST(value) AS value FROM sqlserver_performance WHERE time > now() - 2h GROUP BY sql_instance, host
When this statement is wrapped in a subquery, that does a plain SELECT (be it
SELECT * or an explicit list) the performance drops by a lot, here is Query B
SELECT time, sql_instance, host FROM ( SELECT LAST(value) AS value FROM sqlserver_performance WHERE time > now() - 2h GROUP BY sql_instance, host )
Here are the query plans (Result of
Query A (fast one)
This is extremely fast, almost immediate as I’d expect
QUERY PLAN EXPRESSION: last(value::float) NUMBER OF SHARDS: 1 NUMBER OF SERIES: 22355 CACHED VALUES: 206446 NUMBER OF FILES: 165035 NUMBER OF BLOCKS: 209233 SIZE OF BLOCKS: 16632247
Query B (troublesome one)
This one takes forever and consumes a huge amount of CPU, it goes over 57 shards, meaning it’s doing a full scan of the DB
QUERY PLAN EXPRESSION: last(value::float) NUMBER OF SHARDS: 57 NUMBER OF SERIES: 1272322 CACHED VALUES: 7755 NUMBER OF FILES: 77138 NUMBER OF BLOCKS: 121336 SIZE OF BLOCKS: 13830554
I know adding a plain SELECT over a subquery adds nothing useful, but I don’t see any reason that explains this behavior.
I’ve looked at the docs but found nothing useful about it, can someone explain this behavior?
I’m using InfluxDB v1.8.4