Hi community, I’ve got a weird issue with InfluxQL query plans… I’ve got a pretty simple query, the aim is to get the timestamp of the latest written point for each monitored server.
The basic query is the following: (let’s call it Query A)
SELECT
LAST(value) AS value
FROM sqlserver_performance
WHERE time > now() - 2h
GROUP BY sql_instance, host
When this statement is wrapped in a subquery, that does a plain SELECT (be it SELECT *
or an explicit list) the performance drops by a lot, here is Query B
SELECT
time, sql_instance, host
FROM (
SELECT LAST(value) AS value
FROM sqlserver_performance
WHERE time > now() - 2h
GROUP BY sql_instance, host
)
Here are the query plans (Result of ANALYZE
)
Query A (fast one)
This is extremely fast, almost immediate as I’d expect
QUERY PLAN
EXPRESSION: last(value::float)
NUMBER OF SHARDS: 1
NUMBER OF SERIES: 22355
CACHED VALUES: 206446
NUMBER OF FILES: 165035
NUMBER OF BLOCKS: 209233
SIZE OF BLOCKS: 16632247
Query B (troublesome one)
This one takes forever and consumes a huge amount of CPU, it goes over 57 shards, meaning it’s doing a full scan of the DB
QUERY PLAN
EXPRESSION: last(value::float)
NUMBER OF SHARDS: 57
NUMBER OF SERIES: 1272322
CACHED VALUES: 7755
NUMBER OF FILES: 77138
NUMBER OF BLOCKS: 121336
SIZE OF BLOCKS: 13830554
I know adding a plain SELECT over a subquery adds nothing useful, but I don’t see any reason that explains this behavior.
I’ve looked at the docs but found nothing useful about it, can someone explain this behavior?
I’m using InfluxDB v1.8.4