InfluxQL - Subquery generates awful query-plan

Hi community, I’ve got a weird issue with InfluxQL query plans… I’ve got a pretty simple query, the aim is to get the timestamp of the latest written point for each monitored server.

The basic query is the following: (let’s call it Query A)

SELECT 
  LAST(value) AS value
FROM sqlserver_performance
WHERE time > now() - 2h
GROUP BY sql_instance, host

When this statement is wrapped in a subquery, that does a plain SELECT (be it SELECT * or an explicit list) the performance drops by a lot, here is Query B

SELECT
  time, sql_instance, host
FROM ( 
  SELECT LAST(value) AS value
  FROM sqlserver_performance
  WHERE time > now() - 2h
  GROUP BY sql_instance, host
)

Here are the query plans (Result of ANALYZE)
Query A (fast one)
This is extremely fast, almost immediate as I’d expect

QUERY PLAN
EXPRESSION: last(value::float)
NUMBER OF SHARDS: 1
NUMBER OF SERIES: 22355
CACHED VALUES: 206446
NUMBER OF FILES: 165035
NUMBER OF BLOCKS: 209233
SIZE OF BLOCKS: 16632247

Query B (troublesome one)
This one takes forever and consumes a huge amount of CPU, it goes over 57 shards, meaning it’s doing a full scan of the DB

QUERY PLAN
EXPRESSION: last(value::float)
NUMBER OF SHARDS: 57
NUMBER OF SERIES: 1272322
CACHED VALUES: 7755
NUMBER OF FILES: 77138
NUMBER OF BLOCKS: 121336
SIZE OF BLOCKS: 13830554

I know adding a plain SELECT over a subquery adds nothing useful, but I don’t see any reason that explains this behavior.
I’ve looked at the docs but found nothing useful about it, can someone explain this behavior?

I’m using InfluxDB v1.8.4

Hello @Giovanni_Luisotto,
I’m not sure. I’ve asked I hope to find you an answer soon. Thank you for your patience.

Can you include a time clause?

SELECT
  time, sql_instance, host
FROM ( 
  SELECT LAST(value) AS value
  FROM sqlserver_performance
  WHERE time > now() - 2h
  GROUP BY sql_instance, host
)
WHERE time > now() - 2h

Sorry for the delay, it works… I can’t explain why to myself but it works

> EXPLAIN SELECT time, sql_instance, host FROM ( SELECT LAST(value) AS value FROM sqlserver_performance WHERE time > now() - 2h GROUP BY sql_instance, host) WHERE time > now() - 2h
QUERY PLAN
----------
EXPRESSION: last(value::float)
NUMBER OF SHARDS: 2
NUMBER OF SERIES: 44794
CACHED VALUES: 5734
NUMBER OF FILES: 134094
NUMBER OF BLOCKS: 134094
SIZE OF BLOCKS: 15087092
1 Like