Query too slow when group by small time interval

gala · January 24, 2021, 9:22am

Background:
Client gather partition lag every 10 seconds from many kafka partitions（>1000）,and write into influxdb with time and tag:topic / tag: partition / field: lag_size .

I need to draw lag size of the kafka topic time series

sql like this:

select
    mean(total_lag) as avg_lag,           ---  avg_lag time series
    max(total_lag) as max_lag,            ---  max_lag time series
    min(total_lag) as max_lag,             ---  min_lag time series
from
    (
        select
            sum(lag_size) as total_lag       --- add up all partition lag size
        from
            "kafka_lag"
        where
            time > now() -7d       -- last week :  7*24* 360 ( report num per hour)* 1000(partition num) ≈  60,000,000
            and topic="topic1"
        group by 
            time(10s)           --- report interval
    )
group by
    time(3600s)           --downsample

it cost more than one minute. how should I optimize this sql in such case

Anaisdg · January 25, 2021, 7:21pm

Hello @gala ,
Welcome!
Unfortunately, I’m not quite sure. I’m sharing your question with the InfluxDB team. I appreciate your patience.

Topic		Replies	Views
Summing multiple series influxdb	2	18804	May 11, 2018
groupBy time not working with stream influxdb , kapacitor	5	698	December 15, 2020
Double nested query yields unexpected results Dashboards influxdb , influxql	0	529	November 23, 2018
Group by time() throws error InfluxQL influxql , query	4	1973	March 16, 2022
Flux window or aggregatewindow versus v1 group by InfluxDB 2 time-series	3	645	December 20, 2022

Query too slow when group by small time interval

Related topics