Background:
Client gather partition lag every 10 seconds from many kafka partitions(>1000),and write into influxdb with time and tag:topic / tag: partition / field: lag_size .
I need to draw lag size of the kafka topic time series
sql like this:
select
mean(total_lag) as avg_lag, --- avg_lag time series
max(total_lag) as max_lag, --- max_lag time series
min(total_lag) as max_lag, --- min_lag time series
from
(
select
sum(lag_size) as total_lag --- add up all partition lag size
from
"kafka_lag"
where
time > now() -7d -- last week : 7*24* 360 ( report num per hour)* 1000(partition num) ≈ 60,000,000
and topic="topic1"
group by
time(10s) --- report interval
)
group by
time(3600s) --downsample
it cost more than one minute. how should I optimize this sql in such case