Nested query performance InfluxDB 1.8

I gather data from multiple agents into the same InfluxDB to provide an enterprise wide summary. The metrics I have are detailed hypervisor CPU usage gathered by a guest. For a total overview, I want to add up the per-CPU metrics and get a query like this:

select sum(usage) as usage from zvm_lpar_sytcup
where time > now() - 5m group by time(1m),seqnr,cputype

This runs fine and completes in 60 ms or so.
Now the challenge is that when two guests happen to run on the same hypervisor, we get the data from that hypervisor twice, and my sum() doubles the amount. So I came up with this:

select max(*) from
    (select sum(usage) as usage from zvm_lpar_sytcup
         where time > now() - 5m group by time(1m),seqnr,cputype,lpar)
    group by time(1m),seqnr,cputype

This gives the correct results, but the query takes 550 ms to run. I tried to add the where clause to the outer select as well, but that did not help much.
My Grafana dashboard gets the data for 3 or 6 hours, which takes 1.2 seconds with just a few systems feeding data. Now I can do a continuous query for that last one, which makes the dashboard quick again, but it does burn 1% of CPU non-stop (even when nobody is looking). I’m concerned about usage when we have a few dozen systems feeding data.

I now made two separate continuous queries; one for the inner select (that takes 60 ms) and one for the outer select (using the result from the first CQ) and that takes 2 ms. I suppose because I defined them in this order, that the inner runs before the outer, so it actually works… But since the various system clocks are not necessarily in sync, I run with ‘resample every 1m for 10m’ anyway, so would catch the data later.

But it seems we do have an issue with the nested query when that takes an order of magnitude longer.

Hello. @rvdheij,
I’m confused. Have you resolved your problem? Is there anything else I can help with? Thank you.

Hi @Anaisdg
I have bypassed the issues by breaking up that nested query. I would like to understand why it was so expensive or what I could have done to avoid that.

Hello @rvdheij,
It’s hard to know exactly without some more information, but I notice that your outer query isn’t bound with a time range, which could be the problem. Try adding a where clause to the outside query.
Thanks :slight_smile:

1 Like

This topic was automatically closed 60 minutes after the last reply. New replies are no longer allowed.