Select series whose aggregation greater than value

Hi,

I am currently using Grafana and InfluxDB 1.8. I have a measurement that records webservice call times, and there is a tag for the webservice URL, say url=/get/version. Plotting all series (60-70 different URLs) is hard to glean insights from, but it would be more beneficial to see all the slow queries only, defined by an aggregation, like show the raw series for all URLs whose 90th percentile was > 30,000 ms or maybe all URLs whose avg was > 15,000 ms. Once I do a subquery to get the aggregation like percentile, I am not sure how to then just SELECT all the raw series for a percentile > some value.

Can anyone help with that?

I can’t provide much without a query to start from, but you need to use filters on field values (yes it’s possible) and maybe even subqueries.

I have no idea about what’s the best approach for you or what you want to obtain exactly, but I create a filtered flat table by doing something like this (it will be slightly different if you need to plot the result in a chart)

SELECT * FROM
(
	SELECT
		 sum("_MyField_") AS "_MyAlias_"
	FROM {...}
	WHERE 
		$timeFilter
		AND {...}
	GROUP BY
		 time($__interval)
		,{...}
)
WHERE
	"_MyAlias_" > 0

Hi Giovanni,

Let me try and simplify hopefully it might help to understand what I am trying to achieve:

Like for something simple like CPU usage, say I have 20 hosts, but only two hosts are running hot at 100% all the time but occasionally settle at below 50%, but the 90th percentile computes to 95% cpu usage for those two hosts only. Is there a way to display only the two that have a percentile(90) > 95%?

This is just how to plot the CPU usage and group by host, which would show all 20 hosts and their usage, but 18 of them would be mostly idle, and 2 of them would be running at almost 100% usage all the time.

Flux:

from(bucket: "db/rp")
  |> range(start: v.timeRangeStart, stop:v.timeRangeStop)
  |> filter(fn: (r) =>
    r._measurement == "cpu" and
    r._field == "usage")
  |> group(columns: ["host"])

InfluxQL:

SELECT 
    mean("usage") as "usage"
FROM 
    "metrics_retention"."cpu" 
WHERE 
    $timeFilter 
GROUP BY 
    time($__interval), "host" fill(null)

What I want is something like a filter that says WHERE percentile(usage,90) > 95. So that for the two series (hosts) whose 90th percentile aggregation is greater than 95%, only display those two and ignore the rest of the hosts/series. It could even be something simpler like average/mean instead of percentile. Plot only the series whose computed average is greater than 50%, don’t plot the rest.

Then is something like this should work

SELECT
    mean("CpuUsage") as "CpuUsage"
FROM (
    SELECT 
        mean("usage") as "CpuUsage"
    FROM 
        "metrics_retention"."cpu" 
    WHERE 
        $timeFilter 
    GROUP BY 
        time($__interval), "host"
)
WHERE 
    $timeFilter
    AND "CpuUsage" > 50
GROUP BY 
    *
FILL(null)

notes:

  • the where filter is partially repeated, as it’s present in both inner and outer query. (not mandatory but might save you some performance issues… I recommend it)
  • "CpuUsage" > 50 (or 0.5 up to the actual number underneath) refers to the computation returned by the inner query (I’ve changed the aliases just to make it clearer)
  • The outer query repeats the aggregation made by the inner one, with the very same grouping, which also seems redundant as it changes nothing. This is because I’m not sure about how Grafana behaves in the case of a plain SELECT * FROM (__subquery__) ..., which might not be seen as a set of series… you can try both approaches to see how they work.
  • The same can be done using Flux, I just don’t know how to write it. (I don’t use Flux at all)

I have tried similar stuff but what happens is that you lose the rest of your data points for that series when you apply a filter that basically trims anything below a value of 50, but I would like to show those still.

See the picture below, top panel is the “slow queries” aka series where the max > 200ms. The bottom panel is all series regardless of min/max.

I lose all the data points below value 200 when I apply these filters. I would like to not plot series whose calculated average does not exceed 200. So for example, for the green series in the image, it successfully shows the data point above 200 but does not show the next sample 3 minutes later at around 100ms even though the average for non-0 values for the series is around 500.

So the query I’m looking for is, if the calculated average of the green series > 200 (which it is), then plot all data points for that series, even if a data point is below 200.

I simply think that can’t be achieved with InfluxQL + Grafana, as there is no way to join two separate datasets (which is what is required in this case).

You might be able to do that in Flux as it supports join, logically you need to:

  • Calculate avg CPU for each host, then apply the filter you need. This will be the list of hosts to consider. (you will need a distinct list)
  • Produce the usual series of the CPU by time and host
  • Inner Join the two datasets, so only the ones identified in the first query are kept

I suggest you look at the Flux docs about joins.
if you need further help I’d ask you to open a separate question under the flux category, in order to get help from people who actually know it. (I don’t)

I see, I thought may be the case as well. I really appreciate you taking the time to help, I will take a look at the joins in flux and see if I can come up with something. If not I will repost in flux. Thanks!