How does one do a TOP() query grouped by HOST when the field I’m querying is a counter style field and so only makes sense when aggregated by NON_NEGATIVE_DIFFERENCE(MEAN(“http_response.5xx”))?
I wish I could SELECT TOP(NON_NEGATIVE_DIFFERENCE(MEAN("http_response.5xx")), "host", 10) FROM "haproxy" WHERE... but that doesn’t work.
I’m struggling to do this, I think because my only interface to influx is via grafana and I don’t see how to mix an existing group by with a top query and non_negative_difference.
I have this query (using haproxy telegraf data) and I’d like a top 10 by host. I’m having a mental block I think from too many years of only postgresql.
SELECT NON_NEGATIVE_DIFFERENCE(MEDIAN(“http_response.5xx”)) FROM “haproxy” WHERE (“meta_datacenter” = ‘us-east-1’ AND “server” = ‘/run/reverseproxy/haproxy.sock’ AND “type”=‘frontend’ AND “proxy”=‘443_public_ssl_in’ ) AND $timeFilter GROUP BY time($Interval)
trying:
SELECT TOP( NON_NEGATIVE_DIFFERENCE(MEDIAN("http_response.5xx")), "host", 10) FROM "haproxy" WHERE ("meta_datacenter" = 'us-east-1' AND "server" = '/run/reverseproxy/haproxy.sock' AND "type"='frontend' AND "proxy"='443_public_ssl_in' ) AND $timeFilter GROUP BY host, time(30s)
Results: InfluxDB Error: expected first argument to be a field in top(), found non_negative_difference(median("http_response.5xx"))
You just need a subquery, something like the one below should do it
SELECT
TOP("diff", "host", 10)
FROM (
SELECT
NON_NEGATIVE_DIFFERENCE(MEDIAN("http_response.5xx")) as "diff"
, "host"
FROM "haproxy"
WHERE (
"meta_datacenter" = 'us-east-1'
AND "server" = '/run/reverseproxy/haproxy.sock'
AND "type"='frontend'
AND "proxy"='443_public_ssl_in'
) AND $timeFilter
GROUP BY
host
,time(30s)
)
I realized that TOP over time doesn’t make sense unless it is aggregated over time, so
SELECT TOP( "5xx", 5), "host" FROM (
SELECT (MEDIAN("http_response.5xx")) as "5xx"
FROM "reverseproxy"
WHERE ("sv"='FRONTEND' AND "proxy"='443_public_ssl_in' ) AND $timeFilter
GROUP BY "host" )
Does get me a list of top hosts, but I’d like to take that list of top hosts and then use them as part of a WHERE. In SQL I would use an IN clause SUBQUERY in the WHERE clause. How can I do this with influxdb?
SELECT NON_NEGATIVE_DIFFERENCE(MEDIAN("http_response.5xx"))
FROM "haproxy" WHERE ("sv"='FRONTEND' AND "proxy"='443_public_ssl_in'
AND "host" IN (SELECT "host" FROM (<ABOVEQUERY>))
) AND $timeFilter GROUP BY time($Interval),"host"