Chronograf - query over multiple hosts/cpu usage over time alert

Hi, so I am currently setting up the whole TICK stack at our company and so far I’m loving it.
One problem I am facing right now is creating Kapacitor rules through Chronograf which span over multiple hosts and report the host’s name in the alert message.

For example, this is a query which I built using Chronograf:
SELECT “usage_idle” FROM “telegraf”.“autogen”.“cpu” WHERE time > now() - 15m AND (“host”=‘host1’ OR “host”=‘host2’ …’)

This is the templated message:
{{.Time}} cpu usage is >80% over the last 1 minute on {{ index .Tags “host” }}

On the actual message, the “{{ index .Tags “host” }}” is just blank. Would a query like that even report an alert for every occurence per host or do I need to create a specific alert rule per host? And why is the host tag in the message empty?

Edit 1:
Ok, I fixed it. I had to add the “GROUP BY” clause. The query now is:
SELECT “usage_idle” FROM “telegraf”.“autogen”.“cpu” WHERE time > now() - 15m AND “cpu”=‘cpu-total’ AND (“host”=‘host1’ OR “host”=‘host2’ …) GROUP BY “host”

Edit 2:
As it looks now however, I am unable say “when the usage is >80% over a minute, make an alert”. Has anybody an idea on how to do that via Chronograf?

My current query: SELECT “usage_idle” FROM “telegraf”.“autogen”.“cpu” WHERE time > now() - 15m AND “cpu”=‘cpu-total’ AND (“host”=‘host1’ OR “host”=‘host2’ …) GROUP BY “host”

And this is the rule but all I can specify is the change relative to the current state, instead of a threshold over a period of time:

@luca-moser I’m not quite sure whether or not that is supported in Chronograf, but if you access Kapacitor directly you will be able to do more advance analytics that are not available through the UI. In cases like this I normally create a task that gets me most of the way in Chronograf then use kapacitor show <task_id> to get the TICK script which you can then edit.

@jackzampolin thanks for the heads up. I will try to write the scripts directly as long as Chronograf isn’t ready yet.

I've written a TICKScript now and it works when removing the lambda expression which is supposed to filter the hosts. Do you have an idea how the lambda expression has to be written correctly? changing `strContains("host", hostsFilter)` to `strContains('host', hostsFilter)` doesn't work either.

I fixed it :slight_smile:

1 Like