Use alert counts from TICK scripts


Hi all,

I have recently started using Kapacitor to replace our previous alerting tool. So far I am more than happy with it.

I have one question regarding alerts:
We are tracking a list of services. I want to raise an alert if a service is offline. I also want to raise an alert if the number of instances of this specific alert is above a certain threshold (i.e. more than THRESHOLD services are offline). So far, I wasn’t able to find anything about how to use alert statistics from a TICK script.

TICK script so far:

kapacitor show cpu_alert
ID: cpu_alert
Type: stream
Status: enabled
Executing: true
Created: 12 Nov 18 14:30 UTC
Modified: 13 Nov 18 10:30 UTC
LastEnabled: 13 Nov 18 10:30 UTC
Databases Retention Policies: ["telegraf"."autogen"]
dbrp "telegraf"."autogen"

var THRESHOLD = 10

var data = stream

var services = data
        .crit(lambda: int("running") < 2)

//missing something like:
//   |count('critical')
//   |alert()
//   |crit(lambda: int('count') > THRESHOLD)

digraph cpu_alert {
graph [throughput="0.00 points/s"];

stream0 [avg_exec_time_ns="0s" errors="0" working_cardinality="0" ];
stream0 -> from1 [processed="0"];

from1 [avg_exec_time_ns="0s" errors="0" working_cardinality="0" ];
from1 -> groupby2 [processed="0"];

groupby2 [avg_exec_time_ns="0s" errors="0" working_cardinality="0" ];
groupby2 -> window3 [processed="0"];

window3 [avg_exec_time_ns="0s" errors="0" working_cardinality="0" ];
window3 -> max4 [processed="0"];

max4 [avg_exec_time_ns="0s" errors="0" working_cardinality="0" ];
max4 -> alert5 [processed="0"];

alert5 [alerts_inhibited="0" alerts_triggered="0" avg_exec_time_ns="0s" crits_triggered="0" errors="0" infos_triggered="0" oks_triggered="0" warns_triggered="0" working_cardinality="0" ];

Any help would be much appreciated


You can write each alert back to InfluxDB, and then alert separately on top of that measurement.