Customizing threshold levels with a tag

#1

We use Influx/Kapacitor in a common server shared by a variety of different applications. Using hardcoded alert thresholds in the tick scripts (ie var crit = 80) means one-size-fits-all. We’d like to try to be able to customize some thresholds on a per-app basis by using tags - although if you have better suggestions, that would be OK too. Right now a sample tick looks like:

//parameters
var warn = 40
var unit = 30m
var crit = 20
var critHIGH = 10
var critLOW = 30

//Dataframe
var data = stream
|from()
.measurement(‘cpu’)
.groupBy(‘host’)
|default()
.tag(‘APP_CPU_CRITICAL_LEVEL’, ‘NORMAL’)

var alert = data
|alert()
.id(‘cpu-usage’)
.message(‘error message stuff’)
.info(lambda: “usage_idle” > warn)
.warn(lambda: “usage_idle” < warn)
.crit(lambda: “usage_idle” < crit AND ‘{{ index .Tags “APP_CPU_CRITICAL_LEVEL” }}’ == ‘NORMAL’ )
.crit(lambda: “usage_idle” < critHIGH AND ‘{{ index .Tags “APP_CPU_CRITICAL_LEVEL” }}’ == ‘HIGH’ )
.crit(lambda: “usage_idle” < critLOW AND ‘{{ index .Tags “APP_CPU_CRITICAL_LEVEL” }}’ == ‘LOW’ )
.stateChangesOnly(unit)

//alert
alert
.sensu()
.source(‘source stuff’)

The ‘default’ is working well, and I can see the tag in the log file. And its picking up any tags that I set in an applications telegraf.conf too.

But the AND ‘{{ index .Tags “APP_CPU_CRITICAL_LEVEL” }}’ == ‘xxxxxxx’ is not working. Removing all but the NORMAL one, with that added conditional on it, the alert does not trigger even when I can see measurements which should trigger it.

I’ve tried it without the {{ index . Tags … }} and that didn’t help either.

The syntax of the conditional seems OK from what I can tell, simply a string comparison of a TAG against different contents.

But obviously I’m doing something wrong. Any advice would be appreciated.

Matt

#2

I actually did get this working by changing the crit section to:

    .crit(lambda: (("usage_idle" < crit     AND "APP_CPU_CRITICAL_LEVEL" == 'NORMAL' ) OR
                   ("usage_idle" < critHIGH AND "APP_CPU_CRITICAL_LEVEL" == 'HIGH' ) OR
                   ("usage_idle" < critLOW  AND "APP_CPU_CRITICAL_LEVEL" == 'LOW' )))