Subtract value in Kapacitor message

kapacitor
#1

Hello, I’m configuring Kapacitor to send me alerts when one node reaches a certain percentage of CPU usage by 5 mins, but I don’t know how to send the alert to say the % used.
I’m ussing usage_idle to trigger the alert, but I want to send the (100 - idle) value in the alert, how can I do that? My message is this:

{{.Level}}: CPU usage in {{ index .Tags “host”}} es: {{ index .Fields “value” | printf “%.1f” }}

And I want something like:

{{.Level}}: CPU usage in {{ index .Tags “host”}} es: {{ ( 100 - index .Fields “value” ) | printf “%.1f” }}

I’ve tried everything I can think of but without luck.

Thanks in advance.

#2

@Macfresno Can you include the entire tickscript that you’re using? Also, is it safe to assume that you’re using Telegraf as your data source?

#3

@michael As I only need basic rules I’m using Chronograf’s Kapacitor rules whith this config for this rule:

Select:
SELECT mean("usage_idle") AS "mean_usage_idle" FROM "tbh"."autogen"."cpu" WHERE time > now() - 15m GROUP BY time(5m), "host"

Send Alert when usage_idle is Less Than 20 (80 % usage)

Alert message:
{{.Level}}: Uso CPU en {{ index .Tags "host"}} es: {{ index .Fields "value" | printf "%.1f" }}

You are right, I’m using Telegraf as my data source.

#4

Do you have access to the Kapacitor instance where the task is running? If so, I have a couple of asks

  1. Can you run kapacitor list tasks
  2. For each listed task run kapacitor show <task id>

and paste the results back here.

#5
ID                                                 Type      Status    Executing Databases and Retention Policies
chronograf-v1-098da2f8-41f8-4dcd-9b2d-2a9bfa0f0894 stream    enabled   true      ["tbh"."autogen"]
chronograf-v1-35030398-bdc4-48ec-8bc7-fb83e9fc22ae stream    enabled   true      ["tbh"."autogen"]
chronograf-v1-af1f2da6-9193-47f9-bd6b-93bcff9d176a stream    enabled   true      ["tbh"."autogen"]
chronograf-v1-e6b9c39f-e6e0-49a7-993d-2970309db583 stream    enabled   true      ["tbh"."autogen"]

And the task that I have problems with is this one:

ID: chronograf-v1-af1f2da6-9193-47f9-bd6b-93bcff9d176a
Error:
Template:
Type: stream
Status: enabled
Executing: true
Created: 15 Mar 17 20:59 CET
Modified: 15 Mar 17 23:52 CET
LastEnabled: 15 Mar 17 23:52 CET
Databases Retention Policies: ["tbh"."autogen"]
TICKscript:
var db = 'tbh'

var rp = 'autogen'

var measurement = 'cpu'

var groupBy = ['host']

var whereFilter = lambda: TRUE

var period = 5m

var every = 30s

var name = 'CPU Usage'

var idVar = name + ':{{.Group}}'

var message = '{{.Level}}: Uso CPU en {{ index .Tags "host"}} es: {{ index .Fields "value" | printf "%.1f" }}'

var idTag = 'alertID'

var levelTag = 'level'

var messageField = 'message'

var durationField = 'duration'

var outputDB = 'chronograf'

var outputRP = 'autogen'

var outputMeasurement = 'alerts'

var triggerType = 'threshold'

var crit = 20

var data = stream
    |from()
        .database(db)
        .retentionPolicy(rp)
        .measurement(measurement)
        .groupBy(groupBy)
        .where(whereFilter)
    |window()
        .period(period)
        .every(every)
        .align()
    |mean('usage_idle')
        .as('value')

var trigger = data
    |alert()
        .crit(lambda: "value" < crit)
        .stateChangesOnly()
        .message(message)
        .id(idVar)
        .idTag(idTag)
        .levelTag(levelTag)
        .messageField(messageField)
        .durationField(durationField)
        .telegram()

trigger
    |influxDBOut()
        .create()
        .database(outputDB)
        .retentionPolicy(outputRP)
        .measurement(outputMeasurement)
        .tag('alertName', name)
        .tag('triggerType', triggerType)

trigger
    |httpOut('output')

DOT:
digraph chronograf-v1-af1f2da6-9193-47f9-bd6b-93bcff9d176a {
graph [throughput="0.00 points/s"];

stream0 [avg_exec_time_ns="0s" ];
stream0 -> from1 [processed="180771"];

from1 [avg_exec_time_ns="13.035µs" ];
from1 -> window2 [processed="180771"];

window2 [avg_exec_time_ns="36.265µs" ];
window2 -> mean3 [processed="60254"];

mean3 [avg_exec_time_ns="514.622µs" ];
mean3 -> alert4 [processed="60254"];

alert4 [alerts_triggered="2" avg_exec_time_ns="41.064µs" crits_triggered="1" infos_triggered="0" oks_triggered="1" warns_triggered="0" ];
alert4 -> http_out6 [processed="2"];
alert4 -> influxdb_out5 [processed="2"];

http_out6 [avg_exec_time_ns="0s" ];

influxdb_out5 [avg_exec_time_ns="0s" points_written="2" write_errors="0" ];
}