How to debug Kapacitor Alert?

kixiro · February 27, 2018, 12:32pm

I created an alert through the web interface(Chronograf). The graph shows the intersection. But the alert does not work. How to debug?

Kapacitor 1.4.0 (git: HEAD fcce3ee9e6abcee5595fd61066bfc904edb1e113)

/ # kapacitor list tasks
ID Type Status Executing Databases and Retention Policies
chronograf-v1-708c588f-bf3b-4768-8577-d89ca31d3c1b stream enabled true [“telegraf”.“autogen”]
chronograf-v1-ad53ea27-4a65-4405-b37b-75e587f1ede0 stream enabled true [“telegraf”.“autogen”]
/ # kapacitor show chronograf-v1-708c588f-bf3b-4768-8577-d89ca31d3c1b
ID: chronograf-v1-708c588f-bf3b-4768-8577-d89ca31d3c1b
Error:
Template:
Type: stream
Status: enabled
Executing: true
Created: 22 Feb 18 14:04 UTC
Modified: 26 Feb 18 10:40 UTC
LastEnabled: 26 Feb 18 10:40 UTC
Databases Retention Policies: [“telegraf”.“autogen”]
TICKscript:
var db = ‘telegraf’

var rp = ‘autogen’

var measurement = ‘mem’

var groupBy =

var whereFilter = lambda: (“host” == ‘AutomatedTests’)

var name = ‘test’

var idVar = name + ‘:{{.Group}}’

var message = ’ {{.ID}} {{.Name}} {{.TaskName}} {{.Group}} {{.Tags}} {{.Level}} {{ index .Fields “value” }} {{.Time}}’

var idTag = ‘alertID’

var levelTag = ‘level’

var messageField = ‘message’

var durationField = ‘duration’

var outputDB = ‘chronograf’

var outputRP = ‘autogen’

var outputMeasurement = ‘alerts’

var triggerType = ‘threshold’

var crit = 15000000000

var data = stream
|from()
.database(db)
.retentionPolicy(rp)
.measurement(measurement)
.groupBy(groupBy)
.where(whereFilter)
|eval(lambda: “used”)
.as(‘value’)

var trigger = data
|alert()
.crit(lambda: “value” > crit)
.stateChangesOnly()
.message(message)
.id(idVar)
.idTag(idTag)
.levelTag(levelTag)
.messageField(messageField)
.durationField(durationField)
.telegram()
.chatId(‘216013926’)
.parseMode(‘mem > 15’)

trigger
|eval(lambda: float(“value”))
.as(‘value’)
.keep()
|influxDBOut()
.create()
.database(outputDB)
.retentionPolicy(outputRP)
.measurement(outputMeasurement)
.tag(‘alertName’, name)
.tag(‘triggerType’, triggerType)

trigger
|httpOut(‘output’)

DOT:
digraph chronograf-v1-708c588f-bf3b-4768-8577-d89ca31d3c1b {
graph [throughput=“0.00 points/s”];

stream0 [avg_exec_time_ns=“0s” errors=“0” working_cardinality=“0” ];
stream0 → from1 [processed=“0”];

from1 [avg_exec_time_ns=“0s” errors=“0” working_cardinality=“0” ];
from1 → eval2 [processed=“0”];

eval2 [avg_exec_time_ns=“0s” errors=“0” working_cardinality=“0” ];
eval2 → alert3 [processed=“0”];

alert3 [alerts_triggered=“0” avg_exec_time_ns=“0s” crits_triggered=“0” errors=“0” infos_triggered=“0” oks_triggered=“0” warns_triggered=“0” working_cardinality=“0” ];
alert3 → http_out6 [processed=“0”];
alert3 → eval4 [processed=“0”];

http_out6 [avg_exec_time_ns=“0s” errors=“0” working_cardinality=“0” ];

eval4 [avg_exec_time_ns=“0s” errors=“0” working_cardinality=“0” ];
eval4 → influxdb_out5 [processed=“0”];

influxdb_out5 [avg_exec_time_ns=“0s” errors=“0” points_written=“0” working_cardinality=“0” write_errors=“0” ];
}
/ #

arturo_mondelo · February 27, 2018, 1:28pm

Did you check the telegram config?

kixiro · February 27, 2018, 1:40pm

Config is checked. Is it possible to force an alert?

arturo_mondelo · February 27, 2018, 2:39pm

kixiro:

DOT:

digraph chronograf-v1-708c588f-bf3b-4768-8577-d89ca31d3c1b {

graph [throughput=“0.00 points/s”];

stream0 [avg_exec_time_ns=“0s” errors=“0” working_cardinality=“0” ];

stream0 -> from1 [processed=“0”];

from1 [avg_exec_time_ns=“0s” errors=“0” working_cardinality=“0” ];

from1 -> eval2 [processed=“0”];

eval2 [avg_exec_time_ns=“0s” errors=“0” working_cardinality=“0” ];

eval2 -> alert3 [processed=“0”];

alert3 [alerts_triggered=“0” avg_exec_time_ns=“0s” crits_triggered=“0” errors=“0” infos_triggered=“0” oks_triggered=“0” warns_triggered=“0” working_cardinality=“0” ];

alert3 -> http_out6 [processed=“0”];

alert3 -> eval4 [processed=“0”];

http_out6 [avg_exec_time_ns=“0s” errors=“0” working_cardinality=“0” ];

eval4 [avg_exec_time_ns=“0s” errors=“0” working_cardinality=“0” ];

eval4 -> influxdb_out5 [processed=“0”];

influxdb_out5 [avg_exec_time_ns=“0s” errors=“0” points_written=“0” working_cardinality=“0” write_errors=“0” ];

}

It seems that you are evaluating no points…Should be data passing through? Can you force to go over the crit value?

philb · February 27, 2018, 2:54pm

Try changing the ‘Group By’ variable to group by host

var groupBy = [‘host’]

Heads up, if you are just starting out with the TICK stack then in order to use any of the tags you have you need to group by them as well. So if you wanted to include the host and information about the instance for example then:

var groupBy = [‘host’,‘instance’]

if you’re in linux you can use

sudo tail -f /path/to/kapacitor/logs | grep “Name of your alert”

You should be able to see an error in there. Does the host ‘AutomatedTests’ exist as a tag in the database?

falanger · March 27, 2019, 5:25am

HI! I have a similar problem - I created an alert through the web interface and it doesn’t work.

var db = 'monitoring'
var rp = 'autogen'
var measurement = 'system'
var groupBy = ['host']
var whereFilter = lambda: ("prod" == 'true')
var period = 10s
var every = 30s
var name = 'System high load'
var idVar = name + '-{{.Group}}'
var message = 'System high load on  {{ index .Tags "host" }}'
var idTag = 'alertID'
var levelTag = 'level'
var messageField = 'message'
var durationField = 'duration'
var outputDB = 'chronograf'
var outputRP = 'autogen'
var outputMeasurement = 'alerts'
var triggerType = 'threshold'
var crit = 4

var data = stream
    |from()
        .database(db)
        .retentionPolicy(rp)
        .measurement(measurement)
        .groupBy(groupBy)
        .where(whereFilter)
    |window()
        .period(period)
        .every(every)
        .align()
    |mean('load5')
        .as('value')

var trigger = data
    |alert()
        .crit(lambda: "value" >= crit)
        .message(message)
        .id(idVar)
        .idTag(idTag)
        .levelTag(levelTag)
        .messageField(messageField)
        .durationField(durationField)
        .email()
        .telegram()
        .chatId('-1001217384444')
        .parseMode('Markdown')

trigger
    |eval(lambda: float("value"))
        .as('value')
        .keep()
    |influxDBOut()
        .create()
        .database(outputDB)
        .retentionPolicy(outputRP)
        .measurement(outputMeasurement)
        .tag('alertName', name)
        .tag('triggerType', triggerType)

trigger
    |httpOut('output')

I can’t see any corresponding records in the Kapacitor’s logs and the alert doesn’t sent to neither telegram or email.

MarcV · April 8, 2019, 9:21am

Hi @falanger ,

is your problem solved ?
best regards

falanger · April 8, 2019, 10:41am

Hi! Yes, It is! That was very unexpected solution. For some reasons, influxdb wasn’t able to make requests to the kapacitor and, consequently, alerts wasn’t working. So if you have the same problem, try to check connectivity between influxdb and kapacitor (in both directions).

Nicolas_Solignac · November 12, 2019, 1:36pm

Hi to you all!

I had the same issue, and in my case the solution was a change in the Kapacitor’s configuration.
I changed the line “subscription-mode” from “cluster” to “server” and Kapacitor was able to communicate with InfluxDB (both running in localhost).

Hope it helps.
Cheers!!

Topic		Replies	Views
I created a simple alert in kapacitor, but this alert don't work Kapacitor influxdb , telegraf , kapacitor , chronograf	2	719	December 18, 2019
Alerts are not displaying in chronograf dashboard influxdb , telegraf , kapacitor , chronograf , tasks	5	191	May 23, 2024
Alert Count not Updating Dashboards kapacitor , chronograf	7	1954	May 21, 2020
Kapacitor not raising alerts correctly kapacitor , chronograf	1	124	May 14, 2024
Kapacitor Threshold Alerts not working	5	2303	March 27, 2019

How to debug Kapacitor Alert?

Related topics