Deadman alert is the only one that works - Kapacitor

Hello!

I am having some problems with Kapacitor. I am testing how the alerts work, and I created a test database where a datapoint is written every 5 seconds. All the datapoints have the same value of tag=11 temp=10.

The deadmen alert, when I stop the script that loads the data, works fine, so the connection between Kapacitor and InfluxDB works.
I tried two alerts, one that should be triggered when the %change is <= 1% and one alert that should be triggered when the temp is below a threshold of 15.

The TICK script for the threshold log is the following:

var db = ‘Test_Kapacitor’
var rp = ‘autogen’
var measurement = ‘Temp’
var groupBy =
var whereFilter = lambda: (“tag” == ‘11’)
var period = 10s
var every = 30s
var name = ‘Test’
var idVar = name
var message = 'Tag: {{ index .Tags “tag” }} has a temp below 15 ’
var idTag = ‘alertID’
var levelTag = ‘level’
var messageField = ‘message’
var durationField = ‘duration’
var outputDB = ‘chronograf’
var outputRP = ‘autogen’
var outputMeasurement = ‘alerts’
var triggerType = ‘threshold’
var details = ‘Warning! Temp below 15’
var crit = 15
var data = stream
|from()
.database(db)
.retentionPolicy(rp)
.measurement(measurement)
.groupBy(groupBy)
.where(whereFilter)
|window()
.period(period)
.every(every)
.align()
|mean(‘temp’)
.as(‘value’)

var trigger = data
|alert()
.crit(lambda: “value” <= crit)
.message(message)
.id(idVar)
.idTag(idTag)
.levelTag(levelTag)
.messageField(messageField)
.durationField(durationField)
.details(details)
.log(‘/tmp/alert.log’)

trigger
|eval(lambda: float(“value”))
.as(‘value’)
.keep()
|influxDBOut()
.create()
.database(outputDB)
.retentionPolicy(outputRP)
.measurement(outputMeasurement)
.tag(‘alertName’, name)
.tag(‘triggerType’, triggerType)

trigger
|httpOut(‘output’)

And this is the preview graph

I already tried to select only the tag=11, to use groupby tag, to not select any tag, to add .tolerance(100ms) (this for the relative alert) and I’m running out of ideas.

Do you see any errors for the task?

Assuming you’re running the stack on Ubuntu or something similar.

you can check the task itself
kapacitor show taskname - if its created in Chronograf, it will be something like Chronograf-GUID
sudo kapacitor logs | grep "taskname"

The first should give you a breakdown of each node in your script. There is an errors section on each entry. Check see if the script is failing while processing data or not.

The second command should show you what kapacitor is doing each time the task runs so if there are any errors, you will see what they are.

Hi there, sorry for the late reply but I was working at something else first.

I am running kapacitor,telegraf,influxdb and chronograf all in different docker linux containers.

If I do kapacitor logs, the terminal will remain blinking… so I accessed them directly in var/logs/kapacitor/kapacitor.log and it seems that there is a TLS handshake error that keeps happening every second.


This seems strange because I configured HTTPS with self-signed certificate chronograf and influxdb, and they work fine.

If I do any other command like kapacitor list tasks, it will always give me the unable to parse authentication credentials error, even if the credentials are correct. I tried to give them directly in the -url = https://kapacitor:9092?u=redacted&p=redacted or setting them with the environment variables.

[UPDATE]

I disabled https and kept the authentication and now everything works, confirming that there is a HTTPS misconfiguration somewhere, but I can’t find where.

[UPDATE 2]

If I disable autentication on the kapacitor.conf file, I can use the commands kapacitor logs or list tasks in the container, but the TLS error persist.

The kapacitor.conf is the following:

hostname = “kapacitor”
data_dir = “/var/lib/kapacitor”
skip-config-overrides = false
default-retention-policy = “”

[alert]
persist-topics = true

[http]
bind-address = “:9092”
auth-enabled = true
log-enabled = true
write-tracing = false
pprof-enabled = false
https-enabled = true
https-certificate = “/etc/ssl/testing.pem”
https-private-key = “/etc/ssl/testing.pem”
shutdown-timeout = “10s”
shared-secret = “”

[[influxdb]]
enabled = true
name = “default”
default = true
urls = [“https://influxdb:8086”]
username = “redacted”
password = “redacted”
ssl-ca = “”
ssl-cert = “/etc/ssl/testing.pem”
ssl-key = “/etc/ssl/testing.pem”
insecure-skip-verify = true
timeout = “0s”
disable-subscriptions = false
subscription-protocol = “https”
subscription-mode = “cluster”
kapacitor-hostname = “kapacitor”
http-port = 0
udp-bind = “”
udp-buffer = 1000
udp-read-buffer = 0
startup-timeout = “5m0s”
subscriptions-sync-interval = “1m0s”
[influxdb.excluded-subscriptions]
_kapacitor = [“autogen”]

[logging]
file = “/var/log/kapacitor/kapacitor.log”
level = “DEBUG”

UPDATE:
I tried 4 different combinations.
HTTP and no auth with HTTP subscription= the only one that works
HTTPS with auth = unable to parse authentication credentials + TLS handshake error
HTTPS no auth = TLS handshake error
HTTP with auth = unable to parse authentication credentials.

The credentials are correct, since I can use them to setup the kapacitor connection via chronograf.