I’d like to setup DNR (Data Not Received) alert for which I use Alert Type: Deadman. I hope that’s the proper type for this purpose.
Essentially I wanna be alerted when there is no new data in the db in the last 10 minutes.
So I do all as usual:
- Build a query in Select a Time Series - see below on picture
- Select Alert Type: Deadman and …missing for: 10m
- Send Alert to log
- Put a name of the Rule
- Click Save Rule
- Get Rule successfully updated! box
- verify the rule in kapacitor - posted below.
- refresh the webpage in Chronograf (or navigate in left pane Alerting -> Alert Rules -> select the actual Rule)
- Select a Time Series and Rule Conditions are empty!!
[martin@server ~]$ chronograf -v;service kapacitor version
2017/08/08 09:26:09 Chronograf 1.3.5.0 (git: 9e87035b7f9d8f1e78a77789d69ab4709a095f67)
Kapacitor 1.3.1 (git: master 3b5512f7276483326577907803167e4bb213c613)
[martin@server ~]$ kapacitor show chronograf-v1-e37216f4-b723-4c1a-9dda-8965fa4d2961
ID: chronograf-v1-e37216f4-b723-4c1a-9dda-8965fa4d2961
Error:
Template:
Type: stream
Status: enabled
Executing: true
Created: 08 Aug 17 07:31 UTC
Modified: 08 Aug 17 10:41 UTC
LastEnabled: 08 Aug 17 10:41 UTC
Databases Retention Policies: ["vUSP_CTS"."autogen"]
TICKscript:
var db = 'vUSP_CTS'
var rp = 'autogen'
var measurement = 'memUsage'
var groupBy = ['host']
var whereFilter = lambda: TRUE
var period = 10m
var name = 'DNR'
var idVar = name + ':{{.Group}}'
var message = '{{.Time}} DataNotReceived {{.Name}} '
var idTag = 'alertID'
var levelTag = 'level'
var messageField = 'message'
var durationField = 'duration'
var outputDB = 'chronograf'
var outputRP = 'autogen'
var outputMeasurement = 'alerts'
var triggerType = 'deadman'
var threshold = 0.0
var data = stream
|from()
.database(db)
.retentionPolicy(rp)
.measurement(measurement)
.groupBy(groupBy)
.where(whereFilter)
var trigger = data
|deadman(threshold, period)
.stateChangesOnly()
.message(message)
.id(idVar)
.idTag(idTag)
.levelTag(levelTag)
.messageField(messageField)
.durationField(durationField)
.log('/tmp/DNR.log')
trigger
|eval(lambda: "emitted")
.as('value')
.keep('value', messageField, durationField)
|influxDBOut()
.create()
.database(outputDB)
.retentionPolicy(outputRP)
.measurement(outputMeasurement)
.tag('alertName', name)
.tag('triggerType', triggerType)
trigger
|httpOut('output')
DOT:
digraph chronograf-v1-e37216f4-b723-4c1a-9dda-8965fa4d2961 {
graph [throughput="0.00 points/s"];
stream0 [avg_exec_time_ns="0s" errors="0" working_cardinality="0" ];
stream0 -> from1 [processed="0"];
from1 [avg_exec_time_ns="0s" errors="0" working_cardinality="0" ];
from1 -> noop3 [processed="0"];
noop3 [avg_exec_time_ns="0s" errors="0" working_cardinality="0" ];
stats2 [avg_exec_time_ns="0s" errors="0" working_cardinality="0" ];
stats2 -> derivative4 [processed="0"];
derivative4 [avg_exec_time_ns="0s" errors="0" working_cardinality="0" ];
derivative4 -> alert5 [processed="0"];
alert5 [alerts_triggered="0" avg_exec_time_ns="0s" crits_triggered="0" errors="0" infos_triggered="0" oks_triggered="0" warns_triggered="0" working_cardinality="0" ];
alert5 -> http_out8 [processed="0"];
alert5 -> eval6 [processed="0"];
http_out8 [avg_exec_time_ns="0s" errors="0" working_cardinality="0" ];
eval6 [avg_exec_time_ns="0s" errors="0" working_cardinality="0" ];
eval6 -> influxdb_out7 [processed="0"];
influxdb_out7 [avg_exec_time_ns="0s" errors="0" points_written="0" working_cardinality="0" write_errors="0" ];
}
Successfully saved: