Dears,
I’m currently trying to create a tickscript to alert when a linux service is down.
But it does not alert within the specified time frame and sometimes it doesn’t fill the message field, it returns an empty host field
Could you please advise on the best method to monitor the status of a service and create alerts through Chronograf?
I’m trying a deadman alert like below :
var db = ‘telegraf’
var rp = ‘monitor’
var measurement = ‘procstat_lookup’
var groupBy = [‘host’, ‘result’]
var whereFilter = lambda: (“host” == ‘xxx’ OR “host” == ‘yyy’ OR “host” == ‘zzz’) AND (“result” == ‘lookup_error’)
var period = 1m
var name = ‘[Testes] XXX’
var idVar = name + ‘-{{.Group}}’
var message = ‘Service Off on {{ index .Tags “host” }}’
var idTag = ‘alertID’
var levelTag = ‘level’
var messageField = ‘message’
var durationField = ‘duration’
var outputDB = ‘chronograf’
var outputRP = ‘autogen’
var outputMeasurement = ‘alerts’
var triggerType = ‘deadman’
var threshold = 0.0
var data = stream
|from()
.database(db)
.retentionPolicy(rp)
.measurement(measurement)
.groupBy(groupBy)
.where(whereFilter)var trigger = data
|deadman(threshold, period)
.message(message)
.id(idVar)
.idTag(idTag)
.levelTag(levelTag)
.messageField(messageField)
.durationField(durationField)
.email()
.to(‘xxx@xxx’)
.telegram()
.chatId(‘xxxxx’)
.parseMode(‘Markdown’)trigger
|eval(lambda: “emitted”)
.as(‘value’)
.keep(‘value’, messageField, durationField)
|eval(lambda: float(“value”))
.as(‘value’)
.keep()
|influxDBOut()
.create()
.database(outputDB)
.retentionPolicy(outputRP)
.measurement(outputMeasurement)
.tag(‘alertName’, name)
.tag(‘triggerType’, triggerType)trigger
|httpOut(‘output’)