We use the telegraf SNMP plugin with the starlark processor plugin to check a process running state.
In the telegraf.conf, the [inputs.snmp] section has this measurement to query process names HOST-RESOURCES-MIB::hrSWRunName:
[[inputs.snmp.table]]
name = “hrSWRunTable_SIM”
inherit_tags = [ “hostname” ]
index_as_tag = true
[[inputs.snmp.table.field]]
name = "hrSWRunName"
oid = ".1.3.6.1.2.1.25.4.2.1.2" # HOST-RESOURCES-MIB::hrSWRunName
We use the starlark processor to check a process running state based on the SNMP process table which has more than 100 processes under MIB HOST-RESOURCES-MIB::hrSWRunName. In this case, we check the crond process for its running state and ignore others. We have starlark processor plugin setup as below:
[[processors.starlark]]
namepass = [“hrSWRunTable_SIM”]
source = ‘’’
def apply(metric):
proc_name = metric.fields.get(‘hrSWRunName’)
if proc_name == “crond”:
metric.fields[‘crond’] = 2
print (proc_name)
return metric
else:
metric.fields[‘others’] = 0
return metric
return metric
‘’’
So if crond is running, the starlark script sets metric.fields[‘crond’] = 2, and we can see it on Chronograf Explore, and the Tick script configured with Kapacitor is able to verify it. However, if crond is not running, metric.fields[‘crond’] is not set or undefined(?) by starlark script, and in this case, the Chronograf Explore reported as no data (which is true) for crond state, and the Tick script is unable to verify the crond since there is no data or field “crond” is not defined for crond with the following codes:
var data = stream
|from()
.database(db)
.retentionPolicy(rp)
.measurement(measurement)
.groupBy(groupBy)
.where(whereFilter)
|eval(lambda: “crond”)
.as(‘value’)
var trigger = data
|alert()
// .crit(lambda: “value” == ‘null’)
.crit(lambda: “value” != 2)
// .stateChangesOnly()
.message(message)
.id(idVar)
.idTag(idTag)
.levelTag(levelTag)
.messageField(messageField)
.durationField(durationField)
.log(’/var/log/SAM_SNMP_Proc_crond.log’)
trigger
|eval(lambda: float(“value”))
.as(‘value’)
.keep()
|influxDBOut()
.create()
.database(outputDB)
.retentionPolicy(outputRP)
.measurement(outputMeasurement)
.tag(‘alertName’, name)
.tag(‘triggerType’, triggerType)
trigger
|httpOut(‘output’)
If we add “metric.fields[‘crond’] = 0” right after def apply(metric) in the starlark processor, then metric.fields[‘crond’] = 0 will be applied to all the processes in the process table and the Tick script will catch it and send out an alert for each process. How can we resolve this issue of crond state not defined when crond is not running and how to define it, or how does the Tick script handle it correctly if crond is not defined when it is not running? Thanks!