Hello, I have a Kapacitor template i use to alert on Nutanix over all storage. The alert triggers when the total capacity drops to signify a potentional disk issue. That part works fine, however the alert message is not written into the alerts database and i can’t work out why. I have another template similar (monitoring the current usage to trigger at < 20% and < 30%) - This one works perfectly, the message is inserted with the alert data.
The message in each template is very similar (apart from the name of the field in the message) - I’m at a loss as to what is happening
If i tail the kapacitor logs and run kapacitor show taskname i can see there are no errors. If I tail the influxdb logs i don’t see anything about the point being rejected. Also if i add a .log() node to my template i can see the message as it should be.
As a point, all of my templates use the same message format. The only difference is the fields in the message and the wording describing what the alert is.
// Database settings
var db = 'input'
var rp = 'autogen'
var measurement = 'nutanix_cluster_storage_stats'
// Alerts database
var outputDB = 'output'
var outputRP = 'autogen'
var outputMeasurement = 'NutanixAlerts'
var groupBy = ['cluster_uuid','host','CustomerName','Location']
// Customer name. This must be entered in upper exactly as it appears in the database.
var customer string
var host string
var whereFilter = lambda: ("cluster_uuid" == host AND "CustomerName" == customer)
var priority string
var period = 4m
var every = 2m
var name = 'NutanixClusterStorageCapacity'
var idVar = name + '-{{index .Tags "CustomerName"}}-' + priority + '-' + host
var message = 'Priority: ' + priority + ', host: '+ host +', Capacity Usage: {{index .Fields "capacity_tb" | printf "%0.2f"}}TB at {{.Time}}'
var idTag = 'alertID'
var levelTag = 'level'
var messageField = 'message'
var durationField = 'duration'
var triggerType = 'threshold'
var source = 'Source'
var metric = 'Total Storage Capacity'
var unit = 'TB'
var module = 'Nutanix'
var mb = 1024.0 * 1024.0 * 1024.0 * 1024.0
// This is the total amount of disk space in your cluster. If the capacity is not equal to or is less than this value an alert is triggered.
// Value should be entered in TB. Example: A 10 TB threshold should be entered as 10.0
var crit lambda
var data = stream
|from()
.database(db)
.retentionPolicy(rp)
.measurement(measurement)
.groupBy(groupBy)
.where(whereFilter)
|window()
.period(period)
.every(every)
.align()
// |eval(lambda: float("capacity_bytes"))
// .as('capacity_bytes')
|eval(lambda: float("capacity_bytes") / mb,
lambda: float("capacity_bytes") / mb,
lambda: string(float("capacity_bytes") / mb), // Var1
lambda: string(host), // Var2
lambda: string(source), // Alert Source
lambda: string(metric), // Metric that triggered the alert
lambda: string(unit), // unit type %, MB, GB
lambda: string(module), // Module - Nutanix
lambda: string(priority) // Alert priority. not to be confused with alert level
)
.as('capacity_bytes','capacity_tb','Var1','Var2','Source','Metric','Unit','Module','Priority')
.tags('Var1','Var2','Source','Metric','Unit','Module','Priority')
.keep('capacity_bytes','capacity_tb')
var trigger = data
|alert()
.crit(lambda: float("capacity_tb") != crit)
//.stateChangesOnly()
.message(message)
.id(idVar)
.idTag(idTag)
.levelTag(levelTag)
.messageField(messageField)
.durationField(durationField)
.log('/tmp/' + name + '.log')
|httpPost()
.endpoint('myendpoint')
trigger
|delete()
.tag('Var1')
.tag('Var2')
.tag('Source')
.tag('Metric')
.tag('Unit')
.tag('Module')
.tag('Priority')
|eval(lambda: float("capacity_bytes"),
lambda: float("capacity_bytes") / mb)
.as('capacity_bytes','capacity_tb')
.keep('capacity_bytes','capacity_tb')
|influxDBOut()
.create()
.database(outputDB)
.retentionPolicy(outputRP)
.measurement(outputMeasurement)
.tag('alertName', name)
.tag('Priority', priority)
.tag('triggerType', triggerType)
trigger
|httpOut('output')
InfluxDB 1.5.2
Kapacitor 1.5.1
Updating Influx is not an option currently.
just seems odd that this one alone refuses to work as it should. I have a second template that alerts on percentage of the storage remaining. They are basically the same template but monitoring different fields. That one works!?!!?
Any ideas?
Regards,
PhilB