I’ve created simple kapacitor task to watch disk usage data and alert on certain threshold. Disk usage data gets updated in 10s interval. I wonder how kapacitor checks and raise alerts? Is kapacitor also creates alerts in every 10s in this case ? If it’s, is it configurable or it directly depends on data collection interval. Kapacitor sends alerts via SMTP or some other service and looks like it’s sending mails for each control.
Can I configure kapacitor to send the alert for once for each warning level change (ok to crit or crit to ok or warn to crit…) and may be accelarete the warning level and send again after some time (like after 30 minutes, ‘hey this is still crit’). Currently it spams so many alerts.
Is alerts cheap? How should I monitor the performance or if there is waiting queue?
In this case, Kapacitor is listening on the data stream that writes to influxdb, so if you write every 10s, then Kapacitor will perform checks every 10s.
In the alert node, there is a parameter StateChengeOnly, which will send notification only when the status changes (status can be Ok|Warning|Critical).
You can also set times to upgrade or downgrade the status level, so the status becomes critical only if it has been in “warning” state for 10m.
Also always use a function (ie: mean()) when aggregating points, otherwise the check will be performed on a per point basis instead of on an aggregated value.
This post provides a more practical and very complete example, with multiple alert status and thresholds.
Another nice but in certain way more complex example is in the Kapacitor Docs, it’s well made but since it uses statistical function to process the data you may need to study a bit to understand what it does and why. (It took me some time… and I forgot what it does already…)