We have a case where we are monitoring metrics and looking out for a specific value (-99) that indicates a communication failure in the device.
We would like to generate an event with level “warning”, sending an e-mail, when the value has been -99 for > 1 hour, and escalate this to level “critical” with another e-mail, when the value has been -99 for => 24 hours.
So, the
anything else than -99 => normal
-99 > 1 hour => warning, email
-99 > 24 hours => critical, email
We have searched through the documentation and the community and cannot find any information that can help us figure this out. Is seems that the scenario should be a fairly common case, so there should be a solution out there. Or are we missing out on something?