I try create dashboard with switches, I’m adding it in array ip in config file - working.
But when I try to send test message through ops genie then 2 min and random.
urls = [“192.168.0.21”]
count = 4
ping_interval = 1.0
timeout = 2.0
How to correct it, config ? When device down then I want get message about it
I’ll recap here what you configured in your Grafana alert, as it took me quite some time to get my own to work properly. (Grafana docs here)
I hope it will make the situation clearer.
- Every minute the check is run (Evaluate every = 1m)
- If the result of the check is not fine for more than 1m then an alert will fire (For = 1m)
- the check itself is based on query “A”, the value is the last one available in the last minute of data (Condition section)
- if the value is over 88 then the alert will trigger (here is up to you to choose the proper threshold for your data)
- the alert will fire also if there are no data, or if an error/timeout occurs
- So far so good, now it’s time for the not-so-obvious part - The missing/null series must appear inside the chart at least once, you may want to extend your dashboard or query time range accordingly
As an example, let’s say I consider only have 10mins of data in my chart, with whatever metric for a series of hosts.
If a host does not send any data for 10mins, it won’t be present in the chart anymore. Therefore it won’t be considered in alert rules anymore (as it does not have any data at all).
in your case, I suggest you actually turn the host on, fetch some data (or write some fake data about it), and turn it off again. now you will be able to monitor and check your alert rule.
Note that you can always test the rule with “Test Rule” at the bottom, it won’t send an email but will just evaluate it immediately.
Thanks a lot, Giovanni_Luisotto!