Notification when status changes from OK -> CRIT

Hi there, I’ve been playing around with the alerting in influx2.0-beta and “When status” “is equal to” works like a charm however I’m having issues to get the “When status” “changes from” OK -> Crit. I’ve changed the check around to make the status go from OK -> CRIT -> OK -> CRIT but the Notification rule doesn’t trigger which is a shame.
The when status is equal to works but it’ll send a message on every check which in my case is every minute so the state transition is really the way to go. Has anyone got this working?

Hello @aastroem,
Sorry in advance if this is a silly question, but are you hoping for a message when the alert goes from CRIT->OK? or is the problem that. you’re not getting any Notifications now?

Hello @Anaisdg!
Yes, that’s right. I’m getting alarms when the status is equal to but not when the status transitions between statuses.

Hello @aastroem,
I would recommend checking out stateChanges() then.

@Anaisdg I’m using the docker container running Influx 2.0 BETA so I’m configuring the notification rule through the UI. My understanding is that kapacitor is built into Influx 2.0? Is there any way that I can troubleshoot this to know what’s going on under the hood in the black box? I managed to get the rule to kick in by changing the check rule back and forth to force the status to change back and forth. I got quite inconsistent results and I have no clue in why it doesn’t work on normal operation? Is it a performance issue?

Hi again, I’ve been troubleshooting this for hours now and it’s not easy that’s for sure.
I’ve made sure that I’m only dealing with one series and I’ve tweaked the timings between:

  • Aggregation function window period = 1m (last)

  • Check is scheduled every 1m10s

  • I’ve verified the thresholds i.e. count > 0.9 = CRIT count < 1 = OK

  • The check in combination with the thresholds are spitting out status changes accordingly at Xm+10s

  • I’ve played around with the notification rules and managed to get the “when status equals to” working but I can’t get the "when status changes from OK to CRIT to work. I’ve tried with different schedule times i.e. 1m15s, 5m15s, 10m15s becase I figured that the notification rule check might use an aggregate function / bucket to see if there have been any state transitions during the time windows since the last check. Guess what, doesn’t work.

Can someone please point out what am I doing wrong here?

We’re seeing the same problem on our alerts as well with Influx 2.0.0 beta 8.

Opened [2.0] InfluxDB notification rule doesn't send Slack message on status change · Issue #17809 · influxdata/influxdb · GitHub

It seems that the issue was fixed however I still see a similar behavior:

with the above alert I don’t see any entries in the notification history, any idea why?

this is the notification rule config (I couldn’t put more than one image per post)

+1, we’re running version=2.0.0-beta.15 commit=f54848f443 build_date=2020-07-23T18:27:19Z and this problem still exists.

+1, unfortunately, I cannot make ‘changes from’ condition working as well using:

InfluxDB 2.0.0-beta.16 (git: 50964d732c) build_date: 2020-08-07T20:18:07Z

I can’t get this to work. Tried everything and yet no notifications are created.

Hello @WoLfulus,
What can’t you get to work? Can you be more specific please?

Seems like there’s a bug in v2. Notifications doesn’t work at all if I set the same interval in the check and the notification rule. No notifications gets created, even if I set offsets.

I did manage to get it to work, but I had to set the rule interval to twice the interval of the checks, which sometimes duplicates the notifications.

Hi, Flux developer here. I haven’t been able to reproduce this issue exactly as described, but I am seeing some things I don’t expect and will be investigating this a bit more and triaging it. Thank you for reporting the issue!

1 Like

I have confirmed that when the check interval and the notification interval are different, some duplicate notifications arrive. I understand this to be expected behaviour. The solution for me is to make the intervals the same.

I was not able to reproduce the above mentioned problem whereby notifications do not arrive when the intervals are the same and I believe that an incorrect notification endpoint configuration combined with old/current bugs was causing this problem. With recent versions it should be possible to correct the notification endpoint configuration and see it take effect.

If the above is still not true for you, please make a change to a notification rule and the check configuration and resave them. This is a current known issue with the alerting system.

The known issue with configurations not updating is captured here.

1 Like