The full story:
I want to check for missing data (something like deadman) but for the time defined by the max value of the counts in the last 10 minutes.
For example, if the previous 10 min max count is 1 then I am fine for 10 minutes before throwing an alert, if the last count max is more than 40 then I want to throw an alert after 2 minutes without data, etc.
I am trying with stats and derivative but at the line with the warning that compares the “duration of emitted zeroes” vs the “dynamic time threshold” it throws an error:
err=“right reference value "previous_count.max_cnt" is missing value”
It looks similar to Will stateDuration/stateCount work with dynamic threshold but the solution in there doesn’t work and the “generated” metrics to identify the gaps make it a bit more tricky.
The relevant part of my code is here:
var data = stream
|from()
.measurement('requests')
.where(lambda: ("hostname" =~ /a1/ AND "app" =~ /tek/))
.groupBy('name','app')
|window()
.period(1m)
.every(1m)
|mean('request_time')
.as('mean_request_time')
var count_past = stream
|from()
.measurement('requests')
.where(lambda: ("hostname" =~ /a1/ AND "app" =~ /tek/))
.groupBy('name','app')
|window()
.period(10m)
.every(1m)
|count('request_time')
.as('count_request')
|max('count_request')
.as('max_cnt')
|shift(10m)
var join_count = data
|join(count_past)
.as('current_data', 'previous_count')
.tolerance(1m)
join_count
|stats(1m)
.align()
|derivative('emitted')
.unit(1m)
.nonNegative()
|stateDuration(lambda: ("emitted" < 1))
.unit(1m)
.as('warn_state_duration')
|log()
|alert()
.stateChangesOnly()
.message('{{ .Time }} Tags: {{ .Tags }} - Fields: {{ .Fields }}')
.warn(lambda: "warn_state_duration" > "previous_count.max_cnt")
.slack()
.channel('kpalerts')
If I use a literal number instead of “previous_count.max_cnt” it seems to be working ok.
Second, related question but just part of the above:
I tried to use this (as it is mentioned in the linked question):
|eval(lambda: int("current_data.emitted"), lambda: int("previous_count.max_cnt")
.as('cd_emitted', 'max_count_threshold')
but it throws an error when I define the task on the second line. Can I use 1 eval with 2 or more lambda statements, or not?
What am I doing wrong here?
Any ideas plz?