Will stateDuration/stateCount work with dynamic threshold

#1

The goal of this TICK script is to assign threshold base on the site’s size rating. For example if site = 1, then threshold is -50. My batch also have a field call week2week change percentage (base on past data( shift()) for each site, e.g, if the change percentage is 100% + threshold (-50) = alerting threshold is 50%.

I want to write an alertnode that would say if current week2week change percentage > 50%, then alert. But I also wanted to use the stateCount or stateDuration, so that it eliminate any false positive.

I am not sure if my approach to this is correct.

var past = batch
  |query('''
     SELECT last(week2week_change), last(site_size)
     FROM "kapacitor"."default"."traffic_analysis"
  ''')
    .cluster(out_cluster)
    .groupBy(domain, locale)
    .period(2m)
    .every(1m)
    .offset(2m)
    .align()
  |shift(2m)

// From the analysis measurement, get site size and apply threshold base on size ranking
var size = past
  |eval(lambda: if("last_1" == 1, -50.0, if("last_1" == 2, -30.0, if("last_1" == 3, -25.0, if("last_1" == 4, -20.0, if("last_1" == 5, -17.0, if("last_1" == 6, -16.0, if("last_1" == 7, -14.0, if("last_1" == 8, -12.0, if("last_1" == 9, -10.0, -8.0))))))))))
    .as('percent')
  |last('percent')
    .as('threshold')

// Mean of past data
var past_mean = past
  |mean('last')
   .as('past_w2w_change')

// Get current data from analysis measurement
var current = batch
  |query('''
     SELECT week2week_change
     FROM "kapacitor"."default"."traffic_analysis"
  ''')
    .cluster(out_cluster)
    .groupBy(groups)
    .period(2m)
    .every(1m)
    .align()
  |mean('week2week_change')
    .as('curr_w2w_change')

// Join site size threshold and past mean data into a single point
var pastData = size
  |join(past_mean, current)
    .as('threshold', 'past_mean', 'current')
    .tolerance(tolerence)
  |eval(lambda: (float("past_mean.past_w2w_change") + float("threshold.threshold")), lambda: float("past_mean.past_w2w_change"), lambda: float("threshold.threshold"), lambda: float("current.curr_w2w_change")
   .as('alert_threshold', 'past_w2w_change', 'size_threshold', 'current_w2w_change')

#2

Can I assign a field value to a variable in TICK script?

#3

@Daniel_Li When you say you want to use stateCount/stateDuration to eliminate false positives are you meaning that you want to ensure that the alert_threshold has been crossed for a certain amount of time before alerting? Otherwise I am not quite sure what you are asking.

Slightly unrelated, but in your script you are querying the last values and then taking the mean of last, which is essentially taking the mean of the single last point which really isn’t a mean. Over what data did you intend to take the mean?

#4

After the join in my script, my point will contain these fields:
‘alert_threshold’, ‘past_w2w_change’, ‘size_threshold’, ‘current_w2w_change’

I want my alert crit to be like: current_w2w_change > alert_threshold

And yes i want to use stateDuration to check if its greater than alert threshold for 5min. What my concern is if the “alert_threshold” field value is changing every min, i wonder how thats gonna work. Well stateDuration look at the first alert threshold value that it see at first query run or will it keep changing that value on subsequent run.

For the last() i was trying to convert it to a stream and get the last point .

#5

The stateDuration will use the current value of each field, so it will update when the alert_threshold updates.

Something like this should work fine:

var pastData = size
  |join(past_mean, current)
    .as('threshold', 'past_mean', 'current')
    .tolerance(tolerence)
  |eval(lambda: (float("past_mean.past_w2w_change") + float("threshold.threshold")), lambda: float("past_mean.past_w2w_change"), lambda: float("threshold.threshold"), lambda: float("current.curr_w2w_change")
   .as('alert_threshold', 'past_w2w_change', 'size_threshold', 'current_w2w_change')
 |stateDuration(lambda: "current_w2w_change" > "alert_threshold")
    .unit(1m)
 |alert()
    .crit(lambda: "state_duration" > 5)