I have an influx measurement which will differentiate success vs failure using a tag(status=0 for success, status=non-zero for failure). I am trying to write a tick script to generate an alert when the error percentage exceeds a certain value.
To find the error percentage (100*error_count / total), I was able to individually calculate the total records and error count, for a 2minute window as below. How do I create an alert combining these two values? Or is there a better way to do this?
var data = stream
|from()
.measurement('MTP')
|window()
.period(2m)
.every(1m)
var total = data
|count('status')
var error = data
|where(lambda: "status" != '0')
|count('status')
var alert = ??????
|alert()
.id('{{ .TaskName }}')
.crit(lambda: 100 * "error"/ "total" > 0 )
.message('Error value: {{index .Fields "error"}}')
.log('/tmp/total.log')
@zamrbo That looks like the right way to do it. Is the above alert working as expected? If not I would suggest looking into the Join() node to join the total and error streams before doing the alert.
@jackzampolin No, the alert is not working. I tried to use join as you suggested and found a similar example that (outer) joins two measurements to generate the alert.
Unfortunately, i have not been able to get my alert to work. Not sure where I am going wrong. My updated script using join:
var data = stream
|from()
.measurement('MTP')
|window()
.period(2m)
.every(1m)
var total = data
|count('status')
var error = data
|where(lambda: "status" != '0')
|count('status')
error
|join(total)
.fill(0)
.as('errors', 'totals')
|eval(lambda: "errors.error" / "totals.total")
.as('value')
|alert()
.id('{{ .TaskName }}')
.crit(lambda: "value" > 0)
.message('Value: {{index .Fields "value"}}')
.log('/tmp/total.log')
You are right, access to the values are through errors.count and totals.count. Thank you @jackzampolin!
Looks like the value field was rounded down to 0. So I had to multiply with 100 for the alert expression(value>0) to be become TRUE. This triggers the alert.
Anyways, thanks @jackzampolin for solving the original issue. I am now able to generate an alert when the error rate exceeds a certain threshold . I will try to setup the example in docs to better understand rounding off issue.
I’m trying to do something similar – alert when a field named error exceeds a certain value or when it exceeds a certain percentage. I’m using the aforementioned script as an example, but I’m getting a cannot get properties of non pointer value on the line that has the|alert() method.
What am I doing wrong? Here’s my script:
var data = stream
|from()
.measurement('api_detail')
var total = data
|count('tr-error')
var error = data
|where(lambda: "tr-error" == 'true')
|count('tr-error')
error
|join(total)
.fill(0)
.as('errors', 'totals')
|eval(lambda: "errors.count" / "totals.count")
as('value')
|alert()
.message('message is {{index .Fields "error"}}')
.crit(lambda: TRUE)
.log('/tmp/alert.log')
|httpOut('top10')
|log()