Hello all,
I am monitoring a windows estate and services, I am trying to setup a panel for windows services, the scenario is:
Failover cluster
2x servers
10 services
only 1 of the servers has the services running at one time.
So, trying to use conditional filters to monitor these.
Basically the below is if the:
Service is running on server 1 and not on server 2 then ok
Service is running on server 2 and not on server 1 then ok
Service is not running on server 1 and not on server 2 then critical
from(bucket: "telegraf")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) =>
if r["service_name"] == "SERVICE1" and r["host"] == "SERVER1" and r["state"] == "4" and r["host"] == "SERVER2" and r["state"] == "1"
then "ok"
else if r["service_name"] == "SERVICE1" and r["host"] == "SERVER1" and r["state"] == "1" and r["host"] == "SERVER2" and r["state"] == "4"
then "ok"
else if r["service_name"] == "SERVICE1" and r["host"] == "SERVER1" and r["state"] == "1" and r["host"] == "SERVER2" and r["state"] == "1"
then "Critical"
else "ok"
)
|> aggregateWindow(every: v.windowPeriod, fn: last, createEmpty: false)
|> yield(name: "last")
Has anyone had any experience with this or could give me some pointers as I cant seem to get this right.
Any help would be amazing
Hello @ThePeltonian ,
so if state == 1 then that server is running?
I need a little bit more context around what your input data looks like to better help you.
My. check might look something like:
import "influxdata/influxdb/monitor"
import "influxdata/influxdb/v1"
data = from(bucket: "system")
|> range(start: -10s)
|> filter(fn: (r) => r["_measurement"] == "mymeasurement")
|> filter(fn: (r) => r["service_name"] == "SERVER1" and r["service_name"] == "SERVER2")
|> filter(fn: (r) => r["host"] == "SERVER1" and r["host"] == "SERVER2" )
|> filter(fn: (r) => r["state"] == "1" and r["state"] == "4" )
option task = {name: "Cpu Check", every: 10s, offset: 5s}
check = {_check_id: "0783108a0eb5f000", _check_name: "Cpu Check", _type: "threshold", tags: {}}
crit = (r) => r["SERVER1"] == "4" and r["SERVER2"] == "4"
ok = (r) => r["SERVER1"] == "4" and r["SERVER2"] == "1" or r["SERVER1"] == "1" and r["SERVER2"] == "2"
messageFn = (r) => "Check: ${r._check_name} is: ${r._level}"
data |> pivot(rowKey:["_time"], columnKey: ["service_name"], valueColumn: "state") |> monitor["check"](data: check, messageFn: messageFn, crit: crit)
But essentially what this is doing is:
data = from(bucket: "system")
|> range(start: -10s)
|> filter(fn: (r) => r["_measurement"] == "mymeasurement")
|> filter(fn: (r) => r["service_name"] == "SERVER1" and r["service_name"] == "SERVER2")
|> filter(fn: (r) => r["host"] == "SERVER1" and r["host"] == "SERVER2" )
|> filter(fn: (r) => r["state"] == "1" and r["state"] == "4" )
|> pivot(rowKey:["_time"], columnKey: ["service_name"], valueColumn: "state")
|> map(fn: (r) => ({ r with _level: if r["SERVER1"] == "4" and r["SERVER2"] == "1" or r["SERVER1"] == "1" and r["SERVER2"] == "2" then "ok"
else if r["SERVER1"] == "4" and r["SERVER2"] == "4" then "crit"}))
Let me know if these work for you!