We manage many servers, each server is supported by different team. Each team has differnet threshold to setup alert. When we setup the alert, i could not find any place where we can target the right group. We don’t want to create so many alerts.
I’m thinking running an influx query , this query will get the last avg CPU usage in the last 15 minutes of all servers, we will send this result to another script, this script keep its internal threshold, target support group , based on the threshold , this script will generate the alert.
My problem is that, i can’t find a query to just return the avg CPU for the last 15 minutes , what i found is that it will return multiple windows per 15 minutes.
Any suggestion will be appreciated.