I am looking for a way to monitor all parts of the TICK stack, preferably in terms of someway to health check, maybe a http endpoint you can ping to see if everything is up for Telegraf, InfluxDB, Chronograf, and Kapacitor.
i know some sort of monitoring can be achieved for Telegraf with Kapacitor, but what would be the best way to make sure Kapacitor is healthy?
In chronograf the api is served from /chronograf/v1. A call to that endpoint will return 200 and a listing of all the different API endpoints. You can see the full documentation for the chronograf API at http://{{myinstance}}/docs
Telegraf does not open a port by default so there is no standard health check for that piece. Setting kapacitor alerts for telegraf reporting is the right way to handle that.
I’d also like some way to know if Telegraf is working - I can of course check if the process is alive, but I can’t find a way to see if it’s actually sending stats. Getting ‘last update time’ for dozens of servers from the InfluxDB server isn’t easy, so not quite sure how to approach this.