Health Checking at all parts of the stack

Hi,

I am looking for a way to monitor all parts of the TICK stack, preferably in terms of someway to health check, maybe a http endpoint you can ping to see if everything is up for Telegraf, InfluxDB, Chronograf, and Kapacitor.

i know some sort of monitoring can be achieved for Telegraf with Kapacitor, but what would be the best way to make sure Kapacitor is healthy?

Thanks

There is an API called /ping on influxdb and not sure of other apps
Read more here:

1 Like

@MattC For kapacitor there is a /kapacitor/v1/ping endpoint that will do the same.

In chronograf the api is served from /chronograf/v1. A call to that endpoint will return 200 and a listing of all the different API endpoints. You can see the full documentation for the chronograf API at http://{{myinstance}}/docs

Telegraf does not open a port by default so there is no standard health check for that piece. Setting kapacitor alerts for telegraf reporting is the right way to handle that.

I’d also like some way to know if Telegraf is working - I can of course check if the process is alive, but I can’t find a way to see if it’s actually sending stats. Getting ‘last update time’ for dozens of servers from the InfluxDB server isn’t easy, so not quite sure how to approach this.