I have some docker-engine-nodes with telegraf running natively and one influxdb-container in docker - which is configured as output for telegraf.
The problem is: When the influxdb-container is not available for a short time, telegraf does not try to reconnect again. The logging of telegraf als stopped at that moment. As a result, the metrics and logs of telegraf within the last 10 days are missing.
The only solution is to restart telegraf - which is working very fine.
This seems to be similar to Telegraf recover from - or detect - temporary failure
and is happening with
telegraf --version Telegraf v1.4.5 (git: release-1.4 8385206e6851a212e04b355e3bf0b95421ed0e69)
Is there a way to get telegraf reconnected after an influxdb timeout again?
//edited: crosslink https://github.com/influxdata/telegraf/issues/2679#issuecomment-354802213