I’m using the inputs.mqtt_consumer
plugin to retrieve some data from a LoRaWAN networkserver (TTN). I noticed recently that no more data where written into my InfluxDB.
I restarted telegraf and data were written again.
I looked into telegraf’s log-files and found error entries for all concerned mqtt_consumer instances, e.g.
E! [inputs.mqtt_consumer::ttn_consumer_ow] Error in plugin: connection lost: read tcp 172.19.0.3:42600->52.212.223.226:1883: read: connection reset by peer
E! [inputs.mqtt_consumer::ttn_consumer_ow] Error in plugin: network Error : read tcp 172.19.0.3:33402->63.34.215.128:1883: i/o timeout
I guess it can happen that the connection between the server running telegraf and the mqtt-server is temporarily interrupted. But why does that cause the telgraf input plugins to quit working?
Here are some more observations:
- telegraf is running in a docker container
- other input- and output plugins of the same telegraf instance continued working without issue
- As you can see in the example above it seemed that the IP address for the configured mqtt-server changed.
Maybe that’s just a normal process (e.g. for something like load-balancing?).
Maybe that causes the telegraf issue?
On the other hand there are also error entries in the log that have the same IP address for both error messages (read: connection reset by peer
/i/o timeout
). - when I restarted the telegraf container data started immediately to flow again from the mqtt-server to the InfluxDB
Does someone has an advice how to prevent telegraf from stopping plugins when a connection interruption occurs, respectively enable it to start again when the connection is reestablished?