We have a number of windows systems running telegraf that send data to a single remote influx instance every 10s. We are running into a problem in that sometimes our telegraf instances stop connecting via UDP to send data. Restarting the telegraf service seems to resolve the issue (until it randomly occurs again).
Logs show repeated:
2018-03-29T22:34:24Z E! Failed to connect to output influxdb, retrying in 15s, error was ‘Error creating UDP Client [udp://172.16.22.30:8089]: Error dialing UDP address [172.16.22.30:8089]: dial udp 172.16.22.30:8089: connect: A socket operation was attempted to an unreachable network.’
I’m looking for any hints, clues, things to check into, to try and resolve or at least narrow down the cause of this issue.
Is it possible that the InfluxDB server has changed it’s IP address? If so, I think this has been resolved in Telegraf 1.6 and newer.
Nope. Our influx server IP is statically set. And restarting the telegraf service gets it functioning again pushing data to the expected IP.
I’m not sure then, it sounds like there isn’t a routing entry to the network InfluxDB is on. Any chance you are connecting through a proxy?
No proxies. I may have been misleading in my initial post. By remote influx I only meant an influx instance running on a separate machine from those the telegraf instances are running on.
All machines are on the same LAN. The telegraf instances are connected only via wifi.
Strange, I’m not sure what the problem is then. Could you check if it still occurs with Telegraf 1.6.4 and then open a new issue?
It may take a couple weeks to collect the info as its an intermittent problem, but will do.