My Influx database was working fine , apparently not so fine, until I noticed that for some reason it stopped receiving data.
I checked the server that sends the data and it is working normal.
Both influx and telegraf are deployed to a kubernetes cluster.
The logs at my influx pod are looking fine but the logs from the telegraf pod look like :
2019-11-28T04:44:50Z E! [agent] Error writing to output [influxdb]: could not write any address
2019-11-28T04:45:00Z E! [outputs.influxdb] when writing to [http:// data-influxdb.tick:8086]: Post http:// data-influxdb.tick:8086/write?db=charts_ticks: dial tcp: lookup data-influxdb.tick on 10.100.0.10:53: no such host
2019-11-28T04:45:00Z E! [agent] Error writing to output [influxdb]: could not write any address
…
…
…
…
2019-11-28T10:27:31Z E! [outputs.influxdb] when writing to [http:// data-influxdb.tick:8086]: Post http:// data-influxdb.tick:8086/write?db=charts_ticks: dial tcp 10.100.199.34:8086: connect: connection refused
2019-11-28T10:27:31Z E! [agent] Error writing to output [influxdb]: could not write any address
2019-11-28T10:27:40Z E! [outputs.influxdb] when writing to [http:// data-influxdb.tick:8086]: 404 Not Found: database not found: “charts_ticks”
2019-11-28T10:27:40Z E! [agent] Error writing to output [influxdb]: could not write any address
2019-11-28T10:27:51Z E! [inputs.kafka_consumer]: Error in plugin: kafka: error while consuming TICK2/0: kafka: broker not connected
2019-11-28T10:31:24Z W! [outputs.influxdb]: when writing to [http:// data-influxdb.tick:8086]: received error partial write: points beyond retention policy dropped=1000
…
…
…
2019-11-28T11:44:40Z E! [outputs.influxdb] when writing to [http:// data-influxdb.tick:8086]: 500 Internal Server Error: engine is closed
2019-11-28T11:44:40Z E! [agent] Error writing to output [influxdb]: could not write any address
At that moment I want to mention that the name of the database is the correct one, I tried writing to the database and it successfully worked and the retention policy is like
name duration shardGroupDuration replicaN default
autogen 0s 168h0m0s 1 false
12_hours_rp 12h0m0s 1h0m0s 1 true
so it makes sense to not insert data older than 12 hours, but I don’t understand why it not inserting the current ones and what these error messages mean .