Telegraf logs errors while metrics are successfully written in InfluxDB

Telegraf log:

2019-11-17T17:00:10Z D! [outputs.file] Buffer fullness: 0 / 10000 metrics
2019-11-17T17:00:15Z E! [outputs.influxdb] when writing to [http://influxdb:8086]: Post http://influxdb:8086/write?db=opench: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
2019-11-17T17:00:15Z D! [outputs.influxdb] Buffer fullness: 3 / 10000 metrics
2019-11-17T17:00:15Z E! [agent] Error writing to outputs.influxdb: could not write any address
2019-11-17T17:00:20Z D! [outputs.file] Wrote batch of 2 metrics in 138.445µs
2019-11-17T17:00:20Z D! [outputs.file] Buffer fullness: 0 / 10000 metrics
2019-11-17T17:00:25Z E! [outputs.influxdb] when writing to [http://influxdb:8086]: Post http://influxdb:8086/write?db=opench: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
2019-11-17T17:00:25Z D! [outputs.influxdb] Buffer fullness: 3 / 10000 metrics
2019-11-17T17:00:25Z E! [agent] Error writing to outputs.influxdb: could not write any address
2019-11-17T17:00:30Z D! [outputs.file] Buffer fullness: 0 / 10000 metrics
2019-11-17T17:00:35Z E! [outputs.influxdb] when writing to [http://influxdb:8086]: Post http://influxdb:8086/write?db=opench: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
...
2019-11-17T17:19:05Z D! [outputs.influxdb] Buffer fullness: 6 / 10000 metrics
2019-11-17T17:19:05Z E! [agent] Error writing to outputs.influxdb: could not write any address
2019-11-17T17:19:10Z D! [outputs.file] Buffer fullness: 0 / 10000 metrics
2019-11-17T17:19:15Z E! [outputs.influxdb] when writing to [http://influxdb:8086]: Post http://influxdb:8086/write?db=opench: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
...

Telegraf and InfluxDB started by this docker-compose:

version: '3'

services:

  influxdb:
    image: influxdb:latest
    container_name: opench-influxdb
    restart: always
    volumes:
      - ./influxdb/influxdb.conf:/etc/influxdb/influxdb.conf:ro
      - opench-influxdb-volume:/var/lib/influxdb
    networks:
      - opench

  telegraf:
    image: telegraf:latest
    container_name: opench-telegraf
    restart: always
    links:
      - influxdb
    volumes:
      - ./telegraf/telegraf.conf:/etc/telegraf/telegraf.conf:ro
    networks:
      - opench

networks:
  opench:
    driver: bridge

volumes:
  opench-influxdb-volume:
    external: true

Metrics are written in InfluxDB - I can successfully query them, I can ping in both directions in both telegraf and influxdb. But telegraf timeouts when posting the metrics.

pi@raspberrypi:~/apps/opench/docker $ docker exec -it opench-telegraf /bin/bash
root@317aa174c835:/# telegraf --version
Telegraf 1.12.4 (git: HEAD 9a4f08e9)

root@317aa174c835:/# curl -sl -I http://influxdb:8086/ping
HTTP/1.1 204 No Content
Content-Type: application/json
Request-Id: dd4e9e42-096e-11ea-8346-0242ac140002
X-Influxdb-Build: OSS
X-Influxdb-Version: 1.7.8
X-Request-Id: dd4e9e42-096e-11ea-8346-0242ac140002
Date: Sun, 17 Nov 2019 19:17:15 GMT

root@317aa174c835:/# ping influxdb
PING influxdb (172.20.0.2) 56(84) bytes of data.
64 bytes from opench-influxdb.docker_opench (172.20.0.2): icmp_seq=1 ttl=64 time=0.312 ms
64 bytes from opench-influxdb.docker_opench (172.20.0.2): icmp_seq=2 ttl=64 time=0.223 ms
64 bytes from opench-influxdb.docker_opench (172.20.0.2): icmp_seq=3 ttl=64 time=0.228 ms
64 bytes from opench-influxdb.docker_opench (172.20.0.2): icmp_seq=4 ttl=64 time=0.227 ms

Strange, what could it be?

Hello @Yuriy_Olkhovyy,
Thanks for your question. So far I’ve only found this issue related to yours. Are you using 1.7.5?
Alternatively, what is your telegraf config look like? Can you share it for debugging?
You might want to set timeout = "60s" or some time longer than the default or what you have.

Telegraf and InfluxDB versions:

root@244eceb833d5:/# telegraf --version
Telegraf 1.12.4 (git: HEAD 9a4f08e9)
    ...
root@3e97f8b185d1:/# influx --version
InfluxDB shell version: 1.7.8

Increased timeout to 60s in telegraf, error changed to the following (metrics still successfully written in influxdb):

2019-11-23T16:02:27Z E! [outputs.influxdb] when writing to [http://influxdb:8086]: 500 Internal Server Error: timeout
2019-11-23T16:02:27Z D! [outputs.influxdb] Buffer fullness: 3 / 10000 metrics
2019-11-23T16:02:27Z E! [agent] Error writing to outputs.influxdb: could not write any address
2019-11-23T16:03:09Z D! [outputs.influxdb] Wrote batch of 3 metrics in 7.00731664s
2019-11-23T16:03:09Z D! [outputs.influxdb] Buffer fullness: 0 / 10000 metrics

Writing a batch of 3 metrics in 7 seconds sounds a bit long. And the timestamps of the log entries differ by 32 seconds.

Attached telegraf.conf here

Hello @Yuriy_Olkhovyy,
I don’t think you successfully manage to share the telegraf config. Could you please try again? Wait–you’re successfully writing points despite the 500? That’s odd. And you can ping the containers? What do the influxdb logs look like?