Telegraf logs errors while metrics are successfully written in InfluxDB

Yuriy_Olkhovyy · November 17, 2019, 5:18pm

Telegraf log:

2019-11-17T17:00:10Z D! [outputs.file] Buffer fullness: 0 / 10000 metrics
2019-11-17T17:00:15Z E! [outputs.influxdb] when writing to [http://influxdb:8086]: Post http://influxdb:8086/write?db=opench: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
2019-11-17T17:00:15Z D! [outputs.influxdb] Buffer fullness: 3 / 10000 metrics
2019-11-17T17:00:15Z E! [agent] Error writing to outputs.influxdb: could not write any address
2019-11-17T17:00:20Z D! [outputs.file] Wrote batch of 2 metrics in 138.445µs
2019-11-17T17:00:20Z D! [outputs.file] Buffer fullness: 0 / 10000 metrics
2019-11-17T17:00:25Z E! [outputs.influxdb] when writing to [http://influxdb:8086]: Post http://influxdb:8086/write?db=opench: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
2019-11-17T17:00:25Z D! [outputs.influxdb] Buffer fullness: 3 / 10000 metrics
2019-11-17T17:00:25Z E! [agent] Error writing to outputs.influxdb: could not write any address
2019-11-17T17:00:30Z D! [outputs.file] Buffer fullness: 0 / 10000 metrics
2019-11-17T17:00:35Z E! [outputs.influxdb] when writing to [http://influxdb:8086]: Post http://influxdb:8086/write?db=opench: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
...
2019-11-17T17:19:05Z D! [outputs.influxdb] Buffer fullness: 6 / 10000 metrics
2019-11-17T17:19:05Z E! [agent] Error writing to outputs.influxdb: could not write any address
2019-11-17T17:19:10Z D! [outputs.file] Buffer fullness: 0 / 10000 metrics
2019-11-17T17:19:15Z E! [outputs.influxdb] when writing to [http://influxdb:8086]: Post http://influxdb:8086/write?db=opench: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
...

Telegraf and InfluxDB started by this docker-compose:

version: '3'

services:

  influxdb:
    image: influxdb:latest
    container_name: opench-influxdb
    restart: always
    volumes:
      - ./influxdb/influxdb.conf:/etc/influxdb/influxdb.conf:ro
      - opench-influxdb-volume:/var/lib/influxdb
    networks:
      - opench

  telegraf:
    image: telegraf:latest
    container_name: opench-telegraf
    restart: always
    links:
      - influxdb
    volumes:
      - ./telegraf/telegraf.conf:/etc/telegraf/telegraf.conf:ro
    networks:
      - opench

networks:
  opench:
    driver: bridge

volumes:
  opench-influxdb-volume:
    external: true

Metrics are written in InfluxDB - I can successfully query them, I can ping in both directions in both telegraf and influxdb. But telegraf timeouts when posting the metrics.

pi@raspberrypi:~/apps/opench/docker $ docker exec -it opench-telegraf /bin/bash
root@317aa174c835:/# telegraf --version
Telegraf 1.12.4 (git: HEAD 9a4f08e9)

root@317aa174c835:/# curl -sl -I http://influxdb:8086/ping
HTTP/1.1 204 No Content
Content-Type: application/json
Request-Id: dd4e9e42-096e-11ea-8346-0242ac140002
X-Influxdb-Build: OSS
X-Influxdb-Version: 1.7.8
X-Request-Id: dd4e9e42-096e-11ea-8346-0242ac140002
Date: Sun, 17 Nov 2019 19:17:15 GMT

root@317aa174c835:/# ping influxdb
PING influxdb (172.20.0.2) 56(84) bytes of data.
64 bytes from opench-influxdb.docker_opench (172.20.0.2): icmp_seq=1 ttl=64 time=0.312 ms
64 bytes from opench-influxdb.docker_opench (172.20.0.2): icmp_seq=2 ttl=64 time=0.223 ms
64 bytes from opench-influxdb.docker_opench (172.20.0.2): icmp_seq=3 ttl=64 time=0.228 ms
64 bytes from opench-influxdb.docker_opench (172.20.0.2): icmp_seq=4 ttl=64 time=0.227 ms

Strange, what could it be?

Anaisdg · November 19, 2019, 4:28pm

Hello @Yuriy_Olkhovyy,
Thanks for your question. So far I’ve only found this issue related to yours. Are you using 1.7.5?
Alternatively, what is your telegraf config look like? Can you share it for debugging?
You might want to set timeout = "60s" or some time longer than the default or what you have.

Yuriy_Olkhovyy · November 23, 2019, 1:03pm

Telegraf and InfluxDB versions:

root@244eceb833d5:/# telegraf --version
Telegraf 1.12.4 (git: HEAD 9a4f08e9)
    ...
root@3e97f8b185d1:/# influx --version
InfluxDB shell version: 1.7.8

Increased timeout to 60s in telegraf, error changed to the following (metrics still successfully written in influxdb):

2019-11-23T16:02:27Z E! [outputs.influxdb] when writing to [http://influxdb:8086]: 500 Internal Server Error: timeout
2019-11-23T16:02:27Z D! [outputs.influxdb] Buffer fullness: 3 / 10000 metrics
2019-11-23T16:02:27Z E! [agent] Error writing to outputs.influxdb: could not write any address
2019-11-23T16:03:09Z D! [outputs.influxdb] Wrote batch of 3 metrics in 7.00731664s
2019-11-23T16:03:09Z D! [outputs.influxdb] Buffer fullness: 0 / 10000 metrics

Writing a batch of 3 metrics in 7 seconds sounds a bit long. And the timestamps of the log entries differ by 32 seconds.

Attached telegraf.conf here

Anaisdg · November 25, 2019, 10:40pm

Hello @Yuriy_Olkhovyy,
I don’t think you successfully manage to share the telegraf config. Could you please try again? Wait–you’re successfully writing points despite the 500? That’s odd. And you can ping the containers? What do the influxdb logs look like?

Yuriy_Olkhovyy · December 15, 2019, 4:09pm

telegraf.conf.tar.gz (46.8 KB)

Yuriy_Olkhovyy · December 15, 2019, 4:10pm

Sorry for delay, just uploaded my telegraf.conf

Anaisdg · December 16, 2019, 9:59pm

@Yuriy_Olkhovyy, Perhaps this is a storage system issue? Slow SD card tying up IO operations on the InfluxDB side? Can you look at the logs from the DB? Maybe the DB container is in a restart loop?

Anaisdg · December 16, 2019, 10:01pm

Thanks for the config. It looks good to me.

Yuriy_Olkhovyy · December 17, 2019, 3:46am

storage system issue

You are probably right - the setup is quite resource-limited. It’s a Raspberry PI 4 with 4GB RAM, whole system on an SD card of 4GB, and a 120GB HDD for the data. The important thing to mention that the SD card was about 70% full in the beginning, but at the end the card got 100% full (logs?) and the system stopped doing any work. It boots, containers with Influx stack start, but no work observed anymore. There is still plenty of space on the 120GB disk though.

I will replace the SD card with a bigger one and will let you know the result. This will take some time - I expect I’ll have time to do that only after Christmas.

Thanks a lot for your comments.

Topic		Replies	Views
Error writing to output Telegraf influxdb , telegraf	1	1299	July 10, 2019
Telegraf does not send metrics to influxdb v2 Telegraf influxdb , docker	2	114	October 9, 2024
Influxdb Error Please solution	10	21603	February 27, 2019
Explanation telegraf log Telegraf telegraf	1	2064	March 21, 2019
D! [outputs.influxdb_v2] Buffer fullness: 0 / 1 metrics Telegraf	11	3744	August 2, 2022

Telegraf logs errors while metrics are successfully written in InfluxDB

Related topics