InfluxDB Output error

ramgopal · March 12, 2020, 2:07am

I have telegraf configured to take inputs from a kafka topic and writing output to influxdb. It has been working for months and lately we have seen multiple issues with the agent and reading a few blogs saw a particular version fixed this issue. I was running telegraf v1.3.2 and upgraded to 1.11.5 then started to see some warnings as below:

Mar 12 02:06:39 ip-10-204-29-53 telegraf: 2020-03-12T02:06:39Z W! [outputs.influxdb] Metric buffer overflow; 34514 metrics have been dropped
Mar 12 02:06:40 ip-10-204-29-53 telegraf: 2020-03-12T02:06:40Z W! [outputs.influxdb] Metric buffer overflow; 7732 metrics have been dropped

telegraf config:

[agent]
metric_buffer_limit = 15000

[[outputs.influxdb]]
urls = [“http://monitoring.amgen.com:18086”]
database = “aggregator”
retention_policy = “”
write_consistency = “any”
timeout = “5s”

[[outputs.influxdb]]
urls = [“http://monitoring.amgen.com:28086”]
database = “aggregator”
retention_policy = “”
write_consistency = “any”
timeout = “5s”

[[inputs.kafka_consumer]]

brokers = [“bk1.monitoring.amgen.com:9092”,“bk2.monitoring.amgen.com:9093”]

topics = [“telegraf”]

consumer_group = “telegraf_metrics_consumers”

offset = “oldest”

data_format = “influx”

max_message_len = 65536

[[inputs.kafka_consumer_legacy]]
topics = [“telegraf”]
zookeeper_peers = [“zk4.monitoring.devops.amgen.com:2181”,“zk5.monitoring.devops.amgen.com:2181”,“zk6.monitoring.devops.amgen.com:2181”]
zookeeper_chroot = “”
consumer_group = “telegraf_metrics_consumers”
offset = “newest”
data_format = “influx”
max_message_len = 6553600

rsreeni1 · September 22, 2020, 12:20pm

I am getting the same error trying to scrape metrics from Prometheus. Any updates on this or any configuration fixes available for the resolution of this ?

Here is what I get as output :

2020-09-22T11:58:13Z W! [outputs.influxdb] Metric buffer overflow; 55040 metrics have been dropped
2020-09-22T11:58:14Z W! [outputs.influxdb] Metric buffer overflow; 16189 metrics have been dropped
2020-09-22T11:58:14Z W! [outputs.influxdb] Metric buffer overflow; 9665 metrics have been dropped
2020-09-22T11:58:15Z W! [agent] [“outputs.influxdb”] did not complete within its flush interval
2020-09-22T11:58:15Z W! [outputs.influxdb] Metric buffer overflow; 31218 metrics have been dropped
2020-09-22T11:58:15Z W! [outputs.influxdb] Metric buffer overflow; 24285 metrics have been dropped

Thanks,

-Sreeni

philjb · September 22, 2020, 4:52pm

Hello! The metrics are stored in a fixed sized ring buffer. If the outputs aren’t keeping up with the inputs, older metrics will be overwritten (“dropped”) from the buffer. You can increase the buffer size with the metric_buffer_limit config. This will increase telegraf’s memory usage.

The newer versions of telegraf surface buffer overflow errors better than earlier ones.

rsreeni1 · September 22, 2020, 8:40pm

@philjb Thanks for the reply Philjb, I tried increasing the metric_buffer_limit from 30000 to 100000. still dropping. someone suggested to use the influxdb_v2 output plugin with content_encoding = gzip option. We are running influx 1.7.6. I am not sure if using the v2 would be compatible or not. Also just to try it out am looking thru the options, but I am not finding where I would refer the database name in the v2… any ideas ? can I use the same format as the outputs.influxdb ( v1 ) plugin syntax itself ?

Thanks,

-Sreeni

philjb · September 30, 2020, 7:22pm

The v2 output plugin is for InfluxDB 2. I suspect you are using a 1.x version. The v1 output plugin supports gzip already.

github.com

influxdata/telegraf/blob/release-1.15/plugins/outputs/influxdb/README.md

# InfluxDB v1.x Output Plugin

The InfluxDB output plugin writes metrics to the [InfluxDB v1.x] HTTP or UDP service.

### Configuration:

```toml
# Configuration for sending metrics to InfluxDB
[[outputs.influxdb]]
  ## The full HTTP or UDP URL for your InfluxDB instance.
  ##
  ## Multiple URLs can be specified for a single cluster, only ONE of the
  ## urls will be written to each interval.
  # urls = ["unix:///var/run/influxdb.sock"]
  # urls = ["udp://127.0.0.1:8089"]
  # urls = ["http://127.0.0.1:8086"]

  ## The target database for metrics; will be created as needed.
  ## For UDP url endpoint database needs to be configured on server side.
  # database = "telegraf"

This file has been truncated. show original

Using gzip will help push requests out of telegraf faster. If you are still running into problems, I would increase the buffer size again. Potentially you should look at the network connection speed too.

Topic		Replies	Views
Error writing to output Telegraf influxdb , telegraf	1	1285	July 10, 2019
Telegraf logs errors while metrics are successfully written in InfluxDB Telegraf	8	5004	December 17, 2019
Plz Look at my telegraf config Kapacitor influxdb , telegraf	3	545	November 13, 2019
[agent] Error writing to outputs.influxdb: could not write any address Telegraf telegraf	3	9585	January 17, 2022
Telegraf agent warning - Outputs.influxdb did not complete within its flush interval? Telegraf influxdb , telegraf	1	4155	October 9, 2020