TELEGRAF - metric_batch_size & metric_buffer_limit not working

Hi!
I’m using

Telegraf 1.31.2
CentOS Linux release 7.9.2009

With this relevant telegraf.conf

[agent]

  ## Telegraf will send metrics to outputs in batches of at most
  ## metric_batch_size metrics.
  ## This controls the size of writes that Telegraf sends to output plugins.
  metric_batch_size = 500

  ## Maximum number of unwritten metrics per output.  Increasing this value
  ## allows for longer periods of output downtime without dropping metrics at the
  ## cost of higher maximum memory usage.
  metric_buffer_limit = 2500

Running telegraf with:

/usr/bin/telegraf --config-directory /home/pdfraire/telegraf/prod --once --debug

2024-08-15T12:37:12Z I! Starting Telegraf 1.31.2 brought to you by InfluxData the makers of InfluxDB
2024-08-15T12:37:12Z I! Available plugins: 234 inputs, 9 aggregators, 32 processors, 26 parsers, 60 outputs, 6 secret-stores
2024-08-15T12:37:12Z I! Loaded inputs: ping snmp (247x)
2024-08-15T12:37:12Z I! Loaded aggregators:
2024-08-15T12:37:12Z I! Loaded processors: converter strings
2024-08-15T12:37:12Z I! Loaded secretstores:
2024-08-15T12:37:12Z I! Loaded outputs: influxdb sql
2024-08-15T12:37:12Z I! Tags enabled: host=arys
2024-08-15T12:37:12Z D! [agent] Initializing plugins
2024-08-15T12:37:28Z D! [agent] Connecting outputs
2024-08-15T12:37:28Z D! [agent] Attempting connection to [outputs.influxdb]
2024-08-15T12:37:28Z D! [agent] Successfully connected to outputs.influxdb
2024-08-15T12:37:28Z D! [agent] Attempting connection to [outputs.sql]
2024-08-15T12:37:28Z D! [agent] Successfully connected to outputs.sql
2024-08-15T12:37:28Z D! [agent] Starting service inputs
2024-08-15T12:37:38Z D! [inputs.ping] no packets received
2024-08-15T12:37:38Z D! [inputs.ping] no packets received
2024-08-15T12:38:17Z D! [outputs.influxdb]  Wrote batch of 1000 metrics in 103.087287ms
2024-08-15T12:38:17Z D! [outputs.influxdb]  Buffer fullness: 13 / 10000 metrics
2024-08-15T12:38:17Z D! [outputs.sql]  Buffer fullness: 1077 / 10000 metrics
2024-08-15T12:38:49Z D! [agent] Stopping service inputs
2024-08-15T12:38:49Z D! [agent] Input channel closed
2024-08-15T12:38:49Z D! [agent] Processor channel closed
2024-08-15T12:38:49Z D! [agent] Processor channel closed
2024-08-15T12:38:49Z I! [agent] Hang on, flushing any cached metrics before shutdown
2024-08-15T12:38:49Z D! [outputs.influxdb]  Wrote batch of 432 metrics in 38.213993ms
2024-08-15T12:38:49Z D! [outputs.influxdb]  Buffer fullness: 0 / 10000 metrics
2024-08-15T12:38:49Z D! [outputs.sql]  Buffer fullness: 1411 / 10000 metrics
2024-08-15T12:38:49Z I! [agent] Stopping running outputs
2024-08-15T12:38:49Z D! [agent] Stopped Successfully

It looks like as if it is ignoring config settings and using some default values like:
metric_batch_size = 1000
metric_buffer_limit = 10000

2024-08-15T12:38:17Z D! [outputs.influxdb] Wrote batch of 1000 metrics in 103.087287ms
2024-08-15T12:38:49Z D! [outputs.sql] Buffer fullness: 1411 / 10000 metrics

Is this an error or is something I’m not doing correclty?

Regrads!

That is because you are using --once. This is a special mode that runs inputs and output to once and not all agent settings apply.

This was wrong, the metric_buffer_limit does apply during once, so it is very likely you are not loading the config file you think you are.

OK.

I´ll run it without --once and check the results.

Thanks

Running without --once :
/usr/bin/telegraf --config-directory /home/pdfraire/telegraf/prod --debug

with the same telegraf.conf, shows the same values:

2024-08-15T13:05:39Z I! Starting Telegraf 1.31.2 brought to you by InfluxData the makers of InfluxDB
2024-08-15T13:05:39Z I! Available plugins: 234 inputs, 9 aggregators, 32 processors, 26 parsers, 60 outputs, 6 secret-stores
2024-08-15T13:05:39Z I! Loaded inputs: ping snmp (247x)
2024-08-15T13:05:39Z I! Loaded aggregators: derivative
2024-08-15T13:05:39Z I! Loaded processors: converter strings
2024-08-15T13:05:39Z I! Loaded secretstores:
2024-08-15T13:05:39Z I! Loaded outputs: influxdb sql
2024-08-15T13:05:39Z I! Tags enabled: host=arys
2024-08-15T13:05:39Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"arys", Flush Interval:5m0s
2024-08-15T13:05:39Z D! [agent] Initializing plugins
2024-08-15T13:06:01Z D! [agent] Connecting outputs
2024-08-15T13:06:01Z D! [agent] Attempting connection to [outputs.influxdb]
2024-08-15T13:06:01Z D! [agent] Successfully connected to outputs.influxdb
2024-08-15T13:06:01Z D! [agent] Attempting connection to [outputs.sql]
2024-08-15T13:06:01Z D! [agent] Successfully connected to outputs.sql
2024-08-15T13:06:01Z D! [agent] Starting service inputs
2024-08-15T13:11:07Z D! [outputs.sql]  Wrote batch of 569 metrics in 3.106515722s
2024-08-15T13:11:07Z D! [outputs.sql]  Buffer fullness: 0 / 10000 metrics
2024-08-15T13:11:11Z D! [outputs.influxdb]  Wrote batch of 590 metrics in 83.607264ms
2024-08-15T13:11:11Z D! [outputs.influxdb]  Buffer fullness: 0 / 10000 metrics
2024-08-15T13:16:12Z D! [outputs.sql]  Wrote batch of 569 metrics in 3.329626711s
2024-08-15T13:16:12Z D! [outputs.sql]  Buffer fullness: 0 / 10000 metrics
2024-08-15T13:16:19Z D! [outputs.influxdb]  Wrote batch of 590 metrics in 87.244725ms
2024-08-15T13:16:19Z D! [outputs.influxdb]  Buffer fullness: 0 / 10000 metrics
2024-08-15T13:21:21Z D! [outputs.sql]  Wrote batch of 857 metrics in 4.620559589s
2024-08-15T13:21:21Z D! [outputs.sql]  Buffer fullness: 0 / 10000 metrics
2024-08-15T13:21:25Z D! [outputs.influxdb]  Wrote batch of 878 metrics in 88.522289ms
2024-08-15T13:21:25Z D! [outputs.influxdb]  Buffer fullness: 0 / 10000 metrics
2024-08-15T13:22:38Z I! [agent] Hang on, flushing any cached metrics before shutdown
2024-08-15T13:22:38Z D! [outputs.influxdb]  Wrote batch of 288 metrics in 13.600981ms
2024-08-15T13:22:38Z D! [outputs.influxdb]  Buffer fullness: 0 / 10000 metrics
2024-08-15T13:22:39Z D! [outputs.sql]  Wrote batch of 288 metrics in 1.519625327s
2024-08-15T13:22:39Z D! [outputs.sql]  Buffer fullness: 0 / 10000 metrics

2024-08-15T13:21:25Z D! [outputs.influxdb] Wrote batch of 878 metrics in 88.522289ms
2024-08-15T13:21:25Z D! [outputs.influxdb] Buffer fullness: 0 / 10000 metrics

Reagrds

Are you sure your config file is in that config directory? There should be lines above this that say what config files were loaded:

2024-08-15T13:40:49Z I! Loading config: config.toml
2024-08-15T13:40:49Z I! Starting Telegraf 1.32.0-371b9887 brought to you by InfluxData the makers of InfluxDB
2024-08-15T13:40:49Z I! Available plugins: 234 inputs, 9 aggregators, 32 processors, 26 parsers, 62 outputs, 6 secret-stores
2024-08-15T13:40:49Z I! Loaded inputs: exec

Yes, I removed the lines regarding the loading of config files.

2024-08-15T13:05:39Z I! Loading config: /home/pdfraire/telegraf/prod/exprobe/exprobeIII/ARCCBEL_SC_LSW_PS15_170/INPUTS_UCD-SNMP-MIB-fileTable.conf
2024-08-15T13:05:39Z I! Loading config: /home/pdfraire/telegraf/prod/.../INPUTS_UCD-SNMP-MIB-lmFanSensorsTable.conf
2024-08-15T13:05:39Z I! Loading config: /home/pdfraire/telegraf/prod/.../INPUTS_UCD-SNMP-MIB-lmTempSensorsTable.conf
2024-08-15T13:05:39Z I! Loading config: /home/pdfraire/telegraf/prod/.../INPUTS_UCD-SNMP-MIB-lmVoltSensorsTable.conf
2024-08-15T13:05:39Z I! Loading config: /home/pdfraire/telegraf/prod/.../INPUTS_UCD-SNMP-MIB-prTable.conf
2024-08-15T13:05:39Z I! Loading config: /home/pdfraire/telegraf/prod/telegraf.conf

When asking for help it is best to show complete full logs.

Do you have any other agent sections in any of those config files?

And if you load just that config file what happens?

Hi @jpowers !

Sorry for not sending all, I thought it better to focus in what seems relevant to me.

I have loaded the full log here.

telegraf.txt (416.5 KB)

I have only one main telegraf.conf with telegraf configuration.

And many plugins config files, one for each.

2024-08-15T13:05:39Z I! Loaded inputs: ping snmp (247x)
2024-08-15T13:05:39Z I! Loaded aggregators: derivative
2024-08-15T13:05:39Z I! Loaded processors: converter strings
2024-08-15T13:05:39Z I! Loaded secretstores:
2024-08-15T13:05:39Z I! Loaded outputs: influxdb sql

248 INPUT , 1 AGREGATOR, 2 PROCESSORS, 2 OUTPUTS config files.

wow that’s a lot of files :slight_smile:

In general when someone says their settings are not applied it means either the file is not getting read in, or there is a second agent config setting somewhere else that is overriding things. You shouldn’t technically have two agent config handlers as it can lead to unexpected behavior.

I assume the agent settings are in the /home/pdfraire/telegraf/prod/telegraf.conf file? Can you try loading say, just that file and maybe these two so you have an input and an output:

  • /home/pdfraire/telegraf/prod/telegraf.conf
  • /home/pdfraire/telegraf/prod/INPUTS_PING.conf
  • /home/pdfraire/telegraf/prod/OUTPUT-INFLUX.conf

Something like:

telegraf --debug --config /home/pdfraire/telegraf/prod/telegraf.conf --config /home/pdfraire/telegraf/prod/INPUTS_PING.conf --config /home/pdfraire/telegraf/prod/OUTPUT-INFLUX.conf

I thought I was following best practice recommendations :wink:
One config file for each monitored agent in order to avoid performance bottlenecks, and one for each measurement in order to keep order :slight_smile:

I copied the 3 files to my test directory.

Here you have the limited test:

2024-08-15T14:32:00Z I! Loading config: /home/pdfraire/telegraf/test/INPUTS_PING.conf
2024-08-15T14:32:00Z I! Loading config: /home/pdfraire/telegraf/test/OUTPUT-INFLUX.conf
2024-08-15T14:32:00Z I! Loading config: /home/pdfraire/telegraf/test/telegraf.conf
2024-08-15T14:32:00Z I! Starting Telegraf 1.31.2 brought to you by InfluxData the makers of InfluxDB
2024-08-15T14:32:00Z I! Available plugins: 234 inputs, 9 aggregators, 32 processors, 26 parsers, 60 outputs, 6 secret-stores
2024-08-15T14:32:00Z I! Loaded inputs: ping
2024-08-15T14:32:00Z I! Loaded aggregators:
2024-08-15T14:32:00Z I! Loaded processors:
2024-08-15T14:32:00Z I! Loaded secretstores:
2024-08-15T14:32:00Z I! Loaded outputs: influxdb
2024-08-15T14:32:00Z I! Tags enabled: host=arys
2024-08-15T14:32:00Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"arys", Flush Interval:5m0s
2024-08-15T14:32:00Z D! [agent] Initializing plugins
2024-08-15T14:32:00Z D! [agent] Connecting outputs
2024-08-15T14:32:00Z D! [agent] Attempting connection to [outputs.influxdb]
2024-08-15T14:32:00Z D! [agent] Successfully connected to outputs.influxdb
2024-08-15T14:32:00Z D! [agent] Starting service inputs
2024-08-15T14:35:10Z D! [inputs.ping] no packets received
2024-08-15T14:35:10Z D! [inputs.ping] no packets received
2024-08-15T14:37:05Z D! [outputs.influxdb]  Wrote batch of 21 metrics in 6.682514ms
2024-08-15T14:37:05Z D! [outputs.influxdb]  Buffer fullness: 0 / 10000 metrics
2024-08-15T14:40:14Z D! [inputs.ping] no packets received
2024-08-15T14:40:14Z D! [inputs.ping] no packets received
2024-08-15T14:42:15Z D! [outputs.influxdb]  Wrote batch of 21 metrics in 4.879443ms
2024-08-15T14:42:15Z D! [outputs.influxdb]  Buffer fullness: 0 / 10000 metrics
2024-08-15T14:45:14Z D! [inputs.ping] no packets received
2024-08-15T14:45:14Z D! [inputs.ping] no packets received
2024-08-15T14:47:23Z D! [outputs.influxdb]  Wrote batch of 21 metrics in 6.246848ms
2024-08-15T14:47:23Z D! [outputs.influxdb]  Buffer fullness: 0 / 10000 metrics

The metrics are few so the batch limit is not violated, but the buffer size looks like 10000 instead of 2500 as set in the config file.

You are - that’s just a lot :slight_smile:

the buffer size looks like 10000 instead of 2500 as set in the config file.

Yep, can you share without secrets or passwords those three files please?

Sure!

OUTPUT-INFLUX_conf.txt (4.0 KB)
INPUTS_PING_conf.txt (2.4 KB)
telegraf_conf.txt (5.4 KB)

This is an unfortunate consequence of reading the telegraf.conf file last. Because the outputs have already been read in, confnigured and setup, then the default value is used. Then your telegraf.conf file is read in last and any new outputs would use that value, but the existing outputs do not.

You can tell that the file is read in as your flush interval is changed to the non-default 5 minutes: Flush Interval:5m0s But because that is agent wide not output-specific, the change takes affect.

The work around would be to move the file up and specify this file out of prod first like the following:

telegraf --config /home/pdfraire/telegraf/telegraf.conf --config-directory /home/pdfraire/telegraf/prod

OK!!

Great discovery.
My second issue regarding the location of telegraf.conf :crazy_face:

I´ll do what you are proposing.

Thanks!

It worked, telegraf.conf was loaded first and parameters are as they are set in it.

2024-08-15T17:09:47Z I! Loading config: /home/pdfraire/telegraf/conf/telegraf.conf
2024-08-15T17:09:47Z I! Loading config: /home/pdfraire/telegraf/test/INPUTS_PING.conf
2024-08-15T17:09:47Z I! Loading config: /home/pdfraire/telegraf/test/OUTPUT-INFLUX.conf
2024-08-15T17:09:47Z I! Starting Telegraf 1.31.2 brought to you by InfluxData the makers of InfluxDB
2024-08-15T17:09:47Z I! Available plugins: 234 inputs, 9 aggregators, 32 processors, 26 parsers, 60 outputs, 6 secret-stores
2024-08-15T17:09:47Z I! Loaded inputs: ping
2024-08-15T17:09:47Z I! Loaded aggregators:
2024-08-15T17:09:47Z I! Loaded processors:
2024-08-15T17:09:47Z I! Loaded secretstores:
2024-08-15T17:09:47Z I! Loaded outputs: influxdb
2024-08-15T17:09:47Z I! Tags enabled: host=arys
2024-08-15T17:09:47Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"arys", Flush Interval:5m0s
2024-08-15T17:09:47Z D! [agent] Initializing plugins
2024-08-15T17:09:47Z D! [agent] Connecting outputs
2024-08-15T17:09:47Z D! [agent] Attempting connection to [outputs.influxdb]
2024-08-15T17:09:47Z D! [agent] Successfully connected to outputs.influxdb
2024-08-15T17:09:47Z D! [agent] Starting service inputs
2024-08-15T17:10:14Z D! [inputs.ping] no packets received
2024-08-15T17:10:14Z D! [inputs.ping] no packets received
2024-08-15T17:14:48Z D! [outputs.influxdb]  Wrote batch of 21 metrics in 6.427322ms
2024-08-15T17:14:48Z D! [outputs.influxdb]  Buffer fullness: 0 / 2500 metrics
2024-08-15T17:15:10Z D! [inputs.ping] no packets received
2024-08-15T17:15:10Z D! [inputs.ping] no packets received
2024-08-15T17:16:17Z D! [outputs.influxdb]  Wrote batch of 21 metrics in 4.483191ms
2024-08-15T17:16:17Z D! [outputs.influxdb]  Buffer fullness: 0 / 2500 metrics

Thanks again!

1 Like

@jpowers
One more question.

Is it possible to redefine this two parameters in an output conf file?

metric_batch_size
metric_buffer_limit

I am using Influx for monitoring metrics (querying every 5 minutes) and mysql for inventory (just querying the devices once a day but a lot more data), so I rather have different values for each output

yes you can set those options for each output: telegraf/docs/CONFIGURATION.md at master · influxdata/telegraf · GitHub

Sorry I forgot to check that part of doc.

1 Like