Error handling during collection

revanth · October 29, 2020, 3:15pm

I am using snmp input plugin and polling 10+ devices/IPs. What will be the behavior when one of the IP in the list not responding:

Does it impacts the metric collection of all other IPs?
Is there a way to capture an instance of the device being down?

When tested I see metrics being collected for all other IPs except faulty one, below is the error on the console:

2020-10-29T14:21:35Z E! [inputs.snmp] Error in plugin: agent xx.xx.xx.xx: performing get on field 
devicename: Request timeout (after 3 retries)
2020-10-29T14:22:15Z E! [inputs.snmp] Error in plugin: agent xx.xx.xx.xx: gathering table snmp_pilot:                 performing bulk walk for field ifName: Request timeout (after 3 retries)
2020-10-29T14:22:15Z D! [agent] Stopping service inputs
2020-10-29T14:22:15Z D! [agent] Input channel closed
2020-10-29T14:22:15Z D! [agent] Stopped Successfully
2020-10-29T14:22:15Z E! [telegraf] Error running agent: input plugins recorded 2 errors

How can I get this error into influx to build a dashboard around unreachable IPs? or any efficient approach to monitor these errors?

revanth · November 2, 2020, 8:08pm

I am able to use input.tail plugin for extracting information from telegraf logs and then used processors.regex to extract needful information.

Topic		Replies	Views
Telegraf SNMP Error - performing bulk walk for field field-name: request timeout (after 3 retries) Telegraf telegraf , snmp	4	1314	April 21, 2023
Unable to get SNMP , telegraf & influxdb InfluxDB 2 influxdb , telegraf	1	1965	June 1, 2021
Getting meaningful error data from Telegraf Telegraf telegraf	1	37	February 19, 2025
Possibility to see failed snmp target Telegraf telegraf	4	222	February 27, 2024
Telegraf not collecting on all snmp inputs Telegraf	7	6022	March 9, 2018

Error handling during collection

Related topics