Getting meaningful error data from Telegraf

michal · February 18, 2025, 2:16pm

Hey everyone!

I’m new here and to Telegraf too. I’m struggling to find out how to get meaningful error data from Telegraf to display in our SNMP dashboard.

For example, I’d like to get data about unreachable devices (SNMP input), due to them being down. I’d like to know what exact agent has issues and what the issues are.

I tried to get this info from inputs.internal plugin, but all I can find is in “internal_gather”, there is the number of errors increasing. It doesn’t say anything else, just number that is incrementing. I’d like to see at least the agent failing, but ideally also error message etc.

Is there any solution you can think of how to solve this please?

Here is my (for now simple, 1 agent) config:

[agent]
  interval = "1m"

[[inputs.snmp]]
  path = ["/usr/share/snmp/mibs"]
  agents = ["snmp_simulator:161"]
  timeout = "5s"
  version = 2
  community = "public"

  [[inputs.snmp.field]]
    oid = "1.3.6.1.4.1.4096.10000.1.1"
    name = "utilLoad"

[[inputs.internal]]
  collect_memstats = true
  collect_gostats = false

[[outputs.postgresql]]
  connection="host=db port=5432 user=admin password=admin sslmode=disable dbname=db"

michal · February 19, 2025, 2:04pm

I got a reply form @Hipska in the community Slack.

He suggests using hostname as a value of the alias property under [[inputs.snmp]]. If you use Internal Input Plugin, then this alias will then appear in internal_gather table. This way, you can check what hostname errored.

Also, when determining failures, one can look at the SNMP objects such as Uptime and if the data is not available for the hostname after the last SNMP poll, most likely there is something wrong with the device.

He uses combination of both, plus Ping Input Plugin to get even more information on devices’ health.

An image he provided:

Thank you Hipska once again!

Topic		Replies	Views
Error handling during collection Telegraf influxdb , telegraf , smnp	1	1673	November 2, 2020
Possibility to see failed snmp target Telegraf telegraf	4	222	February 27, 2024
SNMP data not passing into influx Telegraf	2	519	February 9, 2021
Telegraf not collecting SNMP data telegraf	0	977	February 19, 2019
Telegraf not collecting on all snmp inputs Telegraf	7	6022	March 9, 2018

Getting meaningful error data from Telegraf

Related topics