SNMP Data Collection Pipeline Questions

I’m looking at using Telegraf for SNMP data collection. Based on what i was seeing in the forum and other documentation I’ve broken it down so I have 1 host file per conf file.

This is what I have so far mostly for testing purposes.

[[inputs.snmp]]

agents = [ "udp://testing-host:161" ]

interval = "30s"

## SNMP version; can be 1, 2, or 3.
version = 2

## SNMP community string.
community = "public"

## Number of retries to attempt.
retries = 5

[inputs.snmp.tags]
sourceId="${SOURCE_ID}"
type="snmp"
projectId="1"
collectionId="1"

[[inputs.snmp.table]]
name = "interface"
index_as_tag = true

[[inputs.snmp.table.field]]
oid = "IF-MIB::ifHCInOctets"
# is_tag = true

[[inputs.snmp.table.field]]
oid = "IF-MIB::ifHCOutOctets"

Right now I have 160 hosts, but the number will go up once we go live with everything.

Sample output:

{
  "fields": { "ifHCInOctets": 8713798302, "ifHCOutOctets": 60086004480 },
  "name": "interface",
  "tags": {
    "collectionId": "1",
    "collector": "6d9533786514",
    "date": "Tue Sep 29 23:20:02 +0000 UTC 2020",
    "device": "testing-host",
    "oidIndex": "18001",
    "projectId": "1",
    "site": "elkhorn",
    "sourceId": "10",
    "type": "snmp"
  },
  "timestamp": 1601421602
}



Problem 1.

I need to convert the data format to be a single message for each data point. In this example, I’m extracting: ifHCInOctets and ifHCOutOctets, I would like each of these to be a different message, that looks something like this:

{
  "floatValue": null,
  "integerValue": 8713798302,
  "stringValue": null,
  "bytesValue": null,
  "tags": {
    "type": "snmp",
    "device": "testing-host",
    "projectId": "1",
    "collectionId": "1",
    "oidIndex": "65.516.2",
    "oid": "IF-MIB::ifHCOutOctets",
    "sourceId": "10"
  }
}

In this case I basically want to take 1 event and create N number of events out of it. Is this something that can be done in telegraf? or does it need some custom code/plugin?

  1. This is a bit more generic I think, but I need to extract the partial oidIndex of the value being sent. I can figure out the math but I can’t figure out a way to even pass the Oid Index of the value at all.

Is there a way to get that working with existing plugins? Am I trying to do too much with Telegraf that it wasn’t designed to do?

  1. In order to get all of this working correctly, is the recommended pattern, (assuming the answer isn’t a blatant no to the last set of question for #2) to create my own plugin to do all this work?

If so, any recommended guides to write plugins, or any community repos of unofficial plugins to look through that didn’t make it into mainstream?

Any help would be greatly appreciated.

Hi csgeek,

Please have a look at the output plugins.

But each SNMP walk or get is put down as a sperate metric. You can per example use the Prometheus plugin as output and connect to the http://:port/metrics to view the output or write it to a file. This will than show you each metric has it’s own line and timestamp.

We have 2000+ devices gathering SNMP information from multiple platforms. However we used multiple docker containers as we wanted to minimize scraping time. But that is off course if you use Prometheus as an output. So depending on your backend.

You can use processors to modify metric fields and values. The regex processors is especially handy for that.

Hope it answered your questions.

1 Like

I ended up crating a configuration per each device and each interval has a different config. It took a bit to write some code to auto-generate the correct config but this is working great so far.

Coming from a previous attempt to use logstash telegraf is performing incredibly well.

The format conversion was limited. I ended up looking at the telegraf external plugins but it was too limiting, so i’m using a GCP cloud function (ie. AWS lambda) to convert to the right format.

Thanks for the help.