Telegraf SNMP not writing data to Influx

telegraf

#1

Hi,

New to influx, telegraf etc and I’m having a few teething problems. I am running a QNAP NAS and I’d like to be able to monitor it’s various stats through Grafana. The NAS has SNMP functionality and I’ve got this switched on. It also has it’s own mib file. Telegraf and influxdb are running on a separate machine together. I’ve gotten the beginnings running but I’m now stuck because telegraf doesn’t appear to be writing everything to influx.

I have the following set up in the telegraf.conf file:

 [[inputs.snmp]]
   agents = [ "192.168.1.1:161" ]
#   ## Timeout for each SNMP query.
   timeout = "10s"
#   ## Number of retries to attempt within timeout.
   retries = 3
#   ## SNMP version, values can be 1, 2, or 3
   version = 2
#
#   ## SNMP community string.
   community = "public"
#
#   ## The GETBULK max-repetitions parameter
   max_repetitions = 10
#
#   ## SNMPv3 auth parameters
#   #sec_name = "myuser"
#   #auth_protocol = "md5"      # Values: "MD5", "SHA", ""
#   #auth_password = "pass"
#   #sec_level = "authNoPriv"   # Values: "noAuthNoPriv", "authNoPriv", "authPriv"
#   #context_name = ""
#   #priv_protocol = ""         # Values: "DES", "AES", ""
#   #priv_password = ""
#
#   ## measurement name
   name = "snmp"
#   [[inputs.snmp.field]]
#     name = "hostname"
#     oid = ".1.0.0.1.1"
#   [[inputs.snmp.field]]
#     name = "uptime"
#     oid = ".1.0.0.1.2"
#   [[inputs.snmp.field]]
#     name = "load"
#     oid = ".1.0.0.1.3"
#   [[inputs.snmp.field]]
#     oid = "HOST-RESOURCES-MIB::hrMemorySize"
#
#   [[inputs.snmp.table]]
     ## measurement name
#     name = "NAS"
#      oid = ".1.3.6.1.4.1.24681.1.2.17" 
#    # inherit_tags = [ "hostname" ]
#     [[inputs.snmp.table.field]]
#       name = "disk_name"
#       oid = ".1.3.6.1.4.1.24681.1.2.17"
#       is_tag = true
#     [[inputs.snmp.table.field]]
#       name = "nas_fields"
#       oid = ".1.3.6.1.4.1.24681.1.2.17"
#     [[inputs.snmp.table.field]]
#       name = "latency"
#       oid = ".1.0.0.0.1.2"
#
   [[inputs.snmp.table]]
     ## auto populate table's fields using the MIB
     oid = "HOST-RESOURCES-MIB:hrStorageTable"

	[[inputs.snmp.table]]
     ## auto populate table's fields using the MIB
     oid = ".1.3.6.1.4.1.24681.1.2.17"

Yes I know I have a lot of commented out stuff but this is because I’m still learning about this and don’t know if any of it will be useful later on.

When I look at hrStorageTable it is able to pull data from my NAS, although I believe this OID is a global setting?! I can then correctly show this in Grafana so I know my telegraf output is working correctly. However, if I select the bottom OID it brings back my systemvolumetable, lets me select the identifier and the various other options e.g. Storage Free or Storage capacity but it isn’t recording any data to Influx.

If I do an SNMPWALK to that OID it brings back all the correct information. Why isn’t it writing to Influx?

I should add that I have the NAS.mib file saved in a $HOME directory that telegraf is looking at so don’t think that is the problem.


#2

What does telegraf --input-filter snmp --test print?


#3

It prints:

> hrStorageTable,agent_host=192.168.1.1,host=NetworkMonitor,hrStorageIndex=10 hrStorageAllocationUnits=1024i,hrStorageDescr="Swap space",hrStorageSize=24542452i,hrStorageType=".1.3.6.1.2.1.25.2.1.3",hrStorageUsed=3669732i 1530918900000000000
> hrStorageTable,agent_host=192.168.1.1,host=NetworkMonitor,hrStorageIndex=36 hrStorageAllocationUnits=4096i,hrStorageDescr="/share/NFSv=4/Public",hrStorageSize=1102300398i,hrStorageType=".1.3.6.1.2.1.25.2.1.4",hrStorageUsed=1065382343i 1530918900000000000
> hrStorageTable,agent_host=192.168.1.1,host=NetworkMonitor,hrStorageIndex=39 hrStorageAllocationUnits=4096i,hrStorageDescr="/lib/modules/4.2.8/container-station",hrStorageSize=1102300398i,hrStorageType=".1.3.6.1.2.1.25.2.1.4",hrStorageUsed=1065382343i 1530918900000000000
> hrStorageTable,agent_host=192.168.1.1,host=NetworkMonitor,hrStorageIndex=1 hrStorageAllocationUnits=1024i,hrStorageDescr="Physical memory",hrStorageSize=3947496i,hrStorageType=".1.3.6.1.2.1.25.2.1.2",hrStorageUsed=3689156i 1530918900000000000
> hrStorageTable,agent_host=192.168.1.1,host=NetworkMonitor,hrStorageIndex=32 hrStorageAllocationUnits=4096i,hrStorageDescr="/sys/fs/cgroup/memory",hrStorageSize=0i,hrStorageType=".1.3.6.1.2.1.25.2.1.4",hrStorageUsed=0i 1530918900000000000
> hrStorageTable,agent_host=192.168.1.1,host=NetworkMonitor,hrStorageIndex=7 hrStorageAllocationUnits=1024i,hrStorageDescr="Cached memory",hrStorageSize=465284i,hrStorageType=".1.3.6.1.2.1.25.2.1.1",hrStorageUsed=465284i 1530918900000000000
> hrStorageTable,agent_host=192.168.1.1,host=NetworkMonitor,hrStorageIndex=37 hrStorageAllocationUnits=4096i,hrStorageDescr="/share/NFSv=4/VM",hrStorageSize=906746486i,hrStorageType=".1.3.6.1.2.1.25.2.1.4",hrStorageUsed=799159303i 1530918900000000000
> hrStorageTable,agent_host=192.168.1.1,host=NetworkMonitor,hrStorageIndex=33 hrStorageAllocationUnits=4096i,hrStorageDescr="/share/CACHEDEV1_DATA",hrStorageSize=1102300398i,hrStorageType=".1.3.6.1.2.1.25.2.1.4",hrStorageUsed=1065382343i 1530918900000000000
> hrStorageTable,agent_host=192.168.1.1,host=NetworkMonitor,hrStorageIndex=3 hrStorageAllocationUnits=1024i,hrStorageDescr="Virtual memory",hrStorageSize=28489948i,hrStorageType=".1.3.6.1.2.1.25.2.1.3",hrStorageUsed=7358888i 1530918900000000000
> hrStorageTable,agent_host=192.168.1.1,host=NetworkMonitor,hrStorageIndex=35 hrStorageAllocationUnits=4096i,hrStorageDescr="/mnt/ext",hrStorageSize=90876i,hrStorageType=".1.3.6.1.2.1.25.2.1.4",hrStorageUsed=86351i 1530918900000000000
> hrStorageTable,agent_host=192.168.1.1,host=NetworkMonitor,hrStorageIndex=6 hrStorageAllocationUnits=1024i,hrStorageDescr="Memory buffers",hrStorageSize=3947496i,hrStorageType=".1.3.6.1.2.1.25.2.1.1",hrStorageUsed=55116i 1530918900000000000
> hrStorageTable,agent_host=192.168.1.1,host=NetworkMonitor,hrStorageIndex=31 hrStorageAllocationUnits=4096i,hrStorageDescr="/mnt/HDA_ROOT",hrStorageSize=126325i,hrStorageType=".1.3.6.1.2.1.25.2.1.4",hrStorageUsed=30420i 1530918900000000000
> hrStorageTable,agent_host=192.168.1.1,host=NetworkMonitor,hrStorageIndex=8 hrStorageAllocationUnits=1024i,hrStorageDescr="Shared memory",hrStorageSize=0i,hrStorageType=".1.3.6.1.2.1.25.2.1.1" 1530918900000000000
> hrStorageTable,agent_host=192.168.1.1,host=NetworkMonitor,hrStorageIndex=38 hrStorageAllocationUnits=4096i,hrStorageDescr="/share/NFSv=4/Virtual Machines",hrStorageSize=906746486i,hrStorageType=".1.3.6.1.2.1.25.2.1.4",hrStorageUsed=799159303i 1530918900000000000
> hrStorageTable,agent_host=192.168.1.1,host=NetworkMonitor,hrStorageIndex=34 hrStorageAllocationUnits=4096i,hrStorageDescr="/share/CACHEDEV2_DATA",hrStorageSize=906746486i,hrStorageType=".1.3.6.1.2.1.25.2.1.4",hrStorageUsed=799159303i 1530918900000000000
> systemVolumeTable,agent_host=192.168.1.1,host=NetworkMonitor,sysVolumeIndex=1 sysVolumeDescr="[Volume ODIN_DataVol1, Pool 1]",sysVolumeFS="EXT4",sysVolumeFreeSize="140.32 GB",sysVolumeStatus="Unknown",sysVolumeTotalSize="4.11 TB" 1530918905000000000
> systemVolumeTable,agent_host=192.168.1.1,host=NetworkMonitor,sysVolumeIndex=2 sysVolumeDescr="[Volume ODIN_DataVol2, Pool 1]",sysVolumeFS="EXT4",sysVolumeFreeSize="409.90 GB",sysVolumeStatus="Ready",sysVolumeTotalSize="3.38 TB" 1530918905000000000

Which to me now looks like it is writing data. This is a very dumb question but does telegraf have to be running? If so how do I get it running? If I do systemctl status telegraf I get the following:

● telegraf.service - The plugin-driven server agent for reporting metrics into InfluxDB
   Loaded: loaded (/lib/systemd/system/telegraf.service; enabled; vendor preset: enabled)
   Active: active (running) since Thu 2018-07-05 20:16:16 BST; 1 day 4h ago
     Docs: https://github.com/influxdata/telegraf
  Process: 9336 ExecReload=/bin/kill -HUP $MAINPID (code=exited, status=0/SUCCESS)
 Main PID: 14950 (telegraf)
   CGroup: /system.slice/telegraf.service
           └─14950 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d

Jul 07 00:12:00 NetworkMonitor telegraf[14950]: 2018-07-06T23:12:00Z E! Error in plugin [inputs.snmp]: initializing table : getting table columns: exit status 1: Was that a table? SNMPv2-SMI::enterprises.24681.1.2.17
Jul 07 00:12:30 NetworkMonitor telegraf[14950]: 2018-07-06T23:12:30Z E! Error in plugin [inputs.snmp]: initializing table : getting table columns: exit status 1: Was that a table? SNMPv2-SMI::enterprises.24681.1.2.17
Jul 07 00:13:00 NetworkMonitor telegraf[14950]: 2018-07-06T23:13:00Z E! Error in plugin [inputs.snmp]: initializing table : getting table columns: exit status 1: Was that a table? SNMPv2-SMI::enterprises.24681.1.2.17
Jul 07 00:13:30 NetworkMonitor telegraf[14950]: 2018-07-06T23:13:30Z E! Error in plugin [inputs.snmp]: initializing table : getting table columns: exit status 1: Was that a table? SNMPv2-SMI::enterprises.24681.1.2.17
Jul 07 00:14:00 NetworkMonitor telegraf[14950]: 2018-07-06T23:14:00Z E! Error in plugin [inputs.snmp]: initializing table : getting table columns: exit status 1: Was that a table? SNMPv2-SMI::enterprises.24681.1.2.17
Jul 07 00:14:30 NetworkMonitor telegraf[14950]: 2018-07-06T23:14:30Z E! Error in plugin [inputs.snmp]: initializing table : getting table columns: exit status 1: Was that a table? SNMPv2-SMI::enterprises.24681.1.2.17
Jul 07 00:15:00 NetworkMonitor telegraf[14950]: 2018-07-06T23:15:00Z E! Error in plugin [inputs.snmp]: initializing table : getting table columns: exit status 1: Was that a table? SNMPv2-SMI::enterprises.24681.1.2.17
Jul 07 00:15:30 NetworkMonitor telegraf[14950]: 2018-07-06T23:15:30Z E! Error in plugin [inputs.snmp]: initializing table : getting table columns: exit status 1: Was that a table? SNMPv2-SMI::enterprises.24681.1.2.17
Jul 07 00:16:00 NetworkMonitor telegraf[14950]: 2018-07-06T23:16:00Z E! Error in plugin [inputs.snmp]: initializing table : getting table columns: exit status 1: Was that a table? SNMPv2-SMI::enterprises.24681.1.2.17
Jul 07 00:16:30 NetworkMonitor telegraf[14950]: 2018-07-06T23:16:30Z E! Error in plugin [inputs.snmp]: initializing table : getting table columns: exit status 1: Was that a table? SNMPv2-SMI::enterprises.24681.1.2.17

#4

That looks correct when you run telegraf --test. The service on the other hand, which appears to be running, looks like it the MIB isn’t setup right. You may just need to restart the service service telegraf restart if you added the MIB after it started, do this after any configuration change. Keep in mind that the Telegraf service also runs as the telegraf user, which has a different $HOME directory than your regular user.


#5

Thanks Daniel, it looks as though the Error in plugin has now gone however the writing to influx still isn’t quite working.

For example, I can query the data in influx and note down the time of the last data point, wait 5 minutes, rerun the query and nothing has changed. If I then run telegraf from CLI I get the following:

2018-07-07T08:04:18Z I! Starting Telegraf v1.7.0
2018-07-07T08:04:18Z I! Loaded inputs: inputs.cpu inputs.diskio inputs.mem inputs.processes inputs.system inputs.disk inputs.kernel inputs.swap inputs.snmp
2018-07-07T08:04:18Z I! Loaded aggregators:
2018-07-07T08:04:18Z I! Loaded processors:
2018-07-07T08:04:18Z I! Loaded outputs: influxdb
2018-07-07T08:04:18Z I! Tags enabled: host=NetworkMonitor
2018-07-07T08:04:18Z I! Agent Config: Interval:30s, Quiet:false, Hostname:"NetworkMonitor", Flush Interval:30s

I then query the data in influx and I’ll have an additional data point or two depending on how long that was running for.
I didn’t know if it was useful or correct but I tried service telegraf status and it came back with unknown service.


#6

Actually I lie, everything seems to be working now! Thanks @daniel