Hoping someone could assist me, I’m not sure if I’m going about this plugin in the right way.
I use a QNAP NAS and receive SNMP from it through Telegraf. This is working as intended, using SNMPv3. I collect metrics via OID, not using a MIB. I am running Telegraf and InfluxDB from a docker-compose recipe, which also seems to work as intended (though I’m not sure if opening port 162/udp has been successful, it shows open in the container and host but NMAP does not). A few questions below.
Should I be using the same v3 user, auth and so on for snmp_trap as for snmp? My QNAP is configured to use v3 with those settings, which works with the snmp plugin.
Do I need to import my MIB to the stock location (/usr/share/snmp/mibs/my-mib.txt)? I have done this but I don’t believe it’s working as snmptranslate only returns the OID back to me. Also, on checking the logs (docker logs telegraf) I have the following output:
[inputs.snmp_trap] Listening on udp://:162
[inputs.snmp_trap] Error resolving V1 OID: not found
Any help on the standard setup would be greatly appreciated.
I was able to resolve the error with V1 OID, I re-imported my MIB file under /usr/share/snmp/mibs and it no longer reports in the logs.
I’ve also fixed the issue with UDP port 162, this does appear open now.
I still do not receive any traps on Telegraf however. Is this possibly related to the v3 usage? Again I’m not sure if this applies the same as the standard SNMP plugin.
I don’t have a qnap device but I do have a few debugging ideas you can try. First idea is to try SNMP v2c. It doesn’t have authentication and encryption like v3 so it is simpler to configure. Even if you want to use v3 in the long run for increased security, it will probably be worth your time to try v2c temporarily because there are fewer things that can go wrong.
The second debugging idea is to try sending telegraf a trap using net-snmp. I’m not familiar with your qnap device, but it can be tricky with some devices to know for sure when traps are sent. Using net-snmp’s snmptrap command you know a trap is sent when you run the command.
Here is a test telegraf config and a corresponding snmptrap command for v2c. If telegraf receives the test trap you will know for sure that the container and telegraf are both set up correctly. In this example I have it using udp port 2000. You’ll want to change it to use port 162.
snmptrap -v 2c -c community udp:localhost:2000 “” .126.96.36.199.188.8.131.52.5.3.0 0 s “This is a test linkDown trap from v2c”
service_address = “udp://:2000”
Once that works, try swtiching to v3. Here is a config and test command that uses v3 with authentication and privacy enabled:
snmptrap -v3 -e 00abcdefabcdef00 -n mycontextname -l authPriv -u mysecname -a SHA -A myauthpass -x AES -X myprivpass udp:localhost:2000 “” .184.108.40.206.220.127.116.11.5.3.0 0 s “This is a test linkDown trap from v3 authPriv”
service_address = “udp://:2000”
sec_name = “mysecname”
auth_protocol = “SHA”
auth_password = “myauthpass”
sec_level = “authPriv”
priv_protocol = “AES”
priv_password = “myprivpass”
version = “3”
Thanks for the reply.
I’ve verified my container is receiving traps via tcpdump. They are indeed coming through, though they differ when using V2 or V3.Testing from V2 gives more indication, while my QNAP (still V3) is coming through as more of a garbled mess (which I assume is the encryption’s fault).
How can I verify that telegraf explicitly is receiving the trap? Telegraf so far has only received 1 trap - a coldstart trap from localhost - that I can see passed through to influxdb.
If you’re seeing a garbled mess, I wonder if you are hitting this issue:
If so, the workaround is to make sure you have the right encryption settings. Telegraf should be able to tell if decryption didn’t work and log an error and hopefully this will be fixed soon in a future release.
There are a few ways to verify that telegraf is receiving the trap. You found one- checking if the trap data made it through to the output, influxdb in your case.
Another way is to use the file output. When I’m debugging issues like this I temporarily disable my normal outputs and add a file output for debugging and have it write to STDOUT. Then I run telegraf from the command line and watch it produce metrics live. I sometimes also reduce the flush interval agent setting to a second or two so there’s not a big delay before the metric is printed.
With tricky SNMP trap issues I have also resorted to saving packet captures with wireshark and comparing them with what telegraf produces. You can set up wireshark to save only packets on udp port 162, then run it on the same machine and at the same time as telegraf. If the packet capture shows traps that telegraf didn’t report, there’s a bug. I haven’t needed to do this since the trap input was being developed. I also haven’t seen other users report on the telegraf github repo that telegraf is missing traps.
OK, thanks for that.
The output I’m seeing is from my tcpdump packet capture is:
12:37 IP 192.168.1.2 > docker.snmp-trap: F=r U="" E=_80_00_1f_88[…etc] C="" GetRequest(14)
I’m not sure what that is, as when I run traps from other devices (which also aren’t received) it looks entirely different, usually suggesting V2.
I ran a test trap as you showed and that came through more what I would expect, though still nothing output by Telegraf or received by InfluxDB. This leads me to think that my Telegraf is not configured correctly for traps. Is there something additional I require be installed - I have SNMP, SNMPD etc.
My QNAP is set up for V3 auth (no priv) and it works as expected with the normal SNMP plugin for Telegraf. Notably, the supposed OID for the QNAP that suggests a trap returns no result from a walk, though I suppose this is normal behaviour due to the way traps work over agent polls.
So I ran the following:
I then created an SNMP Trap and saw the following:
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1da485d]
I then see a number of lines regarding goroutine 13.
I suppose that explains the problem!