First time using the inputs.exec plug in. Any help would be great.
I test in telegraf which produces this output:
[root@netmon scripts]# telegraf -test -input-filter=exec -debug
2017/04/25 11:47:26 I! Using config file: /etc/telegraf/telegraf.conf
- Plugin: inputs.exec, Collection 1
traceroute,target_loc=dc1,target_ip=xxx.xxx.xxx,hop_num=1,hop_host=10.25.0.1,host=netmon resp_time=2.316 1493135246000000000
traceroute,target_ip=xxx.xxx.xxx,hop_num=2,hop_host=192.168.1.20,host=netmon,target_loc=dc1 resp_time=1.021 1493135246000000000
This output is expected and the same as the script output, with the exception of “host=netmon” which telegraf is adding, that’s not a problem. I can manually INSERT the output into influxdb with no issues.
However, when I start telegraf with systemctl I get this error in my logs
2017-04-25T16:01:00Z E! ERROR in input [inputs.exec]: Errors encountered: [metric parsing error, reason: [buffer too short], buffer: , index: [0]]
It never updates the traceroute measurement. The only entries are my manual ones.
I’ve exhausted the extent of my knowledge. Can anyone shed some light on this and help me out?
Cheers
Karl
@Karl_Raffelsieper Someone had a similar problem reciently. Can you take a peek at this issue?
@jackzampolin Thanks for the suggestion, however, I’m not sure its apropos. The plugin is enabled and it returns valid data in the “telegraph - test” mode Unless my script needs to provide the “host=xxxxx” info. For example my script returns
traceroute,target_loc=dc1,target_ip=xxx.xxx.xxx,hop_num=1,hop_host=10.25.0.1 resp_time=2.101 1493141569019738258
Here is the same line from running in test mode is
[root@netmon scripts]# telegraf -test -input-filter=exec -debug
2017/04/25 13:34:08 I! Using config file: /etc/telegraf/telegraf.conf
- Plugin: inputs.exec, Collection 1
traceroute,target_loc=dc1,target_ip=xxx.xxx.xxx,hop_num=1,hop_host=10.25.0.1,host=netmon resp_time=3 1493141648000000000
Notice the host=netmon has been added to the output.
The other link to #2448 appears to be bug regarding no returning data. I checked git and that appears to have been resolved some time ago.
So I know the plug in has been enabled, data is being returned so I’m still not sure why telegraf is throwing the error.
Any other suggestions?
Tks
This sounds like an issue. Can you open an issue on telegraf and include the script if possible?
Solved
Unfortunately a couple of issues. Error message is a bit cryptic, and my script was not returning a non-zero error. However, even if I did, it may not have been meaningful. Lastly a permission issue.
When work is performed by telelgraf under systemctl it is run as user telegraf ( you can not just su to telegraf, it is a no shell user). The script called by the plugin was globally executable, but within the script traceroute was called which requires root privilege. Of course the script would fail and return meaningful permissions denied, but that was not returned to the plugin. And we get the buffer too short message instead.
Testing with
sudo -u telegraf /path_to_script>.sh
will reveal if telegraf will be able to execute the script end to end. If the script needs root or other use privilege you will need to configure /etc/sudoers for the telegraf user. You will also need to ensure NOPASSWD has been configured for telegraf in the sudoers file. Keep the NOPASSWD list of commands limited. It’s always a security risk in production if you open things up too wide.
Cheers
1 Like
@Karl_Raffelsieper Thank you for posting that here! Glad you were able to get it solved.