Telegraf inputs.exec - json data never received by InfluxDB

I had this working previously on CentOS 7 (unsure which telegraf or InfluxDB version). It ran successfully for years (I have the data to prove it!). Recently we migrated everything to Ubuntu 20.04.3 LTS (focal) with telegraf 1.21.2 + InfluxDB 1.8.10. I’ve been re-installing our telegraf inputs and the majority have been no trouble --except this one. The trouble I’m having is all troubleshooting and debugging is showing a functional input configuration with generation of nice, happy data. Yet when I start telegraf, assuming it’ll process this input without issue, it never writes it’s data! Here are the steps I’ve taken:

[[inputs.exec]]
    interval = "10m"
    timeout = "900s"
    commands = ["/usr/local/bin/speedtest-wrapper"]
    data_format = "json"
    name_suffix = "_speedtest"
    json_string_fields = ["server_name","server_url","server_host"]

Here are the permissions on the wrapper:

# ls -ld /usr/local/bin/speedtest-wrapper
-rwxr-xr-x 1 root root 168 Jan 21 11:39 /usr/local/bin/speedtest-wrapper*

The contents of the script:

cat /usr/local/bin/speedtest-wrapper
#!/bin/bash

echo `date` >>/tmp/telegraf_date
# sleeptime=$(($RANDOM%1200))
# sleep $sleeptime
/bin/speedtest-cli --json
exit 0

Here’s an execution of the wrapper as user telegraf:

# sudo -u telegraf time /usr/local/bin/speedtest-wrapper

{"download": 23777119.54694799, "upload": 14309538.638977839, "ping": 1800000.0, "server": {"url": "http://us.bgp.nkeo.to
p:8080/speedtest/upload.php", "lat": "99.982", "lon": "-102.363", "name": "San Jose, CA", "country": "United States", "cc
": "US", "sponsor": "Neko Neko Cloud", "id": "46047", "host": "us.bgp.nkeo.top:8080", "d": 70.43544359346778, "latency":
1800000.0}, "timestamp": "2022-01-22T01:14:36.827956Z", "bytes_sent": 20226048, "bytes_received": 29808112, "share": null
, "client": {"ip": "1.1.199.118", "lat": "37.7562", "lon": "-122.4866", "isp": "asdf.net, LLC", "isprating": "3.7", "rati
ng": "0", "ispdlavg": "0", "ispulavg": "0", "loggedin": "0", "country": "US"}}

real    3m9.538s
user    0m1.202s
sys     0m1.402s

Here is a --test run of the defined input as user telegraf

# sudo -u telegraf telegraf --config /etc/telegraf/telegraf.d/exec.conf --config /etc/telegraf/telegraf.conf --test
2022-01-22T01:30:31Z I! Starting Telegraf 1.21.2
> exec_speedtest,host=proxy01 bytes_received=24176112,bytes_sent=17063936,download=18691198.516819958,ping=1800000,server
_d=49.56358932433334,server_host="speedtest.baynic.net:8080",server_latency=1800000,server_name="Fremont, CA",server_url=
"http://speedtest.baynic.net:8080/speedtest/upload.php",upload=12089092.321201608 1642811874000000000

Now without --test

sudo -u telegraf telegraf --config /etc/telegraf/telegraf.d/exec.conf --config /etc/telegraf/telegraf.conf
2022-01-22T01:34:22Z I! Starting Telegraf 1.21.2

I can see the execution of speedtest-cli at the defined interval…

cat /tmp/telegraf_date
Fri 21 Jan 2022 17:40:02 PST
ps auwx | grep speed
telegraf  982220  1.0  1.1  29548 23776 pts/2    S+   17:40   0:00 /usr/bin/python3 /bin/speedtest-cli --json

A few minutes later I can see data has successfully been written:

telegraf.log:
2022-01-22T01:43:14Z D! [outputs.influxdb] Wrote batch of 1 metrics in 11.641873ms

influxdb.log: 
Jan 21 17:43:14 influx001 influxd-systemd-start.sh[1959236]: [httpd] 192.168.20.62 - influx_telegraf [21/Jan/2022:17:43:1
4 -0800] "POST /write?db=telegraf HTTP/1.1 " 204 0 "-" "Telegraf/1.21.2 Go/1.17.5" aa4fe635-7b24-11ec-a431-10c37b4d9415 9
434

influx: 
> select * from exec_speedtest  ORDER BY time DESC
2022-01-22T01:43:12Z 28947952       19865600   22817584.22876692 proxy01 1800000 19.662641598786678 speedtest.open ...

At this point I assume everything is fine and I start telegraf normally with systemctl start telegraf. This is what the logs show:

2022-01-22T02:04:57Z I! Loaded inputs: exec
2022-01-22T02:04:57Z I! Loaded aggregators:
2022-01-22T02:04:57Z I! Loaded processors:
2022-01-22T02:04:57Z I! Loaded outputs: influxdb
2022-01-22T02:04:57Z I! Tags enabled: host=proxy01
2022-01-22T02:04:57Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"proxy01", Flush Interval:10s
2022-01-22T02:04:57Z D! [agent] Initializing plugins
2022-01-22T02:04:57Z D! [agent] Connecting outputs
2022-01-22T02:04:57Z D! [agent] Attempting connection to [outputs.influxdb]
2022-01-22T02:04:59Z W! [outputs.influxdb] When writing to [https://influxdb.fgh.net:8086]: database "telegraf" creation
 failed: 403 Forbidden
2022-01-22T02:04:59Z D! [agent] Successfully connected to outputs.influxdb
2022-01-22T02:04:59Z D! [agent] Starting service inputs
2022-01-22T02:05:09Z D! [outputs.influxdb] Buffer fullness: 0 / 10000 metrics
2022-01-22T02:05:19Z D! [outputs.influxdb] Buffer fullness: 0 / 10000 metrics
...
2022-01-22T02:31:49Z D! [outputs.influxdb] Buffer fullness: 0 / 10000 metrics

…and this just carries on indefinitely. I would expect something other than 0 / 10000 metrics roughly every 10-15 minutes but no data is ever written and nothing appears in the logs showing an issue.

Please help me determine what is going on here. This input could definitely use more messaging when debug is enabled!

Hi,

This is the only message that stands out to me. Does it look like the credentials and/or tokens are not getting accepted?

You are using two configuration files:

/etc/telegraf/telegraf.conf
/etc/telegraf/telegraf.d/exec.conf

By default, the first file will have influxdb output uncommented and require some configuration.