[[inputs.exec]] not being executed

Hi,

I’m trying to get [[inputs.exec]] to execute my command.

telegraf.conf

[agent]
interval = “2s”
round_interval = true
debug = true
quiet = false

[[inputs.exec]]
commands = [ “/disktemp.sh” ]
name_suffix = “_disktemp”
timeout = “5s”
interval = “15s”
data_format = “json”
tag_keys = [“disk”, “model”, “serial”, “capacity”]

[[processors.regex]]
[[processors.regex.tags]]
key = “host”
replacement = “nas”
pattern = “telegraf2”

[[outputs.influxdb_v2]]
urls = [“http://influxdb2:8086”]
token = “”
organization = “my-org”
bucket = “test-bucket”

test run:

$ telegraf -test -input-filter=exec -debug
2022-10-22T15:26:59Z I! Using config file: /etc/telegraf/telegraf.conf
2022-10-22T15:26:59Z I! Starting Telegraf 1.24.2
2022-10-22T15:26:59Z I! Available plugins: 222 inputs, 9 aggregators, 26 processors, 20 parsers, 57 outputs
2022-10-22T15:26:59Z I! Loaded inputs: exec
2022-10-22T15:26:59Z I! Loaded aggregators:
2022-10-22T15:26:59Z I! Loaded processors: regex
2022-10-22T15:26:59Z W! Outputs are not used in testing mode!
2022-10-22T15:26:59Z I! Tags enabled: host=telegraf2
2022-10-22T15:26:59Z D! [agent] Initializing plugins
2022-10-22T15:26:59Z D! [agent] Starting service inputs
2022-10-22T15:27:00Z D! [agent] Stopping service inputs
2022-10-22T15:27:00Z D! [agent] Input channel closed
2022-10-22T15:27:00Z D! [agent] Processor channel closed
2022-10-22T15:27:00Z D! [agent] Stopped Successfully

exec_disktemp,capacity=10000831348736,disk=/dev/sda,host=nas,model=WDC\ WD101EFAX-68LDBN0,serial=xzy temperature=43 1666452420000000000
exec_disktemp,capacity=10000831348736,disk=/dev/sdb,host=nas,model=WDC\ WD101EFAX-68LDBN0,serial=xzy temperature=45 1666452420000000000
exec_disktemp,capacity=10000831348736,disk=/dev/sdc,host=nas,model=WDC\ WD101EFAX-68LDBN0,serial=xzy temperature=45 1666452420000000000
exec_disktemp,capacity=10000831348736,disk=/dev/sdd,host=nas,model=WDC\ WD101EFAX-68LDBN0,serial=xzy temperature=43 1666452420000000000
exec_disktemp,capacity=10000831348736,disk=/dev/sde,host=nas,model=WDC\ WD101EFAX-68LDBN0,serial=xzy temperature=44 1666452420000000000

But it does not execute when I run it in non-test mode:

2022-10-22T15:41:21Z I! Using config file: /etc/telegraf/telegraf.conf
2022-10-22T15:41:21Z I! Starting Telegraf 1.24.2
2022-10-22T15:41:21Z I! Available plugins: 222 inputs, 9 aggregators, 26 processors, 20 parsers, 57 outputs
2022-10-22T15:41:21Z I! Loaded inputs: exec
2022-10-22T15:41:21Z I! Loaded aggregators:
2022-10-22T15:41:21Z I! Loaded processors: regex
2022-10-22T15:41:21Z I! Loaded outputs: influxdb_v2
2022-10-22T15:41:21Z I! Tags enabled: host=telegraf2
2022-10-22T15:41:21Z I! [agent] Config: Interval:2s, Quiet:false, Hostname:“telegraf2”, Flush Interval:10s
2022-10-22T15:41:21Z D! [agent] Initializing plugins
2022-10-22T15:41:21Z D! [agent] Connecting outputs
2022-10-22T15:41:21Z D! [agent] Attempting connection to [outputs.influxdb_v2]
2022-10-22T15:41:21Z D! [agent] Successfully connected to outputs.influxdb_v2
2022-10-22T15:41:21Z D! [agent] Starting service inputs
2022-10-22T15:41:23Z D! [agent] Stopping service inputs
2022-10-22T15:41:23Z D! [agent] Input channel closed
2022-10-22T15:41:23Z D! [agent] Processor channel closed
2022-10-22T15:41:23Z I! [agent] Hang on, flushing any cached metrics before shutdown
2022-10-22T15:41:23Z D! [outputs.influxdb_v2] Buffer fullness: 0 / 10000 metrics
2022-10-22T15:41:23Z I! [agent] Stopping running outputs
2022-10-22T15:41:23Z D! [agent] Stopped Successfully
2022-10-22T15:41:24Z I! Using config file: /etc/telegraf/telegraf.conf
2022-10-22T15:41:24Z I! Starting Telegraf 1.24.2
2022-10-22T15:41:24Z I! Available plugins: 222 inputs, 9 aggregators, 26 processors, 20 parsers, 57 outputs
2022-10-22T15:41:24Z I! Loaded inputs: exec
2022-10-22T15:41:24Z I! Loaded aggregators:
2022-10-22T15:41:24Z I! Loaded processors: regex
2022-10-22T15:41:24Z I! Loaded outputs: influxdb_v2
2022-10-22T15:41:24Z I! Tags enabled: host=telegraf2
2022-10-22T15:41:24Z I! [agent] Config: Interval:2s, Quiet:false, Hostname:“telegraf2”, Flush Interval:10s
2022-10-22T15:41:24Z D! [agent] Initializing plugins
2022-10-22T15:41:24Z D! [agent] Connecting outputs
2022-10-22T15:41:24Z D! [agent] Attempting connection to [outputs.influxdb_v2]
2022-10-22T15:41:24Z D! [agent] Successfully connected to outputs.influxdb_v2
2022-10-22T15:41:24Z D! [agent] Starting service inputs
2022-10-22T15:41:34Z D! [outputs.influxdb_v2] Buffer fullness: 0 / 10000 metrics
2022-10-22T15:41:44Z D! [outputs.influxdb_v2] Buffer fullness: 0 / 10000 metrics
2022-10-22T15:41:54Z D! [outputs.influxdb_v2] Buffer fullness: 0 / 10000 metrics

I am running telegraf in docker, hence the hostname being telegraf2.

But I for the life of me can’t figure out why it does not execute the inputs.exec.

Any hits would be welcome.

Hello @casperghst42,
I’m not sure. Let me ask the telegraf team.

Hi @casperghst42. I’m guessing when you ran telegraf in test mode it wasn’t in a docker container. Docker doesn’t expose block devices in containers. For example:

reim@debian:~$ ls /dev/nvme*
/dev/nvme0  /dev/nvme0n1  /dev/nvme0n1p1  /dev/nvme0n1p2  /dev/nvme0n1p3  /dev/nvme0n1p4  /dev/nvme0n1p5  /dev/nvme0n1p6
reim@debian:~$ docker run -it --rm bitnami/minideb /bin/bash
root@79f53c3db0d4:/# ls /dev/nvme*
ls: cannot access '/dev/nvme*': No such file or directory

(My disk devices are named differently than yours but that doesn’t matter)

I’m guessing that telegraf’s exec plugin is running your script but the script fails and doesn’t produce any metrics because the devices don’t exist in the container. I think you need to look into configuring docker to expose the block devices you want to access in your script.

Hi Reimda,

Thank you.

I ran telegraf in testing mode within the docker container.

docker-compose up -d telegraf2
docker exec -it telegraf2 /bin/bash
telegraf …

But I’ll have a look at to see if my shellscript is trying to fetch data from devices which returns an error.

This is the script I’m using (found it on reddit, and modified it a “bit”):

#!/usr/bin/env bash

# Runs smartctl to report current temperature of all disks.

JSON="["

**DISKS=$(ls /dev/sd?)**

for i in ${DISKS[@]} ; do
  # Get temperature from smartctl (requires root).
  TEMP=$(smartctl -l scttemp $i | grep '^Current Temperature:' | awk '{print $3}')
  MODEL=$(smartctl -i $i | grep "^Device Model:" | awk '{printf "%s %s", $3, $4}')
  CAPACITY=$(smartctl -i $i | grep "^User Capacity:" | awk '{print $3}' | sed 's/,//g' )
  SERIAL=$(smartctl -i $i | grep "^Serial Number:" | awk '{printf $3}')

  if [ ${TEMP:-0} -gt 0 ]
  then
    JSON=$(echo "${JSON}{")
    JSON=$(echo "${JSON}\"disk\":\"${i}\",")
    JSON=$(echo "${JSON}\"model\":\"${MODEL}\",")
    JSON=$(echo "${JSON}\"serial\":\"${SERIAL}\",")
    JSON=$(echo "${JSON}\"capacity\":\"${CAPACITY}\",")
    JSON=$(echo "${JSON}\"temperature\":${TEMP}")
    JSON=$(echo "${JSON}},")
  fi

done

# Remove trailing "," on last field.
JSON=$(echo ${JSON} | sed 's/,$//')

echo -e "${JSON}]"

This is the output it produces:

exec_disktemp,capacity=10000831348736,disk=/dev/sda,host=telegraf2,model=WDC\ WD101EFAX-68LDBN0,serial=VCHBDYWP temperature=39 1666710815000000000
exec_disktemp,capacity=10000831348736,disk=/dev/sdb,host=telegraf2,model=WDC\ WD101EFAX-68LDBN0,serial=VCGV4BVP temperature=39 1666710815000000000
exec_disktemp,capacity=10000831348736,disk=/dev/sdc,host=telegraf2,model=WDC\ WD101EFAX-68LDBN0,serial=VCH2JSEP temperature=40 1666710815000000000
exec_disktemp,capacity=10000831348736,disk=/dev/sdd,host=telegraf2,model=WDC\ WD101EFAX-68LDBN0,serial=VCGV54ZP temperature=39 1666710815000000000
exec_disktemp,capacity=10000831348736,disk=/dev/sde,host=telegraf2,model=WDC\ WD101EFAX-68LDBN0,serial=VCHHL2PP temperature=40 1666710815000000000

There are no errors, and it only check the /dev/sd?.

I tried to include the m2 device, and my server crashed.

This is the docker configuration, notice teh HOST_PROC and HOST_MOUNT_PREFIX which allowes the use of ipmi and smartctl. If you do not get it right, it simply will throw an error when you try to use smartctl.

    telegraf2:
        container_name: telegraf2
        hostname: telegraf2
        image: casperghst42/telegraf
        restart: unless-stopped
        user: root
        privileged: true
        networks:
            automation:
        volumes:
            - '/etc/localtime:/etc/localtime:ro'
            - '/etc/timezone:/etc/timezone:ro'
            - './data/telegraf2/config/telegraf.conf:/etc/telegraf/telegraf.conf:rw'
        environment:
            - HOST_PROC=/hostfs/proc
            - HOST_MOUNT_PREFIX=/hostfs
        depends_on:
            - influxdb2
        deploy:
            resources:
                limits:
                    memory: 128M

This is the Dockerfile:

FROM telegraf:latest
MAINTAINER casperghst42

RUN apt-get update && \
    apt-get install -yq \
    ipmitool smartmontools && \
# Cleanup
    apt-get clean && \
    rm -rf \
	/tmp/* \
	/var/lib/apt/lists/* \
	/var/tmp/*

COPY --chmod=755 disktemp.sh /disktemp.sh
CMD ["telegraf"]

I spend some more time on it, and it could be that there is a problem with the json consumer for exec.
I use this (telegraf/telegraf.conf at master · influxdata/telegraf · GitHub) as an example and change my configuration to:

[[inputs.exec]]
name_suffix = “_disktemp”
timeout = “5s”
interval = “15s”
data_format = “influx”
commands = [
“echo ‘deal,computer_name=hosta message="stuff" 1530654676316265790’”,
]

Which works, where as this:

[[inputs.exec]]
name_suffix = “_disktemp”
timeout = “5s”
interval = “15s”
data_format = “json”
tag_keys = [“host”]
commands = [
“echo ‘{"host":"telegraf2", "value":"testvalue" }’”,
]

Does not.

Found the problem, I missed this in the documentation:

This may be related to the Telegraf service running as a different user. The official packages run Telegraf as the telegraf user and group on Linux systems.

As @reimda pointed out it was a rights issue. Fixed it by using my own entrypoint.sh.