Issue monitoring host procstat from a docker instance

Hello all,

I’m hoping someone can help with an issue I’m having with telegraf 1.20.4 in a docker container only picking up a single process.

I am already running 3 other instances of Telegraf in my homelab (pushing to an InfluxDB); 2 linux (ubuntu server) and 1 Windows server. All of these are working as expected and capturing all processes.

I have an Alpine linux server running now for docker. I have followed the instructions from the docker page, and the github page and everything else seems to be capturing correctly; i did have a permissions issue with the docker input but found the following blog post explaining why and providing a fix (Docker: Run Telegraf as non-root | InfluxData).

The relevant parts of my telegraf.conf are as follows (note: this is pretty much a direct copy from my other 2 linux servers which are functioning correctly):

[[inputs.procstat]]
exe = “.”
fieldpass = [“cpu_time_system”, “cpu_time_user”, “cpu_usage”, “memory_*”, “num_threads”, “*pid”]

[[outputs.influxdb]]
urls = [“http://x.x.x.x:8086”]
database = “linuxstats”
[outputs.influxdb.tagdrop]
influxdb_database = [“*”]

This is my docker run command:

docker run -d
–name telegraf
–hostname alpinelinux
–user telegraf:$(stat -c ‘%g’ /var/run/docker.sock)
-v /var/run/docker.sock:/var/run/docker.sock
-v /opt/dockerconfigs/telegraf/telegraf.conf:/etc/telegraf/telegraf.conf:ro
-v /:/hostfs:ro
-e HOST_ETC=/hostfs/etc
-e HOST_PROC=/hostfs/proc
-e HOST_SYS=/hostfs/sys
-e HOST_VAR=/hostfs/var
-e HOST_RUN=/hostfs/run
-e HOST_MOUNT_PREFIX=/hostfs
telegraf

With this configuration/run command, I only see see pid 1 “init” in the procstat measurement for this server.
I have been investigating and have tried a few other things:

  • setting pid_finder = “native” in telegraf.conf. This changes the result, but now only the telegraf process is shown in the procstat measurement.
  • changing the docker run command to “–user telegraf:root” to add the telegraf user to the host root group (assuming that there was a permissions issue somewhere relating to the earlier blog post); no change.
  • adding “–group-add root” to the docker command (assuming a permissions issue again); no change.

The telegraf logs are not showing any errors and now I’m out of ideas.

I should also point out that I have installed telegraf directly onto the same Alpine linux server (i.e. not in a docker container) and it works just fine as expected.

Can someone please let me know what I’m doing wrong or what I have missed? I’ve been scouring the internet but no-one else seems to be reporting this problem (that I can find)

Many Thanks
Luke

Hello @Luke,
Hmm that’s odd. I’m sharing your question with the Telegraf team. Thank you.

Thankyou; much appreciated.

I’ve recently rebuilt my lab to run almost entirely out of docker and now have 3 ubuntu servers each running the telegraf container. I’m not capturing procstat at all anymore (since everything is in docker and the docker plugin works well) but I’ll try enabling it as a test and see if the issue persists.
In the meantime, if the Telegraf team have any ideas I’d be interested to know what the issue was (and if it was my own stupidity, which is a possibility :slight_smile: ).

Hi,

By default the procstat plugin will use pgrep. If you log in to the container and run pgrep . you will see two items come back. One for the prgrep and another for pid 1, which Telegraf is reporting. As you have discovered you will need to use the pid_finder = "native" option.

Now I have to admit I have a huge lack of understanding of all the procstat options, but this is what I did get working to show me a lot more stats:

[[inputs.procstat]]
    pattern = ".*"
    pid_finder = "native"

[[outputs.file]]

with this docker run:

docker run -d --name telegraf --hostname alpinelinux \
    --user telegraf:$(stat -c '%g' /var/run/docker.sock) \
    -v /var/run/docker.sock:/var/run/docker.sock \
    -v /home/powersj/telegraf/config.toml:/etc/telegraf/telegraf.conf:ro \
    -v /:/hostfs:ro \
    -e HOST_ETC=/hostfs/etc \
    -e HOST_PROC=/hostfs/proc \
    -e HOST_SYS=/hostfs/sys \
    -e HOST_VAR=/hostfs/var \
    -e HOST_RUN=/hostfs/run \
    -e HOST_MOUNT_PREFIX=/hostfs \
    telegraf

Does something like that work?

1 Like

Hi!
Any news on this? I have the same problem. No data is shown in my database except for the telegraf process itself. I have tried your suggestion @jpowers, with no success.

This worked for me, thanks! It’s hard to track down this information, and I suggest it be added to the procstat plugin docs. Usually when we use procstat, we are not interested in just the telegraf process, but all the processes on the system.

@sarke it would be really cool if you submit a PR against the docs to make this more clear. Would be very much appreciated…