Issue monitoring host procstat from a docker instance

Luke · November 21, 2021, 12:14pm

Hello all,

I’m hoping someone can help with an issue I’m having with telegraf 1.20.4 in a docker container only picking up a single process.

I am already running 3 other instances of Telegraf in my homelab (pushing to an InfluxDB); 2 linux (ubuntu server) and 1 Windows server. All of these are working as expected and capturing all processes.

I have an Alpine linux server running now for docker. I have followed the instructions from the docker page, and the github page and everything else seems to be capturing correctly; i did have a permissions issue with the docker input but found the following blog post explaining why and providing a fix (Docker: Run Telegraf as non-root | InfluxData).

The relevant parts of my telegraf.conf are as follows (note: this is pretty much a direct copy from my other 2 linux servers which are functioning correctly):

[[inputs.procstat]]
exe = “.”
fieldpass = [“cpu_time_system”, “cpu_time_user”, “cpu_usage”, “memory_*”, “num_threads”, “*pid”]

[[outputs.influxdb]]
urls = [“http://x.x.x.x:8086”]
database = “linuxstats”
[outputs.influxdb.tagdrop]
influxdb_database = [“*”]

This is my docker run command:

docker run -d
–name telegraf
–hostname alpinelinux
–user telegraf:$(stat -c ‘%g’ /var/run/docker.sock)
-v /var/run/docker.sock:/var/run/docker.sock
-v /opt/dockerconfigs/telegraf/telegraf.conf:/etc/telegraf/telegraf.conf:ro
-v /:/hostfs:ro
-e HOST_ETC=/hostfs/etc
-e HOST_PROC=/hostfs/proc
-e HOST_SYS=/hostfs/sys
-e HOST_VAR=/hostfs/var
-e HOST_RUN=/hostfs/run
-e HOST_MOUNT_PREFIX=/hostfs
telegraf

With this configuration/run command, I only see see pid 1 “init” in the procstat measurement for this server.
I have been investigating and have tried a few other things:

setting pid_finder = “native” in telegraf.conf. This changes the result, but now only the telegraf process is shown in the procstat measurement.
changing the docker run command to “–user telegraf:root” to add the telegraf user to the host root group (assuming that there was a permissions issue somewhere relating to the earlier blog post); no change.
adding “–group-add root” to the docker command (assuming a permissions issue again); no change.

The telegraf logs are not showing any errors and now I’m out of ideas.

I should also point out that I have installed telegraf directly onto the same Alpine linux server (i.e. not in a docker container) and it works just fine as expected.

Can someone please let me know what I’m doing wrong or what I have missed? I’ve been scouring the internet but no-one else seems to be reporting this problem (that I can find)

Many Thanks
Luke

Anaisdg · December 13, 2021, 7:11pm

Hello @Luke,
Hmm that’s odd. I’m sharing your question with the Telegraf team. Thank you.

Luke · December 17, 2021, 12:04pm

Thankyou; much appreciated.

I’ve recently rebuilt my lab to run almost entirely out of docker and now have 3 ubuntu servers each running the telegraf container. I’m not capturing procstat at all anymore (since everything is in docker and the docker plugin works well) but I’ll try enabling it as a test and see if the issue persists.
In the meantime, if the Telegraf team have any ideas I’d be interested to know what the issue was (and if it was my own stupidity, which is a possibility ).

jpowers · December 23, 2021, 7:07pm

Hi,

By default the procstat plugin will use pgrep. If you log in to the container and run pgrep . you will see two items come back. One for the prgrep and another for pid 1, which Telegraf is reporting. As you have discovered you will need to use the pid_finder = "native" option.

Now I have to admit I have a huge lack of understanding of all the procstat options, but this is what I did get working to show me a lot more stats:

[[inputs.procstat]]
    pattern = ".*"
    pid_finder = "native"

[[outputs.file]]

with this docker run:

docker run -d --name telegraf --hostname alpinelinux \
    --user telegraf:$(stat -c '%g' /var/run/docker.sock) \
    -v /var/run/docker.sock:/var/run/docker.sock \
    -v /home/powersj/telegraf/config.toml:/etc/telegraf/telegraf.conf:ro \
    -v /:/hostfs:ro \
    -e HOST_ETC=/hostfs/etc \
    -e HOST_PROC=/hostfs/proc \
    -e HOST_SYS=/hostfs/sys \
    -e HOST_VAR=/hostfs/var \
    -e HOST_RUN=/hostfs/run \
    -e HOST_MOUNT_PREFIX=/hostfs \
    telegraf

Does something like that work?

gstrommer · May 10, 2023, 10:24am

Hi!
Any news on this? I have the same problem. No data is shown in my database except for the telegraf process itself. I have tried your suggestion @jpowers, with no success.

sarke · July 7, 2023, 2:16am

This worked for me, thanks! It’s hard to track down this information, and I suggest it be added to the procstat plugin docs. Usually when we use procstat, we are not interested in just the telegraf process, but all the processes on the system.

srebhan · July 12, 2023, 8:36am

@sarke it would be really cool if you submit a PR against the docs to make this more clear. Would be very much appreciated…

Topic		Replies	Views
Use docker-telegraf to collect procstat metrics from docker-app Telegraf telegraf	0	694	April 28, 2020
Telegraf > Procstat > E! Error in plugin [inputs.procstat] Telegraf telegraf , grafana	8	4952	December 13, 2017
I want to collect process stats for all the process including root access. I am using proc_stat plugin but that is only collecting process info run by telegraf.. how do i get the complete process info data Telegraf	3	549	July 28, 2023
Procstat plugin returns process_name='pgrep' when the given pattern is not found Telegraf telegraf	8	463	September 6, 2023
Collection all process stats using telegraf Telegraf	2	3714	April 12, 2018

Issue monitoring host procstat from a docker instance

Related topics