Hello all,
I’m hoping someone can help with an issue I’m having with telegraf 1.20.4 in a docker container only picking up a single process.
I am already running 3 other instances of Telegraf in my homelab (pushing to an InfluxDB); 2 linux (ubuntu server) and 1 Windows server. All of these are working as expected and capturing all processes.
I have an Alpine linux server running now for docker. I have followed the instructions from the docker page, and the github page and everything else seems to be capturing correctly; i did have a permissions issue with the docker input but found the following blog post explaining why and providing a fix (Docker: Run Telegraf as non-root | InfluxData).
The relevant parts of my telegraf.conf are as follows (note: this is pretty much a direct copy from my other 2 linux servers which are functioning correctly):
[[inputs.procstat]]
exe = “.”
fieldpass = [“cpu_time_system”, “cpu_time_user”, “cpu_usage”, “memory_*”, “num_threads”, “*pid”]
[[outputs.influxdb]]
urls = [“http://x.x.x.x:8086”]
database = “linuxstats”
[outputs.influxdb.tagdrop]
influxdb_database = [“*”]
This is my docker run command:
docker run -d
–name telegraf
–hostname alpinelinux
–user telegraf:$(stat -c ‘%g’ /var/run/docker.sock)
-v /var/run/docker.sock:/var/run/docker.sock
-v /opt/dockerconfigs/telegraf/telegraf.conf:/etc/telegraf/telegraf.conf:ro
-v /:/hostfs:ro
-e HOST_ETC=/hostfs/etc
-e HOST_PROC=/hostfs/proc
-e HOST_SYS=/hostfs/sys
-e HOST_VAR=/hostfs/var
-e HOST_RUN=/hostfs/run
-e HOST_MOUNT_PREFIX=/hostfs
telegraf
With this configuration/run command, I only see see pid 1 “init” in the procstat measurement for this server.
I have been investigating and have tried a few other things:
- setting pid_finder = “native” in telegraf.conf. This changes the result, but now only the telegraf process is shown in the procstat measurement.
- changing the docker run command to “–user telegraf:root” to add the telegraf user to the host root group (assuming that there was a permissions issue somewhere relating to the earlier blog post); no change.
- adding “–group-add root” to the docker command (assuming a permissions issue again); no change.
The telegraf logs are not showing any errors and now I’m out of ideas.
I should also point out that I have installed telegraf directly onto the same Alpine linux server (i.e. not in a docker container) and it works just fine as expected.
Can someone please let me know what I’m doing wrong or what I have missed? I’ve been scouring the internet but no-one else seems to be reporting this problem (that I can find)
Many Thanks
Luke