Telegraf kubernetes in GCP fails to start with "su-exec: telegraf: Operation not permitted" error

I’ve installed the Telegraf 1.25.2 helm chart from telegraf 1.8.26 · helm/influxdata into my Google Cloud Platform (GCP) kubernetes cluster but the startup fails with a “su-exec: telegraf: Operation not permitted” error.

The only warning I saw during installation was about letting auto-pilot pick the sizes for the pod VMs. It didn’t seem related to the su-exec error.

Here is the command I ran to deploy the chart and its output:

$ helm upgrade --install telegraf influxdata/telegraf
Release "telegraf" does not exist. Installing it now.
W0308 13:42:40.493507   98179 warnings.go:70] Autopilot set default resource requests for Deployment metrics/telegraf, as resource requests were not specified. See g.co gke autopilot-defaults
NAME: telegraf
LAST DEPLOYED: Wed Mar  8 13:42:36 2023
NAMESPACE: metrics
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
To open a shell session in the container running Telegraf run the following:

  kubectl exec -i -t --namespace metrics $(kubectl get pods --namespace metrics -l app.kubernetes.io/name=telegraf -o jsonpath='{.items[0].metadata.name}') /bin/sh

To view the logs for a Telegraf pod, run the following:

  kubectl logs -f --namespace metrics $(kubectl get pods --namespace metrics -l app.kubernetes.io/name=telegraf -o jsonpath='{ .items[0].metadata.name }')

The kubectl commands afterwards fail because no pod is successfully started:

$ kubectl exec -i -t --namespace metrics $(kubectl get pods --namespace metrics -l app.kubernetes.io/name=telegraf -o jsonpath='{.items[0].metadata.name}') /bin/sh
kubectl exec POD COMMAND is DEPRECATED and will be removed in a future version. Use kubectl exec POD -- COMMAND instead.
error: unable to upgrade connection: container not found ("telegraf")

$ kubectl logs -f --namespace metrics $(kubectl get pods --namespace metrics -l app.kubernetes.io/name=telegraf -o jsonpath='{ .items[0].metadata.name }')
su-exec: telegraf: Operation not permitted

I only found docker issues when googling for su-exec errors. They didn’t seem related to kubernetes.

Has anyone seen this error in k8s before?

The error comes from our alpine entrypoint.sh. It runs that if you are running as root. Are you certain you can run containers as root?

Interesting question. I’m a kubernetes newbie and followed the install instructions on the helm chart page above. Are there instructions for how to modify the chart or deployment to not run as root?

For anyone else running in GCP, I had to add this to my Telegraf Values.yaml to get k8s to run the pod as the “telegraf” user (uid=999, gid=999):

countainerSecurityContext:
  runAsUser: 999
  runAsGroup: 999
1 Like