Running Telegraf as a service Linux

Hello InfluxCommunity,

I am trying to get Telegraf to run as a service on my Ubuntu server. It seems to run okay from the path itself but when I start it from a service it fails.

Telegraf works when it is ran from path:
/home/user/telegraf/usr/bin/telegraf -config /home/user/telegraf/etc/telegraf/telegraf.conf -config-directory /home/user/telegraf/etc/telegraf/telegraf.d &

I tried editing the service file’s EXEC start path to above, it still fails to run as a service.

Any ideas?

Details of the telegraf service unit file

user@ubuntu-20:~$ cat /etc/systemd/system/telegraf.service
[Unit]
Description=The plugin-driven server agent for reporting metrics into InfluxDB
Documentation=https://github.com/influxdata/telegraf
After=network.target

[Service]
EnvironmentFile=-/etc/default/telegraf
User=telegraf
ExecStart=/usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d $TELEGRAF_OPTS
ExecReload=/bin/kill -HUP $MAINPID
Restart=on-failure
RestartForceExitStatus=SIGPIPE
KillMode=control-group

[Install]
WantedBy=multi-user.target

systemctl status telegraf error:

user@ubuntu-20:~$ sudo systemctl status telegraf
● telegraf.service - The plugin-driven server agent for reporting metrics into InfluxDB
     Loaded: loaded (/etc/systemd/system/telegraf.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Wed 2021-12-01 20:41:50 EST; 5min ago
       Docs: https://github.com/influxdata/telegraf
    Process: 256986 ExecStart=/usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d $TELEGRAF_OPTS (code=exited, status=217/USER)
   Main PID: 256986 (code=exited, status=217/USER)

Dec 01 20:41:50 ubuntu-20 systemd[1]: telegraf.service: Scheduled restart job, restart counter is at 5.
Dec 01 20:41:50 ubuntu-20 systemd[1]: Stopped The plugin-driven server agent for reporting metrics into InfluxDB.
Dec 01 20:41:50 ubuntu-20 systemd[1]: telegraf.service: Start request repeated too quickly.
Dec 01 20:41:50 ubuntu-20 systemd[1]: telegraf.service: Failed with result 'exit-code'.
Dec 01 20:41:50 ubuntu-20 systemd[1]: Failed to start The plugin-driven server agent for reporting metrics into InfluxDB.

Is there anything enlightening in the systemd logs with journalctl ?

Hi Franky1,

Is this the output you want to see? Not sure if I am using journalctl correctly.

user@ubuntu-20:~$ journalctl -f -u telegraf.service
Hint: You are currently not seeing messages from other users and the system.
      Users in groups 'adm', 'systemd-journal' can see all messages.
      Pass -q to turn off this notice.
-- Logs begin at Wed 2020-10-14 11:54:18 EDT. --
2021-12-06T14:12:48Z E! Output [wavefront] unexpected type: string, with value: 92 days, 20:30, for: system.uptime.format
2021-12-06T14:12:48Z E! Output [wavefront] unexpected type: string, with value: 92 days, 20:35, for: system.uptime.format
2021-12-06T14:12:48Z E! Output [wavefront] unexpected type: string, with value: 92 days, 20:40, for: system.uptime.format
2021-12-06T14:12:48Z E! Output [wavefront] unexpected type: string, with value: 92 days, 20:45, for: system.uptime.format
2021-12-06T14:12:48Z E! Output [wavefront] unexpected type: string, with value: 92 days, 20:50, for: system.uptime.format
2021-12-06T14:12:48Z E! Output [wavefront] unexpected type: string, with value: 92 days, 20:55, for: system.uptime.format
2021-12-06T14:12:48Z E! Output [wavefront] unexpected type: string, with value: 92 days, 21:00, for: system.uptime.format
2021-12-06T14:12:48Z E! Output [wavefront] unexpected type: string, with value: 92 days, 21:05, for: system.uptime.format
2021-12-06T14:12:48Z E! Output [wavefront] unexpected type: string, with value: 92 days, 21:10, for: system.uptime.format
2021-12-06T14:12:48Z E! Output [wavefront] unexpected type: string, with value: 92 days, 21:15, for: system.uptime.format
2021-12-06T14:12:48Z E! Output [wavefront] unexpected type: string, with value: 92 days, 21:20, for: system.uptime.format
2021-12-06T14:12:48Z E! Output [wavefront] unexpected type: string, with value: 92 days, 21:25, for: system.uptime.format
2021-12-06T14:12:48Z E! Output [wavefront] unexpected type: string, with value: 92 days, 21:30, for: system.uptime.format
2021-12-06T14:12:48Z E! Output [wavefront] unexpected type: string, with value: 92 days, 21:35, for: system.uptime.format
2021-12-06T14:12:48Z E! Output [wavefront] unexpected type: string, with value: 92 days, 21:40, for: system.uptime.format
2021-12-06T14:12:48Z E! Output [wavefront] unexpected type: string, with value: 92 days, 21:45, for: system.uptime.format
2021-12-06T14:12:48Z E! Output [wavefront] unexpected type: string, with value: 92 days, 21:50, for: system.uptime.format
2021-12-06T14:12:48Z E! Output [wavefront] unexpected type: string, with value: 92 days, 21:55, for: system.uptime.format
2021-12-06T14:12:48Z E! Output [wavefront] unexpected type: string, with value: 92 days, 22:00, for: system.uptime.format
2021-12-06T14:12:48Z E! [agent] Error writing to outputs.http: when writing to [https://10.229.42.57/arc/default/metric] received status code: 400
2021-12-06T14:13:02Z E! Output [wavefront] unexpected type: string, with value: 93 days,  5:00, for: system.uptime.format

There is a difference between:

/home/user/telegraf/usr/bin/telegraf
-config /home/user/telegraf/etc/telegraf/telegraf.conf
-config-directory /home/user/telegraf/etc/telegraf/telegraf.d

and

ExecStart=
/usr/bin/telegraf
-config /etc/telegraf/telegraf.conf
-config-directory /etc/telegraf/telegraf.d
$TELEGRAF_OPTS

I strongly suspect that the two configuration files have different content (the
first one is probably correct, since you say this works, the second one is not,
since it produces the data format error message).

Antony.

Hi Antony,

I tried editing the telegraf.service unit file to use the same configuration files that works from the original path and I am still getting the error message.

Is there something wrong with my telegraf.service unit file?

[Unit]
Description=The plugin-driven server agent for reporting metrics into InfluxDB
Documentation=https://github.com/influxdata/telegraf
After=network.target

[Service]
EnvironmentFile=-/etc/default/telegraf
User=telegraf
ExecStart=/home/user/telegraf/usr/bin/telegraf -config /home/user/telegraf/etc/telegraf/telegraf.conf -config-directory /home/user/telegraf/etc/telegraf/tele>ExecReload=/bin/kill -HUP $MAINPID
Restart=on-failure
RestartForceExitStatus=SIGPIPE
KillMode=control-group

[Install]
WantedBy=multi-user.target

Thank you Antony, for pointing me in the right direction.

I’m able to get it to run as a service now but service telegraf status shows that the service fails. Journalctl -fu telegraf shows that the telegraf service is runnning.

user@dtc-ubuntu-20:~$ service telegraf status
● telegraf.service - The plugin-driven server agent for reporting metrics into InfluxDB
     Loaded: loaded (/etc/systemd/system/telegraf.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Mon 2021-12-06 16:16:49 EST; 4min 25s ago
       Docs: https://github.com/influxdata/telegraf
    Process: 1148 ExecStart=/home/user/telegraf/usr/bin/telegraf -config /home/user/telegraf/etc/telegraf/telegraf.conf -config-directory /home/user/telegraf/etc/telegraf/telegraf.d & $TELEGRAF_OPTS (code=exited, status=2)
   Main PID: 1148 (code=exited, status=2)
user@dtc-ubuntu-20:~$ cat /etc/systemd/system/telegraf.service
[Unit]
Description=The plugin-driven server agent for reporting metrics into InfluxDB
Documentation=https://github.com/influxdata/telegraf
After=network.target

My EnvironmentFile and user was incorrect in the original telegraf.service unit file. Edited telegraf.service unit file below.

[Service]
EnvironmentFile=-/home/user/telegraf/usr/bin/telegraf
User=user
ExecStart=/home/user/telegraf/usr/bin/telegraf -config /home/user/telegraf/etc/telegraf/telegraf.conf -config-directory /home/user/telegraf/etc/telegraf/telegraf.d & $TELEGRAF_OPTS
ExecReload=/bin/kill -HUP $MAINPID
Restart=on-failure
RestartForceExitStatus=SIGPIPE
KillMode=control-group

[Install]
WantedBy=multi-user.target