Telegraf goes down

Why after configuring telegraf on a host and displaying metrics on Grafana i always see that the telegraf cannot collect metrics si in each time i shall access to the host and restart it i’am doing this every day !
Solution pleaaaaaaaase !

  1. Which version of Debian have you installed telegraf on?

  2. Which version of telegraf did you install?

  3. How did you install it (as a package, or did you compile from source), and
    where did you get it from (package repository or location of the source code)?

  4. What command/s are you using to restart telegraf?

  5. Does it stop at the same time each day?

  6. Is there anything shown in the telegraf log file to indicate why it is
    stopping?

  7. If there isn’t anything in the log file, have you tried increasing the
    verbosity to see whether it becomes more useful?

Regards,

Antony.

1- Linux debian 3.16.0-4-amd64 ~
2-Telegraf_1.14.1-1
3-using wget https://dl.influxdata.com/telegraf/releases/telegraf-1.14.1_amd64.deb then dpkg
4-telegraf restart
5-i’am not sure of that but i don’t think so
6-No just : Error writing to outputs.influxdb
7-how can i do that ?

The message “Error writing to outputs.influxdb” tells me that telegraf itself
is running, but is unable to write to InfluxDB (which may then later cause
telegraf to stop).

Have you checked whether InfluxDB is running continuously, and has enough disk
space to store what telegraf is sending it?

Antony.

No i didn’t because i really don’t know how !

I got another hosts who send metrics to the same database if that was a problem then all telegraf goes down !

Well, at least try logging more information from telegraf so you can see what
happens before it dies.

Edit your teelgraf.conf and find the line “# debug = false”.

Change it to, or add underneath it (then you can easily delete it later when
you no longer need it) “debug = true”.

Then restart telegraf, wait for it to die, and look at the end of the logfile
to see what happened just before it died.

Antony.

That’s the log file :
[outputs.influxdb] When writing to [http://localhost:8086]: Post [http://localhost:8086]: Post http://localhost:8086/write?db=telegraf: dial tcp [::1]:8086: connect: connection refused

There you go, then - the problem is not telegraf, but InfluxDB. It is
preventing telegraf from writing to it.

What do you find in /var/log/influxdb/influxdb.log ?

Antony.

I don’t have a log file on influxdb

tail: impossible d’ouvrir « /var/log/influxdb/influxdb.log » en lecture: Aucun fichier ou dossier de ce type

Have you deliberately disabled InfluxDB logging on this machine?

I’m using an almost untouched configuration of InfluxDB on Debian 9.12 and it
generates a log file for me by default.

Antony.

no i didn’t !
For my influxdb server i’am using centos 7

Sorry, but that makes no sense to me.

You told us that you were running telegraf on Debian 3.16.0-4-amd64, and the
error message in the log file tells us it’s trying to write to localhost,
therefore you can’t be running InfluxDB on CentOS.

Please clarify?

Antony.

i got telegraf running on 10 hosts and i got that problem of telegraf with 4 of my hosts 2 on debian /1 on ubuntu and the other on centos 6 ! But the influxdb server is running on centos 7

In that case you need to investigate the error message you posted earlier:

[outputs.influxdb] When writing to [http://localhost:8086]: Post
[http://localhost:8086]: Post http://localhost:8086/write?db=telegraf: dial
tcp [::1]:8086: connect: connection refused

and find out why telegraf is trying to write to localhost instead of your
remote InfluxDB server.

Antony.

that’s weird because even URLS is set to an OVH server

Try these commands. So the telegraf will run automatically.

systemctl enable --now telegraf
systemctl status telegraf

Let me know, if it resolved your issue.

No not yet !
Now i’am getting that error :
Post http://localhost:8086/write?db=telegraf: net/http: request canceled (Client.Timeout exceeded while awaiting headers)

Downgrade the version. and first install 1.12 and check. There is version issue.