Use Telegraf to get metrics from Raspberry Pi

Greetings. I have my InfluxDB v2.3 running on my LAN at 192.168.7.11. Woo hoo!

I have a Raspbery Pi on the LAN at 192.168.7.44 and I installed Telegraf on the Pi so that I can monitor the Pi’s metrics in Influx. There are many posts about this, but I thought this one seemed the best:

I followed the instructions and I got nothing! Let me recap the basics and maybe something will jump out for you readers as to something I missed…

  1. Installed telegraf using sudo apt-get update && sudo apt-get install telegraf and everything went smoothly to end up with 1.24.1
  2. telegraf.conf file is located in /etc/telegraf
  3. the only things I changed in telegraf.conf were:
# # Configuration for sending metrics to InfluxDB 2.0
# [[outputs.influxdb_v2]]
#   ## The URLs of the InfluxDB cluster nodes.
#   ##
#   ## Multiple URLs can be specified for a single cluster, only ONE of the
#   ## urls will be written to each interval.
#   ##   ex: urls = ["https://us-west-2-1.aws.cloud2.influxdata.com"]
urls = ["http://192.168.7.11:8086"]
#
#   ## Token for authentication.
token = "---my token here, but kept from prying eyes---"
#
#   ## Organization is the name of the organization you wish to write to.
organization = "h1-org"
#
#   ## Destination bucket to write into.
bucket = "rasp-pi"
#
#   ## The value of this tag will be used to determine the bucket.  If this
#   ## tag is not set the 'bucket' option is used as the default.
  1. I entered the command systemctl start telegraf and it asked me for the password of the Rasbperry Pi, which I successfully entered and it said ==== AUTHENTICATION COMPLETE ===
  2. I got nothing on the Influx dashboard (bucket: rasp-pi), so I typed sudo service telegraf start into the terminal window and still get nothing in Influx.

Since I am actively populating other buckets in Influx for many months now and have loads of data that I can view, query, etc., I can definitely say Influx is set up fine. The problem seems that Telegraf is not sending out the data. What did I miss?

Greetings. I have my InfluxDB v2.3 running on my LAN at 192.168.7.11. Woo
hoo!

Have you confirmed (eg: with “netstat -lptn”) that it is listening on that IP
address, port 8086 (and not perhaps only listening on localhost 127.0.0.1)?

  1. the only things I changed in telegraf.conf were:

# Configuration for sending metrics to InfluxDB 2.0

[[outputs.influxdb_v2]]

Have you genuinely left that line commented out (by the # at the start)?

If so, I suggest deleting the #, try again, and report back here :slight_smile:

Antony.

Similar to the above, I would check your logs from the Telegraf service to get a better idea of what might be erroring out. It could be a config file mistake or something else.

journalctl --follow --unit=telegraf

OK, I missed removing the # as @Pooh pointed out, but even with that line now commented out, I still get no data.

I do not have command line access to the Influx machine (192.168.7.11) until tomorrow, so cannot check that it is listening on that IP address.

However, do I need to start Telegraf on the above machine? EDIT: I tried running this command on the Pi, and it says “Error getting HTTP config”, so clearly it is intended for the Influx machine. I will do that on Saturday.

gives the following:

Sep 23 21:39:13 raspberrypi telegraf[18983]: 2022-09-24T01:39:13Z E! [agent] Error writing to outputs.influxdb: could not write any address
Sep 23 21:39:23 raspberrypi telegraf[18983]: 2022-09-24T01:39:23Z W! [outputs.influxdb] Metric buffer overflow; 18 metrics have been dropped
Sep 23 21:39:23 raspberrypi telegraf[18983]: 2022-09-24T01:39:23Z E! [outputs.influxdb] When writing to [http://localhost:8086]: failed doing req: Post "http://localhost:8086/write?db=telegraf": dial tcp [::1]:8086: connect: connection refused

Is the problem with the Pi not being able to send out the metrics, or with the Influx machine not being able to accept them?

EDIT: Answering my own question, the message returned above seems clear enough:

> dial tcp [::1]:8086: connect: connection refused

@Pooh

Here is the contents of netstat -lptn

Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      -
tcp        0      0 127.0.0.1:631           0.0.0.0:*               LISTEN      -
tcp        0      0 127.0.0.1:41943         0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:1880            0.0.0.0:*               LISTEN      7112/node-red
tcp        0      0 0.0.0.0:1883            0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:55261           0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:46159           0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:111             0.0.0.0:*               LISTEN      -
tcp        0      0 127.0.0.1:5939          0.0.0.0:*               LISTEN      -
tcp6       0      0 :::22                   :::*                    LISTEN      -
tcp6       0      0 :::8086                 :::*                    LISTEN      -
tcp6       0      0 ::1:631                 :::*                    LISTEN      -
tcp6       0      0 :::3000                 :::*                    LISTEN      -
tcp6       0      0 :::35805                :::*                    LISTEN      -
tcp6       0      0 :::111                  :::*                    LISTEN      -
tcp6       0      0 :::80                   :::*                    LISTEN      -
tcp6       0      0 :::52627                :::*                    LISTEN      -
tcp6       0      0 :::1716                 :::*                    LISTEN      -

Given then above, do I need to change anything?

From your original post it sounded like telegraf and influxdb are on different machines. Per the above error message your telegraf is trying to write to InfluxDB on the same local machine, given the IP address is “localhost”. Is that what you intended?

It sounded like you had influxdb running on 192.168.7.11, as such your Telegraf config file should have that IP address there as well.

Thanks @jpowers

Let me back up and you can clarify what I should do.

I have a Raspberry Pi at 192.168.7.44. I installed Telegraf on the Pi so that I can monitor the Pi’s metrics in Influx. My telegraf.conf file on the Pi is:

# # Configuration for sending metrics to InfluxDB 2.0
[[outputs.influxdb_v2]]
#   ## The URLs of the InfluxDB cluster nodes.
#   ##
#   ## Multiple URLs can be specified for a single cluster, only ONE of the
#   ## urls will be written to each interval.
#   ##   ex: urls = ["https://us-west-2-1.aws.cloud2.influxdata.com"]
urls = ["http://192.168.7.11:8086"]
#
#   ## Token for authentication.
token = "---my token here, but kept from prying eyes---"
#
#   ## Organization is the name of the organization you wish to write to.
organization = "h1-org"
#
#   ## Destination bucket to write into.
bucket = "rasp-pi"

I have InfluxDB running for many months at 192.168.7.11.

Question: Do I need to have Telegraf installed on the 7.11 machine? I presumed that I did, so I installed that over the weekend.

Question 2: Assuming the answer to #1 above is YES, then do I need to modify the telegraf.config file on the 7.11 machine? What do I need to modify?

Sorry for such stupid questions, but honestly there are no really good step-by-step tutorials for my situation. They seem to gloss over key steps that I have apparently missed.

These aren’t stupid :slight_smile:

Question: Do I need to have Telegraf installed on the 7.11 machine? I presumed that I did, so I installed that over the weekend.

No, it is very common, in fact I’d say most people have telegraf running remotely on different machines and then push data to a single influxdb system.

urls = [“http://192.168.7.11:8086”]

This is what I don’t understand, your config has 192.168.7.11 as the system for influxdb, but the error message clearly said it was trying to send to localhost.

Can you search the config file for localhost? and see if anything is commentated out?

If that does not find anything, what would really help is the full Telegraf log after a fresh restart of the service.

Thanks!

OK, so evaluating in depth my telegraf.conf file from the Pi, I found no references to localhost:8086 that were uncommented. However, I did find that the outputs.influxdb line was NOT commented out, so I fixed that.

Before:
image

After:
image

However, journalctl --follow --unit=telegraf still gives the same output as before:

Sep 26 10:48:57 raspberrypi telegraf[18983]: 2022-09-26T14:48:57Z E! [agent] Error writing to outputs.influxdb: could not write any address
Sep 26 10:49:07 raspberrypi telegraf[18983]: 2022-09-26T14:49:07Z W! [outputs.influxdb] Metric buffer overflow; 18 metrics have been dropped
Sep 26 10:49:07 raspberrypi telegraf[18983]: 2022-09-26T14:49:07Z E! [outputs.influxdb] When writing to [http://localhost:8086]: failed doing req: Post "http://localhost:8086/write?db=telegraf": dial tcp [::1]:8086: connect: connection refused

Not sure if this matters, but when I do ls of /etc/telegraf, I get this:

telegraf.conf  telegraf.conf.sample  telegraf.d

Given that the message from the journalctl command is still the same (connection refused), do I need to do anything on my InfluxDB config file to allow Telegraf to post the data there?

Finally, how do I do a fresh restart of the service?

However, I did find that the outputs.influxdb line was NOT commented out, so I fixed that.

To be certain, you also commented out all the config options below it? if any were communed out in the first place.

Finally, how do I do a fresh restart of the service?

Run the following to restart the service:

sudo systemctl restart telegraf

You need to do this after any change to the config.

@jpowers

We have success! Thank you. As this is my first working instance of Telegraf, I can totally see myself going gangbusters using this on every device possible.

I actually had to reboot the Pi for other reasons, and when it came back I believe it automatically started telegraf because I believe I configured that to run on boot.

Below is the dashboard in Influx. All looks great except for the areas circled in yellow. If I scour the telegraf.conf file, should I be able to find these in the inputs section?

Thanks again to both you and @pooh for helping me along.

Awesome! Glad to hear.

All looks great except for the areas circled in yellow. If I scour the telegraf.conf file, should I be able to find these in the inputs section?

I’m not totally familiar with that dashboard, but I assume it is data coming from the temp plugin. It might be that your raspberry pi model doesn’t expose metrics.

For the network data, I would look at what interfaces you have (e.g. ip a) and then ensure the telegraf config is set up with those interfaces per the readme. Remember to restart telegraf after a config change.

edit: I did see some other config on the dashboard readme around joining the video group:

sudo usermod -a -G video telegraf
sudo -u telegraf vcgencmd measure_temp

That might get you GPU temps.

CPU & GPU temps were pretty straightforward to get. For those who are curious…

Add this to your telegraf.conf file:

[[inputs.file]]
  files = ["/sys/class/thermal/thermal_zone0/temp"]
  name_override = "cpu_temperature"
  data_format = "value"
  data_type = "integer"

[[inputs.exec]]
  commands = [ "/usr/bin/vcgencmd measure_temp" ]
  name_override = "gpu_temperature"
  data_format = "grok"
  grok_patterns = ["%{NUMBER:value:float}"]

and make these adjustments to the Flux queries:

from(bucket: v.bucket)
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "cpu_temperature")
  |> filter(fn: (r) => r["host"] == v.linux_host)
  |> filter(fn: (r) => r["_field"] == "value")
  |> last()
  |> map(fn: (r) => ({r with _value: r._value /1000}))
  |> yield(name: "last")

and

from(bucket: v.bucket)
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["host"] == v.linux_host)
  |> filter(fn: (r) => r["_measurement"] == "gpu_temperature")
  |> filter(fn: (r) => r["_field"] == "value")
  |> last()
  |> yield(name: "last")
1 Like