Message lost between telegraf and influxDB

Hello,

I have a small project at home to collect data from MQTT (solar information) and send them to influx DB.
I expect telegraf to store these data until the influx DB server (my laptop) is back online, then to deliver them.

When everything is up and running, no issue, telegraf send the messages, influx DB gets them and my grafana dashboard is working well.

But as soon as the influx DB is off, and back on, I’m losing all messages in the meantime.

My telegraf config is not complex, and I don’t see how I can do better. Neither where any documentation would help me.

Any link / help would be welcome.

Regards
Pierre

[[outputs.influxdb_v2]]
  metric_buffer_limit = 100000
  urls = ["http://$INFLUX_SERVER:8086"]
  token = "$INFLUX_TOKEN"
  organization = "$GRAFANA_ORG_NAME"
  bucket = "$INFLUX_BUCKET"
  metric_batch_size = 200

@pmithrandir The metric_buffer_limit is the setting you’d use to buffer unsuccessful writes, which you do have in your config, but it’s not a plugin options, it’s a Telegraf agent option. (Same goes for metric_batch_size. Try this:

[agent]
  metric_buffer_limit = 100000
  metric_batch_size = 200

[[outputs.influxdb_v2]]
  urls = ["http://$INFLUX_SERVER:8086"]
  token = "$INFLUX_TOKEN"
  organization = "$GRAFANA_ORG_NAME"
  bucket = "$INFLUX_BUCKET"

Now just make sure that you wont have 100000 metrics written while InfluxDB is offline. If you do, you will lose any data beyond that limit. The side effects of a high limit are increased resource usage of the Telegraf process, so make sure whatever hardware you’re using to run Telegraf has the resources to support the buffer.

Hello,

Thank you for the answer.
I followed that doc and I thought these params could be set on output plugin: telegraf/docs/CONFIGURATION.md at master · influxdata/telegraf · GitHub

It’s still somehow not working great.

I got some metrics, but most just disappeared.
I’m gonna reduce the number of metrics on my solar panel DTU from 5sec to 30sec to see if it’s better.

Is there any way to know how many messages passes through telegraf ? any monitoring tool ?

I have lot of difficulty to find my way on the doc… sorry.

Pierre

To monitor Telegraf itself, enable the internal input plugin. This reports Telegraf metrics.

Hello,

I managed to enable the internal input plugin.

On grafana, I ended up to use the dashboard: GitHub - RobertoChiosa/grafana-dashboard-telegraf: A public dashboard for Telegraf metrics monitoring using Grafana
(just a bucket name to update)

Here is my telegraf.conf content, if someone wants an example

# Read metrics from MQTT topic(s)
[[inputs.mqtt_consumer]]
  ## Broker URLs for the MQTT server or cluster.  To connect to multiple
  ## clusters or standalone servers, use a separate plugin instance.
  ##   example: servers = ["tcp://localhost:1883"]
  ##            servers = ["ssl://localhost:1883"]
  ##            servers = ["ws://localhost:1883"]
  servers = ["tcp://$MQTT_SERVER:1883"]
  username = "$MQTT_USER"
  password = "$MQTT_PW"
  
  qos = 1
  max_undelivered_messages = 100000

  ## Persistent session disables clearing of the client session on connection.
  ## In order for this option to work you must also set client_id to identify
  ## the client.  To receive messages that arrived while the client is offline,
  ## also set the qos option to 1 or 2 and don't forget to also set the QoS when
  ## publishing.
  persistent_session = true

  ## If unset, a random client ID will be generated.
  client_id = "telegraf_nas"

  ## Topics that will be subscribed to.
  topics = [
    "solar/+/status/+",
    "solar/+/0/+",
    "solar/+/1/+",
    "solar/+/2/+",
    "solar/+/3/+",
    "solar/+/4/+"
  ]

data_format = "value"
data_type = "float"
tagexclude = ["host","topic"]

[agent]
metric_batch_size = 2000
# around 88 metrics per minutes.
metric_buffer_limit = 3000000
interval = "30s"
flush_interval = "60s"

[[inputs.internal]]
  collect_memstats = true

[[inputs.mqtt_consumer.topic_parsing]]
  topic = "solar/+/+/+"
  tags = "_/serial/channel/field"
  [[processors.pivot]]
    tag_key = "field"
    value_key = "value"

# store it in influx
[[outputs.influxdb_v2]]
  urls = ["http://$INFLUX_SERVER:8086"]
  token = "$INFLUX_TOKEN"
  organization = "$GRAFANA_ORG_NAME"
  bucket = "$INFLUX_BUCKET"

Thank you for your help !

Pierre