Some data not written by telegraf into influxDB after telegraf restarted

Hello all,

i am doing experiment related with telegraf realibity, this case to proof all messages from broker into telegraf will still exists and buffered in case the telegraf service is restarted.

i have been set qos = 2; max_undelivered_messages = 100; persistent_session = true; client_id = “telegraf_1” in [[inputs.mqtt_consumer]]. The fact is the telegraf input mqtt consumer received the message (from debug message), while writing into influxdb some datas is lost, example the buffered messages from broker is 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, but telegraf only write 21, 25, 29, 30 into influxdb after restarted the telegraf service.

is there anybody know how to solve this?

Thank you all

It is possible that this is because the client that published the messages didn’t have qos = 2. If you are test publishing with mosquitto_pub make sure you have set -q 2.

Hi @daniel, thank you for your answer, all of them already set become -q 2 so the mqtt consumer still receive complete messages and output influxdb plugin can ingest all the data completely (all i check from debug message and print all ingestion in file also), is it possible this problem happen because of influxdb will loss some data due to high ingestion rate from telegraf? if yes, did you know how to solve this one?

thank you

I think i write wrong title which actually should be “InfluxDB loss data during Telegraf High Ingestion Write into InfluxDB”, is there anyone know how to solve this? thank you all

i am also curious, if the influxdb down, the telegraf could write to influxdb (after online) at high rate without losing any kind of data, on the other side, if telegraf down, it can capture (after online) all messages/data from broker and write to influxdb with high rate, but some data is loss in influxdb, is it any solution for my problem?

thank you

I’m reminded now that unlike some of the other messaging systems, MQTT doesn’t allow for completely durable message handling. This is because a message is marked sent as soon as it is delivered to Telegraf, and there is no mechanism to “ack” the message after it has been processed.

This means that max_undelievered_messages could in theory be lost each time you restart Telegraf, but only if InfluxDB is down at the time or if Telegraf is forcefully closed.

In the case that InfluxDB is up and you restart Telegraf cleanly, Telegraf should pass all messages it has read on to InfluxDB without losing any messages.

I’m testing with a simple publish loop on one of my systems:

for x in {0..1000}; do mosquitto_pub -t 'telegraf/foo' -q 2 -m "foo value=$x"; sleep 1; done

While this is running I can send SIGHUP to Telegraf, and so far all messages are accounted for in InfluxDB.