Slow down Telegraf ingestion

I am running Telegraf and influxdb as docker container and I am using Telegraf Kafka consumer as input plugin and Influxdb as output plugin. So on kafka i am having many topics from which i am getting data every 5 mins. Following are configurations

metric_batch_size = 16000
metric_buffer_limit = 300000
offset = "oldest"
max_undelivered_messages = 50000
input_interval = "1s"
output_interval = "1s"

PS: i have defined interval in both input and output plugin separately as “1s”. Hence used the format input_interval and output_interval.

So whenever i restart telegraf and since data retention period in kafka is 1 week, metrics are getting dropped. Is there any way to slow down the ingestion rate of metrics.

Thanks

The Kafka Consumer input plugin is an event-based plugin. This means that it does not run on an interval and it instead creates a connection to Kafka and listens for events that Kafka is sending to it.

One option you are currently setting is the offset. By using the oldest, you are going to be replaying older data each time you connect. That may or may not be what you want.

Neither of these are valid settings as far as I am aware:

input_interval = "1s"
output_interval = "1s"

Hi, so i have defined interval for both plugin and hence i have defined them like input_interval and output_interval here.

So, if i want to reingest data from some in between date for eg data for past 2 days. Is there any way to do that, and for that if there is any configuration for that…how to limit the ingestion rate of incoming metrics…since metrics will be coming of for past 2 days at once.

Thanks

Hi, so i have defined interval for both plugin and hence i have defined them like input_interval and output_interval here.

I want to ensure you understand that an interval does not apply to the kafka consumer. Those settings are not valid in any case.

Per the readme, there are currently only two settings for offset, “oldest” and “newest”.

Ok got it. So, is there any way to slow down the incoming metrics if my offset is oldest, because metrics are getting dropped in telegraf and i am not sure those metrics which are getting dropped is also present in influxdb or not. So any solution for this or is there any calculation related to telegraf buffer limit and kafka max undelivered message so that we can have optimal value based on number of incoming metrics?

Thanks

I am not aware of a mechanism to slow down the initial load of lots of metrics with Telegraf alone.

If you want to try to load everything from the oldest, then you will want to tune Telegraf’s metric buffer and batch size parameters. This is not an exact science and requires you to understand how many metrics you are collecting.

If you have your logs you can see how many metrics were collected and how many were dropped in the initial read. You could increase these values for the initial load and then reduce them back down after you have your data moved over.