Metric buffer overflow

I’ve got a csv of 1.7 million records and I want to read it and insert it in a bucket of InfluxDB via Telegraf.

This is my Telegraf configuration:

Configuration for telegraf agent

[agent]

Default data collection interval for all inputs

interval = “30s”

Rounds collection interval to ‘interval’

ie, if interval=“10s” then always collect on :00, :10, :20, etc.

round_interval = true

Telegraf will send metrics to outputs in batches of at most

metric_batch_size metrics.

This controls the size of writes that Telegraf sends to output plugins.

metric_batch_size = 100

Maximum number of unwritten metrics per output. Increasing this value

allows for longer periods of output downtime without dropping metrics at the

cost of higher maximum memory usage.

metric_buffer_limit = 150000

Collection jitter is used to jitter the collection by a random amount.

Each plugin will sleep for a random time within jitter before collecting.

This can be used to avoid many plugins querying things like sysfs at the

same time, which can have a measurable effect on the system.

collection_jitter = “0s”

Default flushing interval for all outputs. Maximum flush_interval will be

flush_interval + flush_jitter

flush_interval = “10s”

Jitter the flush interval by a random amount. This is primarily to avoid

large write spikes for users running a large number of telegraf instances.

ie, a jitter of 5s and interval 10s means flushes will happen every 10-15s

flush_jitter = “0s”

However when I execute Telegraf, these warnings appears:


Some data is well inserted into the bucket but others are discarded.

How do I have to change my buffer configuration to ingest all the data?

Thanks

Is this a one-time import or something you expect to do repeatedly?

It is one-time import and afterwards the idea is to make it via Flux repeatedly.

This is something that I do not think Telegraf is a great match for. If you were continuously getting new files or wanting to collect metrics periodically from something then it is a much better fit.

If you were using InfluxDB v2 I would suggest using the web interface to import the CSV file.

However, with InfluxDB v1, to get going and import a big file, I would either use one of the two options:

  • Write a short script in my language of choice (e.g. python or go) to read the CSV and send the data to InfluxDB
  • Use something like export csv-to-influx to do the import for me

We must reconsider using Telegraf for this one-time import. We are thinking of using the command “influx write” and add the header types of the CSV files.