Best way to process deal with "large" data sets

David_Cohen · May 30, 2017, 4:23pm

I have a streaming app, that generates statsd like statistics. Every couple minutes it will probably generate a data set around 50,000 rows. Its essentially, a bunch of tags and a couple values with a timestamp, so easy to convert to influx format.

My question is, whats the best way to get this to influx… i figure my options are :

Just send unbatched web requests across the network (probably too slow)
Send batches of 5000 across the network
Send UDP messages to telegraf (1 per row) and let telegraf deal with batching (can it keep up?)
copy everything to a file and use the -import command for influx

Is there best practice for this?

jackzampolin · May 30, 2017, 5:13pm

@David_Cohen options 2 and 3 are your best best. We advise batches of between 5k-10k field values per batch. Also telegraf can definitely keep up. I would suggest using the socket_listener if you go that route.

Topic		Replies	Views
Alternatives to HTTP API? Telegraf influxdb , telegraf	1	864	August 23, 2017
Telegraf bulk import for large amout of data with input.tail Store influxdb , telegraf	5	4981	December 19, 2019
What is the highest-performance method of getting data in/out of InfluxDB Telegraf influxdb , time-series	12	26430	October 22, 2020
1 database vs many Store	3	1350	March 12, 2019
Probléme statds telegraf influxdb Telegraf	1	212	August 7, 2023

Best way to process deal with "large" data sets

Related topics