I’ve seen posts that recommend changing some tuning parameters for limiting cache size for high (historical) load jobs.
I’m running some tests that is having issues with memory usage and also limited throughput.
I notice when I load a large batch of 24M rows (2-3 fields/row) I can’t get above around 150K rows/second.
This is on a 8core/16thread Ryzen 3.8Ghz processor with 32GB memory. There are moments of full CPU utilization, but it seems to go in cycles, I’d say 50% or less of the time is CPU maxed. This is also on NVMe SSD drive. >1 GB/sec throughput. The memory will slowly grow to between 24-32GB depending on number of load clients. Once the load completes from the client side (2-4 minutes for 2-8 clients) Influx continues processing (indexing?) data for another few minutes. Eventually, the memory will go back down. Sometimes, the box will run out of memory, or if I run a basic query right after the load, it will cause an OOM error in the logs and crash Influx.
~6M unique series. I had most of these under a single measurement, but after reading that can cause performance issues, I spread them out over 600 measurements which didn’t help much.
Also note, I’m not specifying a timestamp on the input, it appears influx picks a value at the start of a file load.
I’m using the curl POST method to load the files. I’ve tried 100K and 10K rows/file. Didn’t seem to make much difference. I’ve also tried both index models.
How do I setup the measurements/series to get 1M/sec throughput?
Is there a document that would help identify tuning parameters for historical loads or high sustained throughput?
Daily processing need to be able to push at minimum 1.3B rows into Influx at 60K/second sustained, or ideally, 14B rows at 655K/second sustained. I’ll need to cluster in production for availability which would drop my throughput in half or more (according to docs).
Side question, when I run the stats query against Influx internal database, the # of series appears correct, but when I run the influx_inspect against the tsm files, it’s wildly different (much higher and wrong?) for it’s estimates. What is influx_inspect actually reporting? Or, is that an indicator, I’m doing something wrong in my loads?