Hello, I have 16 years of data to backfill into InfluxDB v2
- The data consists of around 20,000 seperate data streams.
- Each data stream has a timestamp and value
- I have a measurement for each data stream and value field for the data.
I initially put something together with Python to select from the data source and insert to InfluxDB.
This was going fine until I stopped the process after backfilling 1 year of data, after restarting Influx, the CPU and Memory would sit at 100% usage.
- 8 Cores
- 16GB Memory
I saw somewhere that setting the “Shard group duration” to 52 weeks might help, I was also backfilling in descending order which was not ideal.
So I have started again with a 52 week ‘Shard group duration’ and backfilling in ascending order.
Would anyone have any ideas about the CPU and Memory issues after attempting this.
I was getting these errors in the Influx DB logs
fatal error: out of memory allocating heap arena metadata