Upgrade from InfluxDB 1.8 to 2.3 very slow

Hi there,

we are currently trying to upgrade a very large (~11 TB) InfluxDB from Version 1.8 to 2.3. With a smaller Database (~2.2TB) we already observed very poor performance during the upgrade process. We could track down the performance bottleneck to reading from the disk where the v1.8 DB’s data was located. We tried the automatic update command (influxd upgrade), which took about 19 hours. We could observe that the influxd process only read about 17MB/s and did about 130 IOPS, but was 100% bottlenecked by the disk according to analysis tools like iostat, dstat and iotop. This indicates that the influxd process is upgrading the data not in parallel but in a rather serialized fashion. It read chunks of 128KB and waits until they are read before issuing another read. Disk benchmarks with fio resulted in much higher performance (1200MB/s, 20.000 IOPS). We are running on an Azure VM (64 vCores, 256GB RAM) with premium, high performance SSDs with the size of 32TB. We have separate disks for the V1.8 data and v2.3 data.

Using the manual upgrade process we observed the exact same poor performance during the ‘influx_inspect export’ step.

Is there any way we could reduce the time that the upgrade process takes for such a large database? We are trying to reduce downtime of our system. Extrapolating the performance numbers we expect the upgrade from 1.8 to 2.3 to take at least 4 days, which is not acceptable. Is it even recommended to have that much data in one InfluxDB instance or should we try to split it down?

can u put a steps if you have for upgrade?

I’ve never performed the migration from 1.x to 2.x, and I was not aware of the “serial” processing, but I think you can overcome it using the “manual” procedure and having several processes running commands.

IYou can parallelize the export yourself by writing a small program/script (with Powershell or whatever) to start multiple processes and export the data, using the export command.
The same should be true for the import (in case that’s serial too)

You just need to decide how to “split the data” in each export, and you got just3 parameters to use: DB, RP and time (as a date from-to). meaning you can export one DB/RP and a month worth of data in each iteration.

Will it work properly? will it have a good performance?
I can’t tell, but I’d be interested to know how it behaves.

Other options to reduce downtime (if feasible)

  • populating both systems (with incoming data), this allows you to “switch” with no downtime later
  • Once both systems are receiving data, export-import the latest historic data to new one (up to the requirements)
  • do the switch, by pointing to the new DB with whatever reads the data (before doing it ensure you have the “needed data”… whatever that means (maybe the last 1-2-6 months? I can’t tell)
  • keep importing all the other historical data in the background until everything has been migrated
1 Like

Hi Giovanni,

thanks for the useful hints! We are currently trying the approach of splitting the data in monthly or quarterly chunks and we already see big performance improvements. We will keep you updated on this topic.

Kind regards,
Moritz