OSS InfluxDB 2.5+ crashes after restoring data from backup created with v2.2

Hello,

I’m in a process of migrating InfluxDB from one server to another, and I would like to upgrade the release version from 2.2 to the current one. The data migration is done by creating a full backup of the running database, and restoring on the new server.
Restore finishes without errors, and until restart the database functions as expected. For releases 2.2, 2.3, and 2.4, keeps running after restart without problems, however versions 2.5 and later end up crashing. The databases run in docker containers, are installed from official images, and data and config directories are wiped clean before each install.
Here are the last few lines from failed v2.5 startup log:

influxdb_1  | ts=2023-05-03T21:48:47.083689Z lvl=info msg="failed loading changes (end)" log_id=0h_uGrA0000 service=storage-engine engine=tsm1 op_name="field indices" op_event=end op_elapsed=0.033ms
influxdb_1  | ts=2023-05-03T21:48:47.083976Z lvl=info msg="Opened file" log_id=0h_uGrA0000 service=storage-engine engine=tsm1 service=filestore path=/var/lib/influxdb2/engine/data/f8dc2d3faa5d359f/autogen/61/000000030-000000002.tsm id=0 duration=0.133ms
influxdb_1  | ts=2023-05-03T21:48:47.084196Z lvl=info msg="Opened shard" log_id=0h_uGrA0000 service=storage-engine service=store op_name=tsdb_open index_version=tsi1 path=/var/lib/influxdb2/engine/data/f8dc2d3faa5d359f/autogen/61 duration=5.641ms
influxdb_1  | ts=2023-05-03T21:48:47.095799Z lvl=info msg="index opened with 8 partitions" log_id=0h_uGrA0000 service=storage-engine index=tsi
influxdb_1  | ts=2023-05-03T21:48:47.096556Z lvl=info msg="failed loading changes (start)" log_id=0h_uGrA0000 service=storage-engine engine=tsm1 op_name="field indices" op_event=start
influxdb_1  | ts=2023-05-03T21:48:47.096902Z lvl=info msg="failed loading changes (end)" log_id=0h_uGrA0000 service=storage-engine engine=tsm1 op_name="field indices" op_event=end op_elapsed=0.348ms
influxdb_1  | ts=2023-05-03T21:48:47.098752Z lvl=info msg="Opened file" log_id=0h_uGrA0000 service=storage-engine engine=tsm1 service=filestore path=/var/lib/influxdb2/engine/data/f8dc2d3faa5d359f/autogen/67/000000030-000000002.tsm id=0 duration=0.095ms
influxdb_1  | ts=2023-05-03T21:48:47.099196Z lvl=info msg="Opened shard" log_id=0h_uGrA0000 service=storage-engine service=store op_name=tsdb_open index_version=tsi1 path=/var/lib/influxdb2/engine/data/f8dc2d3faa5d359f/autogen/67 duration=16.056ms
influxdb_1  | ts=2023-05-03T21:48:47.095849Z lvl=info msg="index opened with 8 partitions" log_id=0h_uGrA0000 service=storage-engine index=tsi
influxdb_1  | ts=2023-05-03T21:48:47.100457Z lvl=info msg="failed loading changes (start)" log_id=0h_uGrA0000 service=storage-engine engine=tsm1 op_name="field indices" op_event=start
influxdb_1  | ts=2023-05-03T21:48:47.100699Z lvl=info msg="failed loading changes (end)" log_id=0h_uGrA0000 service=storage-engine engine=tsm1 op_name="field indices" op_event=end op_elapsed=0.244ms
influxdb_1  | ts=2023-05-03T21:48:47.101947Z lvl=info msg="Opened file" log_id=0h_uGrA0000 service=storage-engine engine=tsm1 service=filestore path=/var/lib/influxdb2/engine/data/f8dc2d3faa5d359f/autogen/92/000000080-000000002.tsm id=0 duration=0.078ms
influxdb_1  | ts=2023-05-03T21:48:47.102329Z lvl=info msg="Opened shard" log_id=0h_uGrA0000 service=storage-engine service=store op_name=tsdb_open index_version=tsi1 path=/var/lib/influxdb2/engine/data/f8dc2d3faa5d359f/autogen/92 duration=18.070ms
influxdb_1  | ts=2023-05-03T21:48:47.112353Z lvl=info msg="Open store (end)" log_id=0h_uGrA0000 service=storage-engine service=store op_name=tsdb_open op_event=end op_elapsed=2114.619ms
influxdb_1  | ts=2023-05-03T21:48:47.112699Z lvl=info msg="Starting retention policy enforcement service" log_id=0h_uGrA0000 service=retention check_interval=30m
influxdb_1  | ts=2023-05-03T21:48:47.112971Z lvl=info msg="Starting precreation service" log_id=0h_uGrA0000 service=shard-precreation check_interval=10m advance_period=30m
influxdb_1  | ts=2023-05-03T21:48:47.115854Z lvl=info msg="Starting query controller" log_id=0h_uGrA0000 service=storage-reads concurrency_quota=1024 initial_memory_bytes_quota_per_query=9223372036854775807 memory_bytes_quota_per_query=9223372036854775807 max_memory_bytes=0 queue_size=1024
influxdb_1  | ts=2023-05-03T21:48:47.122720Z lvl=error msg="err finding runs:" log_id=0h_uGrA0000 service=task-executor error="bucket \"_tasks\" not found"
influxdb_1  | ts=2023-05-03T21:48:47.123037Z lvl=fatal msg="could not load existing scheduled runs" log_id=0h_uGrA0000 error="bucket \"_tasks\" not found"

Restoring data to v.2.4 and upgrading to 2.5 following instructions for Linux (replacing the binary with a new version) results in an identical crash.

I use the following commands to create and restore backup files:
Backup:

influx backup --compression none "$dump_dir_name" -t "$INFLUXDB_BACKUP_TOKEN"

Restore:

influx restore "$backup_file" --full

Helo @piranha32,
I’m not sure. I’m asking around.
Thanks for your patience.

Thank you. As a new user I was not allowed to attach the full log, but if it could be helpful, I can share it via dropbox, or post on pastebin.

Hey @piranha32,

Did you happen to delete an org that had tasks associated with it before taking the backup? There’s a known problem with InfluxDB where tasks linked to a deleted organization are not automatically removed.

If that’s what happened in your case, you might find this GitHub issue helpful: InfluxDB doesn't start after delete org with tasks · Issue #24067 · influxdata/influxdb · GitHub.

Hi @felipevallejo!

I created and deleted orgs, but I don’t recall creating any tasks. Unless tasks were created automatically, there shouldn’t be any. I’ll try to dig into tasks on the server, to see if there are any.
Thanks for the link!