Data "loss" after migrating from v1.8 to v2.0.3

Hi there,

we just migrated from v.1.8 to v.2.0.3 (OSS). We followed the instructions given in the manual here: Upgrade from InfluxDB 1.x to InfluxDB 2.0 | InfluxDB OSS 2.0 Documentation.
According to the upgrade.log everything went fine and the InfluxDB is ready to serve. The Web UI is working as expected but - for some reasons - we are not able to query “old” data, so data that has been imported from v1.8. When submitting a query using the Web UI with a timeframe that points to the “old data” the result is simply empty. But there has to be data and the queries (even we worked with the QL not with Flux) were correct and working just fine.
Querying the “new data”, so data that has been written after the migration to v.2.0 works without any problems.
How could this happen? - Are there any steps to resolve this?

Thank you in advance.
Kind regards.
Alexander

Hi @abt, a few things to check first:

  1. If you run the upgrade process with --verbose you should see “Copying data” logs with a “target” file path. If you look in those directories, do you see TSM / WAL files?
  2. All of the target data paths should be nested within a single folder named “data”. If you look at the startup logs for the influxd process, you should see a “Using data dir” log pointing to a path. Does the path match the data directory logged during upgrade?

Hi @dan-moran,
thank you for your reply.
I can answer both questions with “yes”, although we are not using the default path the Influx creates.
We are assuming that somehow the tsi isn’t created after the import of the data for the “old” data, since in the target path the index files are missing. But that is just an assumption.
Kind regards,
Alexander

Hi @abt, I suspect your guess is correct. influxd upgrade doesn’t generate TSIs, under the expectation that the influxd server would regenerate them on startup. We found that this re-generation logic got lost in the push for GA, and an inmem index was instead injected into shards missing TSI files. It’s possible you’re hitting a bug in the inmem logic.

We’ve fixed the problem (and fully removed the inmem index) for our upcoming 2.0.4 release, which we hope to push out at the end of this week / early next. You could try upgrading again using one of our nightly builds to verify that the problem will be fixed:

You could also try using influx_inspect buildtsi from the 1.x line to regenerate your indices (docs here). The storage engine & disk layout is the same between 1.x and 2.x, so the tool will hopefully Just Work when pointed at your data directory.

1 Like

Hi @dan-moran,
thank you for providing the nightly build. I would really like to install it, but can it be installed on top of v.2.0.3 without any problems?
Meanwhile we did a very ugly workaround and exported the old data in line protocol format and imported it in v.2.0.3 (hence, manually triggering the index creation). This way it seems to work when it doesn’t crash due to a out of memory. We had the thing with the out of memory and the OS killing the InfluxDB before and we were not able to explain it. When querying a longer time period (e.g. 4 days) you can literally see the RAM (and vRAM) growing until it is full and the OS stopping the influxd due to the fact that it is not responding. But the amount of data we are expecting is really a very small fraction compared to the RAM the InfluxDB uses (2MB vs. > 10GBs of RAM). This might be another issue and worth another forum topic but it is really a pain… Any ideas on this?