Best practices for duplicating and restoring data from InfluxDB OSS for later analysis

Hi Everyone,

I have a question regarding best practices for backing up InfluxDB data on an edge computer running InfluxDB OSS.

To provide some background, we have an edge computer running InfluxDB OSS (on Ubuntu) at a remote site with limited Internet connectivity that is capturing data from BLE beacons. Due to the amount of data we will be capturing on site and storage limits on the edge computers internal SSD, we will need to setup the bucket in InfluxDB to delete data that is older than 7 days and use a NAS connected to the edge computer to backup the data incrementally so that we can store all data for the duration of the project (estimated to be up to 10 TB over a six week period).

Can anyone please advise as to best practices for backing up the data to the NAS such that it can be restored for later analysis? Looking at this guide on the InfluxDB file system layout, our plan was to incrementally backup everything in the Data, WAL, and Metastore directories to the NAS. Will this be sufficient such that we can copy these onto another machine running InfluxDB OSS later on to do analysis on the full six weeks of data? Is there anything else we need to do or be mindful of?

Many thanks,

Michael.

In general, you should use backup/restore commands.

However it is not possible to simply restore new data into an existing bucket. You could restore to several different buckets instead and then use a flux query to merge the data from them into a single target bucket.