Consistency of atomic filesystem snapshot

the20login · December 7, 2017, 2:02pm

Does storage engine guaranties correctness and consistency for live(without InfluxDB shutdown) atomic filesystem snapshot(like BTRFS)?
In other words, if I make snapshot, will it contains all written(response already received) data? And will it be fully correct, so I could just start InfluxDB on it without any additional preparation?

For example, Postgresql has such guaranties PostgreSQL: Documentation: 9.6: File System Level Backup

noahcrowley · December 7, 2017, 4:46pm

According to the storage engine documentation:

When a write comes in the new points are serialized, compressed using Snappy, and written to a [Write-Ahead-Log] file. The file is fsync’d and the data is added to an in-memory index before a success is returned.

We try to do things in a sane order so that we can recover from sudden failures, and while it is neither tested nor supported, data should not be lost in your scenario and backups using filesystem snapshots should generally be OK. The approach might come with some cost in startup time when restoring the system, since the WAL files will have to be loaded into memory, but you will have to decide whether that is an acceptable trade-off.

the20login · December 7, 2017, 5:36pm

Thanks for the answer.
Unfortunately, even as most of the points is linear, sometimes I have data from the past, so I can’t use build-in backup with ‘since’ option to implement an incremental backup.
I was thinking about checking shards checksum on file level and then use build-in backup with ‘shard’ option to to save only changed shards. But doing so on live system is not a good idea, so then I think about snapshots.

And, to completely answer initial question, is there any estimation to data loss? Is it based on time or records count?

the20login · December 7, 2017, 6:04pm

So, no way to predict.

Ok, thanks.

noahcrowley · December 7, 2017, 8:15pm

@the20login, I’ve updated my response with some additional information from the documentation. Please take a look. You should not experience data loss for requests that have received a success response.

the20login · December 7, 2017, 8:37pm

Awesome, thanks.
This would help me greately.

Topic		Replies	Views
Consistent influxdb backup with LVM/FS snapshot? influxdb , backup	1	3097	June 13, 2018
What's with data in WAL-Files when running a Backup influxdb , backup	4	1404	October 12, 2021
Data snapshot retention	0	553	February 4, 2019
InfluxDB 2.7, flush data to disk (for backup) without restart possible? InfluxDB 2 influxdb , backup	0	207	February 18, 2024
Immutable Storage for InfluxDB Store	3	1677	May 24, 2022

Consistency of atomic filesystem snapshot

Related topics