Does storage engine guaranties correctness and consistency for live(without InfluxDB shutdown) atomic filesystem snapshot(like BTRFS)?
In other words, if I make snapshot, will it contains all written(response already received) data? And will it be fully correct, so I could just start InfluxDB on it without any additional preparation?
For example, Postgresql has such guaranties PostgreSQL: Documentation: 9.6: File System Level Backup
According to the storage engine documentation:
When a write comes in the new points are serialized, compressed using Snappy, and written to a [Write-Ahead-Log] file. The file is fsync’d and the data is added to an in-memory index before a success is returned.
We try to do things in a sane order so that we can recover from sudden failures, and while it is neither tested nor supported, data should not be lost in your scenario and backups using filesystem snapshots should generally be OK. The approach might come with some cost in startup time when restoring the system, since the WAL files will have to be loaded into memory, but you will have to decide whether that is an acceptable trade-off.
Thanks for the answer.
Unfortunately, even as most of the points is linear, sometimes I have data from the past, so I can’t use build-in backup with ‘since’ option to implement an incremental backup.
I was thinking about checking shards checksum on file level and then use build-in backup with ‘shard’ option to to save only changed shards. But doing so on live system is not a good idea, so then I think about snapshots.
And, to completely answer initial question, is there any estimation to data loss? Is it based on time or records count?
@the20login, I’ve updated my response with some additional information from the documentation. Please take a look. You should not experience data loss for requests that have received a success response.
Awesome, thanks.
This would help me greately.