What are the best practices for replacing measurement data in a bucket via a nightly job? Some considerations:
- Don’t care if there is a little bit of downtime
- The reason I’m deleting data is because each night, we might discover that some previous measurements were invalid, and our latest dataset is more accurate
Some options:
- Delete the bucket => recreate bucket => upload data
- Use delete API to just delete all the measurement data via predicate statement => upload new data
- Assuming bucket name is ‘db’ => create a new bucket each night with name “new_db” => upload data to “new_db” => delete “db” bucket => rename “new_db” to “db”
- Each night, create a new db with name “db_YYYY_MM_DD” with a 1d retention policy. Whenever querying for the data, just query the most recent bucket name
- Do the same things as ^, but for the measurement in the bucket