Delete data from DB with autogen retention policy

bcutter · May 9, 2024, 9:49pm

Thank you so much for your previous and also that post. Already based on your previous one I could manage to achieve what I was looking for: removing series from the database. No clue how I came up with the ´DELETE FROM´ instead of the (successful) `DROP SERIES´.

I (partly auto-)generated (and carefully reviewed twice) a .txt file with one DROP SERIES statement per line and used it with influx -precision rfc3339 -database 'homeassistant' -username <uname> -password <pwd> < "/share/deletion_list.txt". It contains ~ 600 lines and while the CPU keeps calm it actually stresses the disk quite a lot:

While the bulk deletion is running I can watch the progress indirectly by using
SELECT numSeries FROM "_internal"."monitor"."database" WHERE "database"='homeassistant' ORDER BY time DESC LIMIT 1
which gives the `numSeries". And that is reaaaally slowly decreasing. Because of everything you said: hot and cold, takes some time to find and purge. So your post really helped me actually (better) understand how it works under the hood.

That last question while waiting for the bulk deletion to complete might be a little off-topic but of HUGE interest to me (only asking here as you seem to be very skilled):

Are you aware of a ´SELECT´ statement (or any other possibility) to get a list of series (and also measurements) with the “most data” (in terms of records or storage used), starting with the largest set and sorting the list decreasingly?

In a classic SQL I would do this, unfortunately I’m not skilled enough to transfer this syntax to something working for InfluxDB:

SELECT entity_id, COUNT(*) as count FROM states GROUP BY entity_id ORDER BY count DESC LIMIT 100

This way I’d like to identify which series are (actually or very likely, based on the storage data) consuming the most disk space. Actual problem: database grew really REALLY big and I need to sort out things. With that list I could also decide which of those “top scorers” actually need to be stored in the InfluxDB at all.

Topic		Replies	Views
Deleting data with autogen retention policy	4	2088	May 9, 2024
Beginners problem with retention policy Store retention-policy	2	781	April 30, 2021
Problem with retention policy Store retention-policy	2	668	April 29, 2021
How to delete data (single series) from a database with endless retention policy?	3	289	May 9, 2024
Alter autogen retention policy and drop data older than 1 year on an existing database	2	4143	March 29, 2019

Delete data from DB with autogen retention policy

Related topics