Downsampling existing data

I have been running InfluxDB+Telegraf to monitor servers for a month or two, with the default system Telegraf plugin, no custom metrics yet. I didn’t add retention policies or continuous queries, so my InfluxDB database grew to 3.5GB by now. Clearly it’s useless to store the CPU usage every 5 seconds going back 2 months!

I want to add downsampling now, but it’s not totally clear to me how it works. If I understand correctly: retention policies can make data expire after X time, and continuous queries aggregate new data and store the aggregated result into a different RP that expires after Y time (or never). The downsampling guide also says:

We perform the following steps before writing the data to the database food_data. We do this before inserting any data because CQs only run against recent data

So it seems like if I add a CQ now, it won’t aggregate my existing data, and if I change the retention duration, it will immediately wipe my old data.

How do I aggregate the data I already have into a new RP, before changing the duration of the default RP?

@nicolas17 You can run raw INTO queries without the CQ syntax to accomplish this. I just did this on one of our instances today!

That works great, thanks. Now I just need to fight my inner data packrat and decide how much data I really need to keep!

1 Like

Well with compression to ~2 bytes per float or integer value you shouldn’t have to throw out too much! I generally just use CQ to optimize queries that are taking too long to return and try to keep as much data as possible.