Downsampling data in-situ

rdp · June 16, 2018, 5:56pm

My client’s data arrives from IoT devices every ten seconds. We would like to set up a downsampling and retention policy that downsamples data to X second intervals after Y minutes and deletes it after a year. (I’m checking on X and Y, but that’s probably not relevant.)

I’ve read the Downsampling and Data Retention guide and I understand how to retain the data for a year. But downsampling leaves me with two questions:

For downsampling we need to thin the existing measurement rather than creating a new measurement (…because we don’t control the client that subscribes to the database). What’s the best way to do that?
The Downsampling and Data Retention guide explains why it’s necessary to set up a Continuous Query before creating the database. In our case, we need to apply the thinning post-facto once before a CQ can take over the job. How is that done?

valentinbora · June 19, 2018, 5:54pm

I’ve looked into this quite a bit and I finally decided to do it with a Kapacitor script that I replay on historical data via CLI, such as:

kapacitor replay-live batch -task data_rollup_1m -start 2018-01-01T00:00:00Z -stop 2018-05-28T03:00:00Z -rec-time

The data_rollup_1m TICK script queries and aggregates data in batches of 1m and writes it out to another retention policy and measurement.

CQs as far as I understand are meant to deal with data streaming into InfluxDB in a more real-time manner, i.e. as you said, on future data coming in.

Topic		Replies	Views
Down sample data in same measurement and delete old data influxdb	1	4185	November 13, 2018
Manual Downsampling Influx 1.8 Kapacitor downsample	3	892	April 29, 2021
Downsampling existing data influxdb	3	7525	June 8, 2017
Downsample into same measure? InfluxQL	4	4181	March 3, 2022
Downsampling older data into same measurement Store	6	3551	March 3, 2022

Downsampling data in-situ

Related topics