Down sampling existing data

kyled · March 5, 2019, 10:14pm

I am fairly new to InfluxDB and I am trying to down sample data that I already have. I have read up on using a continuous query, but I am curious if it is possible to do something like "SELECT temperature FROM generators GROUP BY time(5s) where I can then see the value for “temperature” every 5 seconds (an interval of 5 seconds). I have looked all over the internet and have found nothing quite helpful. Thanks to all of those who reply in advance.

MarcV · March 5, 2019, 10:40pm

Hi , you will need a function in your continuous query ,
I am sure you will find in th link here what you are looking for

Best regards ,
click here for Continuous queries

CullenShane · March 6, 2019, 5:12pm

I haven’t done this, but I’ve been trying to sort out this same thing for a while.

This is for a production environment where it’s import there’s no data gaps:

I think that you setup a CQ so that everything new is being down sampled. Save now() for later.
Use The Into Clause to backfill the existing data. I think that you should be able to basically reuse the CQ that you created earlier, just with the WHERE clause specified.

Good luck!

nmrfiq · July 26, 2019, 9:15am

Here’s a way that you can downsample existing data. As usual, InfluxDB makes it super awkward and it’s not really documented.

Let’s say you have a device measuring some floating-point value at random times. Call it “float”. With each value, it also stores a count. Call it “int”. Thus “float” could be the price of an item, and “int” could be the quantity purchased in a single transaction.

Now you want to summarize purchase activity from database “old_db” to database “new_db”. This involves downsampling the data, in this case, to hourly (“1h”). You could create hourly bars of open (FIRST(“float”)), high (MAX(“float”)), low (MIN(“float”)), and close (LAST(“float”)) values of the item price, along with the total quantity purchased (SUM(“int”)). It would look something like this:

SELECT FIRST(“float”),MAX(“float”),MIN(“float”),LAST(“float”),SUM(“int”) INTO “new_db”.“new_retention_policy”.“new_measurement” FROM “old_db”.“old_retention_policy”.“old_measurement” GROUP BY instrument,time(1h)

This works, with some adjustments depending on your particular measurement and instrument names. (It might not work in all cases, but it’s good enough for a test case.)

Unfortunately, InfluxDB decides to rename the new tags to “first”, “max”, “min”, “last”, and “sum”, respectively (which are just the function names) instead of something related to “price” and “quantity”, for example. This is really annoying if you want to refer to the downsampled data in some other way. I can’t find any way to rename them, so it’s yet another interfacing hassle you need to deal with.

Oh, and be super careful if you’re doing this in Chronograf. It has this dangerous way of automatically rerunning the previous query every time you return to the “Explore” tab. Maybe that can be disabled somewhere, but it’s an accident waiting to happen if your previous query involved writing data.

Anyway good luck with this.

Topic		Replies	Views
Continuous query on the full database influxdb , query	7	3337	April 10, 2019
Downsample and Downsampling a Downsample	2	982	April 5, 2017
Downsample data using INTO clause Store influxdb , query	2	547	January 24, 2021
Downsampling batch-imported-data automatically influxdb	0	929	February 14, 2018
Downsampling existing data influxdb	3	7524	June 8, 2017

Down sampling existing data

Related topics