How to configure different retention policies for different measurements

Hi,

Is there any way to configure different retention policies for different measurements?

My Scenario:

I have created a database with a retention policy for 2 weeks and set it as default. I wanted to create continuous queries on the database and store it in measurements with different retention policies.

If my database title is “test” and 2 weeks data is stored in a measurement called “options”

Continuous queries:
CREATE CONTINUOUS QUERY “cq_2_weeks” ON “test” BEGIN SELECT sum(bytes) as “byte_sum”,sum(“duration”) as “duration_sum” INTO “two_weeks”.“aggregates_two_weeks” FROM “options” GROUP BY dist,time(14d) END

CREATE CONTINUOUS QUERY “cq_12_hours” ON “test” BEGIN SELECT sum(bytes) as “byte_sum”,sum(“duration”) as “duration_sum” INTO “12_hours”.“aggregates_12_hours” FROM “options” GROUP BY dist,time(12h) END

  • When I run “select * from aggregates_two_weeks” I see data populating

  • When I run “select * from aggregates_12_hours” the measurement is empty and returns nothing. I tried using the fully qualified name “test”.“12_hours”.“aggregates_12_hours” but that returned empty too.
    I haven’t created the measurements manually and I believe influx would create measurements and schema depending on the stored data
    Should I change my approach to get it working?

Thanks!

Hi @naveenrt23 welcome ,

Have you created the second retention policy ?

what is the output of

  Influx -database test -execute “show retention policies”

Hey thanks!

I tried “show retention policies” in influx and it displayed the below

name      duration     shardGroupDuration  replicaN   default
----      --------     -----------------   --------   -------
autogen        0s           168h0m0s         1            false
two_days     48h0m0s        24h0m0s          1            false
one_day      24h0m0s        1h0m0s           1             false
one_month    720h0m0s       24h0m0s          1             false
12_hours     12h0m0s        1h0m0s           1             false
two_weeks    336h0m0s       24h0m0s          1              true

from the documentation

continuous queries

'CQs execute at the same interval as the 'cq_query' ’s 'GROUP BY time()' interval, and they run at the start of InfluxDB’s preset time boundaries. If the 'GROUP BY time()' interval is one hour, the CQ executes at the start of every hour.'

That means you will only see data after 00:00 or after 12:00 whatever comes first ,
What is your local time and at what time did you create the continuous queries ?

Sorry, I didn’t get what you mean by that. Please correct me if I’m wrong but I am assuming that the CQ’s run at the time created and the next time it runs at the Group By time() interval period.
The CQ’S were created 24hrs back and my timezone is EST.

I was thinking the same when I wrote my answer , can you try influx -database “test”.”12_hours”

And then select * from “aggregates_12_hours” limit 5

Are there errors in journalctl -u Influxdb ?

correction: influx -database “test”.“12_hours” does not work ,
should be :
influx
use “test”.“12_hours” ( or use test.12_hours )

I tried this and it displayed
ERR: database not found: test.12_hours
When i tried

select * from “test”.“12_hours”.“aggregates_12_hours”

I got an empty response.

I don’t see any errors in my docker logs as of now

I have tested with group by dist,(1m)
and that works ,
With (10) minutes it works after 6 minutes because i created the cq at 08:34
With group by dist,time(30m) it works also ( at 09:00 i see records coming in ) so the continuous query is not executed immediately

Shouldn’t the CQ’s execute on the data that is already present in the database? I am pretty sure I have data at least for 3 days in the database, so I was assuming that should be enough for the rollups.

Anyway, I will test the RP’s with less granularity so that I can verify within a couple of hours.

Also, how did you know it worked after 6 min. Did you have to manually check if the measurements are created periodically?

Yes indeed ,

cq’s execute on the data that is already present in the database ,
on data between now() and now() - 12h ( the group by time value )
I saw records coming in in the cq destination measurement ( in your example aggregates_12_hours )
i checked that with a select .

If the group by is 10m and you create the cq at 9:11 for example the first execution will be at 09:20 , and the next at 09:30 , 40 ,50 ,00,10, 20. … ( so in that case it works after 9 minutes :slight_smile: )
00 , 10 , 20 , 30 , 40 , 50 are also called InfluxDB’s preset time boundaries

there is also the continuous query advanced syntax

Does the query execute only on data between now() and now()-group by time(). Let’s say I have 4 days of data and my group by time value is 12h, would the query only execute on data starting only on last 12 hours and wouldn’t execute on the previous 3.5 days data?
I did find this while reading the official documentation for influx

When the CQ executes, it runs a single query for the time range between now() and now() minus the GROUP BY time() interval. If the GROUP BY time() interval is one hour and the current time is 17:00, the query’s time range is between 16:00 and 16:59.999999999.

Also, I tried for a less granular group by times and looks like the measurements are getting created.