Retention policy not working as accepted

Farkas_Levente · May 30, 2017, 4:34pm

Hi.
first i try to describe a very simple basic setup which we’d like to achieve:

we’d like to push data into a database “test”
*we’d like to push every second a data
we’d like to keep this detailed data for one hour
we’d like to keep data for every minutely average for 3 hours
we’d like to keep data for 15 minutes average data for 6 hours.
is it possible? how can we achieve this? since we’re unable to do so. either influxdb do not insert the data or not that way.
of course the 1 3 and 6 hour and the 1 sec, 1min and 15min is just an example. but aging in such a way is a normal way and we’re not able to get this result.

we try with latest influxdb on rhel 7.
i attached a small shell script which demonstrate the problem.

jackzampolin · May 30, 2017, 5:10pm

@Farkas_Levente You are describing downsampling. When using influx you get really good retention. That number of steps seems to be a little overkill. How many samples are you ingesting every second?

Farkas_Levente · May 31, 2017, 9:51am

The above was just an example to demonstrate the problem. in general

we need high precision data in a short period of time eg: 1 month,
but we need a not so detailed data for 3 months (1 minute min, max, average)
and for at least one year we need a not so detail data (5 minutes min, max, average)

but of course we wouldn’t like to keep the high precision date for a long time.

jackzampolin · May 31, 2017, 4:08pm

@Farkas_Levente The database does get excellent compression so you can pretty easily get away with higher granularity than you expect. In your scenario above you could probably do without the last level of downsampling (1min → 5min). Also I’m not seeing the script you mention above. Would you mind posting it in a gist and linking it here so I can help?

Farkas_Levente · June 2, 2017, 12:19pm

we run it on a raspberry pi with sd card where the storage is very expensive ie. we don’t have enough space to store all high res data.

influx -execute "DROP DATABASE test"
influx -execute "CREATE DATABASE test"
influx -execute "CREATE RETENTION POLICY one_hour ON test DURATION 1h REPLICATION 1 DEFAULT"
influx -execute "CREATE RETENTION POLICY three_hour ON test DURATION 3h REPLICATION 1"
influx -execute "CREATE RETENTION POLICY six_hour ON test DURATION 6h REPLICATION 1"

#WORKING
#influx -database test -execute "CREATE CONTINUOUS QUERY cq_avg_1m ON test BEGIN SELECT mean(value) AS value_mean INTO cqrptest_1m FROM cqrptest GROUP BY time(1m), unit END"
#influx -database test -execute "CREATE CONTINUOUS QUERY cq_avg_15m ON test BEGIN SELECT mean(value) AS value_mean INTO cqrptest_15m FROM cqrptest GROUP BY time(15m), unit END"

#WORKING
#influx -database test -execute "CREATE CONTINUOUS QUERY cq_avg_1m ON test BEGIN SELECT mean(value) AS value_mean INTO test."one_hour".cqrptest_1m FROM test."one_hour".cqrptest GROUP BY time(1m), unit END"
#influx -database test -execute "CREATE CONTINUOUS QUERY cq_avg_15m ON test BEGIN SELECT mean(value) AS value_mean INTO test."one_hour".cqrptest_15m FROM test."one_hour".cqrptest GROUP BY time(15m), unit END"

#NOT WORKING
influx -database test -execute "CREATE CONTINUOUS QUERY cq_avg_1m ON test BEGIN SELECT mean(value) AS value_mean INTO test."three_hour".cqrptest_1m FROM test."one_hour".cqrptest GROUP BY time(1m), unit END"

#NOT WORKING
#influx -database test -execute "CREATE CONTINUOUS QUERY cq_avg_1m ON test BEGIN SELECT mean(value) AS value_mean INTO test."three_hour".cqrptest_1m FROM test."one_hour".cqrptest GROUP BY time(1m), unit END"
#influx -database test -execute "CREATE CONTINUOUS QUERY cq_avg_15m ON test BEGIN SELECT mean(value) AS value_mean INTO test."six_hour".cqrptest_15m FROM test."one_hour".cqrptest GROUP BY time(15m), unit END"


NODATA=0
while true; do
#       [ "$NODATA" -ge 10 ] && echo && break
        curl -i -XPOST 'http://localhost:8086/write?db=test' --data-binary "cqrptest,unit=1 value=$RANDOM" &>/dev/null
        echo -n "."
        sleep 1
        ((NODATA++))
done

echo "DONE"

jackzampolin · June 5, 2017, 4:27pm

@Farkas_Levente Can you describe the issue you are seeing in detail? Also I understand that you have limited space on the Pi. Even with that I think you could likely eliminate that last level of downsampling. How many data points are you writing per day?

Farkas_Levente · June 12, 2017, 8:21am

10-20 sample per second and about 20 different data

jackzampolin · June 12, 2017, 4:18pm

@Farkas_Levente Data compresses down to ~2 bytes per point. At 400 points a second you would be able to store about a month’s worth of data (1B values) in ~2GB of storage. Downsampling after that to the 5 minute rollups would mean your total footprint never goes above 3GB. Does that work for you?

Farkas_Levente · June 15, 2017, 1:52pm

hmm that’d be good. we’ll test it. thanks.

Topic		Replies	Views
Retention policy not doing its work? influxdb	0	476	October 3, 2019
Why do the retention policy not work accurately influxdb	5	2889	November 21, 2019
Retention policy not working? influxdb , time-series , date	1	1274	September 20, 2017
Retention policy not working at all InfluxDB 2	2	173	July 24, 2024
Simplifying InfluxDB: Retention Policy Best Practices Store	1	2021	November 13, 2018

Retention policy not working as accepted

Related topics