Retention policy not working as accepted

Hi.
first i try to describe a very simple basic setup which we’d like to achieve:

  • we’d like to push data into a database “test”
    *we’d like to push every second a data
  • we’d like to keep this detailed data for one hour
  • we’d like to keep data for every minutely average for 3 hours
  • we’d like to keep data for 15 minutes average data for 6 hours.
    is it possible? how can we achieve this? since we’re unable to do so. either influxdb do not insert the data or not that way.
    of course the 1 3 and 6 hour and the 1 sec, 1min and 15min is just an example. but aging in such a way is a normal way and we’re not able to get this result.

we try with latest influxdb on rhel 7.
i attached a small shell script which demonstrate the problem.

@Farkas_Levente You are describing downsampling. When using influx you get really good retention. That number of steps seems to be a little overkill. How many samples are you ingesting every second?

The above was just an example to demonstrate the problem. in general

  • we need high precision data in a short period of time eg: 1 month,
  • but we need a not so detailed data for 3 months (1 minute min, max, average)
  • and for at least one year we need a not so detail data (5 minutes min, max, average)

but of course we wouldn’t like to keep the high precision date for a long time.

@Farkas_Levente The database does get excellent compression so you can pretty easily get away with higher granularity than you expect. In your scenario above you could probably do without the last level of downsampling (1min → 5min). Also I’m not seeing the script you mention above. Would you mind posting it in a gist and linking it here so I can help?

we run it on a raspberry pi with sd card where the storage is very expensive ie. we don’t have enough space to store all high res data.

influx -execute "DROP DATABASE test"
influx -execute "CREATE DATABASE test"
influx -execute "CREATE RETENTION POLICY one_hour ON test DURATION 1h REPLICATION 1 DEFAULT"
influx -execute "CREATE RETENTION POLICY three_hour ON test DURATION 3h REPLICATION 1"
influx -execute "CREATE RETENTION POLICY six_hour ON test DURATION 6h REPLICATION 1"

#WORKING
#influx -database test -execute "CREATE CONTINUOUS QUERY cq_avg_1m ON test BEGIN SELECT mean(value) AS value_mean INTO cqrptest_1m FROM cqrptest GROUP BY time(1m), unit END"
#influx -database test -execute "CREATE CONTINUOUS QUERY cq_avg_15m ON test BEGIN SELECT mean(value) AS value_mean INTO cqrptest_15m FROM cqrptest GROUP BY time(15m), unit END"

#WORKING
#influx -database test -execute "CREATE CONTINUOUS QUERY cq_avg_1m ON test BEGIN SELECT mean(value) AS value_mean INTO test."one_hour".cqrptest_1m FROM test."one_hour".cqrptest GROUP BY time(1m), unit END"
#influx -database test -execute "CREATE CONTINUOUS QUERY cq_avg_15m ON test BEGIN SELECT mean(value) AS value_mean INTO test."one_hour".cqrptest_15m FROM test."one_hour".cqrptest GROUP BY time(15m), unit END"

#NOT WORKING
influx -database test -execute "CREATE CONTINUOUS QUERY cq_avg_1m ON test BEGIN SELECT mean(value) AS value_mean INTO test."three_hour".cqrptest_1m FROM test."one_hour".cqrptest GROUP BY time(1m), unit END"

#NOT WORKING
#influx -database test -execute "CREATE CONTINUOUS QUERY cq_avg_1m ON test BEGIN SELECT mean(value) AS value_mean INTO test."three_hour".cqrptest_1m FROM test."one_hour".cqrptest GROUP BY time(1m), unit END"
#influx -database test -execute "CREATE CONTINUOUS QUERY cq_avg_15m ON test BEGIN SELECT mean(value) AS value_mean INTO test."six_hour".cqrptest_15m FROM test."one_hour".cqrptest GROUP BY time(15m), unit END"


NODATA=0
while true; do
#       [ "$NODATA" -ge 10 ] && echo && break
        curl -i -XPOST 'http://localhost:8086/write?db=test' --data-binary "cqrptest,unit=1 value=$RANDOM" &>/dev/null
        echo -n "."
        sleep 1
        ((NODATA++))
done

echo "DONE"

@Farkas_Levente Can you describe the issue you are seeing in detail? Also I understand that you have limited space on the Pi. Even with that I think you could likely eliminate that last level of downsampling. How many data points are you writing per day?

10-20 sample per second and about 20 different data

@Farkas_Levente Data compresses down to ~2 bytes per point. At 400 points a second you would be able to store about a month’s worth of data (1B values) in ~2GB of storage. Downsampling after that to the 5 minute rollups would mean your total footprint never goes above 3GB. Does that work for you?

hmm that’d be good. we’ll test it. thanks.