Simplifying InfluxDB: Retention Policy Best Practices

Originally published at: Simplifying InfluxDB: Retention Policy Best Practices | InfluxData

Retention policies can often be tricky even at the best of times but when you’re dealing with time series data, setting up the appropriate retention policy to automatically expire (delete) unnecessary data can save you loads of time in the long run. This post will walk through some general guidelines on creating the best retention policy for your use case with InfluxDB.

Wait...What’s a Retention Policy?

[caption id="attachment_216195" align="aligncenter" width="401"] Data doesn’t remain useful forever.[/caption]

Before we start talking about best practices around retention policies, it’s important to understand just what they are. Although its name is somewhat explanatory, an InfluxDB retention policy is defined in the documentation as:

The part of InfluxDB’s data structure that describes for how long InfluxDB keeps data (duration), how many copies of those data are stored in the cluster (replication factor), and the time range covered by shard groups (shard group duration). RPs are unique per database and along with the measurement and tag set define a series.

When you create a database, InfluxDB automatically creates a retention policy called autogen with an infinite duration, a replication factor set to one, and a shard group duration set to seven days.

So in a nutshell, a retention policy dictates for how long data will be kept and stored and if you’re using InfluxDB Enterprise, how many of copies of that data to store. Because time series data tends to pile up really quickly, you’re definitely going to want to discard or downsample data from InfluxDB once it’s no longer as useful. If you need further convincing, just check out these blog posts:

General Guidelines

There are a few key things to consider when you’re setting up your database’s retention policy. First and foremost, you’ll need to consider how long your use case requires that you retain the data. Do you need it for a week? A month? A year? This decision will specifically guide to what amount of time you set your retention policy duration and isn’t really negotiable.

But wait—you’re not done yet. Another integral part of setting up a retention policy involves designating the shard group duration for all data that will follow this retention policy. This is where things get tricky. Since shards really represent the core physical part of the database, tuning the shard group duration to just the right setting can really maximize performance and so, it’s important to get it right.

Setting the duration on the higher side will result in larger collections of data within each shard. This could cause problems when querying the database. For example, if you’re querying the database for a shorter time window than the shard group time span, the database may need to decode longer blocks of data in order to read a subset of the time range of the shard and that process will require greater effort and time.

On the other hand, if you set the shard group duration on the shorter side, the result is a greater number of shard groups. Due to Time Series Indexing, each shard will have some extra overhead in the form of this index and metadata, so having thousands of shards with little data on each is by no means efficient.

[caption id=“attachment_216196” align=“aligncenter” width=“383”] It can sometimes be difficult to determine the right setting for your shard group duration.[/caption]

My recommendation is to be like Goldilocks and try them all out until you hit the perfect spot!

Okay, all joking aside—we at InfluxData recommend setting the shard group duration as follows:

  • The shard group duration should be twice your longest typical query's time range—yep, that means you’ll need to think about what kinds of queries you’ll be running on InfluxDB.
  • The shard group duration should be set so that each shard group ends up with at least 100,000 points per group—you want more data per shard, but not too much data.
  • The shard group duration should be set so that each shard group has at least 1,000 points per series.

Summary

If you’re new to using InfluxDB, setting up your database schema and retention policies can sometimes feel like a daunting task. Especially in more exceptional cases like working with very large clusters (Influx Enterprise) or with very long or short retention periods. You’ll definitely want to spend some time tweaking retention duration and shard group duration until you find the right fit. After all, it took Goldilocks three tries, right? Once you find that setting that’s just right, tweet us @InfluxDB and @mschae16 and tell us all about it!

Dear mschae16,

2 of the recommendations in the post depend on the number of points per shard group and per series. Unfortunately I haven’t known of a straightforward way to calculate / query those values. Can you please give me a hint?

I already tried to play around with the _internal database. However, it is unclear what the fields, such as writePointsOk, mean.

Regards,
Giang