How to set the size/limit for the database created in the influxdb?

influxdb
#1

Hi,

Thanks for providing the great time series database Influxdb and TICK stack.

Here the queries,

  1. how can i set the size or limit of DB created in influx? As I have went through the retention policy, am able to mention the duration and replication copies. But there are no size mentioned per db.
  2. If i have set duration will it applicable to the database or each data stored in measurement. To be precise the query, after the duration, will complete db and all the data get deleted?

My requirement is to set the db size, if it reaches the threshold, I need to delete the old data.

#2

My use case is to not overflow the system memory as database grows, means database should not occupy the all free memory from the hardisk. It has to be some limit, after that it shouldn’t allow writes. If it’s full database management should happen like delete the month of data which got pushed first.

Sending every point with retention policy will help me I guess. Let me play with retention policy. And am not really understanding the continuous query, if possible can you provide me some easy example, which will helps me lot.

I’ve queried SHOW STATS and seen the tsm_fileStore diskBytes for different databases available, is that the real physical database size stored in the influx directory of the server. Also are there any options to take backup the database and clear the points from db automatically whenever the retention period over.

thanks in advance

#3

@bgshankar There is some excelent documentation on these topics:

1 Like
#4

I read all of those links, and none of them answers the original question. I have the same need - I want to limit the absolute size of the InfluxDB files on disk. I have a virtual machine with limited disk space, and I want to allow InfluxDB to use up to 10 GB of it. Is there any way to do that? It looks like retention policies only apply to removing data based on time, but what I need is to remove oldest data based on storage size. If it is not built into InfluxDB, can it be accomplished with an external script?

#5

Hi @asthomas,
you cannot configure limits based on size ,
You could however set your retention policy duration to 1 day and check the size of your database after a few days ,
That gives you the storage requirements for 1 day in case the input size is the same every day.
From there you can increase your duration
after calculating how many days fit in the 10G ,
Is that an acceptable method ?

#6

Wow, thanks for the quick answer. I thought that might be the case. My application is unpredictable. It could accumulate 10 GB of data in a day, or in a year. That makes storage planning impossible with retention policies based on time.

Is there a way to do this with a script? For example, I could measure the size on disk with “du”, but then how would I reduce the data? Could I, for example, query InfluxDB for the oldest timestamp, divide the time span by 2, and then issue a command to delete all data older than that time? I know that is pretty crude, since time is not a valid measure of data volume in this case, but it would be better than exhausting the disk. Does InfluxDB have a call to delete any data older than a specific date?

#7

In influx you can do things like this …

delete from mymeasurement where time < now() -2d

delete from mymeasurement where time < now() -2h

delete from mymeasurement where time <  '2019-03-01'

and on database level :

delete where time < '2019-03-01' 

but it depends on your shard duration if the data is removed from disk or not

see also here for the delete options

hope this helps solve your problem ,

#8

Thank you. That is very helpful. I’ll experiment a bit and see what I can come up with.