We are currently using InfluxDB 2.7.1 for collecting time series data in our application. We have recently had issues that Influx
would not start for several weeks until some retention-policies kicked in and deleted data. InfluxDB would always consume increasing amounts of
RAM on startup until crashing with an OOM-Exception.
Current version of Influx is deployed using docker.compose in a server with share resources. We need to limit InfluxDB’s RAM consumption to
maintain the server stability. The server has 32GB of RAM but influx is limited to 10GB using docker. During runtime we see spikes of RAM
and CPU usage correlating with queries made by other applications, but no issues from that.
After a scheduled restart end of November the DB would always use the full allocated 10GB of RAM as soon as one of the other applications connected
to it, crashing with a OOM Exception after ~30 Seconds. No usage of the DB was possible. Waiting for longer periods of time between starting up the DB
and connecting to it did not change the behavior. Just booting the DB did not produce the problem. Now in January the configured Data-retention removed
most of the Data and starting and using the DB is working again. Currently we have 3 different buckets with 30d,90d and auto-gen as their retention policy.
Most of the data is in the 30d bucket.
So, the current hypothesis is that as the data on disk was large this lead to the crash. As it will now increase again, we are afraid we will have the same problem
again at the end of the month.
InfluxDB OSS v2.7.1 in docker
OS Red Hat Enterprise Linux Server release 7.9 (Maipo)
Linux Kernel 3.10.0-1160.105.1.el7.x86_64
Server RAM 32GB
InfluxDB RAM limitation using docker 10GB.
Buckets and retention policy:
3 relevant buckets with 3 retention policies 30d,90d and autogen
Storage size on disk around 20GB(current InfluxDB data storage)
Total disk Size(Shared by Application) 500GB
Available Disk Size 165GB
Any help is highly appreciated
Thank you Ping @Anaisdg