Crash Allocate memoery

This started to happen for 2-3 weeks ago and now I need to reboot server 10-15 times before it works.

Virtual server
InfluxSB 2.4 OSS
Debian 11.5
Release: 5.10.0-18-amd64 version: 5.10.140-1
RAM 30GB
Disk is not full
~40 databases total 45GB disc usage

Upgraded from 1.8 → 2.1 in February 2022

SERVICE STATUS

● influxdb.service - InfluxDB is an open-source, distributed, time series database
     Loaded: loaded (/lib/systemd/system/influxdb.service; enabled; vendor preset: enabled)
     Active: active (running) since Mon 2022-10-10 10:09:16 CEST; 19min ago
       Docs: https://docs.influxdata.com/influxdb/
    Process: 598 ExecStart=/usr/lib/influxdb/scripts/influxd-systemd-start.sh (code=exited, status=0/SUCCESS)
   Main PID: 606 (influxd)
      Tasks: 33 (limit: 36075)
     Memory: 11.3G
        CPU: 12min 30.066s
     CGroup: /system.slice/influxdb.service
             └─606 /usr/bin/influxd

Oct 10 10:08:36 oemc44 influxd-systemd-start.sh[598]: InfluxDB API at http://localhost:8086/ready unavailable after 2 attempts...
Oct 10 10:08:41 oemc44 influxd-systemd-start.sh[598]: InfluxDB API at http://localhost:8086/ready unavailable after 3 attempts...
Oct 10 10:08:46 oemc44 influxd-systemd-start.sh[598]: InfluxDB API at http://localhost:8086/ready unavailable after 4 attempts...
Oct 10 10:08:51 oemc44 influxd-systemd-start.sh[598]: InfluxDB API at http://localhost:8086/ready unavailable after 5 attempts...
Oct 10 10:08:56 oemc44 influxd-systemd-start.sh[598]: InfluxDB API at http://localhost:8086/ready unavailable after 6 attempts...
Oct 10 10:09:01 oemc44 influxd-systemd-start.sh[598]: InfluxDB API at http://localhost:8086/ready unavailable after 7 attempts...
Oct 10 10:09:06 oemc44 influxd-systemd-start.sh[598]: InfluxDB API at http://localhost:8086/ready unavailable after 8 attempts...
Oct 10 10:09:11 oemc44 influxd-systemd-start.sh[598]: InfluxDB API at http://localhost:8086/ready unavailable after 9 attempts...
Oct 10 10:09:16 oemc44 influxd-systemd-start.sh[598]: InfluxDB started
Oct 10 10:09:16 oemc44 systemd[1]: Started InfluxDB is an open-source, distributed, time series database.

/lib/systemd/system/influxdb.service

[Unit]
Description=InfluxDB is an open-source, distributed, time series database
Documentation=https://docs.influxdata.com/influxdb/
After=network-online.target

[Service]
User=influxdb
Group=influxdb
LimitNOFILE=100000
EnvironmentFile=-/etc/default/influxdb2
ExecStart=/usr/lib/influxdb/scripts/influxd-systemd-start.sh
KillMode=control-group
Restart=on-failure
Type=forking
PIDFile=/var/lib/influxdb/influxd.pid

[Install]
WantedBy=multi-user.target
Alias=influxd.service

SYSLOG

Oct 10 10:00:12 oemc44 systemd[1]: influxdb.service: start operation timed out. Terminating.
Oct 10 10:00:12 oemc44 systemd[1]: influxdb.service: Failed with result ‘timeout’.
Oct 10 10:00:12 oemc44 systemd[1]: Failed to start InfluxDB is an open-source, distributed, time series database.
Oct 10 10:00:12 oemc44 systemd[1]: influxdb.service: Consumed 1min 34.144s CPU time.
Oct 10 10:00:12 oemc44 systemd[1]: influxdb.service: Scheduled restart job, restart counter is at 2.
Oct 10 10:00:12 oemc44 systemd[1]: Stopped InfluxDB is an open-source, distributed, time series database.
Oct 10 10:00:12 oemc44 systemd[1]: influxdb.service: Consumed 1min 34.144s CPU time.
Oct 10 10:00:12 oemc44 systemd[1]: Starting InfluxDB is an open-source, distributed, time series database…
Oct 10 10:00:12 oemc44 influxd-systemd-start.sh[1327]: Command “print-config” is deprecated, use the influx-cli command server-config to display the configuration values from the running server
Oct 10 10:00:12 oemc44 influxd-systemd-start.sh[1347]: Command “print-config” is deprecated, use the influx-cli command server-config to display the configuration values from the running server
Oct 10 10:00:12 oemc44 influxd-systemd-start.sh[1359]: Command “print-config” is deprecated, use the influx-cli command server-config to display the configuration values from the running server
Oct 10 10:00:12 oemc44 influxd-systemd-start.sh[1325]: InfluxDB API at http://localhost:8086/ready unavailable after 1 attempts…
Oct 10 10:00:17 oemc44 influxd-systemd-start.sh[1325]: InfluxDB API at http://localhost:8086/ready unavailable after 2 attempts…
Oct 10 10:00:22 oemc44 influxd-systemd-start.sh[1325]: InfluxDB API at http://localhost:8086/ready unavailable after 3 attempts…
Oct 10 10:00:27 oemc44 influxd-systemd-start.sh[1325]: InfluxDB API at http://localhost:8086/ready unavailable after 4 attempts…
Oct 10 10:00:28 oemc44 influxd-systemd-start.sh[1326]: ts=2022-10-10T08:00:28.557162Z lvl=error msg=“Failed to open shard” log_id=0dSENxzG000 service=storage-engine service=store op_name=tsdb_open db_shard_id=8121 error=“[shard 8121] cannot allocate memory”
Oct 10 10:00:28 oemc44 influxd-systemd-start.sh[1326]: ts=2022-10-10T08:00:28.557584Z lvl=error msg=“Failed to open shard” log_id=0dSENxzG000 service=storage-engine service=store op_name=tsdb_open db_shard_id=5805 error=“[shard 5805] cannot allocate memory”
Oct 10 10:00:28 oemc44 influxd-systemd-start.sh[1326]: ts=2022-10-10T08:00:28.557482Z lvl=error msg=“Failed to open shard” log_id=0dSENxzG000 service=storage-engine service=store op_name=tsdb_open db_shard_id=8183 error=“[shard 8183] cannot allocate memory”
Oct 10 10:00:28 oemc44 influxd-systemd-start.sh[1326]: ts=2022-10-10T08:00:28.558669Z lvl=error msg=“Failed to open shard” log_id=0dSENxzG000 service=storage-engine service=store op_name=tsdb_open db_shard_id=5847 error=“[shard 5847] cannot allocate memory”
Oct 10 10:00:28 oemc44 influxd-systemd-start.sh[1326]: ts=2022-10-10T08:00:28.559386Z lvl=error msg=“Failed to open shard” log_id=0dSENxzG000 service=storage-engine service=store op_name=tsdb_open db_shard_id=4827 error=“[shard 4827] cannot allocate memory”
Oct 10 10:00:28 oemc44 influxd-systemd-start.sh[1326]: ts=2022-10-10T08:00:28.559477Z lvl=error msg=“Failed to open shard” log_id=0dSENxzG000 service=storage-engine service=store op_name=tsdb_open db_shard_id=3613 error=“[shard 3613] cannot allocate memory”

Anyone that can help me?

Hello @flopp,
I’ve asked for help. I’m not sure. Thank you for your patience.

You can add
TimeoutStartSec=<some number of seconds (or infinity)>
inside the [Service] block to give it more time to come up. I would start with that and see if things are healthy once influx is given the proper time to boot.

Thanks for your reply.
Now it happened again, cannot start it.
I noticed that the service reached 8.0GB then it crash.
Any idea how to solve this?

It happened again, tested with version 2.5 and it didnt helped.

Is there a maximum RAM usage?

Is there a limit of how many buckets we can use?
Is there a limit of how many shards each bucket can have?

We use V1 mapping for all buckets