Influxdb2 Docker - High Memory loop

Hi,

I’m running an InfluxDB (2.6) container, limiting its available memory.

I’ve encountered a few times, where one of the users runs some kind of query (I actually don’t know the precise condition) and the influx goes into some loop where it takes all the available memory and reads from disc. (Memory is fully used and BLOCK I/O has a very very high number).

when I go to the influx container logs, I see the following pattern:

{"log":"ts=2023-06-21T08:08:33.912120Z lvl=info msg=\"loading changes (start)\" log_id=0iZFof5l000 service=storage-engine engine=tsm1 op_name=\"field indices\" op_event=start\n","stream":"stdout","time":"2023-06-21T08:08:33.912170363Z"}
{"log":"ts=2023-06-21T08:08:33.912150Z lvl=info msg=\"loading changes (end)\" log_id=0iZFof5l000 service=storage-engine engine=tsm1 op_name=\"field indices\" op_event=end op_elapsed=0.033ms\n","stream":"stdout","time":"2023-06-21T08:08:33.912221505Z"}
{"log":"ts=2023-06-21T08:08:33.913646Z lvl=info msg=\"Opened file\" log_id=0iZFof5l000 service=storage-engine engine=tsm1 service=filestore path=/var/lib/influxdb2/engine/data/844ea0a92a6d5f0e/autogen/256/000000021-000000002.tsm id=0 duration=5.007ms\n","stream":"stdout","time":"2023-06-21T08:08:33.913729618Z"}
{"log":"ts=2023-06-21T08:08:33.913812Z lvl=info msg=\"Opened shard\" log_id=0iZFof5l000 service=storage-engine service=store op_name=tsdb_open index_version=tsi1 path=/var/lib/influxdb2/engine/data/844ea0a92a6d5f0e/autogen/256 duration=26.220ms\n","stream":"stdout","time":"2023-06-21T08:08:33.913877392Z"}
{"log":"ts=2023-06-21T08:08:33.916032Z lvl=info msg=\"Opened file\" log_id=0iZFof5l000 service=storage-engine engine=tsm1 service=filestore path=/var/lib/influxdb2/engine/data/8963475e496bb947/autogen/259/000000012-000000002.tsm id=0 duration=3.714ms\n","stream":"stdout","time":"2023-06-21T08:08:33.916094505Z"}
{"log":"ts=2023-06-21T08:08:33.917749Z lvl=info msg=\"Opened shard\" log_id=0iZFof5l000 service=storage-engine service=store op_name=tsdb_open index_version=tsi1 path=/var/lib/influxdb2/engine/data/8963475e496bb947/autogen/259 duration=16.911ms\n","stream":"stdout","time":"2023-06-21T08:08:33.917818234Z"}
{"log":"ts=2023-06-21T08:08:33.920227Z lvl=info msg=\"index opened with 8 partitions\" log_id=0iZFof5l000 service=storage-engine index=tsi\n","stream":"stdout","time":"2023-06-21T08:08:33.920276544Z"}

This continues on even after restarting the container/machine.

The solution to getting out of this loop is to stop the container, increase/remove the memory limit and let it run for a bit. It then sattles back down to the lower memory usage it usually has.

What is happening?
How can this be avoided?
How can I know what query caused it?

Thanks in advance,
David

Hello @DavidHy,
You can try to use the profiler to see what query is taking up resources:

Operator control over memory is one of the problems that 3.0 is looking to address because problems like these are quite common.
I encourage you to look into learning more about it.

Hi,

it seems that I am having quite a similar problem.

I’m running Influx as a container on a VM.
Some days ago, the container stopped working without making any changes inside the VM.

When I restart it, its stuck in a loop with the following sequence appearing in the log file over and over again (some number keep increasing)

ts=2023-07-02T20:24:50.798262Z lvl=info msg="index opened with 8 partitions" log_id=0in3phwG000 service=storage-engine index=tsi

ts=2023-07-02T20:24:50.810737Z lvl=info msg="loading changes (start)" log_id=0in3phwG000 service=storage-engine engine=tsm1 op_name="field indices" op_event=start

ts=2023-07-02T20:24:51.964682Z lvl=info msg="loading changes (end)" log_id=0in3phwG000 service=storage-engine engine=tsm1 op_name="field indices" op_event=end op_elapsed=1153.248ms

ts=2023-07-02T20:24:52.025038Z lvl=info msg="Opened file" log_id=0in3phwG000 service=storage-engine engine=tsm1 service=filestore path=/var/lib/influxdb2/engine/data/64e176f47427e18f/autogen/2222/000000001-000000001.tsm id=0 duration=51.257ms

ts=2023-07-02T20:24:52.029135Z lvl=info msg="Opened shard" log_id=0in3phwG000 service=storage-engine service=store op_name=tsdb_open index_version=tsi1 path=/var/lib/influxdb2/engine/data/64e176f47427e18f/autogen/2222 duration=3112.264ms

ts=2023-07-02T20:24:53.317031Z lvl=info msg="index opened with 8 partitions" log_id=0in3phwG000 service=storage-engine index=tsi

ts=2023-07-02T20:24:53.515744Z lvl=info msg="loading changes (start)" log_id=0in3phwG000 service=storage-engine engine=tsm1 op_name="field indices" op_event=start

ts=2023-07-02T20:24:54.725015Z lvl=info msg="loading changes (end)" log_id=0in3phwG000 service=storage-engine engine=tsm1 op_name="field indices" op_event=end op_elapsed=1209.240ms

ts=2023-07-02T20:24:54.896777Z lvl=info msg="Opened file" log_id=0in3phwG000 service=storage-engine engine=tsm1 service=filestore path=/var/lib/influxdb2/engine/data/64e176f47427e18f/autogen/2223/000000001-000000001.tsm id=0 duration=139.177ms

ts=2023-07-02T20:24:54.898495Z lvl=info msg="Opened shard" log_id=0in3phwG000 service=storage-engine service=store op_name=tsdb_open index_version=tsi1 path=/var/lib/influxdb2/engine/data/64e176f47427e18f/autogen/2223 duration=4660.574ms

Any help or hint is highly appreciated!

Thanks