Ever-increasing RAM usage with low series cardinality

bongo · October 2, 2017, 4:24pm

Hi,

I’m just testing influxdb 1.3.5 for storing a small number (~30-300) of very long integer series (worst case: (86400)*(12*365) [sec/day * ((days/year)*12) * 1 device] = 378.432.000)

e.g. the number of total points would be for 320 devices: (86400)*(12*365)*320 [sec/day * ((days/year)*12) * 320 devices] = 121.098.240.000)

The series cardinality is low, it equals the number of devices. I’m using second-precision timestamps (that mode is enabled when I commit to influxdb via the php-API.
Yes, I really need to keep all the samples, so downsampling is not an option.

I’m inserting the samples as point-arrays of size 86400 per request sorted from oldest to newest. The behaviour is similar (OOM in both cases) for inmem and tsi1 indexing modes.

Despite all that, I’m not able to insert this number of points to the database without crashing it due to out of memory. The host-vm has 8GiB of RAM and 4GiB of Swap which fill up completely. I cannot find anything about that setup being problematic in the documentation. I cannot find a notice that indicates this setup should result in a high RAM usage at all…

Does anyone have a hint on what could be wrong here?

Thanks and all the best!
b-

bongo · October 6, 2017, 10:49pm

~~I found out what the issue most likely was:~~

I had a bug in my feeder that caused timestamps not being updated to lots of points with distinct values were written over and over again to the same timestamp/tag combination.

If you experience something similar, try double-checking each step in the pipeline for a time concerning error.

This was not the issue unfortunately, the ram usage rises nevertheless then importing more points than before.

bongo · October 17, 2017, 8:09pm

So far this worked best for me and brought influx to a moderate memory usage of ~3GiB

Lower cache-snapshot-write-cold-duration to 10s during backfilling
create default retention policy with long shard duration e.g. create database sensors with duration INF shard duration 5200w name longterm

This essentially locks the number of shard groups being created to one (expecting a daterange smaller than 100y). influxdb then manages the tsm files within the group itself (so you should not lose performance, It’s designed to work this way).

Topic		Replies	Views
Cardinality and system performance InfluxDB 2 influxdb	5	2932	September 22, 2021
InfluxDB users, what is your series cardinality?	6	7654	June 4, 2018
Sensor Data - Series Cardinality Store	6	1630	November 7, 2019
Influxdb 1.7 - high ram - No queries influxdb	2	601	May 20, 2021
How can I reduce the memory usage of InfluxDB 1.7.2	14	18317	August 17, 2019

Ever-increasing RAM usage with low series cardinality

Related topics