Ram is shooting up because of high data pump to influxdb

aksshm · October 27, 2020, 10:39am

Hi,

We are running Influxdb in k8s environment, we are pushing some 50k data to influxdb for 3 days continuously. We could see host ram is shooting up and system is getting slow.

As influxdb stores data in disk as well as ram for fast process, is there any configuration in influxdb where we can limit storation of data in ram?

Giovanni_Luisotto · October 27, 2020, 11:12am

I had a similar issue, the RAM was saturating due to the disk being too slow to write all the data.
As far as I know, there are no conf settings to set a limit in the RAM usage, probably because that would mean actually dropping data… which will happen anyway once the RAM fills up (as InfluxDB will refuse mew data). My suggestion is to have a look at your disks and ensure they are able to sustain the read/write load.

The only way I found to solve my problem (since I could not have better disks, and they were way below the minimum suggested IOPS) was to limit the gathered data, meaning that I avoided gathering some heavy data from the less used systems.

You can have a look at the hardware sizing suggestion here:

aksshm · October 27, 2020, 11:39am

Hi, Thanks for your reply.

If you are familiar with influxdb internals, influxdb supports TMI for high cardianlity and we can change influxdb to use TMI. Right?

isn’t that a right solution?

Giovanni_Luisotto · October 27, 2020, 11:59am

I’m not that knowledgeable in InfluxDB internals, but you have 2 options
You have two options

TSM - Time-Structured Merge Tree
TSI - Time Series Index

If your problem is given by high cardinality, then TSI should help, I can’t really tell if that will solve your issue, if possible try to switch to it and see how it goes. There are people reporting that it consumes even more RAM than TSM but I think that’s up to the specific situation.

as info, I’m using TSI on all my InfluxDB instances and I had problem with it only when the disk was too slow to keep up with data fetched.

To have a more complete opinion on this one we might try to summon @Anaisdg and see if she has some more info to add about the 2 index.

aksshm · October 27, 2020, 1:01pm

Is influxdb 1.7 and later version , are by default TSI?
How can we check influxdb is with TSM or TSI?

Giovanni_Luisotto · October 27, 2020, 1:13pm

By default it uses TSM, you can check it in the influxDB configuration file

aksshm · October 27, 2020, 1:20pm

Below is my configfile,but couldn’t able to see any info regarding that

reporting-disabled = false
bind-address = “:8088”

[meta]
dir = “/var/lib/influxdb/meta”
retention-autocreate = true
logging-enabled = true

[data]
dir = “/var/lib/influxdb/data”
wal-dir = “/var/lib/influxdb/wal”
query-log-enabled = true
cache-max-memory-size = 1073741824
cache-snapshot-memory-size = 26214400
cache-snapshot-write-cold-duration = “10m0s”
compact-full-write-cold-duration = “4h0m0s”
max-series-per-database = 1000000
max-values-per-tag = 100000
trace-logging-enabled = false

[coordinator]
write-timeout = “10s”
max-concurrent-queries = 0
query-timeout = “0s”
log-queries-after = “0s”
max-select-point = 0
max-select-series = 0
max-select-buckets = 0

[retention]
enabled = true
check-interval = “30m0s”

[shard-precreation]
enabled = true
check-interval = “10m0s”
advance-period = “30m0s”

[admin]
enabled = false
bind-address = “:8083”
https-enabled = false
https-certificate = “/etc/ssl/influxdb.pem”

[monitor]
store-enabled = true
store-database = “_internal”
store-interval = “10s”

[subscriber]
enabled = true
http-timeout = “30s”
insecure-skip-verify = false
ca-certs = “”
write-concurrency = 40
write-buffer-size = 1000

[http]
enabled = true
bind-address = “:8086”
auth-enabled = false
log-enabled = true
write-tracing = false
pprof-enabled = true
https-enabled = false
https-certificate = “/etc/ssl/influxdb.pem”
https-private-key = “”
max-row-limit = 10000
max-connection-limit = 0
shared-secret = “beetlejuicebeetlejuicebeetlejuice”
realm = “InfluxDB”
unix-socket-enabled = false
bind-socket = “/var/run/influxdb.sock”

TODO: allow multiple graphite listeners

[[graphite]]
enabled = false
bind-address = “:2003”
database = “graphite”
retention-policy = “autogen”
protocol = “tcp”
batch-size = 5000
batch-pending = 10
batch-timeout = “1s”
consistency-level = “one”
separator = “.”
udp-read-buffer = 0

TODO: allow multiple collectd listeners with templates

[[collectd]]
enabled = false
bind-address = “:25826”
database = “collectd”
retention-policy = “autogen”
batch-size = 5000
batch-pending = 10
batch-timeout = “10s”
read-buffer = 0
typesdb = “/usr/share/collectd/types.db”
security-level = “none”
auth-file = “/etc/collectd/auth_file”

TODO: allow multiple opentsdb listeners with templates

[[opentsdb]]
enabled = false
bind-address = “:4242”
database = “opentsdb”
retention-policy = “autogen”
consistency-level = “one”
tls-enabled = false
certificate = “/etc/ssl/influxdb.pem”
batch-size = 1000
batch-pending = 5
batch-timeout = “1s”
log-point-errors = true

TODO: allow multiple udp listeners with templates

[[udp]]
enabled = false
bind-address = “:8089”
database = “udp”
retention-policy = “autogen”
batch-size = 5000
batch-pending = 10
read-buffer = 0
batch-timeout = “1s”
precision = “ns”

[continuous_queries]
log-enabled = true
enabled = true
run-interval = “1s”

[logging]
format = “auto”
level = “info”
supress-logo = false

Giovanni_Luisotto · October 27, 2020, 1:30pm

If it’s not there then it’s using the default value “inmem” (TSM)

it should be under

[data]
  index-version = "inmem"

also as far as I remember the conversion is not automatic
see this post (the part about index conversion) and maybe have a look around

aksshm · October 29, 2020, 9:34am

Hi,
After some analysis, could see memory is increasing 1g almost in a day.
We’ve a retention policy of 7d1h.

Analysis as follows :

Wal file is created for each shard group, and then segment file is created(in each shard dir) when there is a write to influxdb …wal file gets appended and size increases, and When the segments hit the max segment size (10mb), they are closed and a new segment is opened.
And tsm file will be created based on cache-snapshot-memory-size which is 26214400(26m), but seeing tsm file is creating when wal dir(_tmp files) size is 15m only. why it is happening like that?
when tsm files are created and move to disk, then why in-memory size is increasing(could see wal directory not much data into that)?

eturcott · December 5, 2020, 3:17pm

hi,

one other reason for your memory that always increasing is because you define as Tag something that is always changing like a process id or net interfaces from vmware. List your tags you might find that the number is increasing every day. Tags are load in memory.

Good luck

Topic		Replies	Views
Addressing the growing RAM vs usage issue, aka, unexpected "out of memory" Store influxdb	11	12603	May 3, 2021
InfluxDB v2 High RAM usage and leading into OOM and constant restart InfluxDB 2 influxdb	5	4555	July 27, 2024
InfluxDB TSI1 takes all RAM InfluxDB 2 influxdb	1	521	June 28, 2022
Influx running OOM when loading large amount data InfluxDB 2 influxdb , time-series , query	1	919	May 9, 2022
[question] Influxdb RAM usage Store influxdb	4	3694	May 1, 2017

Ram is shooting up because of high data pump to influxdb

TODO: allow multiple graphite listeners

TODO: allow multiple collectd listeners with templates

TODO: allow multiple opentsdb listeners with templates

TODO: allow multiple udp listeners with templates

Related topics