High memory usage on TSI mode

We are running a stress test with importing certain amount of history records from scratch using InfluxDB v1.8.3 OSS. The goal of the test is to discover the needed parameters to set and the hardware spec in order to gain the best out of InfluxDB. By switching to TSI mode, we expect to decrease memory usage. Unfortunately, we’re pretty much stuck in the same situation where the memory usage is still high and InfluxDB service was up and down periodically. We’ve tried some ways mentioned in other threads but is not helping us much to find a way out.

We could gain a slightly better of memory usage by increasing the shard group duration (4w). But it won’t help much if we multiply the factor of the amount of records we’re going to import. It will still exceed the limit of memory (64G). It seems like even though we’ve switched to TSI, the ram usage is not dropping sharply as we expect for some reason. Our schema design will create over 3000+ databases and 10000+ points per database. The time span of the data will be a year (we’re aiming to hold 3+ years of data). We’d like to find out if there’s any other piece that we’re missing here and why TSI mode drains more memory than inmem mode.

We start to look into tweaking parameters like max-index-log-file-size to 64k and triple the amount of vm.max_map_count to 2048000. Hoping that these tune up can alleviate the stress on memory usage. Does anyone have the formula of giving the estimate for those numbers?

Other than that, we find out with more workers inserting the data, InfluxDB will consume much more memory. With sending compressed data in batch mode, the condition is a bit worsen then non-compressed one. With custom RP set, the data process rate drops also. We have InfluxQL creates database each time when inserting points. Would that be the cause where RP creation get triggered every time? Our InfluxDB node is on Ubuntu20.4 with 8 cores, 64G ram, and gp2 SSD. Looking for any comment about these issues. Thanks.

1 Like

We had very similar issues. The version of Go that we were using doesn’t release memory which caused us a ton of issues.

Try adding this to /etc/default/influxdb

GODEBUG=madvdontneed=1

This will tell GO to release memory and you should see it decrease over time. (A service restart will be required for this to take affect)

Hello! Have you tried this option?
“GODEBUG=madvdontneed=1” into “/etc / default/influxdb”
Have you solved the problem with high memory usage?
I have a similar problem.

Thanks to @Esity.

That really helps us to decrease memory during our data import process. We’ve applied few more tweaks with custom RP, e.g., max-index-log-file-size, max-concurrent-compactions and cache-snapshot-memory-size. So that we can have expected balance of memory usage in our environment.

We are trying to deploy InfluxDB in k8s other than standalone. Since InfluxDB image in container using Alpine image, the way of having that exposed in /etc/default/influxdb doesn’t work. I know this might be out of the scope of this forum, but we’d like to get comment on that if any.

We also find an interesting behavior where a restart of the InfluxDB service will help to release the memory (but we’re not in favor of applying that cause of the downtime).

I tried setup “GODEBUG=madvdontneed=1” , also different combination of values of next settings:
query-concurrency
query-memory-bytes
query-queue-size
storage-cache-max-memory-size
storage-cache-snapshot-memory-size
query-initial-memory-bytes
query-max-memory-bytes

I can not achieve than memory usage fall under 400MB , with a very small data set.

Any clue will be welcome!