InfluxDB TSI Index - How is memory used?

In short, I’d like to know how InfluxDB is using the allocated RAM when using TSI indexes, and possibly even monitor it to understand how much of the memory is used to do what.

My observation is that without any limitation InfluxDB will allocate all the available memory, after setting some OS level limitations to the allocated memory I was afraid to get an out-of-memory (OOM) error, but it kept running without issues, therefore the question became “how much memory is actually needed?”

After looking at docs and old posts I know that memory is used for the following:

  • Running the engine itself (GO runtime)
  • WAL Caching
  • TSI Index Files
  • ???

Here I’ve found the following:

The general behavior to expect is that InfluxDB will use as much memory as is available to maintain an in-memory index and fall back to disk for anything else.
{…} after TSI, when the memory limit gets hit, InfluxDB starts referring to those indices. Also, it loads the WAL in-mem, indices are paged in as required.

I’d also like to identify the index files that are memory-mapped (IndexFile) at OS level.

Will influxDB allocate all the available memory or just what it needs to work properly?
Is it possible to estimate or monitor how much memory is needed due to the data (index files)?
Is there something else I’ve left out that consumes memory?

Thanks

1 Like

Index file == .tsi fileI recommend pointing the user toward pprof: InfluxDB runtime | InfluxDB OSS 2.3 Documentation that some ‘memory usage’ of InfluxDB will be in the form of reserved memory (measurable by heap profiles / RSS size in top ), and some memory usage is in the form of the linux ‘page cache’ backed by mmap’d files (mostly TSM (data) and TSI (series index) files). If you have limited memory headroom, query performance will suffer since there is less room for the page cache - but the page cache never causes an OOM, since it can always be dropped and re-read from disk.

Is it possible you’re wanting operational monitoring, like with telegraf?

I’m aware of the self-monitoring and Telegraf monitoring for InfluxDB, I was just trying to understand how the memory is used.
My objective is to estimate memory needs, and how much gets used by the different “parts” engine/data/etc

Thanks for the infos I’ll look into that

@Giovanni_Luisotto
:hushed: op! Sorry I should have known you were aware already.
Hmmm I don’t think there is a way to estimate this. I’ll around ask to make sure.