influxDB don’t use memory when read data

hi. i m confused about data flow in influxDB.

  1. influxDB don’t use memory when read data?

document say about engine works. (InfluxDB storage engine | InfluxDB OSS 2.1 Documentation)

  1. The write request is appended to the end of the WAL file.
  2. Data is written to disk using fsync().
  3. The in-memory cache is updated.
  4. When data is successfully written to disk, a response confirms the write request was successful.

but other document says.
“After fields are stored safely in TSM files, the WAL is truncated and the cache is cleared.”

As far as I understand, when i send write request, data saved in disk and then removed in memory.

  1. Where is Shard’s data located?

As mentioned above, data is stored on disk when processed, and some blogs say that Shard Groups is stored in memory and on disk after Shard Groups Duration.

what is right?

1 Like

I’m confused by your question as I don’t see the inconsistency in the information provided. What is stated is what happens. The write path is separate from the read path. So, the documentation you are referencing talks about the storage engine and how that functions to support the write path. But, you also have a question about the query engine and how it utilizes memory to return results.

The query engine does, of course, use memory too. It just depends on the kinds of queries you are executing, how much data you have, the volumes of queries you run, etc.

The shard data is located within the engine directory location specified within the config.
Have a look within the directory specified.

There are two additional blogs if you want to go deeper:

Still this answer does not clarify me a point about the read path.
Assuming I’ve correctly understood InfluxDB 2 Internals, also on the basis of the referred sources: TSI Index is on-disk, as well as TSM files obviously.

So, when we ask for a query the index is invoked (memory-mapped I suppose)… but what happens then? How is RAM involved in the process of reading actual data with the help of the index?

What surprised me too during some test sessions is that during a period in which the database was triggered by multiple queries the trend of main memory consumtion on the host machine was stable wrt a rest period, as if actually RAM was not impacted by the process of queries answering.