How influxdb work with files (tsm,db,wal)?

1-would you please clarify this part of your sentence “The files in the WAL (Write-Ahead-Log) are appended to as new writes and deletes arrive”?

Within the WAL dir, there is a dir per shard. When writes are received, are written to disk in the WAL. If you look in a shards WAL dir, you would see a files such as _00001.wal, _00002.wal, etc… These are WAL segments. Each time a write comes in, the write is appended to the current segment which is the file with the largest number (_00002.wal) in this example. When the segments hit the max segment size (10mb), they are closed and a new segment is opened.

2-what is the relation between WAL, TSM? influx write on both of them? when write on each of them?

The WAL is where incoming writes hit initially. As I mentioned before, this is a “write optimized” file structure that allows writes to be appended to the file. These writes are also maintained in an in-memory cache to support querying. When a snapshot compaction occurs, the values in the cache are written to a new TSM file and the associated WAL segments are removed.

TSM files are continually compacted into larger and more dense files. Once they are written, they are immutable and never updated. Compactions combine multiple TSM files into new ones.

3-is it possible configure to use only WAL or only TSM for special scenario that required read or query faster ?

No. If there are no writes coming in, nothing will be written to the WAL and only TSM files will be used.

4-as you mention TSM split when reach to 2 GB but i saw more TSM files that split with lower size such as 40Mb, or 30Mb!

Yes, if an individual TSM file reaches 2GB in size, we split it. Not all TSM files are 2GB and some may never reach that size. Compactions combines small less dense TSM files into more larger, denser files.

5-any technical document that describe it clearly or i need to get enterprise support to access more documents?

The docs I linked to earlier. There is also various docs (some outdated) and comments in the code.

6-what is the tombstone files, are they temp files?

Tombstones record deleted series keys/time ranges within TSM files. Since TSM files are immutable, we write a tombstone file for anything in that TSM file that is deleted. The next time that file is compacted, the deleted keys/time ranges are removed when writing the new TSM file.

1 Like