What determines when these InfluxDB /wal/ directory contents are moved to /data/?

I have an InfluxDB database set up, and as far as I understand the data is first written into the Write-Ahead Logs (WAL), which is write-optimised but space-inefficient, and then after a certain time these files are compacted and moved to Time-Structured Merge Tree (TSM) files, which are read-optimised and space-efficient.

On my system, the WAL files are stored in the /wal/ directory, located at /var/lib/linfluxdb/wal/ and the /data/ directory is located at /var/lib/linfluxdb/data/ .

I am monitoring the size of these two directories on disk using Telegraf/Grafana, and a screenshot of their size over a period of around 10 hours is shown below. The right y-axis corresponds to the /data/ directory, and the left y-axis is /wal/ :
Image3

As expected, the /wal/ directory increases in size as the data comes into the database, and then after some time the data is compacted and we see a step change in the /data/ directory size as the information is moved there.

What I don’t understand is what determines when this happens. Reading off the graph, I can see that the time between compactions is 6 hours and 7 minutes. The size of the /wal/ directory reached immediately before the compaction was 40.9 MB. Neither of these are nice, round numbers.

Taking a look at my influxdb.conf file, both the cache-snapshot-memory-size and cache-snapshot-write-cold-duration variables are commented out:
Image1

I was under the impression that these values set how often the compaction happens? Since they are commented out, what is determining the time now - are there defaults somewhere?

Thanks

Does anyone know where the default times for compression are set?

hi @teeeeee -

The TSM configuration file you shared is the right one. The values that are commented out are the defaults (25M and 10m). The WAL cache snapshot memory size is calculated based on the in-memory size of the cache may not match its on-disk size.

The cold duration will snapshot if there have been no writes for at least 10 minutes by default. Otherwise, the size snapshot value is always used.

Please let us know if you are experiencing a problem or if you have further questions.

Do I also need to point the wal-dir and data directories to different disks to improve performance when using influxdb2? The configuration file of influxdb2 only provides parameters for modifying the engine-path. And, what factors is the disk size of wal-dir estimated based on?