How to estimate the storage capacity for wal and data directories?

I’m testing for Influxdb2. I can see wal and data directories in /var/lib/influxdb/engine.

Do I need to point the wal and data directories to different disks to improve performance when using influxdb2, like for v1.x? The configuration file of influxdb2 only provides parameters for modifying the engine-path.

And, what factors influence the storage size of wal? How to estimate disk capacity for wal and data directories?

I can’t find any docs about this.

Thanks for help!

Hello @tohlzhu,
These HW guidelines roughly apply to 2.x as well I believe:

Something here might be useful (although I didn’t find anything, still worth knowing about):

Factors that influence the storage size of the wal:

  • high series cardinality
  • high ingest rate
  • lack of downsampling or automatically expiring old data
  • storing logs
  • excessively long measurement or tag names (this is a rare situation).

There’s this tool in 1.8:

Perhaps we should create an feature to make this available in 2.x.

I’ll also ask around. Thank you for your patience.

Thanks for replying, this is helpful!

I’ve read the “Hardware sizing guidelines-> Bytes and compression” in docs for v1.8.
Can I assume the size of data directory can be calculated mainly based on points number, and wal directory is only related to write speed? In other words, the wal directory has nothing to do with the data volume of the data in the “data” directory, and the “old” data in the “wal” directory will be deleted naturally?
So, for a 4vCPU and 32GB memory instance, I can give wal a fixed small disk size, for example a 32GB disk and give a 4TB disk for data directory. Is this reasonable?

The doc also suggests to store wal and data on separate storage devices for heavy write load. Although I’m not facing this kind of scenario, I still want to know how to set separate wal and data paths on v2.x. Can you give me suggestions?

Helo @tohlzhu,
Yes if your series cardinality isn’t increasing, then you can.
If it is, then your WAL will increase as well (as I understand it).

Hmm I’m not sure how to separate wal and data paths on v2.x, let me ask around.

@tohlzhu jk the documentation for wal and data path and how to change it is here:

Blessings to the docs team <3
Thanks @scott and team

Hi Anaisdg, I’ve read this “InfluxDB file system layout” page. It mentioned “use the engine-path configuration option” to change engine directory, which is the parent directory of wal and data. This is why I said “The configuration file of influxdb2 only provides parameters for modifying the engine-path.”, wal and data is still on the same disk, and will not be separated.

I suspect that v2.x can not separate the wal directory and the data directory by configuration. Is the documentation incomplete, or does it not have the ability to do this?

@tohlzhu InfluxDB 2.x doesn’t currently expose separate configuration options for data and wal paths, but you’re welcome to submit a feature request.

@scott Thanks, anyway, that’s a solution.