Hi. I have a few questions regarding running InfluxDB with an SD card as the primary data store. I originally tried to post these questions to the recent blog post at the following link (a couple of times, in fact) but they never showed up. Maybe they are stuck in moderation, who knows.
Just as a prefix, SD cards or eMMC typically have a limited number of erase/write cycles per erase block. Depending on the type of card, this number could be 100k or 3k even. However, by default, InfluxDB writes to the WAL with every single insertion, which would be a big hit to the card. What are the “knobs” I have to try to reduce IO load on the SD card? I have listed all the ones I know of below plus ones I’m curious about.
Use case: multiple sensors (say, up to 20) generating points with multiple fields (say, up to 4) at fairly high rate (say, up to 10 Hz). Data is written often but queried rarely.
- Smartly choose retention policies to downsample and otherwise limit the total amount of data stored on the SD card, giving the SD card’s wear leveling algorithms lots of space to use for wear balancing. In this same vein, storing the data in a way that allows high compression ratio is also useful, such as ints.
- Use a high-endurance industrial SD card, for instance from company Swissbit, to give the SD card a long lifetime.
- Change influxdb.conf to set wal-fsync-delay from 0s to a higher number. This will limit the frequency that the kernel commits WAL writes to the SD card. The risk is that power loss could cause data loss of anything not yet fsynced. ext4 by default fsyncs every 5 seconds automatically, so this setting should be set with whatever file system options being used in mind.
- Change influxdb.conf to set the wal-dir to a location in an in-memory file system. This will use 10 MB of memory per WAL. On power failure, even more data will be lost. There’s already an in-memory cache, so this seems sort of a silly way circumvent the WAL. In this case, the cache-snapshot-mem
ory-size could be reduced to a small number so that the cache is written to disk as TSM more frequently. - Put the WAL directory on a device different than the SD card; something that can endure many writes, such as NVRAM. The problem here is that the size of each WAL file is 10 MB, and getting large amounts of NVRAM is not cheap and sometimes not possible. How many WAL files will exist at once? Is it possible to reduce their size from 10 MB and also predict the total space needed for the WAL directory?
- Choose a file system for the SD card that nicely fits the IO patterns of InfluxDB. Mount it with noatime option to prevent lots of file system metadata updates. What file system is best? ext3, perhaps with journaling disabled?
- Anything else?
Thanks very much.