InfluxDB not cleaning up wal files, causing high CPU/memory usage and service refusal

influxdb
#1

I am using InfluxDB to store daily series for a bunch of fields. I notice that as I start populating data, the service will go down after a while. Further investigation shows that the wal directory is growing until several hundred MB and somehow causing high CPU usage and high disk read, followed by high memory usage, causing the whole service to refuse any further requests, and even the whole VM to go down.

Restarting the service doesn’t help, because it seems to process all files in the wal directory on start-up, and going down the same route as above.

The only solution so far is to stop the service and delete the wal directory before starting the service again. I know deleting the wal directory will lose some data so it is not a real solution…

I have left all configurations in influxdb.conf as default.

My VM spec is as follows:
CPU: 2 cores Azure vCPU
Memory: 8GB
Disk: 16GB

Questions:

  1. Is InfluxDB not suitable for handling low frequency (daily) data?
  2. What do I do wrong that causes this issue?
  3. Any good solution?
#2

Which version of InfluxDB are you running?

#3

Hey @xjohnwu,

InfluxDB is perfect for low frequency writes, you’ll just need to configure a few things differently.

There are 2 settings that are important for low frequency writes:

  • cache-snapshot-write-cold-duration
  • cache-snapshot-memory-size

cache-snapshot-write-cold-duration will force the WAL -> TSM write when the specified time has passed
cache-snapshot-memory-size will force the WAL -> TSM write when the WAL reaches the specific size

You can read more about this here

#4

Sorry, I was on holiday so didn’t check this very often.
I was running InfluxDB 1.7.4.

Found another related issue here: InfluxDB 1.7.4 fails after 9 months without issues
Not sure if it’s something in 1.7.4 that caused this?

#5

Sorry, I was on holiday… Will try to change these two options and update here. Thanks.