So a couple weeks ago, I upgraded some InfluxDB servers from 1.7.9 to 1.8.0 and a strange disks apace usage leak started. (I subsequently upgraded from 1.8.0 to 1.8.2 but it did not make any difference)
The pattern of metrics ingestion has not changed, nor have my CQs change, but when I upgraded to 1.8.x suddenly the disk space usage started growing by somewhere in the range of 2.0 to 3.5 gigbytes PER HOUR
Note that this growth is happening on the partition where meta, data and wal are located. The logs, the config, and the rest of the OS (Centos7) are all on different partitions, and are not affected.
Now here is the really strange part. If I restart the influxdb service, the disk usage suddenly drops back to normal levels. No data is lost, which is why I am describing this rapid disk space growth as a “leak”
Has anyone else seen behavior like this? Does anyone have any suggestions for debugging this issue?