I have a server running multiple Influx instances (around 200), using roughly 25% CPU during the day and with enough disk IO capacity to handle the load, except at midnight (in the server timezone) where CPU (system) and disk IO are going crazy. Logs show that TSM compaction is being triggered on all 200 instances at midnight.
I’d like to know first why exactly at midnight. 100 databases have a retention policy of 3 years, and a shard size of 7 days. The other 100 databases have a retention policy of 2 days, and a shard size of 1 day.
Logs show that TSM compaction with strategy=level occurs multiple times per day, and that seems to be related to the load of the DB. But then strategy=full always occurs at midnight. Is that hardcoded in Influx ?
And is there a way to spread the compaction over the day ?