Frequent logging for 'Cache snapshot'

After many years of running influx 1.8 without any issue we’re just coming around to tuning and learning more about its inner workings.

One of the first issues to come up is logging frequency. The following logs show up about every 30 seconds. If these are routine and expected I’d like to know how to mute them. I realize there are some tuning parameters for cache-snapshot-*but it isn’t clear to me how this might impact my current configuration.

My config is very bland and included below.

    Jan  5 07:51:30 influx001 influxd-systemd-start.sh[2816629]: ts=2024-01-05T15:51:30.552588Z lvl=info msg="Cache snapshot (start)" log_id=0mXwlFz0000 engine=tsm1 trace_id=0mY_~ez0000 op_name=tsm1_cache_snapshot op_event=start
    Jan  5 07:51:30 influx001 influxd-systemd-start.sh[2816629]: ts=2024-01-05T15:51:30.552588Z lvl=info msg="Cache snapshot (start)" log_id=0mXwlFz0000 engine=tsm1 trace_id=0mY_~ez0000 op_name=tsm1_cache_snapshot op_event=start
    Jan  5 07:51:31 influx001 influxd-systemd-start.sh[2816629]: ts=2024-01-05T15:51:31.455532Z lvl=info msg="Snapshot for path written" log_id=0mXwlFz0000 engine=tsm1 trace_id=0mY_~ez0000 op_name=tsm1_cache_snapshot path=/var/lib/influxdb/data/telegraf/autogen/4103 duration=902.958ms
    Jan  5 07:51:31 influx001 influxd-systemd-start.sh[2816629]: ts=2024-01-05T15:51:31.455532Z lvl=info msg="Snapshot for path written" log_id=0mXwlFz0000 engine=tsm1 trace_id=0mY_~ez0000 op_name=tsm1_cache_snapshot path=/var/lib/influxdb/data/telegraf/autogen/4103 duration=902.958ms
    Jan  5 07:51:31 influx001 influxd-systemd-start.sh[2816629]: ts=2024-01-05T15:51:31.455560Z lvl=info msg="Cache snapshot (end)" log_id=0mXwlFz0000 engine=tsm1 trace_id=0mY_~ez0000 op_name=tsm1_cache_snapshot op_event=end op_elapsed=902.984ms
    Jan  5 07:51:31 influx001 influxd-systemd-start.sh[2816629]: ts=2024-01-05T15:51:31.455560Z lvl=info msg="Cache snapshot (end)" log_id=0mXwlFz0000 engine=tsm1 trace_id=0mY_~ez0000 op_name=tsm1_cache_snapshot op_event=end op_elapsed=902.984ms

    [meta]
      dir = "/var/lib/influxdb/meta"
    [data]
      dir = "/var/lib/influxdb/data"
      wal-dir = "/var/lib/influxdb/wal"
      query-log-enabled = false
      series-id-set-cache-size = 100
    [coordinator]
    [retention]
    [shard-precreation]
    [monitor]
    [http]
      auth-enabled = true
      https-enabled = true
      https-certificate = "/etc/letsencrypt/live/somewhere.io/fullchain.pem"
      https-private-key = "/etc/letsencrypt/live/somewhere.io/privkey.pem"
      log-enabled = false
      shared-secret = "foobar"
    [logging]
    [subscriber]
    [[graphite]]
    [[collectd]]
    [[opentsdb]]
    [[udp]]
    [continuous_queries]
      log-enabled = false
    [tls]

The frequency of these logs is generally tied to the frequency of the cache snapshot process. In InfluxDB, this process is controlled by several configuration parameters under the [data] section of the configuration file. These parameters include:

  1. cache-snapshot-write-cold-duration: Controls how often InfluxDB will snapshot the cache and write to disk. If this duration is reached and no write has occurred, InfluxDB will write the cache to disk.
  2. cache-snapshot-memory-size: Determines the size at which InfluxDB will snapshot the cache and write to disk. If the cache grows beyond this size, it triggers a snapshot.

To reduce the frequency of these log messages, you can adjust these parameters. Increasing the cache-snapshot-write-cold-duration or the cache-snapshot-memory-size will reduce the frequency of cache snapshots, and consequently, the log messages.

Here’s the documentation: