InfluxDB 1.8 Occasionally really long cache snapshots

marclallen · July 19, 2023, 2:28pm

So… we’re using InfluxDB 1.8. In general, it runs great. We’re having no problems with it. But, we are having one minor issue after we’ve had an increase in write loads.

A few times an hour, we started receiving log messages about write timeouts. Even during periods with no CQs and no queries at all! The only thing we’ve noticed is that when these happen, they occur in close proximity to a snapshot cache process that takes over 10 seconds.

The vast majority of snapshots take less than a second. But, occasionally, 10+ seconds.

We’re using all default configuration for most things (except turning off tag limits, disabling internal monitoring).

We have no error logs regarding out of cache memory or really any errors at all. The log is filled with nothing but writes, shapshots and TSM compressions.

We’re running InfluxDB on an AWS Fargate instance with an EFS disk that has plenty of additional capacity. We’re able to drive InfluxDB much harder when needed and seem to find any correlation to explain why the snapshots occasionally take so long.

Any ideas?

Topic		Replies	Views
Influxdb 2.1.1 sudden slowdown and write timeouts InfluxDB 2	1	917	March 4, 2022
Internal data and cache management Store influxdb , time-series	1	4436	December 7, 2018
How managing "cache-snapshot-memory-size" with (very) different databases Store influxdb	5	5146	December 16, 2019
Debug extremely slow querie	0	1270	February 21, 2018
InfluxDB uses more meory Store	0	493	April 22, 2019

InfluxDB 1.8 Occasionally really long cache snapshots

Related topics