Influxdb gets unresponsive while unsuccessfully trying to compact files

I run an influxdb instance with two moderately used databases (telegraf and openhab). a few days ago the system stopped storing new data and the syslog is full of messsages like these:

Feb 24 18:07:18 z3-4 influxd[21002]: ts=2019-02-24T17:07:18.748914Z lvl=info msg="TSM compaction (start)" log_id=0DnOoANW000 engine=tsm1 tsm1_level=2 tsm1_strategy=level trace_id=0Dp7SGN0000 op_name=tsm1_compact_group op_event=start
Feb 24 18:07:18 z3-4 influxd[21002]: ts=2019-02-24T17:07:18.749059Z lvl=info msg="Beginning compaction" log_id=0DnOoANW000 engine=tsm1 tsm1_level=2 tsm1_strategy=level trace_id=0Dp7SGN0000 op_name=tsm1_compact_group tsm1_files_n=4
Feb 24 18:07:18 z3-4 influxd[21002]: ts=2019-02-24T17:07:18.749500Z lvl=info msg="Compacting file" log_id=0DnOoANW000 engine=tsm1 tsm1_level=2 tsm1_strategy=level trace_id=0Dp7SGN0000 op_name=tsm1_compact_group tsm1_index=0 tsm1_file=/var/lib/influxdb/data/telegraf/autogen/138/000009503-000000002.tsm
Feb 24 18:07:18 z3-4 influxd[21002]: ts=2019-02-24T17:07:18.749692Z lvl=info msg="Compacting file" log_id=0DnOoANW000 engine=tsm1 tsm1_level=2 tsm1_strategy=level trace_id=0Dp7SGN0000 op_name=tsm1_compact_group tsm1_index=1 tsm1_file=/var/lib/influxdb/data/telegraf/autogen/138/000009511-000000002.tsm
Feb 24 18:07:18 z3-4 influxd[21002]: ts=2019-02-24T17:07:18.749865Z lvl=info msg="Compacting file" log_id=0DnOoANW000 engine=tsm1 tsm1_level=2 tsm1_strategy=level trace_id=0Dp7SGN0000 op_name=tsm1_compact_group tsm1_index=2 tsm1_file=/var/lib/influxdb/data/telegraf/autogen/138/000009519-000000002.tsm
Feb 24 18:07:18 z3-4 influxd[21002]: ts=2019-02-24T17:07:18.749990Z lvl=info msg="Compacting file" log_id=0DnOoANW000 engine=tsm1 tsm1_level=2 tsm1_strategy=level trace_id=0Dp7SGN0000 op_name=tsm1_compact_group tsm1_index=3 tsm1_file=/var/lib/influxdb/data/telegraf/autogen/138/000009528-000000002.tsm
Feb 24 18:07:18 z3-4 influxd[21002]: ts=2019-02-24T17:07:18.750609Z lvl=info msg="Aborted compaction" log_id=0DnOoANW000 engine=tsm1 tsm1_level=2 tsm1_strategy=level trace_id=0Dp7SGN0000 op_name=tsm1_compact_group error="compaction in progress: open /var/lib/influxdb/data/telegraf/autogen/138/000009528-000000003.tsm.tmp: file exists"
Feb 24 18:07:18 z3-4 influxd[21002]: ts=2019-02-24T17:07:18.765604Z lvl=info msg="TSM compaction (end)" log_id=0DnOoANW000 engine=tsm1 tsm1_strategy=full tsm1_optimize=false trace_id=0Dp7SCWl000 op_name=tsm1_compact_group op_event=end op_elapsed=1001.598ms

I tried several option to make influx work again

  • stop and restart the DB
  • manually remove the offending *.tmp files
  • update the package to the latests available version on Debian

Everything to no avail. Any hint is appreciated

1 Like

Hi there,

anyone can comment this report please?
I’m having the same problem running InfluxDB 1.7.6.

I have tons of error messages like this “out of sudden”. Stopping the server, removing the tmp files and restarting does not help. TMP files are immediately created after restart.

Can you please provide the logs?