I run an influxdb instance with two moderately used databases (telegraf and openhab). a few days ago the system stopped storing new data and the syslog is full of messsages like these:
Feb 24 18:07:18 z3-4 influxd[21002]: ts=2019-02-24T17:07:18.748914Z lvl=info msg="TSM compaction (start)" log_id=0DnOoANW000 engine=tsm1 tsm1_level=2 tsm1_strategy=level trace_id=0Dp7SGN0000 op_name=tsm1_compact_group op_event=start
Feb 24 18:07:18 z3-4 influxd[21002]: ts=2019-02-24T17:07:18.749059Z lvl=info msg="Beginning compaction" log_id=0DnOoANW000 engine=tsm1 tsm1_level=2 tsm1_strategy=level trace_id=0Dp7SGN0000 op_name=tsm1_compact_group tsm1_files_n=4
Feb 24 18:07:18 z3-4 influxd[21002]: ts=2019-02-24T17:07:18.749500Z lvl=info msg="Compacting file" log_id=0DnOoANW000 engine=tsm1 tsm1_level=2 tsm1_strategy=level trace_id=0Dp7SGN0000 op_name=tsm1_compact_group tsm1_index=0 tsm1_file=/var/lib/influxdb/data/telegraf/autogen/138/000009503-000000002.tsm
Feb 24 18:07:18 z3-4 influxd[21002]: ts=2019-02-24T17:07:18.749692Z lvl=info msg="Compacting file" log_id=0DnOoANW000 engine=tsm1 tsm1_level=2 tsm1_strategy=level trace_id=0Dp7SGN0000 op_name=tsm1_compact_group tsm1_index=1 tsm1_file=/var/lib/influxdb/data/telegraf/autogen/138/000009511-000000002.tsm
Feb 24 18:07:18 z3-4 influxd[21002]: ts=2019-02-24T17:07:18.749865Z lvl=info msg="Compacting file" log_id=0DnOoANW000 engine=tsm1 tsm1_level=2 tsm1_strategy=level trace_id=0Dp7SGN0000 op_name=tsm1_compact_group tsm1_index=2 tsm1_file=/var/lib/influxdb/data/telegraf/autogen/138/000009519-000000002.tsm
Feb 24 18:07:18 z3-4 influxd[21002]: ts=2019-02-24T17:07:18.749990Z lvl=info msg="Compacting file" log_id=0DnOoANW000 engine=tsm1 tsm1_level=2 tsm1_strategy=level trace_id=0Dp7SGN0000 op_name=tsm1_compact_group tsm1_index=3 tsm1_file=/var/lib/influxdb/data/telegraf/autogen/138/000009528-000000002.tsm
Feb 24 18:07:18 z3-4 influxd[21002]: ts=2019-02-24T17:07:18.750609Z lvl=info msg="Aborted compaction" log_id=0DnOoANW000 engine=tsm1 tsm1_level=2 tsm1_strategy=level trace_id=0Dp7SGN0000 op_name=tsm1_compact_group error="compaction in progress: open /var/lib/influxdb/data/telegraf/autogen/138/000009528-000000003.tsm.tmp: file exists"
Feb 24 18:07:18 z3-4 influxd[21002]: ts=2019-02-24T17:07:18.765604Z lvl=info msg="TSM compaction (end)" log_id=0DnOoANW000 engine=tsm1 tsm1_strategy=full tsm1_optimize=false trace_id=0Dp7SCWl000 op_name=tsm1_compact_group op_event=end op_elapsed=1001.598ms
I tried several option to make influx work again
- stop and restart the DB
- manually remove the offending *.tmp files
- update the package to the latests available version on Debian
Everything to no avail. Any hint is appreciated