Influxdb.service: Service hold-off time over, scheduling restart

Vinnic · November 26, 2019, 10:50am

Hi,

I have problem with starting InfluxDB service (version=1.7.9 branch=1.7 commit=23bc63d43a8dc05f53afa46e3526ebb5578f3d88). After few minutes service is restarting without errors. Log looks like that:

Nov 26 11:45:40 ECO-SLDC systemd[1]: influxdb.service: Main process exited, code=killed, status=9/KILL
Nov 26 11:45:40 ECO-SLDC systemd[1]: influxdb.service: Unit entered failed state.
Nov 26 11:45:40 ECO-SLDC systemd[1]: influxdb.service: Failed with result 'signal'.
Nov 26 11:45:41 ECO-SLDC systemd[1]: influxdb.service: Service hold-off time over, scheduling restart.
Nov 26 11:45:41 ECO-SLDC systemd[1]: Stopped InfluxDB is an open-source, distributed, time series database.
Nov 26 11:45:41 ECO-SLDC systemd[1]: Started InfluxDB is an open-source, distributed, time series database.
Nov 26 11:45:57 ECO-SLDC influxd[13357]: ts=2019-11-26T10:45:57.252382Z lvl=info msg="index opened with 8 partitions" log_id=0JLpvQDl000 index=tsi

I’ll be happy with any help.

noahcrowley · November 26, 2019, 3:02pm

Hi @Vinnic,

There are a few things to consider here. First, the message “Service hold-off time over, scheduling restart.” indicates that systemd has waited to restart the process for a period known as the “hold-off time”, and is now restarting the process.

The message “Main process exited, code=killed, status=9/KILL” indicates the reason for the application failing; in this case, it was sent a “kill” signal from elsewhere in the system. This signal can be sent manually, or it can be sent from another mechanism such as the Out-of-Memory (OOM) Killer, which is part of the Linux kernel.

Are there any messages in your system logs that indicate why the process is being sent a “kill” signal?

Vinnic · November 27, 2019, 5:22am

Thanks for reply

command: dmesg -T | grep Out results:

[Wed Nov 27 00:42:04 2019] Out of memory in UB 4227: OOM killed process 7583 (influxd) score 0 vm:35778656kB, rss:2723912kB, swap:0kB
[Wed Nov 27 00:43:53 2019] Out of memory in UB 4227: OOM killed process 7595 (influxd) score 0 vm:35779388kB, rss:2719388kB, swap:0kB
[Wed Nov 27 00:46:25 2019] Out of memory in UB 4227: OOM killed process 7626 (influxd) score 0 vm:37329112kB, rss:2715688kB, swap:0kB
[Wed Nov 27 00:46:55 2019] Out of memory in UB 4227: OOM killed process 7684 (influxd) score 0 vm:35877676kB, rss:2717184kB, swap:0kB
[Wed Nov 27 00:48:29 2019] Out of memory in UB 4227: OOM killed process 7704 (influxd) score 0 vm:35714356kB, rss:2727696kB, swap:0kB
[Wed Nov 27 00:50:54 2019] Out of memory in UB 4227: OOM killed process 7732 (influxd) score 0 vm:35886628kB, rss:2718696kB, swap:0kB
[Wed Nov 27 00:53:24 2019] Out of memory in UB 4227: OOM killed process 7802 (influxd) score 0 vm:37098900kB, rss:2720124kB, swap:0kB
[Wed Nov 27 00:55:56 2019] Out of memory in UB 4227: OOM killed process 7839 (influxd) score 0 vm:35766440kB, rss:2717400kB, swap:0kB
[Wed Nov 27 00:57:09 2019] Out of memory in UB 4227: OOM killed process 7899 (influxd) score 0 vm:35753140kB, rss:2715324kB, swap:0kB

database works without problems for 12 months. Now is quite big - data directory is approximately 80Gb

I tried switch engine to TSI according to: Upgrade to InfluxDB 1.8.x | InfluxDB OSS v1 Documentation

To enable TSI in InfluxDB 1.7.x, complete the following steps:

a. If using the InfluxDB configuration file, find the [data] section, uncomment index-version = "inmem" and change the value to tsi1 .

b. If using environment variables, set INFLUXDB_DATA_INDEX_VERSION to tsi1 .

c. Delete shard index directories (by default, located at /<shard_ID>/index ).

d. Convert TSM-based shards to TSI-based shards by running the influx_inspect buildtsi command.

but after this operation log looks like that:

2019-11-27T06:46:53.763618Z     info    index opened with 8 partitions  {"log_id": "0JMuZ8Yl000", "index": "tsi"}
2019-11-27T06:46:53.766147Z     info    Reading file    {"log_id": "0JMuZ8Yl000", "engine": "tsm1", "service": "cacheloader", "path": "/var/lib/influxdb/wal/liczniki/autogen/418/_01945.wal", "size": 455749}
2019-11-27T06:46:53.768512Z     info    Reading file    {"log_id": "0JMuZ8Yl000", "engine": "tsm1", "service": "cacheloader", "path": "/var/lib/influxdb/wal/liczniki/autogen/416/_02339.wal", "size": 133}
2019-11-27T06:46:53.779122Z     info    Reading file    {"log_id": "0JMuZ8Yl000", "engine": "tsm1", "service": "cacheloader", "path": "/var/lib/influxdb/wal/liczniki/autogen/416/_02341.wal", "size": 267}

I’m not sure if everything works fine becouse of: "index": "tsi" and "engine": "tsm1"

noahcrowley · November 27, 2019, 3:15pm

Hi @Vinnic,

I believe everything is working fine. tsm1 is the on-disk storage format; it can be used with either an inmem index, which is what you have been using so far, and tsi, which is the disk based index. We’re working to clarify some of the language in the docs to avoid confusion.

Vinnic · November 28, 2019, 7:25am

Thanks @noahcrowley

Changing engine to TSI solved the problem

Topic		Replies	Views
InfluxDB not starting after power outage InfluxDB 1	3	2287	August 30, 2021
My influxdb is not able to start anymore :-/ InfluxDB 1 influxdb	4	8888	September 19, 2021
InfluxDB does not start InfluxDB 1	6	1149	March 24, 2022
Influxd process crashing every night around 01:20 with "out of memory" InfluxDB 2 influxdb	2	128	February 19, 2025
Influxdb return Error Code on RPI	5	2161	August 2, 2021

Influxdb.service: Service hold-off time over, scheduling restart

Related topics