Influxdb.service: Service hold-off time over, scheduling restart

Hi,

I have problem with starting InfluxDB service (version=1.7.9 branch=1.7 commit=23bc63d43a8dc05f53afa46e3526ebb5578f3d88). After few minutes service is restarting without errors. Log looks like that:

Nov 26 11:45:40 ECO-SLDC systemd[1]: influxdb.service: Main process exited, code=killed, status=9/KILL
Nov 26 11:45:40 ECO-SLDC systemd[1]: influxdb.service: Unit entered failed state.
Nov 26 11:45:40 ECO-SLDC systemd[1]: influxdb.service: Failed with result 'signal'.
Nov 26 11:45:41 ECO-SLDC systemd[1]: influxdb.service: Service hold-off time over, scheduling restart.
Nov 26 11:45:41 ECO-SLDC systemd[1]: Stopped InfluxDB is an open-source, distributed, time series database.
Nov 26 11:45:41 ECO-SLDC systemd[1]: Started InfluxDB is an open-source, distributed, time series database.
Nov 26 11:45:57 ECO-SLDC influxd[13357]: ts=2019-11-26T10:45:57.252382Z lvl=info msg="index opened with 8 partitions" log_id=0JLpvQDl000 index=tsi

I’ll be happy with any help.

Hi @Vinnic,

There are a few things to consider here. First, the message “Service hold-off time over, scheduling restart.” indicates that systemd has waited to restart the process for a period known as the “hold-off time”, and is now restarting the process.

The message “Main process exited, code=killed, status=9/KILL” indicates the reason for the application failing; in this case, it was sent a “kill” signal from elsewhere in the system. This signal can be sent manually, or it can be sent from another mechanism such as the Out-of-Memory (OOM) Killer, which is part of the Linux kernel.

Are there any messages in your system logs that indicate why the process is being sent a “kill” signal?

1 Like

Thanks for reply :slight_smile:

command: dmesg -T | grep Out results:

[Wed Nov 27 00:42:04 2019] Out of memory in UB 4227: OOM killed process 7583 (influxd) score 0 vm:35778656kB, rss:2723912kB, swap:0kB
[Wed Nov 27 00:43:53 2019] Out of memory in UB 4227: OOM killed process 7595 (influxd) score 0 vm:35779388kB, rss:2719388kB, swap:0kB
[Wed Nov 27 00:46:25 2019] Out of memory in UB 4227: OOM killed process 7626 (influxd) score 0 vm:37329112kB, rss:2715688kB, swap:0kB
[Wed Nov 27 00:46:55 2019] Out of memory in UB 4227: OOM killed process 7684 (influxd) score 0 vm:35877676kB, rss:2717184kB, swap:0kB
[Wed Nov 27 00:48:29 2019] Out of memory in UB 4227: OOM killed process 7704 (influxd) score 0 vm:35714356kB, rss:2727696kB, swap:0kB
[Wed Nov 27 00:50:54 2019] Out of memory in UB 4227: OOM killed process 7732 (influxd) score 0 vm:35886628kB, rss:2718696kB, swap:0kB
[Wed Nov 27 00:53:24 2019] Out of memory in UB 4227: OOM killed process 7802 (influxd) score 0 vm:37098900kB, rss:2720124kB, swap:0kB
[Wed Nov 27 00:55:56 2019] Out of memory in UB 4227: OOM killed process 7839 (influxd) score 0 vm:35766440kB, rss:2717400kB, swap:0kB
[Wed Nov 27 00:57:09 2019] Out of memory in UB 4227: OOM killed process 7899 (influxd) score 0 vm:35753140kB, rss:2715324kB, swap:0kB

database works without problems for 12 months. Now is quite big - data directory is approximately 80Gb

I tried switch engine to TSI according to: Upgrade to InfluxDB 1.8.x | InfluxDB OSS v1 Documentation

To enable TSI in InfluxDB 1.7.x, complete the following steps:

a. If using the InfluxDB configuration file, find the [data] section, uncomment index-version = "inmem" and change the value to tsi1 .

b. If using environment variables, set INFLUXDB_DATA_INDEX_VERSION to tsi1 .

c. Delete shard index directories (by default, located at /<shard_ID>/index ).

d. Convert TSM-based shards to TSI-based shards by running the influx_inspect buildtsi command.

but after this operation log looks like that:

2019-11-27T06:46:53.763618Z     info    index opened with 8 partitions  {"log_id": "0JMuZ8Yl000", "index": "tsi"}
2019-11-27T06:46:53.766147Z     info    Reading file    {"log_id": "0JMuZ8Yl000", "engine": "tsm1", "service": "cacheloader", "path": "/var/lib/influxdb/wal/liczniki/autogen/418/_01945.wal", "size": 455749}
2019-11-27T06:46:53.768512Z     info    Reading file    {"log_id": "0JMuZ8Yl000", "engine": "tsm1", "service": "cacheloader", "path": "/var/lib/influxdb/wal/liczniki/autogen/416/_02339.wal", "size": 133}
2019-11-27T06:46:53.779122Z     info    Reading file    {"log_id": "0JMuZ8Yl000", "engine": "tsm1", "service": "cacheloader", "path": "/var/lib/influxdb/wal/liczniki/autogen/416/_02341.wal", "size": 267}

I’m not sure if everything works fine becouse of: "index": "tsi" and "engine": "tsm1"

Hi @Vinnic,

I believe everything is working fine. tsm1 is the on-disk storage format; it can be used with either an inmem index, which is what you have been using so far, and tsi, which is the disk based index. We’re working to clarify some of the language in the docs to avoid confusion.

Thanks @noahcrowley

Changing engine to TSI solved the problem :grinning:

1 Like