Running Python script as systemd service, connecting to InfluxDB results in ConnectionRefusedError/ConnectionError

I am running a shell script as Ubuntu’s systemd service to start at boot-time. The script internally executes a Python script (python_simulator.py) that connects to InfluxDB (through Python’s influxdb package).

The Python script fails to start at boot-time and checking the logs suggests that’s because of ‘ConnectionError’ while connecting to InfluxDB. I interpreted it as it’s possible that the influxdb service as not started by the time the Python service activates at boot time. So I have tried to add the order dependency in the service by adding “After” and “Wants” as “influxdb.service” which activates the Python service a few seconds after influxdb service. But, I still get the same connection error.

The systemd service (myservice.service) looks like:

[Unit]
Description= Python startup service.
After=influxdb.service
Wants=influxdb.service

[Service]
Type=forking
ExecStart=/bin/bash /home/test_user/Deploy/start.sh
ExecStop=/bin/bash /home/test_user/Deploy/stop.sh

[Install]
WantedBy=multi-user.target

The log file of the Python script (python_simulator.py):

Traceback (most recent call last):
  File "/home/test_user/Deploy/py_venv/lib/python3.6/site-packages/urllib3/connection.py", line 159, in _new_conn
    (self._dns_host, self.port), self.timeout, **extra_kw)
  File "/home/test_user/Deploy/py_venv/lib/python3.6/site-packages/urllib3/util/connection.py", line 80, in create_connection
    raise err
  File "/home/test_user/Deploy/py_venv/lib/python3.6/site-packages/urllib3/util/connection.py", line 70, in create_connection
    sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/test_user/Deploy/py_venv/lib/python3.6/site-packages/urllib3/connectionpool.py", line 600, in urlopen
    chunked=chunked)
  File "/home/test_user/Deploy/py_venv/lib/python3.6/site-packages/urllib3/connectionpool.py", line 354, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/lib/python3.6/http/client.py", line 1239, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/lib/python3.6/http/client.py", line 1285, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.6/http/client.py", line 1234, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.6/http/client.py", line 1026, in _send_output
    self.send(msg)
  File "/usr/lib/python3.6/http/client.py", line 964, in send
    self.connect()
  File "/home/test_user/Deploy/py_venv/lib/python3.6/site-packages/urllib3/connection.py", line 181, in connect
    conn = self._new_conn()
  File "/home/test_user/Deploy/py_venv/lib/python3.6/site-packages/urllib3/connection.py", line 168, in _new_conn
    self, "Failed to establish a new connection: %s" % e)
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f817d91b400>: Failed to establish a new connection: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/test_user/Deploy/py_venv/lib/python3.6/site-packages/requests/adapters.py", line 449, in send
    timeout=timeout
  File "/home/test_user/Deploy/py_venv/lib/python3.6/site-packages/urllib3/connectionpool.py", line 638, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/home/test_user/Deploy/py_venv/lib/python3.6/site-packages/urllib3/util/retry.py", line 399, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=8086): Max retries exceeded with url: /query?q=SHOW+DATABASES (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f817d91b400>: Failed to establish a new connection: [$

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "python_simulator.py", line 467, in <module>
    main(host=args.host, port=args.port)
  File "python_simulator.py", line 312, in main
    for db_dict in client.get_list_database():
  File "/home/test_user/Deploy/py_venv/lib/python3.6/site-packages/influxdb/client.py", line 570, in get_list_database
    return list(self.query("SHOW DATABASES").get_points())
  File "/home/test_user/Deploy/py_venv/lib/python3.6/site-packages/influxdb/client.py", line 416, in query
    expected_response_code=expected_response_code
  File "/home/test_user/Deploy/py_venv/lib/python3.6/site-packages/influxdb/client.py", line 267, in request
    timeout=self._timeout
  File "/home/test_user/Deploy/py_venv/lib/python3.6/site-packages/requests/sessions.py", line 533, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/test_user/Deploy/py_venv/lib/python3.6/site-packages/requests/sessions.py", line 646, in send
    r = adapter.send(request, **kwargs)
  File "/home/test_user/Deploy/py_venv/lib/python3.6/site-packages/requests/adapters.py", line 516, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=8086): Max retries exceeded with url: /query?q=SHOW+DATABASES (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f817d91b400>: Failed to establish a new connection$

Lastly, blocking tree of daemons shows that myservice.service executes after influxdb.service:

myservice.service +6.872s
└─influxdb.service @11.344s
  └─network-online.target @11.337s
    └─NetworkManager-wait-online.service @4.706s +6.630s
      └─NetworkManager.service @3.940s +674ms
        └─dbus.service @3.914s
          └─basic.target @3.728s
            └─sockets.target @3.728s
              └─snapd.socket @3.722s +5ms
                └─sysinit.target @3.712s
                  └─apparmor.service @3.276s +435ms
                    └─local-fs.target @3.266s
                      └─run-user-1000.mount @49.841s
                        └─swap.target @3.160s
                          └─dev-disk-by\x2duuid-16e1b46a\x2d79fc\x2d4965\x2d9932\x2d8f589e9e7057.swap @3.132s +23ms
                            └─dev-disk-by\x2duuid-16e1b46a\x2d79fc\x2d4965\x2d9932\x2d8f589e9e7057.device @3.130s

I am not sure why I am still not able to execute the script (python_simulator.py) with influxdb in place. Are there any other dependencies? Are any changes needed in myservice.service? Any help will be appreciated.

Edit 1:

Can the reason be ConnectionRefusedError instead of ConnectionError and that may be because by the time it connects to influx at Port 8086 , nothing is listening on it? If so, how can I put that in dependency order?

Hi @deepaksurana18 welcome ,
Maybe a few seconde is not long enough , can you check in the Influxdb logfile how long it takes to startup ?
I would let python wait a bit more :slight_smile:

Hope this helps ,

Hi @MarcV, thanks!

InfluxDB service starts before myservice.service, but never ends. In this case, it continuously executes the continuous queries that were part of the python_simulator.py. This is interesting, as the .py ended in error before it could instantiate CQs.

● influxdb.service - InfluxDB is an open-source, distributed, time series database
   Loaded: loaded (/lib/systemd/system/influxdb.service; enabled; vendor preset: enabled)
   Active: active (running) since Wed 2019-04-24 11:32:06 EDT; 1h 29min ago
     Docs: https://docs.influxdata.com/influxdb/
 Main PID: 1744 (influxd)
    Tasks: 12 (limit: 4352)
   CGroup: /system.slice/influxdb.service
           └─1744 /usr/bin/influxd -config /etc/influxdb/influxdb.conf

Apr 24 13:01:00 CR-1 influxd[1744]: ts=2019-04-24T17:01:00.253259Z lvl=info msg="Finished continuous query" log_id=0E~~Sw50000 service=continuous_querier trace_id=0F04XyrG000 op_name=continuous_querier_execute name=cq_test1_instance=DB1 written=0 s
Apr 24 13:01:00 CR-1 influxd[1744]: ts=2019-04-24T17:01:00.254899Z lvl=info msg="Continuous query execution (start)" log_id=0E~~Sw50000 service=continuous_querier trace_id=0F04XysW000 op_name=continuous_querier_execute op_event=start
Apr 24 13:01:00 CR-1 influxd[1744]: ts=2019-04-24T17:01:00.254928Z lvl=info msg="Executing continuous query" log_id=0E~~Sw50000 service=continuous_querier trace_id=0F04XysW000 op_name=continuous_querier_execute name=cq_test2_instance=DB1 start=2019-
Apr 24 13:01:20 CR-1 influxd[1744]: [httpd] 127.0.0.1 - - [24/Apr/2019:13:01:20 -0400] "POST /write?db=telegraf HTTP/1.1" 204 0 "-" "Telegraf/1.9.5" 94ea3668-66b2-11e9-8215-00224d1f73dc 159885
Apr 24 13:01:30 CR-1 influxd[1744]: [httpd] 127.0.0.1 - - [24/Apr/2019:13:01:30 -0400] "POST /write?db=telegraf HTTP/1.1" 204 0 "-" "Telegraf/1.9.5" 9ae08686-66b2-11e9-8216-00224d1f73dc 149187

Q1) How could I make myservice.service wait for a longer non-deterministic time?

Moreover, please check the edited question.

Hi , do you mean that python creates continuous queries when the server reboots ?
That is not really necessary because the continuous queries , once created , survive a reboot or restart of the service.
That explains why after a failure of the python script the continuous queries still execute .
Or is there more in the python script ?

If necessary you could add a sleep in the python script or check influx availability in the script …

That explains the execution of CQs even if Python script fails. Python script creates CQs (drops older one if any), and then start dumping dummy data into InfluxDB.

I don’t want to customize the script to check for influx availability because apart from running this script at boot time, I also start a Flask application which ends in same error when it connects to InfluxDB.

I think the issue could be that the HTTP listener is not up by the time I use InfluxDB’s Python HTTP service. (https://github.com/influxdata/influxdb/issues/6068)

1 Like

Hi , you can delay the influxd service while influx is starting up …

I have done a little test …

I have inserted follow line in /usr/lib/systemd/system/influxdb.service

   ExecStartPost=/usr/bin/sleep 60

This is my complete file …

[Unit]
Description=InfluxDB is an open-source, distributed, time series database
Documentation=https://docs.influxdata.com/influxdb/
After=network-online.target

[Service]
User=influxdb
Group=influxdb
LimitNOFILE=65536
EnvironmentFile=-/etc/default/influxdb
ExecStart=/usr/bin/influxd -config /etc/influxdb/influxdb.conf $INFLUXD_OPTS
ExecStartPost=/usr/bin/sleep 60
KillMode=control-group
Restart=on-failure

[Install]
WantedBy=multi-user.target
Alias=influxd.service

Then systemctl restart gives a warning

$ systemctl restart influxd
Warning: influxd.service changed on disk. Run 'systemctl daemon-reload' to reload units.

so I executed

systemctl daemon-reload

and then

systemctl restart influxd

this will start the influxd service and sleep 60 seconds ( maybe it should be longer )

from the man pages : see ExecStartPost

ExecStart=
Commands with their arguments that are executed when this service is started. The value is split into zero or more
ExecStart= is specified, then the service must have RemainAfterExit=yes set.
For each of the specified commands, the first argument must be an absolute path to an executable. Optionally, if
this file name is prefixed with “@”, the second token will be passed as “argv[0]” to the executed process, followed
file. If one of the commands fails (and is not prefixed with “-”), other lines are not executed, and the unit is
ExecStartPre=, ExecStartPost=
Additional commands that are executed before or after the command in ExecStart=, respectively. Syntax is the same
as for ExecStart=, except that multiple command lines are allowed and the commands are executed one after the
If any of those commands (not prefixed with “-”) fail, the rest are not executed and the unit is considered failed.
Note that ExecStartPre= may not be used to start long-running processes. All processes forked off by processes
invoked via ExecStartPre= will be killed before the next service process is run.

1 Like

It seems like this way we can have some time between influxdb.service and myservice.service, so that InfluxDB can startup completely and the port 8086 is up.

Meanwhile, I was trying a similar thing but on the myservice.service side. Changing it in my own service will avoid making changes to any system services (in this case, influxdb.service). I added the below line in /lib/systemd/system/myservice.service:

ExecStartPre=/bin/sh -c 'while ! curl -sf http://localhost:8086/ping; do sleep 1; done'

Since the issue was with the HTTP Port 8086 while using InfluxDB’s HTTP service, the above line will wait until I get a response on that port. And, once that happens the python script should be good to consume InfluxDB.

My final service looks like:

[Unit]
Description=EDGE INTELLIGENCE startup service.
After=network-online.target influxdb.service
Requires=influxdb.service
Wants=influxdb.service

[Service]
Type=forking
ExecStart=/bin/bash /home/test_user/Deploy/EI_start.sh
ExecStop=/bin/bash /home/test_user/Deploy/EI_stop.sh
ExecStartPre=/bin/sh -c 'while ! curl -sf http://localhost:8086/ping; do sleep 1; done'
Restart=always

[Install]
WantedBy=multi-user.target

Requires/Wants: Configures dependency on influxdb.service and will start it if not running.
ExecStartPre: Will wait until the InfluxDB port 8086 is up and listening.

1 Like

That is even better and more dynamic , have fun :slight_smile: