Running Python script as systemd service, connecting to InfluxDB results in ConnectionRefusedError/ConnectionError

I am running a shell script as Ubuntu’s systemd service to start at boot-time. The script internally executes a Python script (python_simulator.py) that connects to InfluxDB (through Python’s influxdb package).

The Python script fails to start at boot-time and checking the logs suggests that’s because of ‘ConnectionError’ while connecting to InfluxDB. I interpreted it as it’s possible that the influxdb service as not started by the time the Python service activates at boot time. So I have tried to add the order dependency in the service by adding “After” and “Wants” as “influxdb.service” which activates the Python service a few seconds after influxdb service. But, I still get the same connection error.

The systemd service (myservice.service) looks like:

[Unit]
Description= Python startup service.
After=influxdb.service
Wants=influxdb.service

[Service]
Type=forking
ExecStart=/bin/bash /home/test_user/Deploy/start.sh
ExecStop=/bin/bash /home/test_user/Deploy/stop.sh

[Install]
WantedBy=multi-user.target

The log file of the Python script (python_simulator.py):

Traceback (most recent call last):
  File "/home/test_user/Deploy/py_venv/lib/python3.6/site-packages/urllib3/connection.py", line 159, in _new_conn
    (self._dns_host, self.port), self.timeout, **extra_kw)
  File "/home/test_user/Deploy/py_venv/lib/python3.6/site-packages/urllib3/util/connection.py", line 80, in create_connection
    raise err
  File "/home/test_user/Deploy/py_venv/lib/python3.6/site-packages/urllib3/util/connection.py", line 70, in create_connection
    sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/test_user/Deploy/py_venv/lib/python3.6/site-packages/urllib3/connectionpool.py", line 600, in urlopen
    chunked=chunked)
  File "/home/test_user/Deploy/py_venv/lib/python3.6/site-packages/urllib3/connectionpool.py", line 354, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/lib/python3.6/http/client.py", line 1239, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/lib/python3.6/http/client.py", line 1285, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.6/http/client.py", line 1234, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.6/http/client.py", line 1026, in _send_output
    self.send(msg)
  File "/usr/lib/python3.6/http/client.py", line 964, in send
    self.connect()
  File "/home/test_user/Deploy/py_venv/lib/python3.6/site-packages/urllib3/connection.py", line 181, in connect
    conn = self._new_conn()
  File "/home/test_user/Deploy/py_venv/lib/python3.6/site-packages/urllib3/connection.py", line 168, in _new_conn
    self, "Failed to establish a new connection: %s" % e)
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f817d91b400>: Failed to establish a new connection: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/test_user/Deploy/py_venv/lib/python3.6/site-packages/requests/adapters.py", line 449, in send
    timeout=timeout
  File "/home/test_user/Deploy/py_venv/lib/python3.6/site-packages/urllib3/connectionpool.py", line 638, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/home/test_user/Deploy/py_venv/lib/python3.6/site-packages/urllib3/util/retry.py", line 399, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=8086): Max retries exceeded with url: /query?q=SHOW+DATABASES (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f817d91b400>: Failed to establish a new connection: [$

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "python_simulator.py", line 467, in <module>
    main(host=args.host, port=args.port)
  File "python_simulator.py", line 312, in main
    for db_dict in client.get_list_database():
  File "/home/test_user/Deploy/py_venv/lib/python3.6/site-packages/influxdb/client.py", line 570, in get_list_database
    return list(self.query("SHOW DATABASES").get_points())
  File "/home/test_user/Deploy/py_venv/lib/python3.6/site-packages/influxdb/client.py", line 416, in query
    expected_response_code=expected_response_code
  File "/home/test_user/Deploy/py_venv/lib/python3.6/site-packages/influxdb/client.py", line 267, in request
    timeout=self._timeout
  File "/home/test_user/Deploy/py_venv/lib/python3.6/site-packages/requests/sessions.py", line 533, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/test_user/Deploy/py_venv/lib/python3.6/site-packages/requests/sessions.py", line 646, in send
    r = adapter.send(request, **kwargs)
  File "/home/test_user/Deploy/py_venv/lib/python3.6/site-packages/requests/adapters.py", line 516, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=8086): Max retries exceeded with url: /query?q=SHOW+DATABASES (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f817d91b400>: Failed to establish a new connection$

Lastly, blocking tree of daemons shows that myservice.service executes after influxdb.service:

myservice.service +6.872s
└─influxdb.service @11.344s
  └─network-online.target @11.337s
    └─NetworkManager-wait-online.service @4.706s +6.630s
      └─NetworkManager.service @3.940s +674ms
        └─dbus.service @3.914s
          └─basic.target @3.728s
            └─sockets.target @3.728s
              └─snapd.socket @3.722s +5ms
                └─sysinit.target @3.712s
                  └─apparmor.service @3.276s +435ms
                    └─local-fs.target @3.266s
                      └─run-user-1000.mount @49.841s
                        └─swap.target @3.160s
                          └─dev-disk-by\x2duuid-16e1b46a\x2d79fc\x2d4965\x2d9932\x2d8f589e9e7057.swap @3.132s +23ms
                            └─dev-disk-by\x2duuid-16e1b46a\x2d79fc\x2d4965\x2d9932\x2d8f589e9e7057.device @3.130s

I am not sure why I am still not able to execute the script (python_simulator.py) with influxdb in place. Are there any other dependencies? Are any changes needed in myservice.service? Any help will be appreciated.

Edit 1:

Can the reason be ConnectionRefusedError instead of ConnectionError and that may be because by the time it connects to influx at Port 8086 , nothing is listening on it? If so, how can I put that in dependency order?

Hi @deepaksurana18 welcome ,
Maybe a few seconde is not long enough , can you check in the Influxdb logfile how long it takes to startup ?
I would let python wait a bit more :slight_smile:

Hope this helps ,

Hi @MarcV, thanks!

InfluxDB service starts before myservice.service, but never ends. In this case, it continuously executes the continuous queries that were part of the python_simulator.py. This is interesting, as the .py ended in error before it could instantiate CQs.

● influxdb.service - InfluxDB is an open-source, distributed, time series database
   Loaded: loaded (/lib/systemd/system/influxdb.service; enabled; vendor preset: enabled)
   Active: active (running) since Wed 2019-04-24 11:32:06 EDT; 1h 29min ago
     Docs: https://docs.influxdata.com/influxdb/
 Main PID: 1744 (influxd)
    Tasks: 12 (limit: 4352)
   CGroup: /system.slice/influxdb.service
           └─1744 /usr/bin/influxd -config /etc/influxdb/influxdb.conf

Apr 24 13:01:00 CR-1 influxd[1744]: ts=2019-04-24T17:01:00.253259Z lvl=info msg="Finished continuous query" log_id=0E~~Sw50000 service=continuous_querier trace_id=0F04XyrG000 op_name=continuous_querier_execute name=cq_test1_instance=DB1 written=0 s
Apr 24 13:01:00 CR-1 influxd[1744]: ts=2019-04-24T17:01:00.254899Z lvl=info msg="Continuous query execution (start)" log_id=0E~~Sw50000 service=continuous_querier trace_id=0F04XysW000 op_name=continuous_querier_execute op_event=start
Apr 24 13:01:00 CR-1 influxd[1744]: ts=2019-04-24T17:01:00.254928Z lvl=info msg="Executing continuous query" log_id=0E~~Sw50000 service=continuous_querier trace_id=0F04XysW000 op_name=continuous_querier_execute name=cq_test2_instance=DB1 start=2019-
Apr 24 13:01:20 CR-1 influxd[1744]: [httpd] 127.0.0.1 - - [24/Apr/2019:13:01:20 -0400] "POST /write?db=telegraf HTTP/1.1" 204 0 "-" "Telegraf/1.9.5" 94ea3668-66b2-11e9-8215-00224d1f73dc 159885
Apr 24 13:01:30 CR-1 influxd[1744]: [httpd] 127.0.0.1 - - [24/Apr/2019:13:01:30 -0400] "POST /write?db=telegraf HTTP/1.1" 204 0 "-" "Telegraf/1.9.5" 9ae08686-66b2-11e9-8216-00224d1f73dc 149187

Q1) How could I make myservice.service wait for a longer non-deterministic time?

Moreover, please check the edited question.

Hi , do you mean that python creates continuous queries when the server reboots ?
That is not really necessary because the continuous queries , once created , survive a reboot or restart of the service.
That explains why after a failure of the python script the continuous queries still execute .
Or is there more in the python script ?

If necessary you could add a sleep in the python script or check influx availability in the script …

That explains the execution of CQs even if Python script fails. Python script creates CQs (drops older one if any), and then start dumping dummy data into InfluxDB.

I don’t want to customize the script to check for influx availability because apart from running this script at boot time, I also start a Flask application which ends in same error when it connects to InfluxDB.

I think the issue could be that the HTTP listener is not up by the time I use InfluxDB’s Python HTTP service. (systemd service should "wait" until influxdb has finished starting up · Issue #6068 · influxdata/influxdb · GitHub)

Hi , you can delay the influxd service while influx is starting up …

I have done a little test …

I have inserted follow line in /usr/lib/systemd/system/influxdb.service

   ExecStartPost=/usr/bin/sleep 60

This is my complete file …

[Unit]
Description=InfluxDB is an open-source, distributed, time series database
Documentation=https://docs.influxdata.com/influxdb/
After=network-online.target

[Service]
User=influxdb
Group=influxdb
LimitNOFILE=65536
EnvironmentFile=-/etc/default/influxdb
ExecStart=/usr/bin/influxd -config /etc/influxdb/influxdb.conf $INFLUXD_OPTS
ExecStartPost=/usr/bin/sleep 60
KillMode=control-group
Restart=on-failure

[Install]
WantedBy=multi-user.target
Alias=influxd.service

Then systemctl restart gives a warning

$ systemctl restart influxd
Warning: influxd.service changed on disk. Run 'systemctl daemon-reload' to reload units.

so I executed

systemctl daemon-reload

and then

systemctl restart influxd

this will start the influxd service and sleep 60 seconds ( maybe it should be longer )

from the man pages : see ExecStartPost

ExecStart=
Commands with their arguments that are executed when this service is started. The value is split into zero or more
ExecStart= is specified, then the service must have RemainAfterExit=yes set.
For each of the specified commands, the first argument must be an absolute path to an executable. Optionally, if
this file name is prefixed with “@”, the second token will be passed as “argv[0]” to the executed process, followed
file. If one of the commands fails (and is not prefixed with “-”), other lines are not executed, and the unit is
ExecStartPre=, ExecStartPost=
Additional commands that are executed before or after the command in ExecStart=, respectively. Syntax is the same
as for ExecStart=, except that multiple command lines are allowed and the commands are executed one after the
If any of those commands (not prefixed with “-”) fail, the rest are not executed and the unit is considered failed.
Note that ExecStartPre= may not be used to start long-running processes. All processes forked off by processes
invoked via ExecStartPre= will be killed before the next service process is run.

It seems like this way we can have some time between influxdb.service and myservice.service, so that InfluxDB can startup completely and the port 8086 is up.

Meanwhile, I was trying a similar thing but on the myservice.service side. Changing it in my own service will avoid making changes to any system services (in this case, influxdb.service). I added the below line in /lib/systemd/system/myservice.service:

ExecStartPre=/bin/sh -c 'while ! curl -sf http://localhost:8086/ping; do sleep 1; done'

Since the issue was with the HTTP Port 8086 while using InfluxDB’s HTTP service, the above line will wait until I get a response on that port. And, once that happens the python script should be good to consume InfluxDB.

My final service looks like:

[Unit]
Description=EDGE INTELLIGENCE startup service.
After=network-online.target influxdb.service
Requires=influxdb.service
Wants=influxdb.service

[Service]
Type=forking
ExecStart=/bin/bash /home/test_user/Deploy/EI_start.sh
ExecStop=/bin/bash /home/test_user/Deploy/EI_stop.sh
ExecStartPre=/bin/sh -c 'while ! curl -sf http://localhost:8086/ping; do sleep 1; done'
Restart=always

[Install]
WantedBy=multi-user.target

Requires/Wants: Configures dependency on influxdb.service and will start it if not running.
ExecStartPre: Will wait until the InfluxDB port 8086 is up and listening.

That is even better and more dynamic , have fun :slight_smile: