Influxd not openning ports when run as influxdb user

Hello,

I have an issue with my influx setup, info bellow.

System info:

InfluxDB shell version: 1.5.2
CentOS Linux release 7.5.1804 (Core)
SELinux disabled

Steps to reproduce:

Starting up the process normally as the influxdb user (systemctl start influxdb) doesn’t open ports specified with the bind-address parameter.

When starting it as root with /usr/bin/influxd -config /etc/inflxudb/influxdb.conf said ports are open and listening.

Modifiyng the /usr/lib/systemd/system/influxdb.service file to execute as root also works.

Grafana is also running on this server and can open port 3000 as grafana user without any issue.

Additional info:

I tried modifying the port or the IP in bind-address (to either 127.0.0.1, 0.0.0.0 or the actual private ip) without success.

I tried adding the influxdb user to sys and wheel group, as explained in some discussion, to allow it to open ports without success.

I tried running setcap ‘cap_net_bind_service=+ep’ /usr/bin/influxd and changing the port in bind-address to 886 without success.

At the moment only solution I found is modifying the systemd unit file to run as root, but that seem a bit insecure.

Thank you in advance for your help,

Is there a reason why you’re using 1.5.2? The latest release is 1.6.1.

Can you provide more details about how you installed InfluxDB? I wasn’t able to reproduce your issue with the latest release.

I installed CentOS 7.5.1804 in a VirtualBox VM and followed the installation instructions for Red Hat & CentOS in the documentation.

$ sudo systemctl start influxdb
$ systemctl status influxdb
● influxdb.service - InfluxDB is an open-source, distributed, time series database
   Loaded: loaded (/usr/lib/systemd/system/influxdb.service; enabled; vendor preset: disabled)
   Active: active (running) since Sun 2018-08-26 12:56:23 EDT; 1s ago
     Docs: https://docs.influxdata.com/influxdb/
 Main PID: 10952 (influxd)
   CGroup: /system.slice/influxdb.service
           └─10952 /usr/bin/influxd -config /etc/influxdb/influxdb.conf

Aug 26 12:56:23 localhost.localdomain influxd[10952]: ts=2018-08-26T16:56:23.448403Z lvl=info msg="Starting precreation service" log_id=0A9nUd1W000 service=shard-...period=30m
Aug 26 12:56:23 localhost.localdomain influxd[10952]: ts=2018-08-26T16:56:23.448415Z lvl=info msg="Starting snapshot service" log_id=0A9nUd1W000 service=snapshot
Aug 26 12:56:23 localhost.localdomain influxd[10952]: ts=2018-08-26T16:56:23.448420Z lvl=info msg="Starting continuous query service" log_id=0A9nUd1W000 service=c...us_querier
Aug 26 12:56:23 localhost.localdomain influxd[10952]: ts=2018-08-26T16:56:23.448428Z lvl=info msg="Starting HTTP service" log_id=0A9nUd1W000 service=httpd authentication=false
Aug 26 12:56:23 localhost.localdomain influxd[10952]: ts=2018-08-26T16:56:23.448432Z lvl=info msg="opened HTTP access log" log_id=0A9nUd1W000 service=httpd path=stderr
Aug 26 12:56:23 localhost.localdomain influxd[10952]: ts=2018-08-26T16:56:23.448526Z lvl=info msg="Listening on HTTP" log_id=0A9nUd1W000 service=httpd addr=[::]:8...ttps=false
Aug 26 12:56:23 localhost.localdomain influxd[10952]: ts=2018-08-26T16:56:23.448538Z lvl=info msg="Starting retention policy enforcement service" log_id=0A9nUd1W0...terval=30m
Aug 26 12:56:23 localhost.localdomain influxd[10952]: ts=2018-08-26T16:56:23.451540Z lvl=info msg="Listening for signals" log_id=0A9nUd1W000
Aug 26 12:56:23 localhost.localdomain influxd[10952]: ts=2018-08-26T16:56:23.451884Z lvl=info msg="Storing statistics" log_id=0A9nUd1W000 service=monitor db_insta...terval=10s
Aug 26 12:56:23 localhost.localdomain influxd[10952]: ts=2018-08-26T16:56:23.452022Z lvl=info msg="Sending usage statistics to usage.influxdata.com" log_id=0A9nUd1W000
Hint: Some lines were ellipsized, use -l to show in full.

I verified that it was running as the influxdb user:

$ ps aux | grep influ[x]
influxdb 10952  0.2  0.8 386364 16364 ?        Ssl  12:56   0:00 /usr/bin/influxd -config /etc/influxdb/influxdb.conf

and also verified that ports were being opened:

$ ss -lntu
Netid State      Recv-Q Send-Q                                        Local Address:Port                                                       Peer Address:Port              
udp   UNCONN     0      0                                                 127.0.0.1:323                                                                   *:*                  
udp   UNCONN     0      0                                                         *:68                                                                    *:*                  
udp   UNCONN     0      0                                                       ::1:323                                                                  :::*                  
tcp   LISTEN     0      128                                                       *:22                                                                    *:*                  
tcp   LISTEN     0      128                                               127.0.0.1:8088                                                                  *:*                  
tcp   LISTEN     0      100                                               127.0.0.1:25                                                                    *:*                  
tcp   LISTEN     0      128                                                      :::8086                                                                 :::*                  
tcp   LISTEN     0      128                                                      :::22                                                                   :::*                  
tcp   LISTEN     0      100                                                     ::1:25                                                                   :::*                  

I was also able to connect to InfluxDB using the CLI on localhost:

$ influx
Connected to http://localhost:8086 version 1.6.1
InfluxDB shell version: 1.6.1
> 

I was not, however, able to connect to InfluxDB from outside of the virtual machine:

$ influx -host 10.0.2.10 -port 8086
Failed to connect to http://10.0.2.10:8086: Get http://10.0.2.10:8086/ping: dial tcp 10.0.2.10:8086: connect: connection refused
Please check your connection settings and ensure 'influxd' is running.

This is because CentOS comes with FirewallD enabled by default. I adding the InfluxDB port to the public zone:

$ sudo firewall-cmd --zone=public --add-port=8086/tcp
success

and was able to connect from an external machine:

$ influx -host 10.0.2.10 -port 8086
Connected to http://10.0.2.10:8086 version 1.6.1
InfluxDB shell version: v1.6.1
> 

Digital Ocean has a great post on FirewallD for CentOS 7 which goes into more detail on the daemon, including how to set up a FirewallD service and permanently adding it to a zone.

If you’re going to expose your InfluxDB instance to the public, you should enable authentication and HTTPS.

Hello,

Thank you for your answer noahcrowley.

It was on 1.5.2 at the moment where I tested it, and I only wrote the post sometime later sorry.

It seems that updating to 1.6.1 did solve the issue, I just tried putting back the systemd unit file to User=influxdb and the port stayed open after a restart.

I’d like to mention though that firewalld and iptables are disabled in this case because it is an internal server that would be accessed from the outside through either VPN or a proxy.

Thank you a lot for the answer,

Hello,

Turns out it still fails on master1:

[exp@vl2690icingam1 global]$ influx --version
InfluxDB shell version: 1.6.1
[exp@vl2690icingam1 global]$ ps aux | grep influx
exp 2889 0.0 0.0 112704 972 pts/1 S+ 17:24 0:00 grep --color=auto influx
influxdb 31232 0.0 0.1 68844 7444 ? Ssl 17:21 0:00 /usr/bin/influxd -config /etc/influxdb/influxdb.conf
[exp@vl2690icingam1 global]$ ss -lntu | grep 8086
[exp@vl2690icingam1 global]$

Where if I change unit file’s “User” key to root:

[exp@vl2690icingam1 global]$ sudo vim /usr/lib/systemd/system/influxdb.service
[sudo] password for exp:
[exp@vl2690icingam1 global]$ sudo systemctl daemon-reload
[exp@vl2690icingam1 global]$ sudo pcs resource restart InfluxDB
InfluxDB successfully restarted
[exp@vl2690icingam1 global]$ ps aux | grep influx
root 6426 17.6 8.8 7419700 344752 ? Ssl 17:26 0:06 /usr/bin/influxd -config /etc/influxdb/influxdb.conf
exp 7101 0.0 0.0 112704 972 pts/1 R+ 17:26 0:00 grep --color=auto influx
[exp@vl2690icingam1 global]$ ss -lntu | grep 8086
tcp LISTEN 0 128 :::8086 :::*
[exp@vl2690icingam1 global]$

So I’m still back to having the issue even on 1.6.1, though now it seems to work on master 2:

[exp@vl2693icingam2 ~]$ ss -lntu | grep 8086
tcp LISTEN 0 128 :::8086 :::*
[exp@vl2693icingam2 ~]$ ps aux | grep influxdb
influxdb 14609 27.0 16.2 9611640 628480 ? Ssl 17:28 0:06 /usr/bin/influxd -config /etc/influxdb/influxdb.conf
exp 16096 0.0 0.0 112708 972 pts/0 S+ 17:29 0:00 grep --color=auto influxdb
[exp@vl2693icingam2 ~]$

Are you configuring these servers manually, or using a configuration management tool like Chef or Ansible?

As I mentioned before, I was unable to reproduce this issue. The fact that it is happening on one of your servers but not the other seems to indicate that the problem is caused by some unique and unusual state introduced by manual configuration, which can be extremely difficult to diagnose.

Can you provide a set of steps to take to reproduce this issue, beginning with a clean install of CentOS?

I agree with you, the fact that it now only affect one host indicate that it must be a host-specific issue.

What’s strange is the fact that master2 now function without any apparent intervention, beside updates but master1 had those too.

And yes, it will probably work if I do a fresh install with influxdb, but that won’t solve my issue.

Do you have any idea how I could debug this ?

Is the data the same on both machines? How long are you waiting after the application starts up before you check the port status? Can you share the InfluxDB logs?

Debugging this issue really feels like a losing scenario, especially since it can’t be reproduced. Investigating configuration management might be a better use of your time, as it would give you a whole new set of capabilities and assurances to use in the future. If you had automation in place, you could have brought up a new instance and migrated the data.