Telegraf Kafka output SSL questions

telegraf

#1

Howdy! I’m new to Telegraf and have a couple questions around configuring the Kafka output correctly so that data is encrypted over the network.

Our Kafka brokers are configured using a TLS cert from a trusted CA so the broker port that Telegraf will be connecting on should enable secure / encrypted traffic.

Is the main purpose of the SSL section in the Telegraf config Kafka output section for client cert authentication?

## Optional SSL Config
# ssl_ca = "/etc/telegraf/ca.pem"
# ssl_cert = "/etc/telegraf/cert.pem"
# ssl_key = "/etc/telegraf/key.pem"
## Use SSL but skip chain & host verification
# insecure_skip_verify = true

We don’t want to use and manage certs for client auth but we must have SSL enabled for network data encryption.

The only way I can get Telegraf to connect to our secure Kafka port is by setting:

insecure_skip_verify = true

while leaving the rest of the SSL config options commented out. If I set ‘insecure_skip_verify’ to false it again fails to connect with the following errors in the log:

2017-12-21T03:07:55Z D! Attempting connection to output: kafka
2017-12-21T03:07:56Z E! Failed to connect to output kafka, retrying in 15s, error was 'kafka: client has run out of available brokers to talk to (Is your cluster reachable?)' 
2017-12-21T03:08:12Z E! kafka: client has run out of available brokers to talk to (Is your cluster reachable?)

Can someone explain why this is necessary? What exactly does ‘insecure_skip_verify’ option do? I haven’t been able to find an explanation in any docs. If set to true, does that mean that the cert isn’t being verified that it is from a trusted CA? Our cert does match the domain names used for our Kafka brokers.

I do see this in the code:

Thanks for any help and clarification!


#2

Most of the settings are for client cert authentication, in particular the ssl_cert and ssl_key options. As a workaroud, the ssl_ca certificate could be pointed to the correct root certificate to trigger the output to switch into TLS mode.

Could you open a new issue on the Telegraf github page for enabling TLS without setting this manually?

The insecure_skip_verify does indeed mean that the certificate is not being verified, so even though the data is being encrypted you would not notice a man in the middle. However, you should be able to connect to it for testing purposes, so there may still be a problem we need to debug.


#3

@daniel sorry I didn’t respond sooner. I did a test where I set the Kafka output SSL settings as follows:

## Optional SSL Config
ssl_ca = "/etc/pki/tls/cert.pem"
# ssl_cert = "/etc/telegraf/cert.pem"
# ssl_key = "/etc/telegraf/key.pem"
## Use SSL but skip chain & host verification
# insecure_skip_verify = true

What I am hoping this will do is enable SSL/TLS encrypted traffic against the certificate from our CA we are using on our kafka brokers and that the cert can be verified by pointing “ssl_ca” setting to the openssl tls-ca-bundle.pem:

[ec2-user@ip-10-240-39-5 telegraf]$ openssl version -d
OPENSSLDIR: "/etc/pki/tls"
[ec2-user@ip-10-240-39-5 telegraf]$ ls -lha /etc/pki/tls/
total 12K
drwxr-xr-x  5 root root  81 Dec 13 05:16 .
drwxr-xr-x 10 root root 116 Jan 22 13:58 ..
lrwxrwxrwx  1 root root  49 Dec 13 05:16 cert.pem -> /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem
drwxr-xr-x  2 root root 117 Dec 13 05:16 certs
drwxr-xr-x  2 root root  74 Dec 13 05:16 misc
-rw-r--r--  1 root root 11K Nov 28 18:45 openssl.cnf
drwxr-xr-x  2 root root   6 Nov 28 18:49 private

Can you confirm whether this will do what I am hoping / expecting? When I start telegraf I don’t get any errors and I can confirm it is sending messages to kafka so I am assuming SSL/TLS encryption is enabled and that the cert is being verified. Can you recommend a way I can verify this?

Thanks!


#4

Yes, I believe this should work as you are expecting. If you want to verify you could replace your good ssl_ca cert with one that did not sign your Kafka certificate, and verify that there is a failure.


#5

@daniel awesome!
Thanks for the input. I did test and verify everything is working as expected.
What I ended up doing to solve this in the automation I developed that installs and configures Telegraf on all our instances is I run the above openssl commands to find where the OPENSSLDIR is on the filesystem and then point the kafka output ssl_ca config to the cert.pem in that directory.
All my variables are defined as ENV variables in /etc/default/telegraf file. I set the path to the CA cert there as well. Then in my config files I just reference those ENV variables.
Works like a charm. Would be pretty cool and probably easier for most Telegraf users if a boolean flag was added to just specify whether to enable SSL and have the internal telegraf kafka output code look up the openssl directory and point to the CA cert itself.

Thanks for the help and input!