Apache Kafka Cluster - TLS encryption on HDInsight
Published Mar 20 2024 04:22 PM 2,134 Views
Microsoft

In this blog we will go through how to set up Transport Layer Security (TLS) encryption for HDInsight Apache Kafka cluster (between Apache Kafka brokers and Apache Zookeepers). 

 

Prerequisite

1. Create Apache Kafka Cluster 

2. SSH Access to the cluster

 

The blog covers self-sign certificate, the process remains same for certificates issues from Certificate Authority (CA).

 

TLS Setup for Apache Kafka Brokers

 

User can generate TLS for each broker Domian/IP address/Node Name or Subject Alternative Name (SAN) or Wildcard certificate. The choice between individual vs Wildcard vs SAN is depends on your cluster behavior.

 

If you are planning to have scale out feature of the HDInsight Apache Kafka cluster, then it is less hassle if you use Wildcard or SAN certificate. At the time of scale out, you would not require generating new certificate of each additional node, if you are using Wildcard or SAN certificate.

 

 

The HDInsight Apache Kafka cluster broker's DNS names follow similar pattern within given cluster wnX-<cluster name>.<unique id>.bx.internal.cloudapp.net, that means you can generate either wildcard or SAN certificate for the cluster without worrying about cluster scale out.

Before we jump into certificate generation process, let's understand few points:

  1. Certificate can be generated outside of the cluster nodes, via different build pipeline and make them available to the cluster (head node, brokers, and zookeeper nodes) or from the head node. Over here we are assuming you are generating self-sign SAN certificate from the head node. 
  2. Self-sign vs CA cert process - The self-signed certificates are public key certificates that are not issued by a certificate authority (CA). These self-signed certificates are easy to make and do not cost money. However, they do not provide any trust value. It is recommended to use CA cert for the production workload. 

Steps to generate SAN certificate for the Kafka and Zookeeper Nodes:

 

Here we are assuming you are using self-sign certification and generating such certification from cluster head node.

 

1. SSH to Head node or you from your local development environment.

2. Create a new directory 

 

# Create a new directory 'ssl' and change into it
mkdir ssl
cd ssl
NAME=kafka

 

3. Create ca-cert and ca-key files for the self-sign cert process.

 

openssl req -new -newkey rsa:4096 -days 365 -x509 -subj "/C=US/ST=WA/L=Redmond/O=MS/OU=DATA/CN=Root-CA" -keyout ca-key -out ca-cert --nodes 

 

4. Add CAs public certificate to the truststore

 

keytool -keystore $NAME.server.truststore.jks -storepass confidential -import -alias ca-root -file ca-cert -noprompt

 

5. Create a keystore and populate it with a new private certificate. In the case of auto scale, you can add "x" worker node(s) DNS names ahead of time part of the certificate.

 

keytool -keystore $NAME.server.keystore.jks -storepass confidential -alias $NAME -validity 365 -genkey -keypass confidential -storepass confidential -dname "CN=$NAME,OU=DATA,O=MS,L=Redmond,ST=WA,C=US" -ext "SAN=DNS:wn0-<cluster name>.<unique id>.bx.internal.cloudapp.net,DNS:wn1-<cluster name>.<unique id>.bx.internal.cloudapp.net,DNS:wn2-<cluster name>.<unique id>.bx.internal.cloudapp.net,DNS:zk0-<cluster name>.<unique id>.bx.internal.cloudapp.net,DNS:zk1-<cluster name>.<unique id>.bx.internal.cloudapp.net,DNS:zk2-<cluster name>.<unique id>.bx.internal.cloudapp.net" -keyalg RSA -keysize 4096 -storetype pkcs12

 

6. Create a certificate signing request (CSR).

 

keytool -keystore $NAME.server.keystore.jks -storepass confidential -alias $NAME -certreq -file  $NAME-cert-file -keyalg RSA -keysize 4096

 

7. Sign certificate signing request (CSR) using your private key and root CA certificate created in step#3

 

openssl x509 -req -CA ca-cert -CAkey ca-key -in  $NAME-cert-file -out $NAME.jks -days 365 -CAcreateserial -extensions SAN -extfile <(printf "\n[SAN]\nsubjectAltName=DNS:wn0-<cluster name>.<unique id>.bx.internal.cloudapp.net,DNS:wn1-<cluster name>.<unique id>.bx.internal.cloudapp.net,DNS:wn2-<cluster name>.<unique id>.bx.internal.cloudapp.net,DNS:zk0-<cluster name>.<unique id>.bx.internal.cloudapp.net,DNS:zk1-<cluster name>.<unique id>.bx.internal.cloudapp.net,DNS:zk2-<cluster name>.<unique id>.bx.internal.cloudapp.net")

 

8. Add the CAs public certificate to the keystore.

 

keytool -keystore $NAME.server.keystore.jks -storepass confidential -import -alias ca-root -file ca-cert -noprompt

 

9. Add signed certificate (from step#7) to the keystore  

 

keytool -keystore $NAME.server.keystore.jks -storepass confidential -import -alias $NAME -file kafka.jks

 

SCP truststore and keystore to worker and zookeeper nodes 

You can use sshpass or any other automation script to copy truststore and keystore to each kafka and zookeeper nodes. These must be copied to the consistent location across the nodes, for example: `/home/sshuser/ssl'. In the case of auto scale, you can leverage persisted script action to copy truststore and keystore.

Update Zookeeper configuration to use TLS and restart zookeepers

Modify zookeeper related configuration using Ambari. To complete the configuration modification, do the following steps:

  1. Sign in to the Azure portal and select your Azure HDInsight Apache Kafka cluster.

  2. Go to the Ambari UI by clicking Ambari home under Cluster dashboards.

  3. Under Zookeeper ->Advanced zookeeper-env, add following lines before `{% if security_enabled %}`:
    export SERVER_JVMFLAGS=" $SERVER_JVMFLAGS -Dzookeeper.serverCnxnFactory=org.apache.zookeeper.server.NettyServerCnxnFactory -Dzookeeper.ssl.keyStore.location=/home/sshuser/ssl/kafka.server.keystore.jks -Dzookeeper.ssl.keyStore.password=confidential -Dzookeeper.ssl.trustStore.location=/home/sshuser/ssl/kafka.server.truststore.jks -Dzookeeper.ssl.trustStore.password=confidential" 
    
    export CLIENT_JVMFLAGS="$CLIENT_JVMFLAGS -Dzookeeper.clientCnxnSocket=org.apache.zookeeper.ClientCnxnSocketNetty -Dzookeeper.client.secure=true -Dzookeeper.ssl.keyStore.location=/home/sshuser/ssl/kafka.server.keystore.jks -Dzookeeper.ssl.keyStore.password=confidential -Dzookeeper.ssl.trustStore.location=/home/sshuser/ssl/kafka.server.truststore.jks -Dzookeeper.ssl.trustStore.password=confidential"​
  4. Under Custom zoo.cfg, add a new property "secureClientPort" with value "2281".
  5. Restart "Restart All Affected"

Update Kafka configuration to use TLS and restart brokers

You have now set up each Kafka broker with a keystore and truststore, and imported the correct certificates. Next, modify related Kafka configuration properties using Ambari and then restart the Kafka brokers. To complete the configuration modification, do the following steps:

  1. Sign in to the Azure portal and select your Azure HDInsight Apache Kafka cluster.

  2. Go to the Ambari UI by clicking Ambari home under Cluster dashboards.

  3. Under Kafka ->Kafka Broker:

    1. set the listeners property to PLAINTEXT://localhost:9092,SSL://localhost:9093
    2. zookeeper.connect to use port 2281 for zookeeper nodes

  4. Under Advanced kafka-broker:
    1. Set the security.inter.broker.protocol property to SSL
    2. Set ssl.keystore.location and ssl.truststore.location is the complete path of your keystore, truststore location.
    3. Set ssl.keystore.password and ssl.truststore.password is the password set for the keystore and truststore. In this case as an example, confidential
    4. Set ssl.key.password is the key set for the keystore and trust store. In this case as an example, confidential
  5. Under Advanced kafka-env -> kafka-env template , add following line at the end:

 

export ZK_CLIENT_JVMFLAGS="-Dzookeeper.clientCnxnSocket=org.apache.zookeeper.ClientCnxnSocketNetty -Dzookeeper.client.secure=true -Dzookeeper.ssl.keyStore.location=/home/sshuser/ssl/kafka.server.keystore.jks -Dzookeeper.ssl.keyStore.password=confidential -Dzookeeper.ssl.trustStore.location=/home/sshuser/ssl/kafka.server.truststore.jks -Dzookeeper.ssl.trustStore.password=confidential"
export EXTRA_ARGS="$EXTRA_ARGS $ZK_CLIENT_JVMFLAGS" 

 

 

Validate TLS Setup

Zookeeper Connection

1. SSH to zookeeper node from the headnode

2. connect to zookeeper using zookeeper-cli: 

 

/usr/hdp/current/zookeeper-client/bin/zkCli.sh -server zk3-<<cluster name>>.<<unique id>>.bx.internal.cloudapp.net:2281

 

 

Test Kafka Connection

1. Copy the ca-cert to the client machine (maybe passive headnode hn1)

2. Import the CA certificate to the truststore.

 

keytool -keystore kafka.client.truststore.jks -alias CARoot -import -file ca-cert -storepass "MyClientPassword123" -keypass "MyClientPassword123" -noprompt

 

3. Create the file client-ssl-auth.properties on client machine. It should have the following lines:

 

security.protocol=SSL
ssl.truststore.location=/home/sshuser/ssl/kafka.client.truststore.jks
ssl.truststore.password=MyClientPassword123

 

4. Create Kafka Topic

 

/usr/hdp/current/kafka-broker/bin/kafka-topics.sh --bootstrap-server <Kafka Broker FQDN>:9093 --command-config client-ssl-auth.properties --create --topic topic1 --partitions 2 --replication-factor 2

 

5. Run Kafka producer

 

/usr/hdp/current/kafka-broker/bin/kafka-console-producer.sh --broker-list <Kafka Broker FQDN>:9093 --topic topic1 --producer.config client-ssl-auth.properties

 

6. Run Kafka consumer

 

/usr/hdp/current/kafka-broker/bin/kafka-console-consumer.sh --bootstrap-server <Kafka Broker FQDN>:9093 --topic topic1 --consumer.config client-ssl-auth.properties --from-beginning

 

References

 

1. Self-signed certificate - Wikipedia

2. Apache Kafka TLS encryption & authentication - Azure HDInsight | Microsoft Learn

Co-Authors
Version history
Last update:
‎Mar 19 2024 07:24 AM
Updated by: