This article will discuss the common scenario when the on-prem servers failed to connect to AKS(Azure Kubernetes Services) with the error: Encrypted Alert. Normally, this error message means the TLS handshake has failed to establish, and today I want to share the common reason for that: the NGINX-based AKS package limits certain cipher suites in the connectivity.
Symptom
Connection from the 2012R2 on-prem Windows server to the AKS would fail, with the error message:
Troubleshooting
- Capture the network trace, and you will find Encrypted Alert during the TLS handshake:
2. Note that the handshake would fail immediately after the client hello. This means TLS handshake fails to establish, and most often it's due to the server and client does not have common cipher suites to exchange.
3. Then we confirm the client is an on-prem server running on Windows Server 2012R2, and the cipher suites will limit to only a small set which AKS would support. On the AKS part, normally we will use NGINX to connect. This is also a commonly used package in the AKS project.
4. Given the information above, we can now conclude that the goal is to help the client and server to have a common cipher suite to exchange. The following ways could help:
- Upgrade the Windows Server version to 2016 or newer. This is the recommended way to avoid such issue. You will have more supported ciphers, and it will be more secure on the cipher level.
- Use OpenSSL instead of schannel to connect on the client side. Modern browsers, like Chrome / Firefox will use OpenSSL, which will offer a wider range of cipher suites on the browser level. However, given this is a 2012R2 server, by default, we will have IE installed, and it will use schannel to connect. schannel will only use the system-level supported cipher suites to connect. It could be tricky to apply this change, and it will be harder to implement if you have a large set of servers to operate.
- Modify the setting on the AKS part, and allow some old cipher suites to negotiate in TLS handshake. The article will mainly focus on this part.
So, how to do that?
1. Firstly, we need to figure out what is the cipher suite that both the client and server can support. The following docs could help:
Eventually, we have chosen to add this cipher suite: TLS_DHE_RSA_WITH_AES_128_GCM_SHA256. This is the safest cipher suite supported by 2012R2 version.
2. From this cipher name, we can see this is a DHE-based cipher (A key exchange algorism in TLS handshake https://en.wikipedia.org/wiki/Diffie%E2%80%93Hellman_key_exchange), but this one will not be enabled by default on the NGINX level.
3. Check this doc, and you will see the explanation:
DHE-based cyphers will not be available until DH parameter is configured Custom DH parameters for perfect forward secrecy
Now we will have a clear action plan to move forward.
Action Plan
1. Follow the below post link to generate custom DH parameters, this is required for DHE-based ciphers:
2. Add another Secret on the AKS part. I would recommend creating a new YAML file with name like ssl-dh-param.yaml. An example of it:
# First, get the dhparam: openssl dhparam 4096 | base64
# replace (token generated by openssl) with your dhparam
# replace (your namespace) with your namespace
# replace (your configmap) with your configmap
apiVersion: v1
data:
# Doc details: https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/configmap/#ssl-ciphers
# This is configured on the NGINX side.
dhparam.pem: "(token generated by openssl)"
kind: Secret
type: Opaque
metadata:
name: lb-dhparam
namespace: (your namespace)
labels:
app.kubernetes.io/name: (your configmap)
app.kubernetes.io/part-of: (your configmap)
Also modify the SSL ciphers order in your ConfigMap. An example to achieve this:
# cat configmap.yaml
# replace (your namespace) with your namespace
# replace (your configmap) with your configmap
apiVersion: v1
data:
ssl-dh-param: "(your namespace)/lb-dhparam"
ssl-ciphers: "ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384" # Put ECDHE-ECDSA-AES128-GCM-SHA256 in 1st order to prioritize.
kind: ConfigMap
metadata:
name: (your configmap)
namespace: (your namespace)
labels:
app.kubernetes.io/name: (your configmap)
app.kubernetes.io/part-of: (your configmap)
3. Apply the YAML file, and then you could use the below command to validate if DHE ciphers are configured correctly:
kubectl exec (Pod Name) -- cat /etc/nginx/nginx.conf | grep ssl_dhparam
If you see the value, it means the configuration is correct.
4. Then, you could go back, and use IE to test if the connection is working fine now. From our test, we've confirmed 2012R2-based VM can connect to AKS successfully.
Notes
1. Please note that this issue will happen if you are using VMs prior to Windows Server 2016 version, and the impacted service will not only limit to AKS. Any Azure-related service could be impacted as long as old ciphers are not supported.
2. The cloud services have removed some supported ciphers, due to they are less secure compared with new ciphers. The solution provided above may bring less security on the cipher level, and upgrading server version is the recommended way to avoid this issue.
Please feel free to leave the comments below if you have any further questions, and I would be glad to help, thank you.