Automating Big Data Clusters (BDC) deployment with Azure Kubernetes Service (AKS) private cluster
Published Sep 03 2020 09:00 PM 5,144 Views
Microsoft

One of the biggest challenges businesses face is how to integrate disparate data sources from many different sources, and how to turn valuable data into actionable insights. Big Data Clusters (BDC) is on the right choice for Big Data Analytics solutions.

As a cloud-native, platform-agnostic, open data platform for analytics at any scale orchestrated by Kubernetes, BDC works on Azure Kubernetes Service ( AKS ) -  a fully managed Kubernetes service in Microsoft Azure cloud platform.

For security-critic customers who need a private environment,  deploying BDC with AKS private cluster is a good way to restrict use of public IP addresses. Furthermore you can use UDR ( user-defined routes) to restrict egress traffic. You can do this with automation scripts are available on SQL Sample Github repo – private-aks.

 

Deploy AKS private cluster with automation scripts

Go to the Github repo to deploy AKS private cluster from here with your client in Linux OS or using WSL/WSL2. There are two bash scripts of you can use to deploy AKS private cluster:

You can use deploy-private-aks.sh to provision a private AKS cluster with private endpoint, and fto limitthe use of public addresses as well as egress traffic, use  deploy-private-aks-udr.sh to deploy BDC with AKS private cluster and limit egress traffic with UDR ( User-defined Routes ).

 

Here we take more common case where a you deploy BDC with AKS private cluster. After downloading the script on the client environment, you can use the following command to execute the script :

 

chmod +x deploy-private-aks.sh
sudo ./deploy-private-aks.sh

 

Input your Azure subscription ID, the resource group name, and the Azure region that you wish to deploy your resource:

MelonyQ_0-1599148650093.png

 

The deployment will take a few minutes. You’ll be able to find the deployed resources on your Azure portal after the deployment completes.

Access to AKS private cluster

After you deploy a private AKS cluster, you need to access a VM to connect to AKS cluster. There are multiple ways to help you manage your AKS private cluster, and you can find those at this link.  Here we’re using the easiest option,  which is to provision a management VM which installs all required SQL Server 2019 big data tools and resides on the same VNET with your AKS private cluster, then connect to that VM so you can get access to private AKS cluster as follows :

MelonyQ_1-1599148650104.png

 

Deploy BDC with AKS private cluster with automation script

You can download the script deploy-bdc.sh to deploy BDC without a public endpoint:

 

chmod +x deploy-bdc.sh
sudo ./deploy-bdc.sh

 

 

This requires you to set up the BDC admin username and password, and then it kicks off a BDC cluster deployment:

MelonyQ_2-1599148650114.png

 

At the end of the deployment,  the script will list all the BDC endpoints :

MelonyQ_3-1599148650124.png

 

Connect to BDC in AKS private cluster

Make sure all components of your BDC cluster show a healthy status :

azdata bdc status show

If all goes well, you’ll get this output:

MelonyQ_4-1599148650131.png

 

You can use the SQL Server master instance in the cluster endpoint to connect to BDC cluster with SQL Server Management Studio or Azure Data Studio as shown here :

MelonyQ_5-1599148650142.png

 

 

Wrap up

 

As we saw in the first part of this article, businesses are looking for a secure, portable way to create value from multiple sources of data. Using SQL Server’s Big Data Cluster ( BDC ) in an Azure Kubernetes Service ( AKS ) private cluster, they get exactly that. You’ve seen how to use two variations of scripts that are available on our repository to fit your network environment and security requirements.  You can also  customize the scripts with your specific requirements for the information such as IP addresses range,  flags to add or remove an AKS feature while creating AKS cluster before deploying in your environment.  

 

 

 

 

Version history
Last update:
‎Sep 03 2020 09:42 AM
Updated by: