SQL Server Big Data Clusters (BDC) is a new capability brought to market as part of the SQL Server 2019 release. BDC extends SQL Server’s analytical capabilities beyond in-database processing of transactional and analytical workloads by uniting the SQL engine with Apache Spark and Apache Hadoop to create a single, secure and unified data platform. BDC is available exclusively to run on Linux containers, orchestrated by Kubernetes, and can be deployed in multiple-cloud providers or on-premises.
The availability of the latest cumulative update (CU8) for SQL Server 2019 BDC includes several fixes, optimizations and adds two main capabilities for SQL Server BDC:
- Encryption at Rest experiences for SQL Server and HDFS
- Oracle Proxy Authentication Support on Data Virtualization.
We've also surfaced a number of documentation articles to help you plan and deploy more advanced Active Directory enabled BDC scenarios on Azure Kubernetes Services (AKS).
This announcement blog highlights some of the major improvements, provides additional context to better understand the design behind these capabilities, and points you to relevant resources to learn more and get you started.
Encryption at Rest on SQL Server Big Data Clusters
Starting on SQL Server Big Data Clusters CU8, a comprehensive encryption at rest feature set is available to provide application level encryption to all data stored in the platform. This enables application level encryption capability allows administrators to securely encrypt data on both SQL Server and HDFS services of BDC to comply with desired enterprise-grade requirements. If your organization needs to enable encryption at rest, read all concepts and guidelines in our new documentation set for the feature.
Oracle Proxy Authentication support
As a modern data platform, BDC have long supported Oracle as one of its Data Virtualization sources. In order to support more enterprise-grade compliant scenarios, we've enabled Oracle Proxy Authentication support to provide fine grained access control. A proxy user connects to the Oracle database using its credentials and impersonates another user in the database. To learn how to enable this feature on your Oracle connection scenarios, read the updated documentation article here.
A proxy user can be configured to have limited access compared to the user being impersonated. For example, a proxy user can be allowed to connect using a specific database role of the user being impersonated. The identity of the user connecting to Oracle database through proxy user is preserved in the connection, even if multiple users are connecting using proxy authentication. This enables Oracle to enforce access control and to audit actions taken on behalf of the actual user.
Azure Data Studio Operational Notebooks for SQL Server BDC CU8
Along the Cumulative Update 8 we are also releasing an updated set of Operation Notebooks to Azure Data Studio. This update contains a new section of notebooks for managing Encryption at Rest, adds endpoint certificate rotation and include many fixes.
In order to get the latest version of the notebooks, either install the latest Azure Data Studio or use the "Add Remote Jupyter Book" feature as highlighted bellow. Using the Command Palette (Shift + Cmd + P), find and execute the "Jupyter Books: Add Remote Jupyter Book" command.
Fill out the information as displayed in the screenshot bellow, the click the "Add" button at the botton of the pane.
Active Directory enabled SQL Server BDC on Azure Kubernetes Services (AKS)
SQL Server Big Data Clusters have always supported Active Directory (AD) enabled deployment mode for Identity and Access Management (IAM). Yet, IAM on Azure Kubernetes Service (AKS) has been challenging because of industry-standard protocols such as OAuth 2.0 and OpenID Connect which are widely supported by Microsoft identity platform yet are not supported on SQL Server.
We are pleased to announce that you can now deploy a big data cluster (BDC) in AD mode while deploying in Azure Kubernetes Service (AKS). Learn more about how to plan Active Directory integration for BDC on AKS and how to Deploy SQL Server Big Data Clusters in AD mode on Azure Kubernetes Services (AKS) hand-by-hand follow the links to our documentation page.
Private BDC cluster with further egress traffic restriction using user-defined Route table ( UDRs ) on Azure Kubernetes Service (AKS)
To complement the above platform security enhancements regarding deployment of BDC in Active Directory mode on AKS, we are pleased to announce that we added support for deploying BDC in Azure Kubernetes Service (AKS) private cluster for both AD and non-AD clusters, it makes sure the network traffic between API server and node pools remains on the private network only. The control plane or API server has internal IP addresses in an AKS private cluster. You can find how to Deploy BDC in Azure Kubernetes Service (AKS) private cluster step-by-step here.
Leverage this configuration will help our customer restrict use of public IP addresses in enterprise networking environment. Furthermore, to restrict the additional hops are required for egress traffic, we also provide guidance on restricting egress traffic of Big Data Clusters (BDC) clusters in Azure Kubernetes Service (AKS) private cluster, please refer to this article on our documentation page. To know more about how to manage an Azure Kubernetes Service (AKS) private cluster with SQL Server big data clusters (BDC) deployed in Azure in here.
SQL Server BDC team hears your feedback
If you would like to help make BDC an even better analytics platform, please share any recommendations or report issues through our feedback page. SQL Server engineering team is thoroughly going through the reported suggestions. They are valuable input for us, that is being considered when planning and prioritizing the next set of improvements. We are committed to ensuring that SQL Server enhancements are based on customer experiences, so we build robust solutions that meet real production requirements in terms of functionality, security, scalability, and performance.
Ready to learn more?
With SQL Server 2019 CU8 updates, BDC continues to simplify the security, deployment, and management of your key data workloads.
Check out the SQL Server CU8 release notes for BDC to learn more about all the improvements available with the latest update. For a technical deep-dive on Big Data Clusters, read the documentation and visit our GitHub repository.
To get started using Encryption at Rest on BDC, follow the instructions on our documentation page.