If you are planning to move to or are already using a hybrid cloud, like many enterprises, you are already dealing with the complexity of running Applications and data across a diverse infrastructure with on-premises data centers, hosting environments, branch offices as well as multi-cloud environments, where we see two or more clouds being used. Azure Arc enables you to seamlessly extend the Azure control plane to on-premises datacenters, edge locations, and any public cloud to manage these resources and the infrastructure that runs on them at scale. Azure Arc-enabled data services makes it easier to run Azure data services such as Azure SQL Managed Instance (SQLMI), PostgreSQL and other Arc-enabled data services in future on this Arc-enabled hybrid infrastructure. It gives centralized visibility, security and governance across your resources and locations while you to run your databases without having to change developer tools.
In July 2021, Microsoft released Azure Arc-enabled data services with production support for indirectly connected mode. This gave you isolation and control over the data that was uploaded to Azure. But in this mode the Azure portal remains a read-only view. You can see the inventory of SQL managed instances and Postgres Hyperscale server groups that you have deployed and the details about them, but you cannot act on them in the Azure portal.
At Ignite Nov 2021, Microsoft announced Azure Arc-enabled data services directly connected mode to be generally available. Now, you can use all the supported Azure services such as updates, Azure Monitor, Azure Cost Analytics and more with your Arc-enabled data services. In this blog, we will explore the directly connected mode in greater depth to understand benefits, requirements, the inner workings and important considerations when deploying it in your cloud. The directly connected mode is one of the most important and awaited features of Azure Arc-enabled data services that will unlock many solutions for modernizing your data centers, your data services and your stateful applications.
There are important benefits to using Arc-enabled data services, such as SQL MI and PostgreSQL, when these services are running outside Azure but directly connected to it in three broad deployments
In these deployments, you need the flexibility to run machine learning models and / or advanced analytics closer to the users or where the data resides. There may be data compliance and data sovereignty requirements that prevent you from moving your databases to the cloud, but you need the flexibility to be able to observe and manage their life cycle centrally. Arc allows you to have an active inventory of all the data services you manage across all clouds as billing and inventory data continuously flows to Azure. You can operate on all the data services using ARM APIs and familiar Azure tools like Azure CLI, Azure portal or Azure Data studio simplifying integrating these into your existing toolset. Now, Azure services like Azure Monitor, Azure role-based access control, and many more in future that you use with your public cloud based services can be used with on-premises Arc-enabled data services.
Figure 1. Azure resource group with all resources across hybrid cloud in Azure portal
You may be wondering how do features differ between the two modes – directly connected or indirectly connected. At a high level, without connectivity, you have limited capabilities to centrally inventory in a read-only view but not be able to derive the benefits of Azure services fully that can use the limitless capacity, scalability and elasticity of Azure.
At a high level, many of the core capabilities such as provisioning/deprovisioning, scale up/down, point in time restore, cloud-based billing, built-in monitoring, built-in logging etc. are all available regardless of which mode the Azure Arc data controller is deployed in. A direct connected Azure Arc data controller enables the core capabilities automatically and additional capabilities by nature of its Azure integration, such as:
Directly connected mode is very easy to configure for an Arc-enabled data services deployment on any cloud from the Azure portal. As with indirect-connected mode, direct-connected mode requires a CNCF conformant Kubernetes cluster to run data services, whether it is hosted on physical, virtual or a public cloud infrastructure. The direct connectivity support is to Azure cloud and hence you will need an Azure subscription. You then install all the required tools such as Azure CLI and the required extensions – k8s-extension, connectedk8s, k8s-configuration and custom location. You should be able to connect the host cluster to Azure using Azure Arc-enabled Kubernetes. Azure Arc creates a secure connection from the Kubernetes cluster to Azure using a wide range of options such as SSL, VPN or Express Route. You will need to deploy Azure Arc-enabled Kubernetes. With all the pre-requisites, you can configure directly connected mode in easy steps using Azure CLI, Azure portal, or ARM APIs as given below.
Putting this all together, Azure Arc functional architecture in a directly connected mode is shown in Figure 2 below.
Figure 2. Azure Arc architecture of directly connected cluster
Azure Arc-enabled services can run on any Azure Arc-enabled infrastructure. In this case the infrastructure required to run data services is Kubernetes. Kubernetes requires underlying hardware to support compute, network and persistent storage. At this time, we recommend using local storage for some functions like automatic backups. Azure Arc-enabled data services have Azure based Azure Resource manager Resource providers for data services that integrate with ARM APIs on the Azure side.
Figure 3. Azure Arc-enabled data services architecture
Figure 3 shows the Azure Arc-enabled data services architecture by detailing the components in the target Arc-enabled Kubernetes cluster. Here, the data controller interfaces through ARM APIs as well as Kubernetes API server to deliver all management services such as lifecycle management, backup/restore, Point in time restore and any others supported by the underlying DB engines running SQL MI and Postgres instances. This allows you to use not only Azure tools but also native Kubernetes tools.
Using Arc-enabled Kubernetes communication channel the data controller sends billing, log, usage and metrics to Azure when configured. The same secure channel is used to bring down policy, actions and role based information. It connects with the Microsoft registry to pull down images and updates as scheduled.
The SQL Managed instances and Postgres instances are hosted inside the cluster in the custom location. This directly-connected architecture enables continuous updates, monitoring and usage tracking, thus allowing use of Azure services like Azure Monitor. This architecture is built to be extensible in future to bring more Azure cloud capabilities like other data services and related services to the edge, activating newer scenarios that can accelerate your data modernization.
We all may agree with the benefits of a directly connected hybrid data cloud – public, on-premises and at the edge. But it may raise concerns about the free flow of data that could result in violating data compliance, data sovereignty and data governance requirements. Azure Arc has been built with Enterprise scale in mind. Though directly connected mode, connects all the host clusters to Azure, the data stored in the data services such as the SQL MI and Postgres databases remains in the cluster on the location. Only metadata about the data services such as SQL MI instances is pushed to Azure. The usage information is essential for billing and needs to be uploaded. But logs and other metrics uploads can be disabled. If you have data sovereignty constraints, you can also configure it to be constrained to an Azure region of your choice. For a detailed list of the data sent to Azure, please refer to Azure Arc-enabled data services data collection and reporting policy.
All connections are over secure transports. The port configuration required to setup your firewall follows Azure best practices. For more information, please refer to product documentation on ports, encryption and proxy support.
If you are migrating towards adopting more of the cloud whether with the intent of retiring your data centers, modernizing them or developing new cloud native application, directly connected mode is the preferred and more common way for deploying your data services. This approach allows you to start adopting cloud native DevOps practices. If you are a developer, you can start developing stateful application that meet the data latency guarantees while using cloud-native designs and practices. If you are a database administrators, you can deliver a consistent database as a service experience across all your cloud deployments. Azure Arc is designed to meet your exacting data compliance, security and governance requirements of a global scale enterprise-ready hybrid multi-cloud. Using the additional services in Azure, you will benefit from the new cloud based billing models and be able to put cost controls in place to bring down your overall total cost of ownership.
To see it in action, Travis Wright, Principal Engineering Manager for Azure Arc-enabled Data services and Thomas Maurer have published this informative video.
For more information, you can begin with using the Jumpstart for directly connected mode. The latest release with directly connected mode will be published by the end of Ignite2021. Watch this space for updates - Azure Arc-enabled data services - Release notes - Azure Arc | Microsoft Docs". Happy cloud connecting!!
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.