Directly connected mode for Azure Arc-enabled data services is generally available.
Published Nov 02 2021 08:00 AM 7,684 Views
Microsoft

If you are planning to move to or are already using a hybrid cloud, like many enterprises, you are already dealing with the complexity of running Applications and data across a diverse infrastructure with on-premises data centers, hosting environments, branch offices as well as multi-cloud environments, where we see two or more clouds being used. Azure Arc enables you to seamlessly extend the Azure control plane to on-premises datacenters, edge locations, and any public cloud to manage these resources and the infrastructure that runs on them at scale. Azure Arc-enabled data services makes it easier to run Azure data services such as Azure SQL Managed Instance (SQLMI), PostgreSQL and other Arc-enabled data services in future on this Arc-enabled hybrid infrastructure. It gives centralized visibility, security and governance across your resources and locations while you to run your databases without having to change developer tools.

In July 2021, Microsoft released Azure Arc-enabled data services with production support for indirectly connected mode. This gave you isolation and control over the data that was uploaded to Azure. But in this mode the Azure portal remains a read-only view. You can see the inventory of SQL managed instances and Postgres Hyperscale server groups that you have deployed and the details about them, but you cannot act on them in the Azure portal.

At Ignite Nov 2021, Microsoft announced Azure Arc-enabled data services directly connected mode to be generally available. Now, you can use all the supported Azure services such as updates, Azure Monitor, Azure Cost Analytics and more with your Arc-enabled data services. In this blog, we will explore the directly connected mode in greater depth to understand benefits, requirements, the inner workings and important considerations when deploying it in your cloud. The directly connected mode is one of the most important and awaited features of Azure Arc-enabled data services that will unlock many solutions for modernizing your data centers, your data services and your stateful applications.

 A connected hybrid cloud for data services

There are important benefits to using Arc-enabled data services, such as SQL MI and PostgreSQL, when these services are running outside Azure but directly connected to it in three broad deployments

  • Edge site locations with internet connectivity to Azure
  • Corporate data centers that allow data services to be connected to/from their data region to the data center
  • Hosted in public clouds outside Azure like Amazon Web services or Google Cloud

In these deployments, you need the flexibility to run machine learning models and / or advanced analytics closer to the users or where the data resides. There may be data compliance and data sovereignty requirements that prevent you from moving your databases to the cloud, but you need the flexibility to be able to observe and manage their life cycle centrally. Arc allows you to have an active inventory of all the data services you manage across all clouds as billing and inventory data continuously flows to Azure. You can operate on all the data services using ARM APIs and familiar Azure tools like Azure CLI, Azure portal or Azure Data studio simplifying integrating these into your existing toolset. Now, Azure services like Azure Monitor, Azure role-based access control, and many more in future that you use with your public cloud based services can be used with on-premises Arc-enabled data services.

 

dhanMMS_0-1635777190856.png

Figure 1. Azure resource group with all resources across hybrid cloud in Azure portal

To be directly or Indirectly connected

You may be wondering how do features differ between the two modes – directly connected or indirectly connected. At a high level, without connectivity, you have limited capabilities to centrally inventory in a read-only view but not be able to derive the benefits of Azure services fully that can use the limitless capacity, scalability and elasticity of Azure.

At a high level, many of the core capabilities such as provisioning/deprovisioning, scale up/down, point in time restore, cloud-based billing, built-in monitoring, built-in logging etc. are all available regardless of which mode the Azure Arc data controller is deployed in.  A direct connected Azure Arc data controller enables the core capabilities automatically and  additional capabilities by nature of its Azure integration, such as:

  • provision/deprovision directly from Azure portal
  • automatically upload your metrics to Azure Monitor
  • automatically upload your logs to Azure Log Analytics
  • Azure Role based access control

 

Pre-requisites for deploying directly-connected data services

Directly connected mode is very easy to configure for an Arc-enabled data services deployment on any cloud from the Azure portal. As with indirect-connected mode, direct-connected mode requires a CNCF conformant Kubernetes cluster to run data services, whether it is hosted on physical, virtual or a public cloud infrastructure. The direct connectivity support is to Azure cloud and hence you will need an Azure subscription. You then install all the required tools such as Azure CLI and the required extensions – k8s-extension, connectedk8s, k8s-configuration and custom location. You should be able to connect the host cluster to Azure using Azure Arc-enabled Kubernetes. Azure Arc creates a secure connection from the Kubernetes cluster to Azure using a wide range of options such as SSL, VPN or Express Route. You will need to deploy Azure Arc-enabled Kubernetes. With all the pre-requisites, you can configure directly connected mode in easy steps using Azure CLI, Azure portal, or ARM APIs as given below.

How to setup directly-connected mode

  1. First, you set up a connected cluster using the AZ CLI connectedK8s command supported by the Arc-extension. This deploys core components of Azure Arc-enabled Kubernetes. Cluster connect allows access to the clusters from anywhere for interactive development and debugging, all without opening any inbound port on the firewall. It allows hosted agents/runners of Azure Pipelines, GitHub Actions, or any other hosted CI/CD service to deploy applications to the clusters. Portal browse will allow for viewing Kubernetes cluster resources like nodes, workloads, and data services centrally from the Azure portal. It also deploys RPaaS K8s Bridge that enables Data Services to integrate and deliver their required functionality on top of Arc-enabled K8s or AKS clusters. It comes with built-in support for managing Kubernetes based resources, which includes the SQL MI and Postgres instances, and extends the management capabilities of the Azure control plane for resources deployed in extended locations outside Azure.
  2. Then you configure a Custom location  on the host cluster. It provides a deployment target for creating SQL MI instances and PostgreSQL instances on Azure Arc-enabled Kubernetes clusters from Azure while abstracting underlying infrastructure details from end-users like application development teams/database admins. The IT infrastructure administrator can create a custom location (with a friendly name and tags) mapping to a namespace of a Kubernetes cluster and give a Database administrator the appropriate permissions to deploy SQL managed database instances to this namespace on the cluster.
  3. With the target custom location configured, you then install a data services Cluster extensions.  This provides an at-scale mechanism to deploy, update and manage the Kubernetes components required to run Arc-enabled data services. The cluster extension required for Arc-enabled data services is aptly called Arc data services extension.
  4. The next step is to deploy the Azure Arc data controller in a directly-connected mode in the custom location. In Kubernetes terminology, the data controller is a Kubernetes controller object that extends Kubernetes APIs to manage lifecycle of the data services. It is the heart of Arc-enabled data services and uses the underlying Arc-enabled Kubernetes for all aspects of managing the lifecycle of the data services like provisioning/deprovisioning, monitoring/logging to communication with Azure to bring down commands, policy and sending inventory, usage, metrics and billing information to Azure. There are pre-set profiles that have been setup for the most popular and Arc-validated series of Kubernetes distros to simplify the deployment configuration for the data services. The deployment command uses the bootstrapper extension in the target custom location to instantiate a data controller in the target custom location.
  5. Finally, you are ready to spin up an  Azure Arc enabled SQL Managed instance in this cluster. The command provided in Azure to create a SQL MI instance is passed down to the cluster using the cluster connect channel. The RPaaS Kubernetes bridge translates this into actions for the data controller to create an Azure Arc enabled SQL MI. When the SQL MI is created in the Arc infrastructure, it is automatically projected into Azure using the RPaaS K8s bridge allowing you to inventory, configure and operate all your Arc-enabled data services. The metrics for the SQL MI can be optionally sent to Azure Monitor giving you the same metrics as available in your Azure SQL cloud instances. Logs are periodically uploaded to Azure Log analytics workspace. The usage data is used to provide Cost analytics of the data services.

 

Architecture of the Arc-enabled data service cluster

Putting this all together, Azure Arc functional architecture in a directly connected mode is shown in Figure 2 below.

 

 

dhanMMS_5-1635907626755.png

Figure 2. Azure Arc architecture of directly connected cluster

Azure Arc-enabled services can run on any Azure Arc-enabled infrastructure. In this case the infrastructure required to run data services is Kubernetes. Kubernetes requires underlying hardware to support compute, network and persistent storage. At this time, we recommend using local storage for some functions like automatic backups. Azure Arc-enabled data services have Azure based Azure Resource manager Resource providers for data services that integrate with ARM APIs on the Azure side.

 

 

dhanMMS_6-1635907747963.png

Figure 3. Azure Arc-enabled data services architecture

Figure 3 shows the Azure Arc-enabled data services architecture by detailing the components in the target Arc-enabled Kubernetes cluster. Here, the data controller interfaces through ARM APIs as well as Kubernetes API server to deliver all management services such as lifecycle management, backup/restore, Point in time restore and any others supported by the underlying DB engines running SQL MI and Postgres instances. This allows you to use not only Azure tools but also native Kubernetes tools.

Using  Arc-enabled Kubernetes communication channel the data controller sends billing, log, usage and metrics to Azure when configured.  The same secure channel is used to bring down policy, actions and role based information. It connects with the Microsoft registry to pull down images and updates as scheduled.

The SQL Managed instances and Postgres instances are hosted inside the cluster in the custom location. This directly-connected architecture enables continuous updates, monitoring and usage tracking, thus allowing use of Azure services like Azure Monitor. This architecture is built to be extensible in future to bring more Azure cloud capabilities like other data services and  related services to the edge, activating newer scenarios that can accelerate your data modernization.

Data considerations in directly connected mode

We all may agree with the benefits of a directly connected hybrid data cloud – public, on-premises and at the edge. But it may raise concerns about the free flow of data that could result in violating data compliance, data sovereignty and data governance requirements. Azure Arc has been built with Enterprise scale in mind. Though directly connected mode, connects all the host clusters to Azure, the data stored in the data services such as the SQL MI and Postgres databases remains in the cluster on the location. Only metadata about the data services such as SQL MI instances is pushed to Azure. The usage information is essential for billing and needs to be uploaded. But logs and other metrics uploads can be disabled. If you have data sovereignty constraints, you can also configure it to be constrained to an Azure region of your choice. For a detailed list of the data sent to Azure, please refer to Azure Arc-enabled data services data collection and reporting policy.

All connections are over secure transports. The port configuration required to setup your firewall follows Azure best practices. For more information, please refer to product documentation on ports, encryption and proxy support.

Get connected to the cloud

If you are migrating towards adopting more of the cloud whether with the intent of retiring your data centers, modernizing them or developing new cloud native application, directly connected mode is the preferred and more common way for deploying your data services. This approach allows you to start adopting cloud native DevOps practices. If you are a developer, you can start developing stateful application that meet the data latency guarantees while using cloud-native designs and practices. If you are a database administrators, you can  deliver a consistent database as a service experience across all your cloud deployments. Azure Arc is designed to meet your exacting data compliance, security and governance requirements of a global scale enterprise-ready hybrid multi-cloud. Using the additional services in Azure, you will benefit from the new cloud based billing models and be able to put cost controls in place to bring down your overall total cost of ownership.

To see it in action, Travis Wright, Principal Engineering Manager for Azure Arc-enabled Data services and Thomas Maurer have published this informative video.

 

 

For more information, you can begin with using the Jumpstart for directly connected mode. The latest release with directly connected mode will be published by the end of Ignite2021. Watch this space for updates - Azure Arc-enabled data services - Release notes - Azure Arc | Microsoft Docs". Happy cloud connecting!!

Co-Authors
Version history
Last update:
‎Nov 08 2021 11:09 AM
Updated by: