azure
7788 TopicsGuide for Architecting Azure-Databricks: Design to Deployment
Author's: Chris Walk cwalk, Dan Johnson danjohn1234, Eduardo dos Santos eduardomdossantos, Ted Kim tekim, Eric Kwashie ekwashie, Chris Haynes Chris_Haynes, Tayo Akigbogun takigbogun and Rafia Aqil Rafia_Aqil Peer Reviewed: Mohamed Sharaf mohamedsharaf Note: This article does not cover the Serverless Workspace option, which is currently in Public Preview. We plan to update this article once Serverless Workspaces are Generally Available. Also, while Terraform is the recommended method for production deployments due to its automation and repeatability, for simplicity in this article we will demonstrate deployment through the Azure portal. Introduction Video to Databricks: what is databricks | introduction - databricks for dummies DESIGN: Architecting a Secure Azure Databricks Environment Step 1: Plan Workspace, Subscription Organization, Analytics Architecture and Compute Planning your Azure Databricks environment can follow various arrangements depending on your organization’s structure, governance model, and workload requirements. The following guidance outlines key considerations to help you design a well-architected foundation. 1.1 Align Workspaces with Business Units A recommended best practice is to align each Azure Databricks workspace with a specific business unit. This approach—often referred to as the “Business Unit Subscription” design pattern—offers several operational and governance advantages. Streamlined Access Control: Each unit manages its own workspace, simplifying permissions and reducing cross-team access risks. For example, Sales can securely access only their data and notebooks. Cost Transparency: Mapping workspaces to business units enables accurate cost attribution and supports internal chargeback models. Each workspace can be tagged to a cost center for visibility and accountability. Even within the same workspace, costs can be controlled using system tables that provide detailed usage metrics and resource consumption insights. Challenges to keep-in-mind: While per-BU workspaces have high impact, be mindful of workspace sprawl. If every small team spins up its own workspace, you might end up with dozens or hundreds of workspaces, which introduces management overhead. Databricks recommends a reasonable upper limit (on Azure, roughly 20–50 workspaces per account/subscription) because managing “collaboration, access, and security across hundreds of workspaces can become extremely difficult, even with good automation” [1]. Each workspace will need governance (user provisioning, monitoring, compliance checks), so there is a balance to strike. 1.2 Workspace Alignment and Shared Metastore Strategy As you align workspaces with business units, it's essential to understand how Unity Catalog and the metastore fit into your architecture. Unity Catalog is Databricks’ unified governance layer that centralizes access control, auditing, and data lineage across workspaces. Each Unity Catalog is backed by a metastore, which acts as the central metadata repository for tables, views, volumes, and other data assets. In Azure Databricks, you can have one metastore per region, and all workspaces within that region share it. This enables consistent governance and simplifies data sharing across teams. If your organization spans multiple regions, you’ll need to plan for cross-region sharing, which Unity Catalog supports through Delta Sharing. By aligning workspaces with business units and connecting them to a shared metastore, you ensure that governance policies are enforced uniformly, while still allowing each team to manage its own data assets securely and independently. 1.3 Distribute Workspaces Across Subscriptions When scaling Azure Databricks, consider not just the number of workspaces, but also how to distribute them across Azure subscriptions. Using multiple Azure subscriptions can serve both organizational needs and technical requirements: Environment Segmentation (Dev/Test/Prod): A common pattern is to put production workspaces in a separate Azure subscription from development or test workspaces. This provides an extra layer of isolation. Microsoft highly recommends separating workspaces into prod and dev, in separate subscriptions. This way, you can apply stricter Azure policies or network rules to the prod subscription and keep the dev subscription a bit more open for experimentation without risking prod resources. Honor Azure Resource Limits: Azure subscriptions come with certain capacity limits and Azure Databricks workspaces have their own limits (since it’s a multi-tenant PaaS). If you put all workspaces in one subscription, or all teams in one workspace, you might hit those limits. Most enterprises naturally end up with multiple subscriptions as they grow – planning this early avoids later migration headaches. If you currently have everything in one subscription, evaluate usage and consider splitting off heavy workloads or prod workloads into a new one to adhere to best practices. 1.4 Consider Completing Azure Landing Zone Assessment When evaluating and planning your next deployment, it’s essential to ensure that your current landing zone aligns with Microsoft best practices. This helps establish a robust Databricks architecture and minimizes the risk of avoidable issues. Additionally, customers who are early in their cloud journey can benefit from Cloud Assessments—such as an Azure Landing Zone Review and a review of the “Prepare for Cloud Adoption” documentation—to build a strong foundation. 1.5 Planning Your Azure Databricks Workspace Architecture Your workspace architecture should reflect the operational model of your organization and support the workloads you intend to run, from exploratory notebooks to production-grade ETL pipelines. To support your planning, Microsoft provides several reference architectures that illustrate well-architected patterns for Databricks deployments. These solution ideas can serve as starting points for designing maintainable environments: Simplified Architecture: Modern Data Platform Architecture, ETL-Intensive Workload Reference Architecture: Building ETL Intensive Architecture, End-to-End Analytics Architecture: Create a Modern Analytics Architecture. 1.6 Planning for that “Right” Compute Choosing the right compute setup in Azure Databricks is crucial for optimizing performance and controlling costs, as billing is based on Databricks Units (DBUs) using a per-second pricing model. Classic Compute: You can fine-tune your own compute by enabling auto-termination and autoscaling, using Photon acceleration, leveraging spot instances, selecting the right VM type and node count for your workload, and choosing SSDs for performance or HDDs for archival storage. Preferred by mature internal teams and developers who need advanced control over clusters—such as custom VM selection, tuning, and specialized configurations. Serverless Compute: Alternatively, managed services can simplify operations with built-in optimizations. Removes infrastructure management and offers instant scaling without cluster warm-up, making it ideal for agility and simplicity. Step 2: Plan the “Right” CIDR Range (Classic Compute) Note: You can skip this step if you plan to use serverless compute for all your resources, as CIDR range planning is not required in serverless deployments. When planning CIDR ranges for your Azure Databricks workspace, it's important to ensure your virtual network has enough IP address capacity to support cluster scaling. Why this matters: If you choose a small VNet address space and your analytics workloads grow, you might hit a ceiling where you simply cannot launch more clusters or scale-out because there are no free IPs in the subnet. The subnet sizes—and by extension, the VNet CIDR—determine how many nodes you can. Databricks recommends using a CIDR block between /16 and /24 for the VNet, and up to /26 for the two required subnets: the container subnet and the host subnet. Here’s a reference Microsoft provides. If your current workspace’s VNet lacks sufficient IP space for active cluster nodes, you can request a CIDR range update through your Azure Databricks account team as noted in the Microsoft documentation. 2.1 Considerations for CIDR Range Workload Type & Concurrency: Consider what kinds of workloads will run (ETL Pipelines, Machine Learning Notebooks, BI Dashboards, etc.) and how many jobs or clusters may need to run in parallel. High concurrency (e.g. multiple ETL jobs or many interactive clusters) means more nodes running at the same time, requiring a larger pool of IP addresses. Data Volume (Historical vs. Incremental): Are you doing a one-time historical data load or only processing new incremental data? A large backfill of terabytes of data may require spinning up a very large cluster (hundreds of nodes) to process in a reasonable time. Ongoing smaller loads might get by with fewer nodes. Estimate how much data needs processing. Transformation Complexity: The complexity of data transformations or machine learning workloads matters. Heavy transformations (joins, aggregations on big data) or complex model training can benefit more workers. If your use cases include these, you may need larger clusters (more nodes) to meet performance SLAs, which in turn demands more IP addresses available in the subnet. Data Sources and Integration: Consider how your Databricks environment will connect to data. If you have multiple data sources or sinks (e.g. ingest from many event hubs, databases, or IoT streams), you might design multiple dedicated clusters or workflows, potentially all active at once. Also, if using separate job clusters per job (Databricks Jobs), multiple clusters might launch concurrently. All these scenarios increase concurrent node count. 2.2 Configuring a Dedicated Network (VNet) per Workspace with Egress Control By default, Azure Databricks deploys its classic compute resources into a Microsoft-managed virtual network (VNet) within your Azure subscription. While this simplifies setup, it limits control over network configuration. For enhanced security and flexibility, it's recommended to use VNet Injection, which allows you to deploy the compute plane into your own customer-managed VNet. This approach enables secure integration with other Azure services using service endpoints or private endpoints, supports user-defined routes for accessing on-premises data sources, allows traffic inspection via network virtual appliances or firewalls, and provides the ability to configure custom DNS and enforce egress restrictions through network security group (NSG) rules. Within this VNet (which must reside in the same region and subscription as the Azure Databricks workspace), two subnets are required for Azure Databricks: a container subnet (referred to as private subnet) and a host subnet (referred to as public subnet). To implement front-end Private Link, back-end Private Link, or both, your workspace VNet needs a third subnet that will contain the private endpoint (PrivateLink subnet). It is recommended to also deploy an Azure Firewall for egress control. Step 3: Plan Network Architecture for Securing Azure-Databricks 3.1 Secure Cluster Connectivity Secure Cluster Connectivity, also known as No Public IP (NPIP), is a foundational security feature for Azure Databricks deployments. When enabled, it ensures that compute resources within the customer-managed virtual network (VNet) do not have public IP addresses, and no inbound ports are exposed. Instead, each cluster initiates a secure outbound connection to the Databricks control plane using port 443 (HTTPS), through a dedicated relay. This tunnel is used exclusively for administrative tasks, separate from the web application and REST API traffic, significantly reducing the attack surface. For the most secure deployment, Microsoft and Databricks strongly recommend enabling Secure Cluster Connectivity, especially in environments with strict compliance or regulatory requirements. When Secure Cluster Connectivity is enabled, both workspace subnets become private, as cluster nodes don’t have public IP addresses. 3.2 Egress with VNet Injection (NVA) For Databricks traffic, you’ll need to assign a UDR to the Databricks-managed VNet with a next hop type of Network Virtual Appliance (NVA)—this could be an Azure Firewall, NAT Gateway, or another routing device. For control plane traffic, Databricks recommends using Azure service tags, which are logical groupings of IP addresses for Azure services and should be routed with the next hop type of internet. This is important because Azure IP ranges can change frequently as new resources are provisioned, and manually maintaining IP lists is not practical. Using service tags ensures that your routing rules automatically stay up to date. 3.3 Front-End Connectivity with Azure Private Link (Standard Deployment) To further enhance security, Azure Databricks supports Private Link for front-end connections. In a standard deployment, Private Link enables users to access the Databricks web application, REST API, and JDBC/ODBC endpoints over a private VNet interface, bypassing the public internet. For organizations with no public internet access from user networks, a browser authentication private endpoint is required. This endpoint supports SSO login callbacks from Microsoft Entra ID and is shared across all workspaces in a region using the same private DNS zone. It is typically hosted in a transit VNet that bridges on-premises networks and Azure. Note: There are two deployment types: standard and simplified. To compare these deployment types, see Choose standard or simplified deployment. 3.4 Serverless Compute Networking Azure Databricks offers serverless compute options that simplify infrastructure management and accelerate workload execution. These resources run in a Databricks-managed serverless compute plane, isolated from the public internet and connected to the control plane via the Microsoft backbone network. To secure outbound traffic from serverless workloads, administrators can configure Serverless Egress Control using network policies that restrict connections by location, FQDN, or Azure resource type. Additionally, Network Connectivity Configurations (NCCs) allow centralized management of private endpoints and firewall rules. NCCs can be attached to multiple workspaces and are essential for enabling secure access to Azure services like Data Lake Storage from serverless SQL warehouses. DEPLOYMENT: Step-to-Step Implementation using Azure Portal Step 1: Create an Azure Resource Group For each new workspace, create a dedicated Resource Group (to contain the Databricks workspace resource and associated resources). Ensure that all resources are deployed in the same Region and Resource Group (i.e. workspace, subnets...) to optimize data movement performance and enhance security. Step 2: Deploy Workspace Specific Virtual Network (VNET) From your Resource Group, create a Virtual Network. Under the Security section, enable Azure Firewall. Deploying an Azure Firewall is recommended for egress control, ensuring that outbound traffic from your Databricks environment is securely managed. Define address spaces for your Virtual Network (Review Step 2 from Design). As documented, you could create a VNet with these values: IP range: First remove the default IP range, and then add IP range 10.28.0.0/23. Create subnet public-subnet with range 10.28.0.0/25. Create subnet private-subnet with range 10.28.0.128/25. Create subnet private-link with range 10.28.1.0/27. Please note: your IP values can be different depending on your IPAM and available scopes. Review + Create your Virtual Network. Step 3: Deploy Azure-Databricks Workspace: Now that networking is in place, create the Databricks workspace. Below are detailed steps your organization should review while creating workspace creation: In Azure Portal, search for Azure Databricks and click Create. Choose the Subscription, RG, Region, select Premium, enter in “Managed Resource Group name” and click Next. Managed Resource Group- will be created after your Databrick workspace is deployed and contains infrastructure resources for the workspace i.e. VNets, DBFS. Required: Enable “Secure Cluster Connectivity” (No Public IP for clusters), to ensure that Databricks clusters are deployed without public IP addresses (Review Section 3.1). Required: Enable the option to deploy into your Virtual Network (VNet Injection), also known as “Bring Your Own VNet” (Review Section 3.2). Select the Virtual Network created in Step 2. Enter Private, Public Subnet Names. Enable or Disable “Deploying Nat Gateway”, according to your workspace requirement. Disable “Allow Public Network Access”. Select “No Azure Databricks Rules” for Required NSG Rules. Select “Click on add to create a private endpoint”, this will open a panel for private endpoint setup. Click “Add” to enter your Private Link details created in Step 2. Also, ensure that Private DNS zone integration is set to “Yes” and that a new Private DNS Zone is created, indicated by (New)privatelink.azuredatabricks.net. Unless an existing DNS zone for this purpose already exists. (Optional) Under Encryption Tab, Enable Infrastructure Encryption, if you have requirement for FIPS 140-2. It comes at a cost, it takes time to encrypt and decrypt. By default your data is already encrypted. If you have a standard regulatory requirement (ex. HIPAA). (Optional) Compliance security profile- for HIPAA. (Optional) Automatic cluster updates, First Sunday of every Month. Review + Create the workspace and wait for it to deploy. Step 4: Create a private endpoint to support SSO for web browser access: Note: This step is required when front-end Private Link is enabled, and client networks cannot access the public internet. After creating your Azure Databricks workspace, if you try to launch it without the proper Private Link configuration, you will see an error like the image below: This happens because the workspace is configured to block public network access, and the necessary Private Endpoints (including the browser_authentication endpoint for SSO) are not yet in place. Create Web-Auth Workspace Note: Deploy a “dummy”: WEB_AUTH_DO_NOT_DELETE_<region> workspace in the same region as your production workspace. Purpose: Host the browser_authentication private endpoint (one required per region). Lock the workspace (Delete lock) to prevent accidental removal. Follow step 2 to create Virtual Network (Vnet) Follow step 3 and create a VNet injected “dummy” workspace. Create Browser Authentication Private Endpoint In Azure Portal, Databricks workspace (dummy), Networking, Private endpoint connections, + Private endpoint. Resource step: Target sub-resource: browser_authentication Virtual Network step: VNet: Transit/Hub VNet (central network for Private Link) Subnet: Private Endpoint subnet in that VNet (not Databricks host subnets) DNS step: Integrate with Private DNS zone: Yes Zone: privatelink.azuredatabricks.net Ensure DNS zone is linked to the Transit VNet After creation: A-records for *.pl-auth.azuredatabricks.net are auto-created in the DNS zone. Workspace Connectivity Testing If you have VPN or ExpressRoute, Bastion is not required. However, for the purposes of this article we will be testing our workpace connectivity through Bastion. If you don’t have private connectivity and need to test from inside the VNet, Azure Bastion is a convenient option. Step 5: Create Storage Account From your Resource Group, click Create and select Storage account. On the configuration page: Select Preferred Storage type as: Azure Blob Storage or Azure Data Lake Storage Gen 2. Choose Performance and Redundancy options based on your business requirements. Click Next to proceed. Under the Advanced tab: Enable Hierarchical namespace under Data Lake Storage Gen2. This is critical for: Directory and file-level operations, Access Control Lists (ACLs). Under the Networking tab: Set Public Network Access to Disabled. Complete the creation process and then create container(s) inside the storage account. Step 6: Create Private Endpoints for Workspace Storage Account Pre-requisite: You need to create two private endpoints from the VNet used for VNet injection to your workspace storage account for the following Target sub-resources: dfs and blob. Navigate to your Storage Account. Go to Networking, Private Endpoints tab and click on to + Create Private Endpoint. In the Create Private Endpoint wizard: Resource tab: Select your Storage Account. Set Target sub-resource to dfs for the first endpoint. Virtual Network tab: Choose the VNet you used for VNet injection. Select the appropriate subnet. Complete the creation process. The private endpoint will be auto approved and visible under Private Endpoints. Repeat the process for the second private endpoint: This time set Target sub-resource to blob. Step 7: Link Storage and Databricks Workspace: Create Access Connector In your Resource Group, create an Access Connector for Azure Databricks. No additional configuration is required during creation. Assign Role to Access Connector Navigate to your Storage Account, Access Control (IAM), Add role assignment. Select: Role: Storage Blob Data Contributor Assign access to: Managed Identity Under Members: Click Select members. Find and select your newly created Access Connector for Azure Databricks. Save the role assignment. Copy Resource ID Go to the Access Connector Overview page. Copy the Resource ID for later use in Databricks configuration. Step 8: Link Storage and Databricks Workspace: Navigate to Unity Catalog In your Databricks Workspace, go to Unity Catalog, External Data and select “Create external Location” button. Configure External Location Select ADLS as the storage type. Enter the ADLS storage URL in the following format: abfss://<container_name>@<storage_account_name>.dfs.core.windows.net/ Update these two parameters: <container_name> and <storage_name> Provide Access Connector Select “Create new storage credential” from Storage credential field. Paste the Resource ID of the Access Connector for Azure Databricks (from Step 10) into the Access Connector ID field. Validate Connection Click Submit. You should see a “Successful” message confirming the connection. Click submit and you should receive a “Successful” message, indicating your connection has succeeded. You can now create Catalogs and link your secure storage. Step 9: Configuring Serverless Compute Networking: If your organization plans to use Serverless SQL Warehouses or Serverless Jobs Compute, you must configure Serverless Networking. Add Network Connectivity Configuration (NCC) Go to the Databricks Account Console: https://accounts.azuredatabricks.net/ Navigate to Cloud resources, click Add Network Connectivity Configuration. Fill in the required fields and create a new NCC. Associate NCC with Workspace In the Account Console, go to Workspaces. Select your workspace, click Update Workspace. From the Network Connectivity Configuration dropdown, select the NCC you just created. Add Private Endpoint Rule In Cloud resources, select your NCC, select Private Endpoint Rules and click Add Private Endpoint Rule. Provide: Resource ID: Enter your Storage Account Resource ID. Note: this can be found from your storage account, click on “JSON View” top right. Azure Subresource type: dfs & blob. Approve Pending Connection Go to your Storage Account, Networking, Private Endpoints. You will see a Pending connection from Databricks. Approve the connection and you will see the Connection status in your Account Console as ESTABLISHED. Step 10: Test Your Workspace: Launch a small test cluster and verify the following: It can start (which means it can talk to the control plane). It can read/write from the storage, following the following code to confirm read/write to storage: Set Spark properties to configure Azure credentials to access Azure storage. Check Private DNS Record has been created. (Optional) If on-prem data is needed: try connecting to an on-prem database (using the ExpressRoute path): Connect your Azure Databricks workspace to your on-premises network - Azure Databricks | Microsoft Learn. Step 11: Account Console, Planning Workspace Access Controls and Getting Started: Once your Azure Databricks workspace is deployed, it's essential to configure access controls and begin onboarding users with the right permissions. From your account console: https://accounts.azuredatabricks.net/, you can centrally manage your environment: add users and groups, enable preview features, and view or configure all your workspaces. Azure Databricks supports fine-grained access management through Unity Catalog, cluster policies, and workspace-level roles. Start by defining who needs access to what—whether it's notebooks, tables, jobs, or clusters—and apply least-privilege principles to minimize risk. DBFS Limitation: DBFS is automatically created upon Databricks Workspace creation. DBFS can be found in your Managed Resource Group. Databricks cannot secure DBFS (see reference image below). If there is a business need to avoid DBFS then you can disable DBFS access following instructions here: Disable access to DBFS root and mounts in your existing Azure Databricks workspace. Use Unity Catalog to manage data access across catalogs, schemas, and tables, and consider implementing cluster policies to standardize compute configurations across teams. To help your teams get started, Microsoft provides a range of tutorials and best practice guides: Best practice articles - Azure Databricks | Microsoft Learn. Step 12: Planning Data Migration: As you prepare to move data into your Azure Databricks environment, it's important to assess your migration strategy early. This includes identifying source systems, estimating data volumes, and determining the appropriate ingestion methods—whether batch, streaming, or hybrid. For organizations with complex migration needs or legacy systems, Microsoft offers specialized support through its internal Azure Cloud Accelerated Factory program. Reach out to your Microsoft account team to explore nomination for Azure Cloud Accelerated Factory, which provides hands-on guidance, tooling, and best practices to accelerate and streamline your data migration journey. Summary Regular maintenance and governance are as important as the initial design. Continuously review the environment and update configurations as needed to address evolving requirements and threats. For example, tag all resources (workspaces, VNets, clusters, etc.) with clear identifiers (workspace name, environment, department) to track costs and ownership effectively. Additionally, enforce least privilege across the platform: ensure that only necessary users are given admin privileges, and use cluster-level access control to restrict who can create or start clusters. By following the above steps, an organization will have an Azure Databricks architecture that is securely isolated, well-governed, and scalable. References: [1] 5 Best Practices for Databricks Workspaces AzureDatabricksBestPractices/toc.md at master · Azure ... - GitHub Deploy a workspace using the Azure Portal Additional Links: Quick Introduction to Databricks: what is databricks | introduction - databricks for dummies Connect Purview with Azure Databricks: Integrating Microsoft Purview with Azure Databricks Secure Databricks Delta Share between Workspaces: Secure Databricks Delta Share for Serverless Compute Azure-Databricks Cost Optimization Guide: Databricks Cost Optimization: A Practical Guide Integrate Azure Databricks with Microsoft Fabric: Integrating Azure Databricks with Microsoft Fabric Databricks Solution Accelerators for Data & AI Azure updates Appendix 3.5 Understanding Data Transfer (Express Route vs. Public Internet) For data transfers, your organization must decide to use ExpressRoute or Internet Egress. There are several considerations that can help you determine your choice: 3.5.1. Connectivity Model • ExpressRoute: Provides a private, dedicated connection between your on-premises infrastructure and Microsoft Azure. It bypasses the public internet entirely and connects through a network service provider. • Internet Egress: Refers to outbound data traffic from Azure to the public internet. This is the default path for most Azure services unless configured otherwise. 3.6 Planning for User-Defined Routes (UDRs) When working with Databricks deployments—especially in VNet-injected workspaces—setting up User Defined Routes (UDRs) is a smart move. It’s a best practice that helps manage and secure network traffic more effectively. By using UDRs, teams can steer traffic between Databricks components and external services in a controlled way, which not only boosts security but also supports compliance efforts. 3.6.1 UDRs and Hub and Spoke Topology If your Databricks workspace is deployed into your own virtual network (VNet), you’ll need to configure standard user-defined routes (UDRs) to manage traffic flow. In a typical hub-and-spoke architecture, UDRs are used to route all traffic from the spoke VNets to the hub VNet. 3.6.2 Hub and Spoke with VWANHUB If your Databricks workspace is deployed into your own virtual network (VNet) and is peered to a Virtual WAN (VWAN) hub as the primary connectivity hub into Azure, a user-defined route (UDR) is not required—provided that a private traffic routing policy or internet traffic routing policy is configured in the VWAN hub. 3.6.3 Use of NVAs and Service Tags For Databricks traffic, you’ll need to assign a UDR to the Databricks-managed VNet with a next hop type of Network Virtual Appliance (NVA)—this could be an Azure Firewall, NAT Gateway, or another routing device. For control plane traffic, Databricks recommends using Azure service tags, which are logical groupings of IP addresses for Azure services and should be routed with the next hop type of internet. This is important because Azure IP ranges can change frequently as new resources are provisioned, and manually maintaining IP lists is not practical. Using service tags ensures that your routing rules automatically stay up to date. 3.6.4 Default Outbound Access Retirement (Non-Serverless Compute) Microsoft is retiring default outbound internet access for new deployments starting September 30,2025. Going forward, outbound connectivity will require an explicit configuration using an NVA, NAT Gateway, Load Balancer, or Public IP address. Also, note that using a Public IP Address in the deployment is discouraged for Security purposes, and it is recommended to deploy the workspace in a ‘Secure Cluster Connectivity ration.” Configure connectivity will require an explicit configuration using an NVA, NAT Gateway, Load Balancer, or Public IP address. Also, note that using a Public IP Address in the deployment is discouraged for Security purposes, and it is recommended to deploy the workspace in a ‘Secure Cluster Connectivity ration.”1.8KViews2likes0CommentsAnnouncing seamless integration of Apache Kafka with Azure Cosmos DB in Azure Native Confluent
Integrate Azure Cosmos DB with Kafka applications Confluent announced general availability (GA) of the fully managed V2 Kafka connector for Azure Cosmos DB, enabling users to seamlessly integrate their Azure Cosmos DB containers with Kafka-powered event streaming applications. without worrying about provisioning, scaling, or managing the connector infrastructure. The Confluent Cosmos DB v2 connector offers significant advantages as compared to the v1 connector in terms of higher throughput, enhanced security and observability, and increased reliability. Seamless Integration with Azure Native Confluent We are excited to announce a new capability in Azure Native Confluent service that enables users to create and configure Confluent-managed Cosmos DB Kafka connectors (v2) for Azure Cosmos DB containers through a direct, seamless experience in the Azure portal. Users can also provision and manage environments, Kafka clusters and Kafka topics from within the Azure Native Confluent service, creating a holistic end-to-end experience to integrate with Azure Cosmos DB. This eliminates the need for users to switch between the Azure and Confluent Cloud portal. Key Highlights Bi-directional Support: Allows users to create source connectors to stream data from Cosmos DB to Kafka topics, or sink connectors to move data from Kafka into Cosmos DB. Secure Authentication: Users can authenticate to Kafka cluster using service accounts, enabling least-privilege access controls to provision the connectors - aligned with Confluent’s recommended security guidelines. Create a Confluent Cosmos DB (v2) Kafka Connector from Azure portal The following section summarizes the key steps required to provision the connector from the Azure Native Confluent service. Navigate to the native Confluent resource in Azure. Navigate to Connectors -> Create new connector You can also create an environment, cluster and topic from within the Azure Portal. Choose the desired connector type Source: to stream data from Azure Cosmos DB Sink to move data into Azure Cosmos DB Select ‘Azure Cosmos DB V2’ as the connector plugin Enter the connector name. Then, select the required Kafka topics, Azure Cosmos DB account and database. Select Service Account authentication and provide a name for the service account. When the connector is created, this will create a new service account on Confluent Cloud. Optionally, you can also select the user-account based authentication by provisioning an API key on Confluent Cloud. Do the required connector configurations. Enter the topic container mapping in the form of ‘topic1#container1,topic2#container2…’ Review the configuration summary and click Create. Your connector will appear in the list with real-time status indicators. Other Resources Try out the Azure Native Confluent Service right away! Every new sign-up gets a free $1000 credit! To learn more, check out the Microsoft Docs If you would like to give us feedback on this feature, the overall product or have any suggestions for us to work on, please drop in your suggestions in the comments.70Views0likes0CommentsLive AMA: Demystifying Azure pricing (AM session)
⏱️ This live AMA is on January 22nd, 2026 at 9:00 AM PT. This same session is also scheduled at 5:00 PM PT on January 22nd. SESSION DETAILS This session breaks down the complexity of Azure pricing to help you make informed decisions. We’ll cover how to estimate costs accurately using tools like the Azure Pricing Calculator, explore strategic pricing offers such as Reservations, Savings Plans, and Azure Hybrid Benefit, and share best practices for optimizing workloads for cost efficiency. Whether you’re planning migrations or managing ongoing cloud spend, you’ll learn actionable strategies to forecast, control, and reduce costs without compromising performance. This session includes a live chat-based AMA. Submit your questions below as Comment to be answered by the product team!430Views0likes1CommentAutomating Microsoft Sentinel: A blog series on enabling Smart Security
This entry guides readers through building custom Playbooks in Microsoft Sentinel, highlighting best practices for trigger selection, managed identities, and integrating built-in tools and external APIs. It offers practical steps and insights to help security teams automate incident response and streamline operations within Sentinel.397Views1like0CommentsLive AMA: Demystifying Azure pricing (PM session)
⏱️ This live AMA is on January 22nd, 2026 at 5:00 PM PT. This same session is also scheduled at 9:00 AM PT on January 22nd. SESSION DETAILS This session breaks down the complexity of Azure pricing to help you make informed decisions. We’ll cover how to estimate costs accurately using tools like the Azure Pricing Calculator, explore strategic pricing offers such as Reservations, Savings Plans, and Azure Hybrid Benefit, and share best practices for optimizing workloads for cost efficiency. Whether you’re planning migrations or managing ongoing cloud spend, you’ll learn actionable strategies to forecast, control, and reduce costs without compromising performance. This session includes a live chat-based AMA. Submit your questions below as Comment to be answered by the product team!75Views0likes0CommentsMoving an app from AWS to Azure
Moving an application from AWS to Azure is more than a simple lift‑and‑shift. Successful replication requires selecting the right Azure services to support scalability, security, and long‑term growth. This article outlines the key migration considerations software development teams must address to successfully replicate AWS‑based applications on Azure. Walk through how to: Map core AWS services to their Azure equivalents Design Azure‑ready architectures with the right identity and security foundations Avoid common pitfalls that slow migrations and increase rework Discover the key Azure resources that help teams: Move faster with confidence Reduce rework during development and deployment Set a strong foundation for long‑term success on the Azure platform Read the full blog here: Replicating your AWS application to Azure: key resources for software development companies | Microsoft Community HubExplore paths for database migration into Azure
As organizations accelerate their multicloud strategies, choosing the right approach to database migration is essential for ensuring performance, scalability, and long-term flexibility. Microsoft’s latest article, “Expanding the Multicloud Advantage: Database Migration Paths into Azure,” provides practical guidance for navigating migration options, evaluating architectural considerations, and maximizing the benefits of Azure’s data platform services. If your team is looking to broader your customer base and enhance your apps's exposute by bringing your AWS-based solution to Azure and listing it on Microsoft Marketplace, this resource offers valuable insights to support informed, strategic decision‑making. Read the full article: Expanding the multicloud advantage: Database migration paths into Azure | Microsoft Community HubExpanding the multicloud advantage: Database migration paths into Azure
Broaden your customer base and enhance your app’s exposure by bringing your AWS-based solution to Azure and listing it on Microsoft Marketplace. This guide walks you through how Azure database services compare to those on AWS—spotlighting differences in architecture, scalability, and feature sets—so you can make confident choices when replicating your app’s data layer to Azure. This post is part of a series on replicating apps from AWS to Azure. View all posts in this series For software development companies looking to expand or replicate their marketplace offerings from AWS to Microsoft Azure, one of the most critical steps is selecting the right Azure database services. While both AWS and Azure provide robust managed database options, their architecture, service availability, and design approaches vary. To deliver reliable performance, scale globally, and meet operational requirements, it’s essential to understand how Azure databases work—and how they compare to AWS—before you replicate your app. AWS to Azure database mapping When replicating your app from AWS to Azure, start by mapping your existing database services to the closest Azure equivalents. Both clouds offer relational, NoSQL, and analytics databases, but they differ in architecture, features, and integration points. Choosing the right Azure service helps keep your app performant, secure, and manageable—and aligns with Azure Marketplace requirements for an Azure-native deployment. AWS Service Azure Equivalent Recommended Use Cases & Key Differences Amazon RDS (MySQL/PostgreSQL) Azure Database for MySQL / PostgreSQL Fully managed relational DB with built-in HA, scaling, and security. Building Generative AI apps. Amazon RDS (SQL Server) Azure SQL Database or Azure SQL Managed Instance Use Azure SQL Database for modern apps; choose Managed Instance for near 100% compatibility with on-prem SQL Server. SQL Server on EC2 SQL Server on Azure VMs Best for lift-and-shift scenarios requiring full OS-level control. Amazon RDS (Oracle) Oracle Database@Azure Managed Oracle workloads with Azure integration. Amazon Aurora (PostgreSQL/MySQL) Azure Database for PostgreSQL (Flexible Server) or Azure Database for MySQL Similar managed experience for large workloads, consider Azure HorizonDB (public preview)—built on PostgreSQL to compete with Aurora & AlloyDB. Learn more. Amazon DynamoDB Azure Cosmos DB (NoSQL API) Global distribution, multi-model support, and guaranteed SLAs for latency and throughput. Amazon Keyspaces (Cassandra) Azure Managed Instance for Apache Cassandra Managed Cassandra with elastic scaling and Azure-native security. Cassandra on EC2 Azure Managed Instance for Apache Cassandra Same as above; ideal for lift-and-shift Cassandra clusters. Amazon DocumentDB MongoDB Atlas MongoDB on EC2 Azure DocumentDB Azure DocumentDB Azure DocumentDB Drop-in compatibility for MongoDB workloads with global replication and vCore-based pricing. Amazon Redshift Azure Synapse Analytics Enterprise analytics with integrated data lake and Power BI connectivity. Amazon ElastiCache (Redis) Azure Cache for Redis Low-latency caching with clustering and persistence options. Match your use case After mapping AWS services to Azure equivalents, the next step is selecting the right service for your workload. Start by considering the data model (relational, document, key-value), then factor in performance, consistency, and global reach. Building AI apps: Generative AI, vector search, advanced analytics. Relational workloads: Use Azure SQL Database, Azure SQL Managed Instance, or Azure Database for MySQL/PostgreSQL for transactional apps; enable zone redundancy for HA. Review schema compatibility, stored procedures, triggers, and extensions. Inventory all databases, tables, indexes, users, and dependencies before migration. Document any required refactoring for Azure. NoSQL workloads: Choose Azure Cosmos DB for globally distributed apps; select the API (No SQL, MongoDB, Cassandra) that matches your existing schema. Validate data: Model mapping and test migration in a sandbox environment to ensure data integrity and application connectivity. Analytics: For large-scale queries and BI integration, Azure Synapse Analytics offers MPP architecture and tight integration with Azure Data Lake. Inventory all analytics assets, ETL pipelines, and dependencies. Plan for migration using Azure Data Factory or Synapse pipelines. Test performance benchmarks and optimize query plans post-migration. Caching: Azure Cache for Redis accelerates app performance with in-memory data and clustering. Update application connection strings and drivers to use Azure endpoints. Implement retry logic and connection pooling for reliability. Validate cache warm-up and failover strategies. Hybrid scenarios: Combine Cosmos DB with Synapse Link (for Synapse as target) or Fabric Mirroring (for Fabric as target) for real-time analytics without ETL overhead. Assess network isolation, security, and compliance requirements. Deploy Private Endpoints and configure RBAC as needed. Document integration points and monitor hybrid data flows. Factor in security and compliance Encryption: Confirm default encryption meets compliance requirements; enable customer-managed keys (CMK) if needed. Enable Transparent Data Encryption (TDE) and review encryption for backups and in-transit data. Access control: Apply Azure RBAC and database-level roles for granular permissions. Audit user roles and permissions regularly to ensure least privilege. Network isolation: Use Private Endpoints within a virtual network to keep traffic off the public internet. Configure Network Security Groups (NSGs) and firewalls for additional protection. Identity integration: Prefer Managed Identities for secure access to databases. Integrate with Azure Active Directory for centralized identity management. Compliance checks: Verify certifications like GDPR, HIPAA, or industry-specific standards. Use Azure Policy and Compliance Manager to automate compliance validation Audit logging and threat detection: Enable audit logging and advanced threat detection with Microsoft Defender for all database services. Review logs and alerts regularly. Optimize for cost Compute tiers: Choose General Purpose for balanced workloads; Business Critical for low-latency and high IOPS. Review workload sizing and adjust tiers as needed for cost efficiency. Autoscaling: Enable autoscale for Cosmos DB and flexible servers to avoid overprovisioning. Monitor scaling events and set thresholds to control spend. Reserved capacity: Commit to 1–3 years for predictable workloads to unlock discounts. Evaluate usage patterns before committing to reservations. Serverless: Use serverless compute for workloads with completely ad hoc usage and low frequency of access. This eliminates the need for pre-provisioned resources and reduces costs for unpredictable workloads. Monitoring: Use Azure Cost Management and query performance insights to optimize spend. Set up budget alerts and analyze cost trends monthly. Include basic resource monitoring to detect adverse usage patterns early. Storage and backup costs: Review storage costs, backup retention policies, and configure lifecycle management for backups and archives. Data migration from AWS to Azure Migrating your data from AWS to Azure is a key step in replicating your app’s database layer for Azure Marketplace. The goal is a one-time transfer—after migration, your app runs fully on Azure. Azure Database Migration Service (DMS): Automates migration from RDS, Aurora, or on-prem to Azure Database, Azure SQL Managed Instance, Azure Database for MySQL/PostgreSQL, and SQL Server on Azure VM (for MySQL/PostgreSQL/SQL Server). Supports online and offline migrations; run pre-migration assessments and schema validation. Azure Data Factory: Orchestrates data movement from DynamoDB, Redshift, or S3 to Azure Cosmos DB or Synapse. Use mapping data flows for transformations and data cleansing. MongoDB migrations: Use the online migration utility designed for medium to large-scale migrations to Azure DocumentDB. Ensure schema compatibility and validate performance benchmarks before cutover. Cassandra migrations: Use Cassandra hybrid cluster or dual write proxy for Azure Managed Instance for Apache Cassandra. Validate schema compatibility and test migration in a sandbox environment. Offline transfers: For very large datasets, use Azure Data Box for secure physical migration. Plan logistics and security for device handling. Migration best practices: Schedule migration during a maintenance window, validate data integrity post-migration, and perform cutover only after successful data validation & verifications. Final readiness before marketplace listing Validate performance: Benchmark with real data and confirm chosen SKUs deliver required throughput and latency. Test application functionality under expected load and validate query performance for all critical scenarios. Lock down security: Ensure RBAC roles, Private Endpoints, and encryption meet compliance requirements. Review audit logs, enable threat detection, and verify access controls for all database and storage resources. Control costs: Verify autoscaling, reserved capacity, and cost alerts are active. Review storage and backup policies, and set up budget alerts for ongoing cost control. Enable monitoring: Set up dashboards for query performance, latency, and capacity. Configure alerts for failures, anomalies, and capacity thresholds. Monitor with Azure Monitor and Log Analytics for real-time operational insights. Documentation and support: Update migration runbooks, operational guides, troubleshooting documentation, and escalation contacts for post-migration support. Key Resources SaaS Workloads - Microsoft Azure Well-Architected Framework | Microsoft Learn Metered billing for SaaS offers in Partner Center Create plans for a SaaS offer in Microsoft Marketplace Get over $126K USD in benefits and technical consultations to help you replicate and publish your app with ISV Success Maximize your momentum with step-by-step guidance to publish and grow your app with App Advisor87Views2likes0CommentsBlank screen in "new" AI Foundry and unable to go back to "classic" view
Hi, I toggled on the "new" AI foundry experience. But it just gives me a blank page. I cannot disable it and go back to the old view though. When I go to https://ai.azure.com/ it takes me to the "nextgen" page (e.g. https://ai.azure.com/nextgen/?wsid=%2Fsubscriptions...) Removing "nextgen" from the URL just redirects me back to the "nextgen" version. This also happens to my colleague, and we can't use the Foundry. Can you help?15Views0likes0CommentsUnleashing New Business Opportunities for Microsoft Partners with PostgreSQL & MySQL on Azure
The latest innovations announced at Microsoft Ignite 2025 for PostgreSQL and MySQL running on Azure are more than just technical upgrades—they’re a launchpad for new business growth, deeper customer engagement, and accelerated digital transformation. Here’s how these advancements can help you deliver greater value and unlock new opportunities for your clients. 1. Introducing Azure HorizonDB: Built for Performance and AI Workloads We’re excited to unveil Azure HorizonDB in private preview—a new, fully managed PostgreSQL service engineered for business and developers alike. HorizonDB is designed for ultra-low latency, high read scale, and built-in AI capabilities, offering seamless scaling up to 192 virtual cores and 128 TB of storage. Deep integration with developer tools, including GitHub Copilot, delivers performance, resilience, and simplicity at any scale. With HorizonDB, teams can: Build AI apps at scale using advanced DiskANN vector indexing, pre-provisioned AI models, semantic search, and unified support for both relational and graph data. Accelerate app development with built-in extensions, including the PostgreSQL extension for Visual Studio Code integrated with GitHub Copilot. Copilot in VS Code is context-aware for PostgreSQL and enables one-click performance debugging. Unlock data insights through deep integrations with Microsoft Fabric and Microsoft Foundry. Expect reliability with enterprise-ready features from day one, including Entra ID integration, Private Link networking, and Azure Defender for Cloud. Business Opportunity: Position your practice as an early adopter and expert in next-generation database solutions. Position your practice as an early adopter and expert in next-generation database solutions by introducing customers to Azure HorizonDB. Use this conversation to offer migration, modernization, and AI-powered application development services leveraging Azure Database for PostgreSQL with future migrations to HorizonDB. Help clients build resilient, high-performance, and intelligent data platforms—driving new revenue streams and deeper customer engagement. 2. Modernize Data Infrastructure with Limitless Scale and Performance Azure’s new Elastic Clusters in Azure Database for PostgreSQL enable organizations to scale their databases horizontally across multiple nodes, supporting virtually unlimited throughput and storage. This means you can help clients build and grow multi-tenant SaaS applications and large-scale analytics solutions without the complexity of manual sharding or the limitations of legacy infrastructure. Azure’s managed service automates shard management, tenant isolation, and cross-node query coordination, freeing up your teams to focus on innovation instead of administration. Business Opportunity: Position your practice as the go-to partner for scalable, future-proof data platforms. Offer migration services, architecture consulting, and managed solutions that leverage Azure’s unique scale-out capabilities. 3. Accelerate Innovation with AI-Ready Databases Azure is leading the way in AI integration for open-source databases. With the PostgreSQL extension for Visual Studio Code and native Microsoft Foundry support, developers can build smarter apps and AI agents leveraging advanced AI capabilities directly in the database. Features like natural language querying, vector search, and seamless Copilot integration mean your clients can unlock new insights and automate processes faster than ever. Business Opportunity: Expand your offerings to include AI-powered analytics, intelligent agent development, and custom Copilot solutions. Help organizations harness their data for real-time decision-making and enhanced customer experiences. 4. Simplify and Accelerate Migrations from Legacy Systems The new AI-assisted Oracle to PostgreSQL migration tool dramatically reduces the effort and risk of moving off expensive, proprietary databases. Integrated into the PostgreSQL extension for VS Code, it automates schema and code conversion, provides inline AI explanations, and ensures secure, context-aware migrations. Business Opportunity: Lead migration projects that deliver rapid ROI. Offer assessment, planning, and execution services to help clients escape legacy costs and embrace open-source flexibility on Azure. 5. Enable Seamless Analytics and Real-Time Insights With support for Parquet in the Azure storage extension for PostgreSQL and Fabric zero-ETL mirroring for Azure Database for MySQL and Azure Database for PostgreSQL, Azure is bridging operational databases and analytics platforms. Business Opportunity: Build solutions that unify data estates, streamline analytics workflows, and deliver actionable intelligence. Position your team as experts in data integration and real-time analytics. 6. Drive Industry-Specific Transformation Ignite 2025 showcases real-world success stories from industries like healthcare (Apollo Hospitals), automotive (GM), and finance (Nasdaq), demonstrating how Azure’s open-source databases power resilient, scalable, and AI-driven solutions. Business Opportunity: Use these case studies to inspire clients in regulated or complex sectors. Offer tailored solutions that meet strict compliance, security, and performance requirements. Why Partners Win with Azure’s Latest Innovations Faster time-to-value: Help clients adopt the latest tech with minimal downtime and risk. Expanded service portfolio: From migration to AI, analytics to managed services, the new capabilities open doors to new revenue streams. Trusted platform: Azure’s enterprise-grade security, compliance, and high availability mean you can deliver solutions with confidence. Ready to help your customers achieve more? Dive deeper into the Ignite 2025 announcements and start building the next generation of intelligent, scalable, and AI-powered solutions on Microsoft Azure. Learn more here: https://ignite.microsoft.com/en-US/home73Views2likes0Comments