Blog Post

Microsoft Security Community Blog
8 MIN READ

Step-by-Step Guide: Integrating Microsoft Purview with Azure Databricks and Microsoft Fabric

Rafia_Aqil's avatar
Rafia_Aqil
Icon for Microsoft rankMicrosoft
Oct 07, 2025

Co-Authored By: aryananmol​, laurenkirkwood​ and mmanley​ 

 

This article provides practical guidance on setup, cost considerations, and integration steps for Azure Databricks and Microsoft Fabric to help organizations plan for building a strong data governance framework. It outlines how Microsoft Purview can unify governance efforts across cloud platforms, enabling consistent policy enforcement, metadata management, and lineage tracking. The content is tailored for architects and data leaders seeking to execute governance in scalable, hybrid environments. Note: this article focuses mainly on Data Governance features for Microsoft Purview 

Why Microsoft Purview 

Microsoft Purview enables organizations to discover, catalog, and manage data across environments with clarity and control. Automated scanning and classification build a unified view of your data estate enriched with metadata, lineage, and sensitivity labels, and the Unified Catalog gives business-friendly search and governance constructs like domains, data products, glossary terms, and data quality. 

NoteMicrosoft Purview Unified Catalog is being rolled out globally, with availability across multiple Microsoft Entra tenant regions; this page lists supported regions, availability dates, and deployment plans for the Unified Catalog service: Unified Catalog Supported Regions.  

Understanding Data Governance Features Cost in Purview  

Under the classic model: Data Map (Classic), users pay for an “always-on” Data Map capacity and scanning compute. In the new model, those infrastructure costs are subsumed into the consumption meters – meaning there are no direct charges for metadata storage or scanning jobs when using the Unified Catalog (Enterprise tier). Essentially, Microsoft stopped billing separately for the underlying data map and scan vCore-hours once you opt into the new model or start fresh with it. You only incur charges when you govern assets or run data processing tasks. This makes costs more predictable and tied to governance value: you can scan as much as needed to populate the catalog without worrying about scan fees and then pay only for the assets you actively manage (“govern”) and any data quality processes you execute. In summary, Purview Enterprise’s pricing is usage-based and divided into two primary areas: (1) Governed Assets and (2) Data Processing (DGPUs). 

Plan for Governance  

Microsoft Purview’s data governance framework is built on two core components: Data Map and Unified Catalog.  

  1. The Data Map acts as the technical foundation, storing metadata about assets discovered through scans across your data estate. It inventories sources and organizes them into collections and domains for technical administration.  
  2. The Unified Catalog sits on top as the business-facing layer, leveraging the Data Map’s metadata to create a curated marketplace of data products, glossary terms, and governance domains for data consumers and stewards. 

Before onboarding sources, align Unified Catalog (business-facing) and Data Map (technical inventory) and define roles, domains, and collections so ownership and access boundaries are clear. Here is a documentation that covers roles and permissions in Purview: Permissions in the Microsoft Purview portal | Microsoft Learn.  

Figure 1: The imageabove helps understand therelationship between the primary data governance solutions, Unified Catalog and Data Map, and the permissions granted by the roles for each solution.

Considerations and Steps for Setting up Purview 

Steps for Setting up Purview: 

  • Step 1: Create a Purview Account. In the Azure Portal, use the search bar at the top to navigate to Microsoft Purview Accounts. Once there, click “Create”. This will take you to the following screen:Figure 2: Fill out Subscription, Resource Group, and Instance Details, including a name for your Microsoft Purview account and its location.
  • Step 2: Click Next: Configuration and follow the Wizard, completing the necessary fields, including information on Networking, Configurations, and Tags. Then click Review + Create to create your Purview account.  

Consideration: Private networking: Use Private Endpoints to secure Unified Catalog/Data Map access and scan traffic; follow the new platform private endpoints guidance in the Microsoft Purview portal or migrate classic endpoints. Once your Purview Account is created, you’ll want to set up and manage your organization’s governance strategy to ensure that your data is classified and managed according to the specific lifecycle guidelines you set.  

Note: Follow the steps in this guide to set up Microsoft Purview Data Lifecycle Management: Data retention policy, labeling, and records management.  

Data Map Best Practices  

Design your collections hierarchy to align with organizational strategy—such as by geography, business function, or data domain. Register each data source only once per Purview account to avoid conflicting access controls. If multiple teams consume the same source, register it at a parent collection and create scans under subcollections for visibility. 

Figure 3: The imageaboveillustrates a recommended approach for structuring your Purview DataMap.
Why Collection Structure Matters 

A well-structured Data Map strategy, including a clearly defined hierarchy of collections and domains, is critical because the Data Map serves as the metadata backbone for Microsoft Purview. It underpins the Unified Catalog, enabling consistent governance, role-based access control, and discoverability across the enterprise. Designing this hierarchy thoughtfully ensures scalability, simplifies permissions management, and provides a solid foundation for implementing enterprise-wide data governance. 

 

Purview Integration with Azure Databricks   

Databricks Workspace Structure 

In Azure Databricks, each region supports a single Unity Catalog metastore, which is shared across all workspaces within that region. This centralized architecture enables consistent data governance, simplifies access control, and facilitates seamless data sharing across teams. As an administrator, you can scan one workspace in the region using Microsoft Purview to discover and classify data managed by Unity Catalog, since the metastore governs all associated workspaces in a region. 

If your organization operates across multiple regions and utilizes cross-region data sharing, please review the consideration and workaround outlined below to ensure proper configuration and governance. Follow pre-requisite requirements here, before you register your workspace: Prerequisites to Connect and manage Azure Databricks Unity Catalog in Microsoft Purview. 

Steps to Register Databricks Workspace 

    • Step 1In the Microsoft Purview portal, navigate to the Data Map section from the left-hand menu. Select Data Sources. Click on Register to begin the process of adding your Databricks workspace. 
    • Step 5: For Scan trigger, choose whether to set up a schedule or run the scan once according to your business needs.  
    • Step 6: From the left pane, select Data Map and select your data source for your workspace. You can view a list of existing scans on that data source under Recent scans, or you can view all scans on the Scans tab. Review further options here: Manage and Review your Scans. You can review your scanned data sources, history and details here:  Navigate to scan run history for a given scan. 

 

Limitation: The “Azure Databricks Unity Catalog” data source in Microsoft Purview does not currently support connection via Managed Vnet. As a workaround, the product team recommends using the “Azure Databricks Unity Catalog” source in combination with a Self-hosted Integration Runtime (SHIR) to enable scanning and metadata ingestion. You can find setup guidance here:  

 Scoped scan support for Unity Catalog is expected to enter private preview soon. You can sign up here: https://aka.ms/dbxpreview. 

 

ConsiderationsIf you have delta-shared Databricks-to-Databricks workspaces, you may have duplication in your data assets if you are scanning both Workspaces. The workaround for this scenario is as you add tables/data assets to a Data Product for Governance in Microsoft Purview, you can identify the duplicated tables/data assets using their Fully Qualified Name (FQN). To make identification easier: 

  1. Look for the keyword “sharing” in the FQN, which indicates a Delta-Shared table.   
  1. You can also apply tags to these tables for quicker filtering and selection. 

 The screenshot highlights how the FQN appears in the interface, helping you confidently identify and manage your data assets. 

 

Purview Integration with Microsoft Fabric  

Understanding Fabric Integration: 

Connect Cross-Tenant: This refers to integrating Microsoft Fabric resources across different Microsoft Entra tenants. It enables organizations to share data, reports, and workloads securely between separate tenants, often used in multi-organization collaborations or partner ecosystems. Key considerations include authentication, data governance, and compliance with cross-tenant policies.  

Connect In-Same-Tenant: This involves connecting Fabric resources within the same Microsoft Entra tenant. It simplifies integration by leveraging shared identity and governance models, allowing seamless access to data, reports, and pipelines across different workspaces or departments under the same organizational umbrella. 

Requirements: 

    • An Azure account with an active subscription. Create an account for free. 
    • An active Microsoft Purview account. 
    • Authentication is supported via: Managed Identity. Delegated Authentication and Service Principal. 

Steps to Register Fabric Tenant 

    • Step 1: In the Microsoft Purview portal, navigate to the Data Map section from the left-hand menu. Select Data Sources. Click on Register to begin the process of adding your Fabric Tenant (which also includes PowerBI). 
    • Step 2: Add in Data Source Name, keep Tenant ID as default (auto-populated). Microsoft Fabric and Microsoft Purview should be in the same tenant. 
    • Step 3: Enter in Scan name, enable/disable scanning for personal workspaces. You will notice under Credentials automatically created identity for authenticating Purview account. 

Note: If your Purview is behind Private Network, follow the guidelines here: Connect to your Microsoft Fabric tenant in same tenant as Microsoft Purview.  

    • Step 5: You will need to create a group, add the Purviews' managed identity to the group and add the group under “Service Principals can access read-only admin APIs” section of your tenant settings inside Microsoft Fabric 
    • Step 6Test your connection and setup scope for your scan. Select the required workspaces, click continue and automate a scan trigger.  
    • Step 7From the left pane, select Data Map and select your data source for your workspace. You can view a list of existing scans on that data source under Recent scans, or you can view all scans on the Scans tab. Review further options here: Manage and Review your Scans. You can review your scanned data sources, history and details here:  Navigate to scan run history for a given scan. 

Why Customers Love Purview 

  • Kern County unified its approach to securing and governing data with Microsoft Purview, ensuring consistent compliance and streamlined data management across departments. 
  • EY accelerated secure AI development by leveraging the Microsoft Purview SDK, enabling robust data governance and privacy controls for advanced analytics and AI initiatives. 
  • Prince William County Public Schools created a more cyber-safe classroom environment with Microsoft Purview, protecting sensitive student information while supporting digital learning. 
  • FSA (Food Standards Agency) helps keep the UK food supply safe using Microsoft Purview Records Management, ensuring regulatory compliance and safeguarding critical data assets. 

Conclusion  

Purview’s Unified Catalog centralizes governance across Discovery, Catalog Management, and Health ManagementThe Governance features in Purview allow organizations to confidently answer critical questions: What data do we have? Where did it come from? Who is responsible for it? Is it secure and compliant? Can we trust its quality?  Microsoft Purview, when integrated with Azure Databricks and Microsoft Fabric, provides a unified approach to cataloging, classifying, and governing data across diverse environments. By leveraging Purview’s Unified Catalog, Data Map, and advanced governance features, organizations can achieve end-to-end visibility, enforce consistent policies, and improve data quality. You might ask, why does data quality matter? Well, in today’s world, data is the new gold. 

 

References  

 

 

Updated Oct 07, 2025
Version 2.0
No CommentsBe the first to comment