Blog Post

FastTrack for Azure
7 MIN READ

Exploring the Relationship Between Microsoft Fabric and Microsoft Purview: What You Need to Know

Eduardo_Noriega's avatar
Jul 01, 2024

 

Microsoft Purview is a data governance solution designed to help organizations discover, catalog, and manage their data assets across the organization. It provides a unified view of an organization's data landscape, regardless of where the data resides — whether it's on-premises, in the cloud, or in SaaS applications. Purview scans and catalogs metadata from various data sources, including databases, data lakes, file systems, and more, to create a comprehensive data map. Purview includes connectors to non-Microsoft Sources like Oracle, Teradata, SAP, Google Big Query, etc.

Microsoft Fabric is an end-to-end analytics and data platform designed for enterprises that require a unified solution for encompassing data movement, processing, ingestion, transformation, real-time event routing, and report building.

While they are two distinct offerings from Microsoft, they work well together within the Microsoft ecosystem.

In this article we will learn about how these two Microsoft solutions interact, their distinct features, and how they can be leveraged together for optimal performance in Data Governance.

We begin by describing “Live View”, a Purview feature that allows users to explore Fabric items even when Fabric is neither registered as a data source nor scanned.

We demonstrate the steps to register and scan a Fabric tenant, providing examples of actions that can be performed on scanned Fabric data assets.

Additionally, we cover how to leverage the data governance capabilities within Microsoft Fabric, including an explanation of the Purview Hub integrated service within Fabric.

 

Live View of Fabric items in Microsoft Purview.

 

“Live View” is one of the simplest features you can use in Purview to govern your organization’s spectrum of data. It consists of being able to access Fabric items and explore them in Fabric, without having to scan Fabric as a data source in Purview.

Among other functionalities, in Purview Data Catalog you can use data search to get a live view of multiple data sources, including Microsoft Fabric items and workspaces.

Go to the new Microsoft Purview Portal: https://purview.microsoft.com

Select the Data Catalog solution and then, Data Search.

 

 

After selecting Microsoft Fabric, you can see the option “Microsoft Fabric”, and by pressing it, you can see the Fabric’s workspaces you have access to:

 

 

You can see all items in a selected workspace, by pressing the type of item: 

 

 

By selecting a specific item, you can see the item’s details or view the item in Fabric.

 

 

However, you can utilize more advanced functionalities for the governance of your Fabric items by using the Fabric scan as a data source. This approach helps feed the core Microsoft Purview solution, known as the Data Map.

 

Microsoft Purview Data Map.

 

The Data Map is a platform as a service (PaaS) component of Purview that keeps an up-to-date map of assets and their metadata across your data estate.

First of all, you need to define the Data Map of your organization by defining Collections.

By using collections, you can manage and maintain data sources, scans, and assets in a hierarchy instead of a flat structure. Collections allow you to build a custom hierarchical model of your data landscape based in your organization needs.

For future scalability, we recommend that you create a top-level collection for your organization below the root collection (Purview defines a root collection by default with the same name as your Microsoft Purview account name).

From the top-level collection, organize data sources, distribute assets, and run scans based on your business requirements, geographical distribution of data, and data management teams, departments, or business functions.

To learn about how create collections refer to How to manage domains and collections | Microsoft Learn

Here you have an example of a Data Map:

 

 

Using the Data Map, you can register an appropriate source to feed each collection with later scanning processes. Simply click on the highlighted icon in the figure above to register a source associated with that collection.

In the above Data Map, a registered Fabric source is shown below the Collection named “Medicines”.

In Microsoft Purview, you can scan various types of data sources and monitor the scan status over time. Once a scan succeeds, it populates the data map and data catalog.

You can also move data assets from one collection to another either manually or automated through the scanning and ingestion features.

You can register various data sources such as Azure SQL Database, Azure Data Lake Storage, and other supported data sources to a single collection to feed data assets into that collection. But a data source belongs only to a single collection, and by design, you can't register a data source multiple times in a single Microsoft Purview account.

 

Register a Fabric tenant in Microsoft Purview.

 

 

Select the Data Map solution and then go to Data Sources.

You can register a Data Source by using the icon option in the Data Map, as shown before, or by using the Register Option in the Data Sources sub menu:

 

 

Press “Register” and select “Fabric (includes Power BI)” from the other possible data sources. The following screen appears:

 

 

After press “Register” you can see Fabric registered as a source in the Map View or in the Table View.

 

 

Scan a Fabric tenant in Purview.

 

After the data source is registered, you are ready to scan it and feed the collections in your data map.

In the previous section of this post, we registered Fabric as a data source using the default tenant ID (by default, the system will find the Fabric tenant that exists in the same Microsoft Entra tenant).

In Microsoft Entra tenant, create a security group and add the Microsoft Purview account MSI as member of this group. You can read further details in Connect to and manage a Power BI tenant same tenant | Microsoft Learn.

You can also connect to Fabric using a different tenant and other variants explained at Connect to and manage a Microsoft Fabric tenant (cross-tenant) | Microsoft Learn

In your Fabric tenant, go to Settings and select Admin Portal.

You must be a Fabric administrator to see Tenant Settings in the Admin Portal.

Enable the following tenant settings, as explained in Admin API admin settings - Microsoft Fabric | Microsoft Learn

 

 

You must enable the three Admin API settings to the security groups previously created:

 

 

Now, get back to Purview and at the registered data source, select “New Scan”, either in the Map View or in the Table View of the Data Map.

 

 

You must give a name for the scan and select one collection to serve as destination of the scanning process.

 

 

One scan has only one target collection.

You can choose which domain you want to use, having the appropriate permissions.

After pressing “Continue”, the scan can be scheduled or executed only once.

Pressing “Continue” again lets you Save and Run, and this action starts the scanning process.

After scanning, you will see the assets from Fabric in your previously created collection:

 

 

You can see the inventory of the scanned assets:

 

 

Going to Data Catalog and selecting one asset lets you examine it in Fabric, make curation and see data lineage of assets.

Next two figures show a data curated after scanning a Fabric data source into a previously created collection, and the data lineage of some other asset.

 

 

 

In Overview, we can classify the asset using existing classifications (system or custom classifications). System and custom classification can be defined in Data Map, under Annotation Management.

 

 

For now, it’s not possible to scope your scan to specific subsets of data for Fabric items, nor to apply scan rule sets. Connect to and manage your Microsoft Fabric tenant | Microsoft Learn

Now we will examine the interaction in reverse: How can we use Purview to improve data governance inside Fabric?

 

Implementing Data Governance in Microsoft Fabric with Purview Hub.

 

Fabric allows users to manage and govern their data estate using built-in features such as Domains, Endorsement, Data Lineage, various security management tools, and the application of Sensitivity Labels.

Additionally, users can take advantage of Purview Hub, which is part of the Purview ecosystem.

The Purview Hub is a centralized place in Fabric where you can manage and govern your data assets across different services, providing enhanced governance capabilities.

Purview Hub provides a view for Fabric administrators and another view for non-admin Fabric users, as explained at The Microsoft Purview hub in Microsoft Fabric - Microsoft Fabric | Microsoft Learn

Fabric administrators can see insights related to their organization’s entire Fabric data estate. They also see links to capabilities in the Microsoft Purview governance and compliance portals to help them further analyze and manage governance of their organization's Fabric data.

Other users only see insights related to their own Fabric content and links to capabilities in the Microsoft Purview governance portal.

In your Fabric tenant, go to Settings and select Microsoft Purview Hub.

 

 

You will see a screen like that:

 

 

You can go directly to Microsoft Purview selecting “Get started with Microsoft Purview” or “Data Catalog”.

You can see a dashboard with the total amount of workspaces and items you have in this Fabric tenant and several graphics of your data items, grouped by workspaces and types.

If you select “Open full Report”, this action automatically generates a Purview Hub Report with the pages: Overview, Sensitivity Report, Endorse, Inventory, Sensitivity Page and Items Page.

Next figure shows the Inventory Report.

 

 

Summary.

Organizations may choose to develop or identify the data governance tools and technologies right for their current and future needs.

Microsoft Purview provides a unified data governance solution to help manage and govern your on-premises, multi-cloud, and software as a service (SaaS) data. Easily create a holistic, up-to-date map of your data landscape with automated data discovery, sensitive data classification, and end-to-end data lineage. Enable data consumers to access valuable, trustworthy data management.

Solutions in Microsoft Fabric manages a lot of data distributed in many source types that need data governance, realizing it through the seamless integration between Fabric and Purview platforms.

Live View a Purview feature that allows users to explore Fabric items even when Fabric is neither registered as a data source nor scanned.

By following the proper steps to scan a Fabric tenant, we can take advantage of the benefits provided by Purview to start managing all the data assets you have in Fabric solutions.

On the other hand, we can take advantage of the data governance capabilities provided by Purview Hub within Microsoft Fabric.

Purview Hub is part of the broader Purview ecosystem, which provides many comprehensive data governance solutions.

 

Learn more:

Introduction to Microsoft Purview governance solutions | Microsoft Learn

How to manage data sources in the data map | Microsoft Learn

How to manage domains and collections | Microsoft Learn

Microsoft Purview collections architecture and best practices | Microsoft Learn

Connect to and manage your Microsoft Fabric tenant | Microsoft Learn

The Microsoft Purview hub in Microsoft Fabric - Microsoft Fabric | Microsoft Learn

Governance and compliance in Microsoft Fabric - Microsoft Fabric | Microsoft Learn

 

Updated Jun 27, 2024
Version 1.0
  • Christoph1250's avatar
    Christoph1250
    Copper Contributor

    Great study Edoardo Eduardo_Noriega , congrats

     

    May I ask a question about data map , please ?

     

    Imagine on my MS Fabric tenant , I manage 2 domains on Workspaces  (SQL  Warhouses &  DFGEN2 pipelines & PBI Semantic models inside )

     

    Sales ( 5 WS)

    Supply (15 WS)

     

    How to fill in 2 data map collections via 2 scans ?

    Exp : Domain Fnac --> collection is sales

              Domain Fnac --> collection is supply

    Rmk : Same data source as tenant id  During the 2 scans

    1) by precising a Fabric Domain (on WS)  we want to scan ?

    or 2) by playing with WS access during scans (My "sales" scan --> credentials is a particular service principal -->  granted to   5 WS (on fabric domain sales) access)  ?

     

    Hi hope ;>) 1) solution is possible