Blog Post

Security, Compliance, and Identity Blog
2 MIN READ

Report Manual Data lineage with few clicks in Microsoft Purview

ChandruS's avatar
ChandruS
Icon for Microsoft rankMicrosoft
Oct 18, 2022

Data lineage enables Data citizens to trust enterprise data for consumption and accelerate their data journey. Data lineage helps data consumers to understand the end-to-end upstream and downstream journey of data, where the data is coming from, where it’s going, and everything in between. Microsoft Purview has automated capturing data lineage from several data systems and continues to expand the coverage to many other systems. However, there are many more data systems that move data in an enterprise that are not yet automated with Microsoft Purview connectors, leading to gaps in data lineage collection.  

 

With Manual lineage, data owners can report data lineage manually using asset curation experiences. Manual lineage can bridge the gap in coverage of end-to-end lineage, where automated lineage of the Data system is not natively supported in Microsoft Purview. Manual lineage is a platform capability of Microsoft Purview with a simplified user experience, to add, edit or remove lineage relationships manually.

 

Use cases

  1. Data engineer owns “Customer” dataset in ADLS Gen2, scanned in Purview DataMap. Every week a cleanup activity is run on “Customer” dataset using a script reading data from SQL table “Address”. The SQL table dependency is not automatically captured in Microsoft Purview lineage currently. Manual lineage can help the data engineer to document lineage between “Customer” and “Address” datasets with few clicks.
  2. A Data scientist uses Ad-hoc script to load data from one table in SQL database to another table for training ML models. While waiting for results and completely automating the ML pipeline, the Data scientist want to document the lineage of data between 2 tables in SQL Server. The Data scientist can use Manual lineage to report data movement with few clicks in Microsoft Purview.

 

Search your asset and add manual lineage

  1. Select Edit in asset detail page and go to the Lineage tab of the current asset in the catalog

     

  2. Select Add Lineage in the list panel to add a row for selecting an asset to report manual lineage
  3. Select the relationship type: To report upstream lineage, select the relationship as “Consumes”. To report downstream lineage, select the relationship type as “Produces”
  4. Select Asset dropdown to find the asset from the suggested list or View more to search the catalog.

 

  1. Use the asset picker experience to select the asset and report in lineage canvas
  2. Select the Save button to exit the edit mode

 

Get Started

  • Quickly and easily create a Microsoft Purview account
  • Search for your asset and add manual lineage (Data curator role is required for adding manual lineage)

Demo of Manual data lineage

 

Updated Oct 17, 2022
Version 1.0
  • anietova's avatar
    anietova
    Copper Contributor

    Nice feature! waiting for PBI Datasets support. As far as I know, currently is not possible to link PBI datasets to Azure SQL assets so can´t have full e2e lineage.