Data lineage enables Data citizens to trust enterprise data for consumption and accelerate their data journey. Data lineage helps data consumers to understand the end-to-end upstream and downstream journey of data, where the data is coming from, where it’s going, and everything in between. Microsoft Purview has automated capturing data lineage from several data systems and continues to expand the coverage to many other systems. However, there are many more data systems that move data in an enterprise that are not yet automated with Microsoft Purview connectors, leading to gaps in data lineage collection.
With Manual lineage, data owners can report data lineage manually using asset curation experiences. Manual lineage can bridge the gap in coverage of end-to-end lineage, where automated lineage of the Data system is not natively supported in Microsoft Purview. Manual lineage is a platform capability of Microsoft Purview with a simplified user experience, to add, edit or remove lineage relationships manually.
- Data engineer owns “Customer” dataset in ADLS Gen2, scanned in Purview DataMap. Every week a cleanup activity is run on “Customer” dataset using a script reading data from SQL table “Address”. The SQL table dependency is not automatically captured in Microsoft Purview lineage currently. Manual lineage can help the data engineer to document lineage between “Customer” and “Address” datasets with few clicks.
- A Data scientist uses Ad-hoc script to load data from one table in SQL database to another table for training ML models. While waiting for results and completely automating the ML pipeline, the Data scientist want to document the lineage of data between 2 tables in SQL Server. The Data scientist can use Manual lineage to report data movement with few clicks in Microsoft Purview.
Search your asset and add manual lineage
- Select Edit in asset detail page and go to the Lineage tab of the current asset in the catalog
- Select Add Lineage in the list panel to add a row for selecting an asset to report manual lineage
- Select the relationship type: To report upstream lineage, select the relationship as “Consumes”. To report downstream lineage, select the relationship type as “Produces”
- Select Asset dropdown to find the asset from the suggested list or View more to search the catalog.
- Use the asset picker experience to select the asset and report in lineage canvas
- Select the Save button to exit the edit mode
- Quickly and easily create a Microsoft Purview account
- Search for your asset and add manual lineage (Data curator role is required for adding manual lineage)
Demo of Manual data lineage