Get metadata driven data ingestion pipelines on ADF within 10 minutes
By
Published Jul 08 2021 08:29 PM 12.8K Views
Microsoft

Now you can build large-scale data copy pipelines with metadata-driven approach on copy data tool within 10 minutes ! 

 

metadata driven blog.png

 

When you want to copy huge amounts of objects (for example, thousands of tables) or load data from large variety of sources to Azure, the appropriate approach is to input the name list of the objects with required copy behaviors in a control table, and then use parameterized pipelines to read the same from the control table and apply them to the jobs accordingly. By doing so, you can maintain (for example, add/remove) the objects list to be copied easily by just updating the object names in control table instead of redeploying the pipelines. What’s more, you will have single place to easily check which objects copied by which pipelines/triggers with defined copy behaviors.

 

Copy data tool in ADF eases the journey of building such metadata driven data copy pipelines. After you go through an intuitive flow from a wizard-based experience, the tool can generate parameterized pipelines and SQL scripts for you to create external control tables accordingly. After you run the generated scripts to create the control table in your SQL database, your pipelines will read the metadata from the control table and apply them on the copy jobs automatically.

 

You can get more details here.

6 Comments
Copper Contributor

This is great news. I have been a fan of metadata-driven ingestion for many years already.

I had a deeper look at this today. please have a look at First look at metadata-driven copy task for Azure Data Factory | az Data Guy if you want to get a demo environment up and running quickly and if you want to read about my first impressions of this feature.

Copper Contributor

Trying to use this.

How long should I be waiting after the review my selections until the scripts are generated this is my third try and each time it seems to get stuck after Review.
I have waited 30 minutes and process is still Saving... (with status on every option pending)  9 hours later still looking at the same screen.

ozhug_0-1626041042493.png

 

Copper Contributor

@ozhug for me the last steps tales just a couple of seconds. I have not seen any issue there...

Copper Contributor

I have been doing meta data ETL in ADF for some time using a product I've developed for this called ChillETL.  ChillETL uses meta data stored in an Azure SQL database to not only copy tables but do incremental copies, power bi refreshes, stored procedure execution all thru parameterized pipelines.  It also manages the scheduling of multiple processes, both sequentially and concurrently based on dependencies.   ChillETL is listed in the Azure Marketplace and was approved by Microsoft for its partner co-selling program.   Here are some advantages of ChillETL vs the feature described here:

• In ChillETL the metadata is Maintained and Updated via an Excel Add-In vs SQL Scripts which is a very big timesaver.   Behind the Scenes the Excel Add-In generates the SQL Scripts.

• ChillETL Handles sequencing all types of processes (Executing Procs and Queries, Refreshing Power BI Datasets, Importing from Rest APIs, etc)

• ChillETL provides Intelligent Restart on Failures, where only the process that failed and subsequent processes are executed rather than re-running all the processes again.

• ChillETL provides Logging and Real Time Reporting via Power BI

• ChillETL provides Email Notifications and integration with Azure Logic Apps 

• ChillETL Excel Add-In Generates SQL Tables, Procedures, Views and updates Power BI Models

Copper Contributor

This is a great feature in ADF, currently I am looking to copy data from multiple sql servers to one destination as my data exists in multiple servers, is it possible to load data from multiple servers using copy data tool in a parameterized pipeline?

Copper Contributor

In response to copying data from multiple servers using copy data tool, the answer is yes as you can parameterize the linked service (connection) and pass values to that linked service via the pipeline.  I am currently doing this with my ChillETL tool to copy from SQL Servers in 40 different Tenants.

Co-Authors
Version history
Last update:
‎Jul 08 2021 10:00 PM
Updated by: