Batch Integration for Azure Digital Twins with Azure Data Factory
Published Apr 05 2022 12:18 PM 3,617 Views

Imagine taking any complex environment and applying the power of technology to create awe-inspiring experiences and reach new business heights that were previously unimaginable. The possibilities are endless: A retail store where the shopping experience is optimized in real-time and shelves are always stocked. A supply chain that tracks and reduces carbon emissions. A process manufacturing line that adjusts for variations in natural ingredients and automatically detects and compensates for operational bottlenecks. A city plan that simulates various growth proposals to ensure they’re making the best use of energy sources.  An energy company that provides off peak energy consumption recommendations to its customers based on neighborhood and household patterns. Azure Digital Twins allows you to build metaverse experiences by creating digital replicas of real-world people, places and things and bringing these replicas to life by augmenting with real time IoT data and data from LOB systems.


In many cases, Azure Digital Twins are kept up to date using real time input feeds from sources including Azure IoT Hub, and Azure Event Hubs.  What we’ve found is that there is often a need to have data from business systems, including ERPs, included in the twin model.  This data is typically accessed through ETL processes at batch or micro batch intervals and used to update twin properties as well as updating the twin graph by creating new twins or relationships.  The goal of this article is to highlight a new pattern to achieve this using Azure Data Factory and Azure Batch. With these services, you will see how it’s possible to connect to a source system and update your digital twins.


Solution Overview


This solution leverages common Azure data services to process the twin updates:




  1. Azure Data Factory uses either a copy activity or a mapping data flow to connect to the business system and copy the data to a temporary location.
  2. A mapping data flow handles any optional transformations and outputs a file for each twin that must be processed.
  3. A Custom Activity is called from the Data Factory Pipeline which will send the list of files that need to be processed.
  4. Azure Batch creates a task for each file that runs custom code to interface with Azure Digital Twins.


Reference Implementation


We created a reference implementation of this architecture in GitHub with instructions on how to build it out.  General setup is as follows:

  1. Deploy the necessary Azure Resources.
  2. Deploy the Custom Activity code to Azure Storage.
  3. Create a sample Business System/ERP schema in Azure SQL.
  4. Create sample digital twin instances.
  5. Configure Azure Data Factory to point to your Azure resources.
  6. Run the sample pipeline!


Enhancing the Reference Implementation


  1. The reference implementation simulates a Business System/ERP using an Azure SQL database.  You could connect to an actual system using one of the many connectors available out of the box in Azure Data Factory.
  2. The Custom Activity sample code updates existing digit twin instances.  You could enhance this to create or delete twin instances and the relationships between them.



In this post we learned how to use Azure Data Factory to get data from your source system and update your digital twins in Azure Digital Twins. With this solution, you now have a batch integration approach for interfacing with your digital twins.  See the full pattern and instructions for building it in your environment in our Internet of Things Architectures.

Let us know what you think by commenting below. 




Batch Integration with Azure Digital Twins

GitHub Sample

Version history
Last update:
‎Apr 27 2022 05:47 AM
Updated by: