Db2 for z/OS(IBM Mainframe) is a relational database management system (RDBMS) developed by IBM. The product supports other platforms like DB2 for LUW and DB2 for “i”. It has been widely used by enterprises, including many Fortune 500 companies, for managing their data and supporting critical business applications.
So, from an enterprise standpoint, it is very important to have a seamless integration of DB2 for z/OS with distributed data sources. This would enable enterprises to harness the full potential of their data, uncovering meaningful patterns, trends, and relationships. These insights would empower organizations to make informed decisions, optimize processes, enhance customer experiences, and gain a competitive edge in the market.
The latest & the unified product suite, Microsoft Fabric is an all-in-one analytics solution for enterprises that covers everything from data movement to data science, real-time analytics, and business intelligence.
In this technical blog, we will explore the convenience and ease of opening DB2 for z/OS data in Microsoft Fabric, the latest unified product suite. Microsoft Fabric offers an all-in-one analytics solution for enterprises, encompassing data movement, data science, real-time analytics, and business intelligence. By transferring data to Fabric, data engineers and scientists gain access to a wide range of tools and technologies to extract valuable insights from the data. To follow along with this blog, we recommend setting up and enabling Fabric by referring to Part 1 of the Get Started with Microsoft Fabric – All in-one place for all your Analytical needs…. Part 1 - Micros...
For a large enterprise with multiple disconnected systems and technologies, Microsoft Fabric makes it easier for their data engineers & data scientists to unlock the data potential. The whole pain of the data integration & analytics could be offloaded to Fabric.
Below are the high-level ideas for opening up Db2 for z/OS data to Fabric:
Major components of Fabric that we would be using to achieve these are:
Data Flow: Dataflows provide a low-code interface for ingesting data from hundreds of data sources, transforming your data using 300+ data transformations. Read more here.
Data Pipeline: Data pipelines enable powerful workflow capabilities at cloud-scale. Use data pipelines to build complex ETL and data factory workflows that can perform many different tasks on a scale. Read more here.
Data Gateway: The on-premises data gateway acts as a secure bridge between on-premises data sources, such as DB2 for z/OS, and Microsoft cloud services. It enables quick and secure data transfer, ensuring seamless connectivity.
Lakehouse: Microsoft Fabric Lakehouse serves as a unified data architecture platform, allowing organizations to store, manage, and analyze structured and unstructured data in a single location. Learn more about the concept of a Lakehouse here.
Power BI: Power BI is the world's leading Business Intelligence platform. It ensures that business owners can access all the data in Fabric quickly and intuitively to make better decisions with data. For more information, see What is Power BI?
DB2 Data flows to Lakehouse
Fabric native engineering capabilities
This architecture diagram shows the data flow from DB2 to Fabric. We will explain the components and the set up in detail.
To get the data to Lakehouse, we need to establish a connection to the On-Premises Mainframe and the DB2 database. We could do that using a data gateway. Follow this to install one on a system that has access to the DB2 for z/OS database: Install an on-premises data gateway | Microsoft Learn
Once done you should see the connection status as below:
Once the Gateway is installed, go to the Fabric portal, and create a workspace & Lakehouse, if not already, and build a data flow from DB2 Database to Lakehouse.
Click on the Data Flow and then click on other sources when you land on the page below:
Select IBM DB2 from all the pre-defined sources:
Click on learn more for detailed instructions on filling up the connection details.
Once the connection was established, select the Lakehouse and destination, and hit the publish button:
Data flow could be set up to sync the data from the Db2 database to Lakehouse at a regular interval. Example: We can get up the sync after every nightly batch run in the Mainframes.
Here is the sample table, that we synced from DB2 for z/OS to Lakehouse:
This space is evolving at a rapid pace, and we expect many more capabilities to be added.
This pane gives a bird eye view of all the data sources. At an enterprise scale with multiple data sources, this helps to understand and trace back to the various sources.
Go to the workspace, right click on the dataflow, and click on “View Lineage.”
We can see the IBM Mainframe/DB2 for zOS as a source.
We can use Data pipeline to copy the regularly synced DB2 data in Lakehouse to an Azure SQL target. This gives us the flexibility to branch out & feed the DB2 data to multiple downstream applications from the business-critical application running in Mainframes.
This is a read-only but a unique ability that Lakehouse gives. We can use the inbuilt SQL endpoint to offload read-only applications that support TSQL to Lakehouse. from Lakehouse, that gets synced on a nightly basis.
If you have feedback or suggestions for improving this data migration asset, please contact the Databases SQL CSE/Ninja Engineering Team (datasqlninja@microsoft.com). Thanks for your support!
Note : For additional information about migrating various source databases to Azure, see the Azure Database Migration Guide.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.