Integrate Synapse Database Templates with SAP Data - Part 1
Published Dec 22 2022 04:48 AM 1,325 Views
Microsoft

Introduction

According to SAP, 77% of the world’s transaction revenue touches an SAP system. To understand these systems and extracting the data is a challenging job. To gain insights, it is always helpful to integrate the data source systems seamlessly with analytics services.

 

During a hack session, we tried to accomplish the challenge of extracting the data from SAP source and integrating it with analytical services using the new Database Templates/Lake Database feature that is available in Synapse workspaces.

  • The SAP source system was SAP S/4HANA 
  • Analytics services was Azure Synapse Analytics and Lake database templates

The scope for our Hack

  • Extract data from the SAP system (using Synapse Pipelines, with SAP connector based on your SAP source system for e.g. SAP ECC, SAP BW or HANA DB etc.). Then land the data in a storage account (we used the same storage account associated with the Synapse workspace).
  • Identify the industry template to map the data based on source data (we chose Manufacturing)
  • Using the above template, design the lake database schema by adding/editing entities and attributes to suit our requirements.
  • Map the data between source and destination using the map data tool (most of the time was spent understanding the source entities and mapping it to entities in lake database)
  • The above mapping process will eventually end up in pipelines and dataflows being created. The pipelines will then run and ingest the data in the lake database.
  • After successful data ingestion use the data in lake database in SQL pools or Spark pool in synapse for exploring the data. It also can be exposed to advanced analytics capabilities such as Machine Learning or use the ingested data for reporting purposes, integrating it with Power BI.

Steps performed for achieving the goal

We integrated SAP S/4 HANA system data with Synapse Lake database templates.

Requirements:

  1. Synapse workspace
  2. SAP source system details and login

1.Extract data from SAP source system

The first step is to create a linked service to the SAP source system.

To establish a connection, we set up a self-hosted integration runtime as a compute engine and a gateway for data connection.

Create a self-hosted integration runtime - Azure Data Factory & Azure Synapse | Microsoft Learn

 

We used SAP Table connector to create linked service for our source system (the latest CDC connector was still not available within Synapse for some regions).

Provide the client id and login details to establish a connection to SAP source system.

 

ssonwane_0-1671624803056.png

 

Next, created the datasets for the source and destination.

Source                                                                    Destination

ssonwane_1-1671624340557.pngssonwane_2-1671624346432.png

 

A simple copy activity in a pipeline to extract the data from source and land in the destination data lake

 

ssonwane_3-1671624430113.png

 

Source and sink settings

 

ssonwane_4-1671624444599.pngssonwane_5-1671624449188.png

 

Once the data lands in ADLS, it should look similar to the below:

(The depth of the table folders should be: <container>/<foldername>/<inputfiles> this is important for the data mapping step later)

 

ssonwane_6-1671624482604.png

 

2.Create an empty Lake database for our source data

 

We created our lake database using the Manufacturing industry template. It is important to identify the industry template relevant for the data.

 

ssonwane_7-1671624505394.png

 

We then added the entities to the database canvas as per our needs and created the database definition.

 

ssonwane_8-1671624526098.png

 

As needed, edit the table definitions with the column names, data types, nullability, relationships etc.

No data is ingested yet, so the database will be empty.

 

ssonwane_9-1671624556143.png

 

We then published to save the changes. Now, we have an empty lake database based off the manufacturing template with the customizations that we needed.

 

Note that the database will show under the "Lake Database" in Synapse but yet the corresponding folders in ADLS would still not be created. These folders will be created while the data gets ingested which brings us to the next step of ingesting data into the Lake Database that is continued in second part of this blog below.

Integrate Synapse Database Templates with SAP Data - Part 2 - Microsoft Community Hub

 

Additional reading:

Quickstart on Azure Synapse lake database and database templates - Azure Synapse Analytics | Microso...

 

 

 

 

Co-Authors
Version history
Last update:
‎Dec 22 2022 12:43 AM
Updated by: