Azure Data Factory is a fully managed data integration service that allows you to create data-driven workflows in a code free visual environment in Azure for orchestrating and automating data movement and data transformation. Azure Database for PostgreSQL is now a supported sink destination in Azure Data Factory. Azure Data factory can be leveraged for secure one-time data movement or running continuous data pipelines which loads data into Azure Database for PostgreSQL from disparate data sources running on-premises, in Azure or other cloud providers for analytics and reporting.
In this tutorial, you create a Data Factory pipeline that copies data from Azure Blob Storage to Azure Database for PostgreSQL. The configuration pattern in this tutorial applies to copying from a file-based data store to a relational data store.
If you don't have an Azure subscription, create a free account before you begin.
Azure Storage account. You use the blob storage as source data store. If you do not have an Azure storage account, see the Create a storage account article for steps to create one.
Azure Database for PostgreSQL. You use the database as sink data store. If you do not have an Azure Database for PostgreSQL, see the Create an Azure Database for PostgreSQL article for steps to create one.
Once you deploy the above template, you should see the following resources in your resource group:
Create a blob and a PostgreSQL table
Now, prepare your Azure Blob and Azure Database for PostgreSQL for the tutorial by performing the following steps:
Create a source blob
1. Launch Notepad. Copy the following text and save it as employee.txt file on your disk.
2. Use tools such as Azure Storage Explorer to create a container named “adftutorial”, and to upload the “employee.txt” file to the container in a folder named “input”
Create a sink PostgreSQL table
1. Use the following SQL script to create the public.employee table in your Azure Database for PostgreSQL :
CREATE TABLE public.employee
ID INT NOT NULL GENERATED ALWAYS AS IDENTITY,
FirstName VARCHAR NOT NULL,
LastName VARCHAR NOT NULL
2. Allow Azure services to access Azure Database for PostgreSQL Server. Ensure that Allow access to Azure services setting is turned ON for your Azure Database for PostgreSQL Server so that the Data Factory service can write data to your Azure Database for PostgreSQL Server. Here are the instructions to verify and turn on this setting.
Important: This option configures the firewall to allow all connections from Azure including connections from the subscriptions of other customers. When selecting this option, make sure your login and user permissions limit access to only authorized users.
Create Azure Data Factory
The following template creates a data factory of version 2 with a pipeline that copies data from a folder in an Azure Blob Storage to a table in an Azure Database for PostgreSQL.
Once the template is deployed successfully, you can monitor status of ADF copy activity by running the following commands in PowerShell: