DataFactory Setup To Use a Private Network

Copper Contributor

Hi.  I have a Pipeline that executes an SSIS package (through the SSIS-IR).  This package uses a C# Script to call a web service to get data and transfer it into an Azure SQL Database.  I have a requirement to execute this on a private secure network.   I have researched solutions and found information about a private link IR (sorry, that may not be the exact terminology).  And have my SSIS-IR reference that private link IR. 

 

Is this the proper solution?  Will it give me a secure/private "tunnel" for the data that is received from the API back to the Azure SQL Database? 

 

Is there a way to setup the subscription to use only a private network for all incoming and outgoing traffic?  That way if data is transferred without using the SSIS-IR it will not be exposed to public addresses?

 

I appreciate suggestions and recommendations, or links to KB articles about using private networks from Azure Cloud Resources.

Thank you.

1 Reply
Hi Paula,
For the connectivity from the mentioned API being secure, that would focus around transport layer security (TLS) unless you had another private means to use the API. As long as you are using HTTPS and credentials it will provide two layers of security as such. Credentials can also be replaced with a client certificate for connectivity (public/private key pair; mutual auth) if the API host has the option for it, which is better than basic auth (username+password) IMHO but can be complicated to put in place.

For the ADF runtime you can use a SSIS package to pull the data in, otherwise you can go native and use something like the following:
https://docs.microsoft.com/en-us/azure/data-factory/connector-rest
This should achieve the same result without the need for a package. If you haven't seen it already there is a quickstart with an example here:
https://docs.microsoft.com/en-us/azure/data-factory/quickstart-create-data-factory-rest-api

In respect to clarifying terminology around "secure" and "private" the requirements from the person, client or people asking would need to happen. When Microsoft services are talking to each other for Azure services, it will stay within Microsoft (MS) backbone network(s) and if in the same region, in the same datacenter. It wont traverse out to public infrastructure outside the MS edge. The only point this may occur from the example mentioned, is if the API exist outside of Microsoft's services and or hosting (i.e. AWS, GPC, local providers, etc...). Then it would be over "public" infrastructure from ADF to the API, but secured in transit (HTTPS/TLS + credentials/keys for access).

You can use private link to secure connectivity if wanted, but that does mean all connectivity to the resource would then need to be private which means involving the API provider to do this.
There is some information here that will be useful based on what I think you are after to consider:
https://docs.microsoft.com/en-us/azure/data-factory/data-access-strategies

For the question around "exposed to public addresses" it is a bit of a broad question but hopefully the above and following will give some details for you.

It maybe a good idea to separate out private connectivity and secure connectivity. Secured connectivity is a must, private connectivity is subjective to the solution and does come at a cost, such as provisioning an VNet, NSG's, outlining the private network space and subnets (IP Address Management, hub and spoke or other designs), etc... You can also delve into Private DNS if you want to avoid using public DNS records for addressing, but again at a cost (price and complexity). There is a decent amount of material here for a starter for private networks:
https://docs.microsoft.com/en-us/azure/virtual-network/virtual-network-vnet-plan-design-arm

Hopefully some of that doesn't provide more questions that answers :)