Azure Purview is a unified Data Governance tool used to manage and govern your on-premises, multicloud, and SaaS data. For those of you who have used Azure Purview previously, you will know that to scan sources such as Azure data assets (Blobs, Azure Data Lake Service, etc.) you will need an authentication methodology leveraging either a Purview managed identity or perhaps a service principal. This applies to a wide array of assets, however, there is also a method to use self-hosted integration runtimes (SHIR) that will let you scan data sources in Azure Purview. This is particularly useful to install into your machines when scanning a resource in a private network (on-prem or VNET). One of the top use cases for this is around the scanning of on-prem SQL Servers, which is a frequent ask from customers.
In this blog we will review how to set up a self-hosted integration runtime in Azure Purview and demonstrate how to use it in setting up a scan for an on-prem SQL Server.
Background on Self-Hosted Integration Runtimes
The Integration Runtime (IR) is compute infrastructure used by Azure Data Factory to provide data integration capabilities across different network environments. You can set a linked service which defines a target data store or a compute service along with a defined activity. The IR is the bridge between the activity and linked service and provides the compute environment where the activity either runs on or gets dispatched from. There are several types of Integration Runtime (IR) but we will be focusing on Self-hosted integration runtimes.
First navigate to the Management Center in the Purview Studio and select Integration Runtimes.
Select New at the top, select Self-Hosted and press Continue.
Enter the name that you want for the Integration Runtime along with any description you would like to add and click Create.
Then follow the steps for the Manual setup. Make sure to download and install the integration runtime into the self-hosted machine where you want to run it.
Navigate through the Setup Wizard to install the Integration Runtime
After finishing this you will have to insert the authentication key in step 4 into the input box shown in the image below. You will not need to change the proxy.
After the key is verified, go ahead and register the runtime. Verify the information regarding the node name and click finish.
Now that you have set up the SHIR on-premises, you can use it to bring certain sources into Purview. We will demonstrate this with the scanning of SQL Server on-prem.
Navigate to Azure Purview and Register a SQL Server source
Make sure to enter the name of the source along with the Server endpoint
After this, configure your scan by entering the following information on Scan name, selection of the Integration Runtime we have created from earlier, Server endpoint (which should already be filled in), and the credential for SQL Authentication. Continue through and run through the scan similarly to other sources by selecting the rule sets and recurrences.
Congratulations! You have now set up your Integration Runtime and can scan through SQL Servers and much more!