Azure Databricks activities now support Managed Identity authentication
Published Nov 23 2020 03:27 AM 58.2K Views
Microsoft

Azure Databricks supports Azure Active Directory (AAD) tokens (GA) to authenticate to REST API 2.0. The AAD tokens support enables us to provide a more secure authentication mechanism leveraging Azure Data Factory's System-assigned Managed Identity while integrating with Azure Databricks.

 

Benefits of using Managed identity authentication:

  • Managed identities eliminate the need for data engineers having to manage credentials by providing an identity for the Azure resource in Azure AD and using it to obtain Azure Active Directory (Azure AD) tokens. In our case, Data Factory obtains the tokens using it's Managed Identity and accesses the Databricks REST APIs.  
  • It lets you provide fine-grained access control to particular Data Factory instances using Azure AD. 
  • It helps prevent usage of Databricks Personal Access Tokens, which acts as a password and needs to be treated with care, adding additional responsibility on data engineers on securing it.

Earlier, you could access the Databricks Personal Access Token through Key-Vault using Manage Identity. Now, you can directly use Managed Identity in Databricks Linked Service, hence completely removing the usage of Personal Access Tokens. 

 

High-level steps on getting started:

  1. Grant the Data Factory instance 'Contributor' permissions in Azure Databricks Access Control.
    databricks-grant-access-to-adf-msi-1.jpg databricks-grant-access-to-adf-msi-2.jpg
  2. Create a new 'Azure Databricks' linked service in Data Factory UI, select the databricks workspace (in step 1) and select 'Managed service identity' under authentication type.
    databricks-grant-access-to-adf-msi-3.jpg

 

Note: Please toggle between the cluster types if you do not see any dropdowns being populated under 'workspace id', even after you have successfully granted the permissions (Step 1). 

 

Sample Linked Service payload:

 

{
    "name": "AzureDatabricks_ls",
    "type": "Microsoft.DataFactory/factories/linkedservices",
    "properties": {
        "annotations": [],
        "type": "AzureDatabricks",
        "typeProperties": {
            "domain": "https://adb-***.*.azuredatabricks.net",
            "authentication": "MSI",
            "workspaceResourceId": "/subscriptions/******-3ab0-48f2-b171-0f50ec******/resourceGroups/work-rg/providers/Microsoft.Databricks/workspaces/databricks-****",
            "existingClusterId": "****-030259-dent495"
        }
    }
}

 

Note: There are no secrets or personal access tokens in the linked service definitions!

5 Comments
Version history
Last update:
‎Nov 23 2020 05:58 AM
Updated by: