Using the workspace MSI to authenticate a Synapse notebook when accessing an Azure Storage account
Published May 06 2021 11:04 AM 12.7K Views
Microsoft

Background

 

When a Synapse notebook accesses Azure storage account it uses an AAD identity for authentication.

 

How the notebook is run controls with AAD identity is used:

  • If a user is interactively running the notebook, then the user's AAD identity is used. We often call this "AAD passthrough" because it "passes the user's AAD identity through to Azure Storage" 
  • If the notebook is run through the pipeline, the workspace MSI is used. 

This blog will show you how force the notebook to always use the workspace MSI.

 

Audience

This is for beginners with some knowledge of the workspace configuration using linked services.

 

STEP 1: Ensure the workspace MSI must have the permissions to access the data in the storage account.

 

The easiest way of doing this is to assign the workspace to the Storage Blob Data Contributor role on the storage account.

 

STEP 2: Configuring the storage account firewall (if needed)

 

If you have enabled the firewall on the storage account, you need to follow these instructions:  Configure Azure Storage firewalls and virtual networks | Microsoft Docs

 

Here is an example with firewall enabled on the storage account:

post.png

When you grant access to trusted Azure services inside of the storage networking, you will grant the following types of access:

 

  • Trusted access for select operations to resources that are registered in your subscription.
  • Trusted access to resources based on system-assigned managed identity.

Additional information on this topic can be found in this document: Connect to a secure storage account from your Azure Synapse workspace – Azure Synapse Analytics | Mi...

 

 

Liliam_Leme_0-1620285112785.png

 

 

Step 3: Configuring the Linked Service

Open Synapse Studio and configure the Linked Service to use the workspace MSI:

 

Liliam_Leme_2-1620285112806.png

 

STEP 4: Test the configuration and see if it is successful

 

Click Test connection to verify that you have configured everything correctly.

 

STEP 5: Update the notebook code to use the Linked Service configuration 

 

val linked_service_name = “LinkedServerName” 
// replace with your linked service name
%%spark
// Allow SPARK to access from Blob remotely
val sc = spark.sparkContext
spark.conf.set(“spark.storage.synapse.linkedServiceName”, linked_service_name)
spark.conf.set(“fs.azure.account.oauth.provider.type”, “com.microsoft.azure.synapse.tokenlibrary.LinkedServiceBasedTokenProvider”) 
//replace the container and storage account names
val df = “abfss://Container@StorageAccount.dfs.core.windows.net/”

print(“Remote blob path: ” + df)

mssparkutils.fs.ls(df)

 


Additional Resources

Learn more about how the Synapse workspaces performs authentication and uses managed identities by reading these documents:

 

 

That is it!

Liliam UK Engineer

Co-Authors
Version history
Last update:
‎May 07 2021 08:00 PM
Updated by: