Blog Post

Azure Data Factory Blog
3 MIN READ

Storage Event Trigger - Permission and RBAC setting

ChenyeCharlieZhu's avatar
Jan 27, 2021

Storage Event Trigger in Azure Data Factory is the building block to build an event-driven ETL/ELT architecture (EDA). Data Factory's native integration with Azure Event Grid let you trigger processing pipeline based upon certain events. Currently, Storage Event Triggers support events with Azure Data Lake Storage Gen2 and General Purpose version 2 storage accounts, including Blob Created and Blob Deleted.

 

As with any architecture, it's sometimes critical to enforce Role Based Access Control (RBAC) to ensure that only certain members on the team can access certain sensitive information. Unauthorized access to listen to, subscribe to updates from, and trigger pipelines linked to blob accounts should be strictly prohibited. 

 

Azure Data Factory make it really easy for you and enforce the following rules:

  1. To successfully create a new or update an existing Storage Event Trigger, the Azure account signed into the Data Factory and publish the event trigger needs to have appropriate access to the relevant storage account. Otherwise, the operation with fail with Access Denied.
  2. Data Factory needs no special permission to your Event Grid, and you do not need to assign special RBAC permission to Data Factory service principal for the operation.

 

In order to understand how Azure Data Factory delivers the two promises, let's take a step back and take a sneak peek behind the scene. These are the high level architecture for integration among Data Factory, Storage, and Event Grid.

  1. Create a new Storage Event Trigger
    Two noticeable callouts from the flows are:
    1. Azure Data Factory makes no direct contact with Storage account. Request to create a subscription is instead relayed and processed by Event Grid. Hence, your Data Factory needs no permission to Storage account in this stage
    2. Access control and permission checking happens on Azure Data Factory side. Before ADF issues a request to subscribe to Storage event, it checks the permission for the user. More specifically, it checks whether the Azure account signed in and attempting to create the Event trigger have appropriate access to the relevant Storage account. If the permission check fails, trigger creation also fails
    3. Any of the following RBAC settings works:
      1. Owner role to the storage account
      2. Contributor role to the storage account
      3. Microsoft.EventGrid/EventSubscriptions/Write permission to /subscriptions/####/resourceGroups/####/providers/Microsoft.Storage/storageAccounts/storageAccountName
  2. Storage event trigger Data Factory pipeline run

     

     


     

    When it comes to Event triggering pipeline in Data Factory, three noticeable call outs in the workflow:

    1. Event Grid uses a Push model that it relays the message as soon as possible when storage drops the message into the system. This is different from messaging system, such as Kafka where a Pull system is used.
    2. Event Trigger on Azure Data Factory serves as an active listener to the incoming message and it properly triggers the associated pipeline.
    3. Storage Event Trigger itself makes no direct contact with Storage account
      1. That said, if you have a Copy or other activity inside the pipeline to process the data in Storage account, Data Factory will make direct contact with Storage, using the credentials stored in the Linked Service. Please ensure that Linked Service is set up appropriately
      2. However, if you make no reference to the Storage account in the pipeline, you do not need to grant permission to Data Factory to access Storage account


What's in the bag for the future?

The team is currently in the process of expanding functionalities for Event Trigger family. Soon, we will support Custom Event in Event Grid to give customers even more flexibilities in defining the Event Driven Architecture. Please keep an eye out for the exciting announcement, as we test the functionality thoroughly and gradually roll it out to General Availability.

Updated Mar 12, 2021
Version 6.0
  • NCJ's avatar
    NCJ
    Copper Contributor

    Where do ADF Storage Event Triggers register?

     

    This article is referenced by the following: Working with ADF Storage Event Trigger Over SFTP - Microsoft Community Hub

     

     

    SFTP Storage Events must be added onto the data.api values of the Storage trigger, but where does one find the correct storage trigger to modify?  Looking in the Event Grid System Topics, there's nothing listed for my trigger.  Does ADF Create standard Event Grid Subscriptions which can be modified to add the SFTPCommit value to the data.api settings?  If not, how does one achieve the outcomes listed in the referencing article?