ADF does not directly support copying a folder/multiple files from SharePoint Online, but there are workarounds to achieve this. Two additional steps needed here as compared to single file copy are:
Below is how the pipeline flow would look like:
Web1 – Get the access token from SPO
Web2 – Get the list of files from SPO folder
ForEach1 – Loop the list of file names
Copy1 – Copy data with HTTP connector as source
Step1:
Grab Access token from SPO
Copy file from SharePoint Online leverages AAD/service principal authentication and SharePoint API to retrieve files.
a) Register AAD Application
b) Grant SharePoint site permission to your registered App (need site owner permission on SharePoint)
Full details on how to register app and also granting permissions is mentioned in prerequisites here - https://docs.microsoft.com/en-us/azure/data-factory/connector-sharepoint-online-list#prerequisites
c) Create an ADF Pipeline. Start with creating a Web Activity to get the access token
Headers:
Debug run to check if the activity succeeds and also check the activity output to see if it returns the access token in the payload. You can also verify the same using Postman client to check if the token is valid.
Step 2:
Get the list of Files
Headers:
Debug run to see if the activity succeeds, and check it shows the list of files under the folder in the output.
Step 3:
Loop the list of relative file names
Step 4:
Create Copy activity
a) HTTP linked service
b) Configure copy activity HTTP source
Dataset properties:
Tip: You can test with a static access token gotten from the previous Web activity output first. You can also use expression (add dynamic content): @{concat('Authorization: Bearer ',activity('WebActivityName').output.access_token)}
c) Configure Linked Service properties
2. Create Copy sink as below
Successful pipeline run as follows:
Thanks to @Jijo Puthooran for helping me in authoring this blog.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.