Blog Post

Data Architecture Blog
5 MIN READ

CI CD in Azure Synapse Analytics Part 3

Bradley_Ball's avatar
Bradley_Ball
Icon for Microsoft rankMicrosoft
Dec 18, 2020

Here's a quick review of the road so far:

 

CI CD in Azure Synapse Analytics Part 1

  • Creating an Azure DevOps project
  • Linking our Azure Synapse Analytics environment to that Project via Git
  • Validating that our Azure DevOps Repo was populated with our Azure Synapse Analytics environment

CI CD in Azure Synapse Analytics Part 2:

  • Create a new branch on our Repo
  • Edit our Azure Synapse Analytics environment
    • Specifically my SQL scripts have demos all over the place and Buck Woody said I have to clean up my very messy room .... Azure Synapse Analytics environment
  • Create a Pull Request in Azure Synapse Analytics to merge our new branch with the main
  • Approve the Pull Request in Azure DevOps
  • Validate our main branch is updated in our Azure Synapse Analytics Environment

This time we will:

  • Create an Artifact pipeline
    • This is to create an Artifact we can use to deploy to another environment

 

First we are going to examine a very important part of our Azure Synapse Analytics environment.  The Publish button.

 

 

Why all the arrows and boxes?  Because this is important.  This publish button saves the templates that we will use to deploy our environment to another Azure Synapse Analytics workspace.  When you click publish a few messages should appear.  Publishing In progress, Publishing completed, Generating templates, and Generating templates completed.  

 

If you get an error do not fear, validation will occur and show you where the error is in your workspace.  I've encountered this a time or two.  Eventually I will intentionally write a blog in this series where we break things just to fixt them.  For now, let us presume that everything went just fine.

 

Next we will move over to our Azure DevOps Repo.  Find the folder that is the same name as your Azure Synapse Analytics workspace.  In this picture mine is bballasw.  Under that folder you will find two files, TemplateForWorkspace.json & TemplateParametersForWorkspace.json.

 

*NOTE - these templates are not the same templates you would use to deploy a new environment.  These are only for deploying the artifacts from one environment to another.  In part 5 we will look at generating the ARM templates needed for deploying a new environment from Azure DevOps.

 

 

We will be using these files to create our artifact build pipeline.  Also highlighted is the WorkspaceDefaultSqlServer_connectionString, this string is of the type secureString.  This is important when we reach our release pipeline in Part 4, if we do not handle this string properly the release will fail.

 

For this exercise I've created another Azure Synapse Analytics Environment for us to deploy to named bballaswqa in a separate resource group from bballasw.  

 

Right now there is nothing in bballaswqa.

 

Especially compared to bballasw.

 

 

With our destination of bballaswqa in mind, we begin with our build pipeline.  Moving over to Azure DevOps we want to move to Pipelines.

 

Click New pipeline.

 

 

At the very bottom of the page, in super tiny font you will find Use the classic editor.  Make sure to click on that lin.

 

 

This is where we configure the project, repository, and the default branch for our builds.  All of this information is correct.  We are using Azure Repos Git, we will click the Continue button.

 

At the very top of our next page we have different options for our pipeline template.  We will click on the Empty job link.

 

 

Now we are finally to the pipeline.  First we will rename the default name.  I use ASW in my naming convention as it stands for Azure Synapse Workspace.  We rename the pipeline to ASW Build Pipeline.  Then we click on the Triggers section.  The Triggers section is where we will configure our CI portion of our build. 

 

 

Check the Enable continuous integration check box.  Under branch filters we want main, as that is the branch we are publishing to for all of our builds.  But we need to add a Path filter.  Ever time we merge a branch to main it would cause the build pipeline to run.  I only want the pipeline to run when we publish from our Azure Synapse Analytics environment.

 

After this return to the tasks window.

 

 

On the Agent job 1 click the + button. We need the Copy files task.  Add that to our pipeline.

 

 

In Azure DevOps there are some reserved variables.  One of the Build variables is Build.ArtifactStagingDirectory, for a full list see this Docs article Predefined variables - Azure Pipelines | Microsoft Docs. This defines a local path on a build agent and this is where we want our build files deployed to.  In part 4 we will build a Release Pipeline and that Release will be linked to the artifacts we produce.  

 

We will change our Display name to Copy ARM Template Files to: $(Build.ArtifactStagingDirectory).  When you use a variable in Azure DevOps you invoke it with $(variableName).  Now click on the three ellipsis next to the Source Folder text box.

 

 

We will select the folder that has the same name as our Azure Synapse Analytics workspace that has our Template JSON files.

 

We finish configuring this task by setting our contents to *.json.  This will pull in only the JSON files under our folder.  We set the Target Folder to $(Build.ArtifactsStagingDirectory)/ARM.

 

We don't need the folder path.  At this time there are no other object in my build pipeline.  If we ever want to add them we can have additional subfolders, but for now my OCD won and I created a folder.

 

 

Now click on the + sign on Agent Job 1.  We need to add the Publish Pipeline Artifacts task.  

 

 

Our File or Directory path will be $(Build.ArtifactsStagingDirectory).  Our Artifact name will be ASW_Drop. 

 

Hit Save & queue.

 

 

Enter a save comment, click Save and run. 

 

 

Click on the Agent job 1 section of the page and open the build agent window.

 

 

OH WOW!  IT ALL TURNED GREEN AND WORKED!!! .....it's not like I did this a few 100 times failing miserably until I figured it out......  Now click on the small arrow next to the Jobs run to return to the pipeline.

 

 

Under the header Related 0 work items click on 1 published; 1 consumed. 

 

 

Expand the arrow next to ASW_Drop, ARM, and we can see our Template files.  Success we have a build artifact that we can now call in a Release pipeline.

 

Ok Dear Reader, it's late and we are done for today.  In our next blog we will cover the release pipeline and look at what was deployed to our QA environment!

 

As always, thank you for stopping by.

 

 

Updated Dec 18, 2020
Version 1.0

18 Comments

  • aturlov's avatar
    aturlov
    Copper Contributor

    I am reading this article in year 2024. While I appreciate the work Bradley did by writing and sharing this article I am disappointed with the content presented in this article. It's very old school even for 2020 when the article was written let alone 2024. Multiple reasons for my disappointment:

    • Classic pipelines are not satisfactory for modern CI/CD approach. YAML pipelines are the way to go. In 2020 that was definitely possible and Microsoft already released the Synapse deployment ADO extension.
    • Using manual publishing in Synapse Studio. This assumes that developers have to do the work and it is a contradiction with the CI/CD approach: the only thing developers should do is to integrate their changes into a Git collaboration branch. Validation they do during development and testing should be enough to justify their commits and PRs.
    • The main disappointment is this entire article has really nothing to do with automating the CI/CD process for Synapse developers. All it does is copying the manually created ARM template from Git branch and exposing it as a pipeline artifact. It's absolutely unnecessary when the ARM template is already in Git and can be checked out by any DevOps pipeline at any time.

    I do hope there is a more modern and really automated way of implementing CI/CD for Synapse. I am exploring it myself event those this article series is advertised by Microsoft as an "official approach". I am well familiar with how CI/CD is done for the Data Factory, and how to use a Synapse Deployment ADO extension. Would be really beneficial for the community have an updated version of this article that uses more modern CI/CD approach and toolset.

  • Vini_Napoleao's avatar
    Vini_Napoleao
    Copper Contributor

    Enable Classic editor - To people who don't know.

     

    https://www.youtube.com/watch?v=c0UhygUkBrE

  • JohnLEdwards's avatar
    JohnLEdwards
    Copper Contributor

    I've changed my workspace_publish branch to main and I see no change -- no folder.  I'm stuck about 10% of the way into this blog entry.  The first two were top notch, this one has me stumped.

     

    Edit -- Got it.  I needed to publish something.  I needed to make a file and change it after creating the DevOps in order for the Publish command to create a set of output files for the repository.  I had a ton of stuff in the workspace prior to setting up DevOps, but nothing that I changed.  I added one empty file and everything got captured in the TemplateForWorkspace.json and TemplateParametersForWorkspace.json.

     

    At this point things are becoming less intuitive.  Up until this point I could figure out what each step was doing and why.  This part is a little less apparent.

  • Victor2260's avatar
    Victor2260
    Copper Contributor

    "folder that is the same name as your Azure Synapse Analytics workspace." - yeah, no. The folder doesn't exist. What now?

     

    Edit: Found the solution buried in the comments... I need to change my workspace_publish branch to main as the directory got created in workspace_publish at the moment...

  • hps2022's avatar
    hps2022
    Copper Contributor

    Hi Bradley_Ball 

    Thanks for sharing the article. It is really helpful.

    It would be great if you can also include the CICD process for SQL database inside synapse workspace (serverless). Somehow, I went through the entire process but , i was not able to deploy Synapse SQL database for external tables.

  • anujsen18's avatar
    anujsen18
    Copper Contributor

    Hi Bradley, 

    Nice article !!

    I have one question on managing Linkservice / integartionruntime across the environment.

    how should it be done or handled in CICD part without or lest manual effort we can run pipeline in  test and prod  

    Thanks 

  • theCorbin8or's avatar
    theCorbin8or
    Copper Contributor

    Hi Bradley,
    ive read through these articles and i am currently trying to implement a CICD solution for synapse artifacts across 3 environments DEV, REL, PROD 
    I am using azure YAML pipelines for this. i have followed the comment above of changing the publish branch to `main` as suggested.. However i am running into a permissions issue when invoking publish on the DEV workspace via studio. 


    It states "Failed to save ARM templates. Error: you are not allowed to save to current branch, either select another branch or resolve permissions in azure devops" obviously i dont want to allow "push master` as that will void any PR policies in place 

    How to handle this please ? 

    Cheers 

    Corbin 

  • qzhou's avatar
    qzhou
    Copper Contributor

    In part 5 we will look at generating the ARM templates needed for deploying a new environment from Azure DevOps.

    Is still there a part 5?

  • sugidwani's avatar
    sugidwani
    Copper Contributor

    " folder that is the same name as your Azure Synapse Analytics workspace." we will see the folder only when we create the pull request and commit the changes post publish. otherwise will not able to see the folder workspace in repo.

  • CheongSin's avatar
    CheongSin
    Copper Contributor

    Hi Brad, great series of articles. This is the most comprehensive tutorial that anyone can follow and works like charm.    Now the part that I am struggling with the most which i didn't anticipate it was going to be was creating a synapse workspace with just built-in serverless pool using ARM template.   I used the portal to generate the template and downloaded it and setup a Arm template deployment task in release pipeline in devop.   It's creating the synapse space and storage and all, but i am getting an error.

     

     'Authorization failed for template resource 'datalakegen2synapseacct/Microsoft.Authorization/5dc0d056-0e65-53d3-89fc-6e17ac29d7e7' of type 'Microsoft.Storage/storageAccounts/providers/roleAssignments'. The client '4656517f-f949-417d-a1f8-3271ccbcc404' with object id '4656517f-f949-417d-a1f8-3271ccbcc404' does not have permission to perform action '...   
     
    I would think this has something to do with the Service principal not having sufficient privilege or something.. but I can not resolve it.   I would think this should just work with out of box with imported templates.