Azure Synapse Studio is the primary tool to use to interact with the many components that exist in Azure Synapse Analytics, allowing you to perform a wide range of activities against your data and build a fully integrated analytics solution. Integrating Synapse Studio with a Source Control System such as Azure DevOps Git or Github has been shown as one of Studio’s preferred features to leverage collaborative work and source control.
Working collaboratively and tracking code changes in an integrated analytics platform that combines data warehousing, big data analytics, data integration, and visualization, can be quite challenging. Multidisciplinary teams, working in different projects/features, working in complex applicational lifecycles requiring agility and and…automation!
Looking at this figure below, illustrating the Synapse CICD lifecycle for Workspace Artifacts, you can see some manual steps preventing a fully automated process.
Figure 1: Current CICD flow in Synapse
The goal of this article is to deep dive into the new cool features recently introduced in Synapse CICD (Synapse Workspace Deployment Task) V2(preview) to automate the publishing step of the process, allowing you to deploy the code from any user branch without any manual intervention from the UI.
Let’s take a closer look at the current CICD flow in Synapse before the V2 release of Synapse Workspace Deployment task. And let’s use this simple scenario as an example: two developers, John and Mary, are working in different projects and features, developing their code in a single Synapse Workspace.
Figure 2: Source control and publishing in Synapse Workspace
John has been developing a new feature for Project X, which must be deployed and tested in (UAT) tomorrow. Mary’s has also been developing a new feature but for Project Y, which is scheduled for deployment and acceptance only next week. Since a few weeks ago, both developers have been developing their code in their own feature branches, publishing their changes, executing their code, making sure everything is working fine before deploying their features to the UAT environment.
The day for deploying John’s feature has come. John is about to trigger the DevOps release pipeline, to deploy his code to UAT, but before that, he wants to review the ARM templates that are going to be deployed. He realizes that Mary’s artifacts are already part of the ARM templates and they are not supposed to be deployed until next week.John really wanted to cherry pick his feature, but since the Workspace Deployment task V.1 requires the ARM templates generated from the collaboration branch, Mary’s artifacts will also be included in the deployment. It’s an all or nothing approach.
So, when using the Workspace Deployment Task v1:
The Workspace Deployment Task v2 (in Preview at the time of writing) now introduces some new cool deployment modes (aka Operation Types): “Validate” (only available in YAML pipelines) and “Validate and Deploy”. These operations will facilitate the CICD automation, introducing a new CICD flow.
Figure 3: The new CICD workflow in Synapse
I will explain in more detail the rationale behind each operation type and some of the applicable use cases:
The Validation operation only works in the YAML pipeline.
The goal of the Validate operation is to validate the files in a non-publish branch and export the Workspace ARM templates as pipeline artifacts.
This is useful when you want to automate the validation and generation of ARM templates from any user branch. Before V2 this could only be possible from the collaboration branch, and you had to do this manually, from Synapse Studio.
The goal of the Validate and Deploy operation is to deploy the artifacts to a target from any user branch (a non-publish branch). It does the job of the Validate operation (generating the ARM templates based on the specified branch) and adds this extra deploy step to deploy only the artifacts from that branch.
This is a useful operation when you want to cherry pick the code that you want to deploy from your lower environment to your target environment, bypassing the manual Publish operation. Before V2, the deploy operation would only consider the ARM code published from the collaboration branch. In V2 the task can consider the code from any non-publish branch!
This operation remains unchanged from V1: the goal of the Deploy operation is to take the ARM templates manually generated using the Publish action in Synapse Studio (from the collaboration branch) and deploy the artifacts to a target environment. In this operation you will not be able to deploy any branch code separately, you will be deploying the code published from your collaboration branch only.
I'm sharing below a couple of use cases where you can benefit from using the “Validate and Deploy” feature to automate the publish action of Synapse Workspace artifacts from any user branch:
When you need to orchestrate the code that you have recently developed in a feature branch, the typical flow is to create a pull request to merge your code with the master(collaboration) branch and then, from this branch, you manually hit the publish button to persist your code in Live Mode. This will also generate the ARM template files in the publish branch.
This can be a showstopper in scenarios where you are not allowed to push your changes directly to the master (collaboration) branch or you simply don’t want those changes to be propagated to the ARM templates in the publish branch.
This will no longer be a problem if you use this new “Validate and Deploy” feature, as it will allow you to publish your code directly to Live Mode, bypassing the merge operation with the master branch and the ARM template generation to the publish branch.
Here’s an example on how you can make it work:
This will allow you to choose whether you want this release to be triggered whenever a push occurs in your user branch or when you create a Pull Request.
In this example, I’m using my dev workspace as the target for this deployment, as I want to automate the publishing of my code to the dev environment.
Switching to Live Mode:
You can see that the code from your feature branch has been published
This is a very common use case, where the need arises to isolate the tests and the acceptance of the code that is being developed in the source environment. If we look at the Deploy flow (figure above), the deployment operation will use the ARM templates from the publish branch, that contain all features published in our development environment. If we deploy these ARM templates to the target environment, we might be delivering unnecessary features to the target environment. We can, however, take advantage of this new "Validate and Deploy" feature to cherry pick and publish the code that we want to test in the target environment. Here’s an example:
Consider Mary's "featureA" ready to be tested in UAT environment.
Several Project X features have been already published in Live Mode by her teammate John, but John’s code is not supposed to go to UAT yet, only Mary's code (Project Y).
If Mary follows the typical CICD flow, merging her code with the collaboration branch and publishing her changes, the generated ARM templates in the publish branch will contain both John’s and Mary’s code. These ARM templates cannot be used in this use case, as they will deploy John’s code to the UAT environment as well. So, we need to take advantage of the “Validate and Deploy” new feature to achieve Mary's goals to deploy her code only.
In this example, as part of the branching strategy, we will be using a "gateway" branch, called, "UAT-ready". This is a branch that will be used to merge the code from any feature that is ready to be deployed to the UAT environment. We will use this "gateway" branch as a target branch filter to trigger our release pipeline. You will get more details on this below. Here’s an example on how we can do that:
The branching strategy in place, will allow Mary to merge her code from her feature branch to the "gateway" “UAT ready” branch. You can configure the continuous deployment trigger as follows:
1. From the Task version selection dropdown, select “2 (preview)
2. Select the “Validate and deploy” operation type
3. Select the root folder of the “UAT ready” branch. This is where you are merging your code through a Pull Request.
In this second use case, I’m using my UAT workspace as the target for this deployment.
Save your changes and do not create any release.
And, as expected, no sign of any code related to Project Y being published in DEV environment
And with this last step, we conclude this second use case.
Automating the publish action has been one of the most challenging tasks in Synapse CICD. By introducing these new operation types (“Validate” and “Validate and Deploy”) in the Workspace Deployment task V2, we are bringing new enhancements that will facilitate this automation. In this article, I have demonstrated how to take advantage of these new features using two simple scenarios. Combining creativity with engineering will bring new automation capabilities to your CICD use cases.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.