Automated publish improvement in ADF's CI/CD flow
Published Feb 08 2021 05:40 PM 15.2K Views
Microsoft

The "Automated publish" improvement takes the validate all and export Azure Resource Manager (ARM) template functionality from the ADF UI and makes the logic consumable via a publicly available npm package @microsoft/azure-data-factory-utilities. This allows you to programmatically trigger these actions instead of going to the ADF UI and do a button click 'Publish'. This will give your CI/CD pipelines a truer continuous integration experience.

 

Current CI/CD flow

  1. Each user makes changes in their private branches.
  2. Push to master is forbidden; users must create a PR to master to make changes.
  3. Users must load ADF UI and click publish to deploy changes to Data Factory and generate the ARM templates in the Publish branch.
  4. DevOps Release pipeline is configured to create a new release and deploy the ARM template each time a new change is pushed to the publish branch.

current-ci-cd-flow.png

 

 

Manual step in current CI/ CD flow

In the current CI/CD flow, the ADF UI is the intermediary to create the ARM template; therefore, a user must go to ADF UI and manually click publish to start the ARM template generation and drop it in the publish branch. This manual step may be problematic for a few with different expectations around automation in the CI/CD process. 

 

 

The new CI/ CD flow

  1. Each user makes changes in their private branches.
  2. Push to master is forbidden; users must create a PR to master to make changes.
  3. Azure DevOps pipeline build is triggered every time a new commit is made to master, validates the resources and generates an ARM template as an artefact if validation succeeds.
  4. DevOps Release pipeline is configured to create a new release and deploy the ARM template each time a new build is available.

new-ci-cd-flow.png

 

For more details on implementing the new flow, check out our documentation

 

Note: You can continue to use the existing mechanism (adf_publish branch) or use the new flow. Both mechanisms will be supported, just that the latest one removes the additional 'publish' requirement using the ADF UI and does not rely on the 'adf_publish' branch. Choose the flow which works better for you. 

32 Comments
Copper Contributor

@Abhishek Narain , I've noticed that thus utility doesn't consider managedVirtualNetworks and managedPrivateEndpoints when it creates ARM templates. Because of this, I am unable to use this method to automate publishing. Do you have some solution or work around for this issue?

Microsoft

Is it possible to run validate all against a data factory in a feature branch?  Would be nice to block merges that won't pass validation before they get to master.

Brass Contributor

@bcassell-MSFT We use a dev branch AND a master branch, and make dev branch the Collaborative branch in our dev Data Factory resource. We make feature branches off dev, and then PR to dev first. The release pipeline in Azure DevOps deploys to the Dev environment first.

I think the issue with the old guidance is that the CI/CD diagram made it look like you must only have a master branch, and it looks like the new guidance suffers the same shortsightenness. Hopefully, someone special on the DF team is reading this.

Microsoft

@kolangareth The issue was fixed in the latest NPM package. Please update the NPM package and try.

Microsoft

@Jason Kohlhoff That's fair feedback. It should be the customer's choice. The major problem we sorted with this is the automated 'publish' and 'validate', getting rid of the adf_publish branch dependency, which was updated on publish using the collaboration branch (it could be master or any other).
We will come up with better guidance on deploying feature/ dev branches across environments and the limitless possibilities customers can use the CI/CD.    

Copper Contributor

@Abhishek Narain  I am using the latest version 0.1.3. But the managedVirtualNetworks and managedPrivateEndpoints are not included in the Arm Template even with this version.

Microsoft

@bcassell-MSFT 

This package is not exclusive for the collaboration branch, you can generate any ARM template based on any branch.

Also, it is possible to run a build when a PR is raised and add it as a mandatory check before merging. You can do this in the branch policies section, e.g.,:

 

CesarBerard_0-1614623144737.png

 

Also note that you can also run a build every time there's a new commit in master.

Copper Contributor

@Abhishek Narain  I have noticed, while testing the ADF NPM utility, that the deleted pipelines are not getting undeployed from ADF environment.

 

Does the ADF utility not support this feature?

 

Do we need to follow some steps apart from the utility?

Copper Contributor

Hello Microsoft team,

 

NPM Utility is removing Azure DevOps Git configuration from my Data Factory integration.

 

I had to set after every build. Do we have any fix for this?

Microsoft

@skammili 

 

About the resources not deleted:

Since the ARM template deployment is an incremental operation, the resources are not deleted by design. You need to use a Post deployment PS script in order to do so, this script is included in the output of the package (PrePostDeploymentScript.ps1). Here's more info about this script and how to implement it in your release definition.

 

 

Regarding the Git problem:

 

It is possible that you are including global parameters in the ARM template. Since global parameters are located in the factory entity, when they are added, the factory entity is added in the ARM template as well, however, the Git configuration is not. You can check the box to do not include them in the ARM template and use the Global Parameters post deployment script in order to update them in the target environment. This script is provided in the output of the package as well (GlobalParametersUpdateScript.ps1). Here's more info about global parameters and CICD, and here is more info about the deployment with the PS script.

Copper Contributor

Hello Microsoft team,

 

I have datafactory for Development environment. In this datafactory both collaborator and publish branch are kept as master.

I created a feature branch from master and added a new ADF pipeline with one set variable activity in this feature branch. After this generated the arm template using the npm utility.

The ARM template got generated successfully. I also verfiied that my new ADF pipeline is present in ARMTemplateForFactory.json.

 

However are performing the deployment via release pipeline using the above generated ARM template.

I am not able to find the new ADF pipeline in my DataFactory.

 

Can you tell me where I may be doing something wrong.

Let me know if you need more information.

 

Copper Contributor

Hi @Abhishek Narain @CesarBerard 

Can you help me in resolving my above query ?

Copper Contributor

Hi @Abhishek Narain @CesarBerard,

 

My issue is resolved.

Copper Contributor

Hi @Abhishek Narain @CesarBerard,

Since now we are able to create ARM template from this npm package, can you tell me if it is possible to disable the manually publish from Data Factory ? If yes then can you help with some pointers.

 

Thanks in advance !!

Copper Contributor

Hi, is there a way of incorporating this new functionality into an Azure DevOps release pipeline? The example above and in the documentation show how to do this on the commandline, but I would like to build this step into my DevOps YAML releases.

Copper Contributor
Copper Contributor

This has been an issue for us for a long long time. As a result we decided on a different approach and started using a slick tool based on json file deployment. It works great.

 

Check it out at https://sqlplayer.net/adftools/

 

Copper Contributor

Hi @Abhishek Narain ,

Something is not clear for me.

We want to use this new package to validate and test the factory before the change is merged in the collaboration branch.

So: PR to collaboration branch -> build artefact produced -> deployment back in DEV and tests -> PR approved and merged -> build artefact produced for release to int and prod.

Is this correct? I understood from the documentation that this method requires this DEV deployment (which before was not needed), as a replacement of the Publish button.

How this deployment back in dev works since dev has also git integration enabled? 

I have tried it, deployment back in dev works, but I can't identify my changes in the Factory UI. There are on the feature branch and not in collaboration branch (of course, since merge did not happen at this time), but where the changes are applied after deploying back to dev?

Thank you in advance!

Copper Contributor

Hi ,

 

I have only one instance of ADF and having different folders like Dev, QA, Pre-Prod , Prod.

 

whatever pipelines I created in Dev , Can I still use CI/CD to create the same in QA folder

 

Regards,

Amit

Copper Contributor

Hi @Abhishek Narain,

 

This is a great utility to avoid the manual publish which has been a pan for more than an year for our team. However, this doesn't seem to be supported for the data factory pipelines in the synapse. 

 

https://docs.microsoft.com/en-us/azure/data-factory/continuous-integration-deployment-improvements

 

Could we expected the support for Synapse anytime soon?

Copper Contributor

Hello all,

If anyone interested to build, validate and export arm templates automatically as a part of CI process - this is your friend:

https://marketplace.visualstudio.com/items?itemName=SQLPlayer.DataFactoryTools

One of the tasks uses the Microsoft NPM library behind the scenes and can validate your ADF:

sqlplayer_0-1621364458966.png

Then you can use:

1) a standard approach to deployment arm_templates to the target ADF

2) or deploy from code (directly from branch) using Publish ADF task from the above Azure DevOps extension.

I'm the author of this #adftools. More info: https://sqlplayer.net/adftools/

To understand the differences between these two methods:

Two methods of deployment Azure Data Factory

Copper Contributor

@AmitGhotikar having only one instance of ADF is definitely a bad idea and you ask for troubles.

You should have one ADF per environment. All objects should have the same names, only selected properties must be substituted.

Furthermore, DEV environment (with git configured) should be covered by the DevOps deployment process.

Copper Contributor

Hi @Abhishek Narain , How can we run validation when the ADF has global parameters? Could not find any reference for the same.

Copper Contributor

Hello @Abhishek Narain ,

 

Do you know of any sources that might show how this process would be set up end to end? This would be for someone starting out in Azure DevOps and while I know how to set up a release pipeline, I am getting a little bit confused as to what specific tasks an Agent would need to do to be able to implement what you have provided.

 

Thank you

Copper Contributor

@vdedkov Take a look at this page: https://docs.microsoft.com/en-us/azure/data-factory/continuous-integration-deployment-improvements#c...

There is a whole YAML file which should be helpful.

Copper Contributor

Hi @Abhishek Narain, thanks for this post. In my company we've been using the

 

 

run build validate <datafactory>

 

 

for a while to build our datafactory in Azure, but recently we used a DataFlow activity with ADX (both source and sink), and the validation didn't go through, even though on ADF it works and validates correctly. It seems the package @microsoft/azure-data-factory-utilities hasn't been updated to include ADX in data flows.

Copper Contributor

Hi @Abhishek Narain & Team,

 

If Azure data factory in dev environment is having global parameters defined and its global parameters are used in the data factory pipeline then the npm package is giving validation error as global parameter is not defined. 

Have tried with disabled include ARM template & enabled include ARM template then also it is giving the same error as global parameter is not found. It is right to say that global parameter should not be present in inital adf dev environment then only we can validate/build/generate the ARM template, deploy to another enviornment. Please correct if i am missing anything here.

Kindly note that in order to deploy global parameter to another environment i have refer the doc and use powershell script. But if global parameter is already present & applied in data factory pipeline then only NPM package is giving this issue.  Please suggest

Thanks in advance!!

Copper Contributor

Hi team,

 

Any comments on the above query?

 

Thanks!!

Copper Contributor

Hi, 

 

We have the same issue than @kirti20feb.

 

Is there a way to make the npm utils understand or skip the global parameters thing? When I run export via the UI or Validate it works fine but with the NPM tool I get the following:

 

 ERROR === Validator: Unhandled validation error for: pipeline - pl_backbook_data_drift_monitoring, error: {"stack":"TypeError: Cannot read property 'parameters' of undefined\n    at n.getConceptParameters (/home/vsts/work/1/s/downloads/main.js:2:11988289)\n    at n.e.resolve (/home/vsts/work/1/s/downloads/main.js:2:11985799)\n    at Function.n._getResolution (/home/vsts/work/1/s/downloads/main.js:2:12048312)\n    at Object.resolve (/home/vsts/work/1/s/downloads/main.js:2:12050159)\n    at Function.<anonymous> (/home/vsts/work/1/s/downloads/main.js:2:12111012)\n    at /home/vsts/work/1/s/downloads/main.js:2:10601093\n    at Object.next (/home/vsts/work/1/s/downloads/main.js:2:10601198)\n    at /home/vsts/work/1/s/downloads/main.js:2:10600135\n    at new Promise (<anonymous>)\n    at c (/home/vsts/work/1/s/downloads/main.js:2:10599880)","message":"Cannot read property 'parameters' of undefined"}

 

 

Copper Contributor
Copper Contributor

@edgBR unfortunately I have no experience with global parameters. Did you try raising a support ticket for MS from azure portal ? I quick search on internet shows that many have faced issued with global parameters when using npm but didn't found any solution

Copper Contributor

@edgBR @jigar191089 

The global parameters are giving error on ADO pipelines as "Parameter Env was not found under $(azuredatafactory)Npm failed with return code: 1" when it has been used in ADF data pipelines. 

It is only working fine with global parameters defined only in ADF but not included in the ADF data pipeline. 

This seems to as per the current design and appears to be bug that will be fixed in future releases.

 

Co-Authors
Version history
Last update:
‎Oct 06 2021 09:45 AM
Updated by: