When you integrate your Synapse Workspace with your Git repository, you need to define the organizational structure of your Source Control System (Organization->Project->Repository) , your workspace shared codebase folder (collaboration branch) and workspace ARM templates folder (publish branch).
Once you integrate your workspace with your Git repository, you will no longer be authoring your code against the Synapse service. Instead, all the changes will be first committed to your Git repository before getting published in Synapse Service (Live Mode).
The Continuous Integration Process in Synapse Workspace
The Publish operation is divided in two stages: a first stage where all the pending changes from your Git collaboration branch are stored in your workspace (Live Mode); and a second stage where the workspace ARM templates are generated and saved in the workspace publish branch. These two ARM templates represent the outcome of the Continuous Integration process in your Synapse Workspace:
TemplateForWorkspace.json is the ARM template containing all the workspace artifacts and resources
TemplateParametersForWorkspace.json is the ARM template containing only the artifacts parameters.
After integrating your Synapse Workspace, your first Publish will generate a TemplateParametersForWorkspace.json file containing a global parameter for your workspace name and a parameter for each workspace default linked service: the default SQL Server and the default Storage account.
Where are these parameters coming from?
When in Synapse Studio, go to the "Manage Hub" -> select "Linked Services" and mouse over one of the workspace default linked services (in this example below I'm selecting the default sql server) and select “Code” {}
You can see these properties highlighted below that are being exposed by the Workspace default parameter template.
Now let’s create a new Linked Service, using the Azure Key Vault connector, and publish the pending changes to generate the new ARM templates.
Check the TemplateParametersForWorkspace.json in your publish branch to confirm that the new AKV linked service "baseUrl" property is also being exposed by the default parameter template.
Now let’s create and publish a Notebook attached to an existing Spark Pool.
Check the TemplateParametersForWorkspace.json in your publish branch. No sign of any notebook property,right?
But if you check the TemplateForWorkspace.json in your publish branch, you will find several notebook properties!! Here’s a clear example of an artifact whose properties are not exposed by the default workspace parameters template.
Let's use a different kind of artifact, a Dataset, and see if the default template will expose its properties.
Again, no sign of these Dataset properties in the parameters file:
Although these Datasets are part of the main template file with several properties associated:
The use case for using a customized parameters template in Synapse is simple: when you want to automate your CICD process in Synapse and you need to override any artifact property that is not parameterized by the default parameters template.
After publishing your pending changes from the collaboration branch into Synapse Service (Live Mode), Synapse will verify if there is any custom template file stored in the root folder of your collaboration branch with this exact name “template-parameters-definition.json”. If this file exists, Synapse will use its configuration to generate the ARM template parameters; if it does not exist, it will use the default parameters template.
From your Devops collaboration branch, hit the “More Actions” button and then select + New -> File to create a new file in the root folder of your collaboration branch.
Important: Create a new file with this exact name: template-parameters-definition.json
Hit the “Create” button and copy the parameters template definition JSON example from Microsoft Public documents: Continuous integration & delivery in Azure Synapse Analytics - Azure Synapse Analytics | Microsoft D...
Paste the JSON content into the new template-parameters-definition.json file. Don’t forget to select “Commit” to save your changes.
Now that we have saved the custom template file, it’s time to generate the new Synapse Workspace ARM templates.
Switch to Synapse Studio and do a minor change in your code to force a new commit, publish this change in the Live Service to generate the new ARM template files.
Once the ARM templates get generated, check the TemplateParameterForWorskspace.json arm template in your publish branch. This file content will now look much different from the original one, as you have now exposed more properties to parameterize.
You may ask: If we have published the Storage Account and the Synapse SQL pool datasets, why the properties for the latest are missing in the TemplateParameterForWorskspace.json?
Let’s take a look at the template-parameters-definition.json file and check the /datasets section:
We are exposing any key-value pairs that are included under the “properties” -> “typeProperties” object.
Let’s analyze the JSON code associated with each dataset.
Starting with the Storage Account dataset:
we have four key-value pairs listed under the “typeProperties” object:
Looking at the TemplateParameterForWorskspace.json file, we confirm the presence of these properties.
Now let's look at the Synapse SQL pool dataset JSON.
Since there are no key-value pairs under the “typeProperties” object, no properties will be exposed in the ARM template to parameterize.
Microsoft strongly recommends that you prepare your pools before migrating the workspace artifacts, making sure you use the same name for your pools across your environments. In some circumstances, you may need to attach your artifacts to a different pool in your target environment. Using a custom parameter template can help you achieve this goal.
In this example, I’m going to show how you can take advantage of custom parameterization in your parameters template to attach a Notebook to a different Spark Pool, when deploying this Notebook to a target environment hosting a Spark pool with a different name.
So here’s the case where you have two environments each one hosting a Spark Pool with different names.
DEV Environment |
UAT Environment |
|
|
Here’s an example of a Notebook in the DEV environment that is attached to a Spark Pool named “mysparkpooldev”.
Taking a closer look at the Notebook JSON code, there are multiple properties where this Spark Pool is being referenced.
Now we need to change the template definition file (template-parameters-definition.json), and find the Microsoft.Synapse/workspaces/notebooks section to expose these additional properties. You can find highlighted below, the code that you need to add to this section to expose these properties.
"Microsoft.Synapse/workspaces/notebooks": {
"properties": {
"bigDataPool": {
"referenceName": "="
},
"metadata": {
"a365ComputeOptions": {
"id": "=",
"name": "=",
"endpoint": "="
}
}
}
}
Don’t forget to select "Commit" to save your changes.
Switch now to Synapse Studio, and make sure you make a minor change in your notebook to force a commit and publish your changes. This will generate the ARM templates based on the new template definition file.
Once the template generation is finished, check the TemplateParametersForWorkspace.json in your workspace publish branch to confirm that the new notebook parameters are now being exposed.
Once you confirm that the necessary properties are being exposed, you can config the Workspace Deployment task in your Release Pipeline and add these new parameters in the “OverrideParameters” section.
As an example, I’m overriding this parameter “NotebookA_properties_metadata_a365ComputeOptions_id” using the target Spark Pool resourceURi:
/subscriptions/<target_workspace_subscription>/resourceGroups/<target_workspace_RG>/providers/Microsoft.Synapse/workspaces/<target_workspace_name>/bigDataPools/<target_spark_pool>
After executing your Release Pipeline in Azure DevOps, go to your target Synapse Workspace and open the Notebook to confirm that it is now attached to a Spark pool with a different name.
To simplify the parameter overriding operation and code maintenance, you can take advantage of a custom parameters template to provide shorter names to your parameters.
Let’s take this parameter name as example: NotebookA_properties_metadata_a365ComputeOptions_id. Lengthy name, right?
Let's make this parameter name shorter, like “NotebookA_meta_id”.
You just need to edit the Notebooks section in your template-parameters-definition.json and use the custom parameter syntax as explained here.
Use the format <action>:<name>:<stype>
<action> -> we are using the “=” character to keep the current current value as the default value for the parameter.
<name> -> we are using “-“ character (because we don’t want to keep the default name) followed by the new name.
<stype> -> we don’t want to change the default type, so we are omitting this value (by default the parameter type is a string).
Here’s how the Notebook section will look like:
Now if you switch back to Synapse Studio and publish any pending changes from your collaboration branch to generate the new ARM templates, you will see that this Notebook parameter has been renamed from “NotebookA_properties_metadata_a365ComputeOptions_id” to “NotebookA_meta_id”.
At the time of this writing, Synapse will fail to generate the ARM templates if they exceed the 20MB limit each.
If you are experiencing this limitation and failing to generate these ARM templates during the publish operation, you can evaluate if by using a custom parameters template and renaming you parameters to use shorter parameter names will decrease the ARM file size and allow the ARM template generation.
When using automated CI/CD in Azure Synapse Analytics, users can take advantage of custom parameters to extend the capabilities of the default Workspace template, allowing the exposure and the overriding of any artifact property that is not parameterized by default.
Source control in Synapse Studio - Azure Synapse Analytics | Microsoft Docs
Learn how to configure source control in Synapse Studio
Create custom parameters in the workspace template
Learn how to use custom parameters in Synapse CICD
Best practices for CI/CD in Azure Synapse Analytics
If you're using Git integration with your Azure Synapse workspace we recommend these best practices
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.