Orchestration allows conditional logic and enables user to take different based upon outcomes of a previous activity. Building upon the concepts of conditional paths, ADF and Synapse pipeline allows users to build versatile and resilient work flows that can handle unexpected errors that work smoothly in auto-pilot mode.
This is an ongoing series that gradually level up and help you build even more complicated logic to handle more scenarios. We will walk through examples for some common use cases, and help you to build functional and useful work flows.
Before dive deep into pipeline logics and building complicated work flows, we will start with the basic building blocks.
Azure Data Factory and Synapse Pipeline orchestration allows conditional logic and enables user to take different based upon outcomes of a previous activity. Using different paths allow users to build robust pipelines and incorporates error handling in ETL/ELT logic. In total, we allow four conditional paths,
(Default Pass) Execute this path if the current activity succeeded
Execute this path if the current activity failed
Execute this path after the current activity completed, regardless if it succeeded or not
Execute this path if the activity itself didn't run
You may add multiple branches following an activity, with one exception: "Upon Completion" path can't co-exist with either "Upon Success" or "Upon Failure" path. Note that for each pipeline run, at most one path will be activated, based on the execution outcome of the activity.
After an activity ran and completed, you may reference its status with @activity('ActivityName').Status. It will be either "Succeeded" or "Failed". We'll use this expression to build complicated work flow.
Please note that an activity which was skipped would not have a Status field. Therefore, there will never be "Skipped" value in this field
Let's get our hands dirty and start building some work flows.
#1 Error handling and notifications
The single most important work flow logic in ADF is error handling and notification. It allows pipeline to invoke an error handling script or send out notification, when the step fails. It should be incorporated as best practice for all mission critical steps that needs fallback alternatives or logging.
Add the mission critical work step
Add an error handling/logging/nonfiction step on Upon Failure path
(Optional) add next steps on Upon Success path
#2 Best Effort Steps
Certain steps are less critical, and their failures shouldn't block the whole pipeline. For instance certain informative logging about the start/end of a job falls into this category. In such cases, we should adopt the best effort strategies
Add the non-critical work step
Add next step on Upon Completion path
#3 Blocking Dependencies
It is often the requirements that a post processing script can only run if and only if all previous activities succeeded. For example, send out a success notification when all 3 copy activities succeeded. In ADF, the behavior can be achieved easily: declare multiple dependencies for the next step. Graphically, that means multiple lines pointing into the next activity.
To ensure that all previous activities all have succeeded, make the connections with Upon Success paths, like this
#4 Non-blocking Dependencies
You can also mix the Upon Success and Upon Completion paths in the above diagram, to declare some steps as non-blocking. Notice that the follow up step will still wait for all steps to complete, nonetheless it allows some activities to fail, adding resilience to the overall workflow.
In this case, the wait activity will still proceed, even when the ActivityFailed did not succeed.
In the next installment, we will advance to more complicated scenarios and discuss how to achieve OR in orchestration, to implement:
Invoke a shared error handling/logging activity if any activity fails
Proceeds to next step if at least one activity succeeds