Blog Post

Analytics on Azure Blog

6 MIN READ

Smart Pipelines Orchestration: Designing Predictable Data Platforms on Shared Spark

Sally_Dabbah

Microsoft

Feb 08, 2026

Instead of treating all pipelines equally, we orchestrate them based on how heavy they actually are

Introduction In mature data platforms, scaling compute is rarely the primary challenge. Shared, elastic Spark pools already provide sufficient processing capacity for most workloads. The harder pro...

Updated Feb 08, 2026

Version 1.0

analytics

azure

azure synapse analytics

microsoft fabric

spark

Sally_Dabbah

Microsoft

Joined July 09, 2022

View Profile

Analytics on Azure Blog

Follow this blog board to get notified when there's new activity

paarthgupta

Microsoft

Feb 08, 2026

Great thought!

Just wondering, how can we extend this in use cases where different jobs are triggered from parallel pipeline executions and have no common parent pipeline?

Sally_Dabbah

Microsoft

Feb 09, 2026

Hello Paarath Gupta,

Yes, this can be made fully dynamic using metadata-driven pipelines. In this approach, you would define three parent pipelines based on workload type: Light, Medium, and Heavy.

Each parent pipeline reads from metadata and uses a ForEach activity to control parallelism. For example, the Light parent pipeline can trigger ~20 light workloads in parallel, followed by a smaller number of Medium pipelines (e.g., 2), and finally the Heavy pipelines.

The master pipeline’s responsibility is only to orchestrate execution order by invoking these three parent pipelines. Each parent pipeline, in turn, is responsible for invoking its corresponding child pipelines based on metadata.
Metadata can be saved as json and controlled by AI agent that can determine the weight for each pipeline.