Blog Post

Azure Infrastructure Blog
4 MIN READ

Parallel AKS Node Pool Creation with Crossplane: A Version Compatibility Journey

sbalaji's avatar
sbalaji
Icon for Microsoft rankMicrosoft
Jan 19, 2026

When managing Azure Kubernetes Service (AKS) clusters at scale, the ability to create multiple node pools in parallel can significantly reduce provisioning time. In this article, I'll share my journey troubleshooting and resolving node pool creation bottlenecks when using Crossplane to manage AKS infrastructure, culminating in a solution that enables true parallel creation of 30+ node pools.

 

TL;DR: Upgrading to Crossplane v2.1.3 (community edition) with Azure Provider v2.2.0 resolved sequential node pool creation issues, enabling parallel provisioning with the correct field names and resource references.

The Challenge

While using Crossplane to provision private AKS clusters with multiple node pools, I encountered a critical performance issue: node pools were being created sequentially rather than in parallel, significantly impacting deployment times. With requirements to create 30-50 node pools per cluster, this sequential behavior was unacceptable for production scenarios.

Initial Environment:

Investigation & Key Discoveries

The Initial Problem:

While designing scalable Azure Kubernetes Service (AKS) platforms, customers often leverage multiple node pools to isolate workloads, optimize costs, and enforce workload-specific configurations. However, during one such implementation, a customer encountered a critical performance bottleneck that made cluster provisioning impractical for production use. The customer attempted to provision AKS clusters with 30–50 node pools as part of their landing zone automation. While the configuration itself was valid, the overall cluster creation time stretched into multiple hours, making it unsuitable for real-world production scenarios.

Observed Symptoms

  • AKS cluster creation took hours to complete.
  • Node pools were provisioned sequentially, not in parallel.
  • Each node pool took approximately 2 minutes to provision.
  • A cluster with 33 node pools required more than 1 hour only for node pool creation.
  • This time did not include base AKS control plane provisioning.

As a result, the total deployment time exceeded acceptable operational limits. 

At first glance, the customer’s node pool creation logic appeared to be correct. Each node pool was defined independently and was capable of creating all its nodes in parallel within the node pool itself.

Here’s a simplified version of the Helm template used for node pool creation:

Customer's Approach & Our Alternatives:

The customer was initially using direct managed resources (KubernetesCluster and KubernetesClusterNodePool CRDs) to create Azure AKS infrastructure through Crossplane. While this approach worked, it exhibited the sequential creation behavior that was causing the performance issues.

To address this, we explored an alternative approach using Crossplane Compositions - a higher-level abstraction that allows you to define reusable infrastructure templates. We created composite resource definitions (XRDs) that could orchestrate cluster and node pool creation. However, during our investigation, we discovered that the root cause wasn't the abstraction level (managed resources vs. compositions).

Rate Limiting Investigation:

Initially, we suspected Azure API rate limiting might be the culprit behind the sequential behavior. To validate this hypothesis, we attempted to create the same AKS infrastructure using alternative Infrastructure as Code (IaC) tools: Terraform, Azure Resource Manager (ARM) templates & PowerShell (CLI)

The results were revealing that Terraform, ARM templates and PowerShell (CLI) are successfully creating node pools in parallel without issues. This was a critical discovery that confirmed it was NOT an Azure API rate limit problem. 

Open Issues:

During our investigation, we searched the official Crossplane GitHub repository (https://github.com/crossplane/crossplane) for any reported issues related to sequential node pool creation bottlenecks. No open or closed issues were found that documented this specific problem. This suggests the issue was specific to the combination of older provider versions and our use case, rather than a widely reported or known limitation.

Discovery: Version Compatibility Matters

After an extensive testing with various Crossplane and provider versions, the breakthrough came with:

Crossplane: v2.1.3 (community edition)
Azure Provider Family: v2.2.0
- upbound/provider-family-azure
- upbound/provider-azure-containerservice

 

Conclusion:

The journey from sequential to parallel node pool creation highlights the importance of staying current with Crossplane and provider versions. What initially appeared to be an Azure API rate limiting issue turned out to be a version compatibility challenge. The combination of Crossplane v2.1.3 and Azure Provider v2.2.0 delivers parallel resource creation capabilities, improving AKS cluster provisioning times at scale.

Key takeaways:

  1. Version compatibility is critical for optimal Crossplane performance
  2. Performance issues aren't always what they seem - thorough investigation revealed it wasn't Azure rate limiting
  3. Both managed resources and compositions can work, but provider version was the key factor
  4. Proper resource references enable parallel submissions
  5. CRD inspection provides authoritative field definitions
  6. Testing with alternative IaC tools (Terraform, ARM) can help validate hypotheses
Updated Jan 19, 2026
Version 1.0
No CommentsBe the first to comment