Blog Post

Azure Architecture Blog
4 MIN READ

Flexible Cooling for AI Growth: How Zonal Architecture Supports Diverse Hardware Needs

stsolo's avatar
stsolo
Icon for Microsoft rankMicrosoft
Apr 28, 2026

Microsoft is advancing a zonal cooling strategy in its next-generation AI datacenters to handle the growing mix of liquid-cooled accelerators and traditional air-cooled systems. By tailoring cooling to the specific needs of each workload, this design increases efficiency and performance while supporting Microsoft’s energy, carbon, and water reduction goals.

By: Ricardo Bianchini, Steve Solomon, Brijesh Warrier, Martin Herbert, Jay Jochim, Husam Alissa, Pulkit Misra, Eric Peterson and Cam Turner

Context - 

Microsoft is pioneering zonal cooling in its next-generation AI datacenters, enabling flexible, performant, efficient, and sustainable thermal management for diverse workloads.

The unprecedented growth of artificial intelligence (AI) is transforming datacenter infrastructure. Modern facilities must now support a diverse array of IT equipment, each with distinct cooling requirements. For example, modern GPUs and other AI accelerators require liquid cooling as air cooling is impractical at power draws exceeding 1 kW per accelerator due to the limited heat capacity of air to remove the resulting thermal load. Meanwhile, non-AI-accelerator (i.e., general-purpose) hardware deployments such as CPU-based compute, storage, and networking are expected to mostly remain air-cooled for the foreseeable future.

Furthermore, liquid cooling offers a significant efficiency advantage: its superior heat dissipation allows coolant supply temperatures at the chip as high as 45°C without sacrificing peak performance. In contrast, air-cooled equipment requires much lower supply temperatures—around 30 °C—for optimal efficiency.


The divergence in hardware cooling requirements creates a complex landscape that demands a strategy that is both flexible and adaptive. As shown in Figure 1, relying on a unified facility water system (FWS) introduces major inefficiencies. For example, liquid-cooled GPU racks may receive coolant below their required operating temperature when served by a single-temperature loop. This inefficiency becomes even more pronounced as the proportion of liquid- to air-cooled equipment increases (e.g., 90:10 liquid-to-air ratio for NVIDIA GB300 servers) since a larger share of the equipment is unnecessarily overcooled.

Beyond operational efficiency, sustainability is a key priority for Microsoft even as we grow our AI infrastructure. Among our sustainability commitments, Microsoft has set goals to become carbon negative and eliminate water evaporation as a cooling method in its next-generation datacenters. A key lever for reducing carbon emissions is improving PUE (Power Usage Effectiveness, i.e., total power divided by IT power), a standard measure of datacenter power and energy efficiency. Achieving this requires dynamically matching cooling delivery to the specific needs of each equipment type, ensuring optimal performance, reduced energy consumption, and enhanced sustainability.

Zonal Cooling: Flexible by Design

Zonal cooling is a facility design that introduces multiple independent water loops, each supplying coolant at different temperatures. Figure 2 illustrates a specific implementation of the zonal concept with two facility-level zones: one loop serves air-cooled equipment, maintaining lower temperatures for human comfort and general-purpose hardware, and the other loop caters to liquid-cooled IT AI accelerators, which can operate efficiently at higher supply temperatures. This separation enables datacenter operators to precisely match cooling supply to the requirements of each zone, avoiding the inefficiency of over-cooling all equipment to the lowest common denominator.

A key strength of zonal cooling is its flexibility. As new generations of IT hardware emerge, with varying thermal profiles, zonal cooling allows datacenters to adapt without major infrastructure overhauls. For example, future AI accelerators may need different liquid temperature ranges (see 30℃ Coolant - A Durable Roadmap for the Future) or technological improvements, such as microfluidics, may enable operating at even higher coolant temperatures, while general-purpose equipment requirements may remain unchanged. Zonal cooling’s architecture supports these changes by enabling operators to adjust loop temperatures and reconfigure cooling assignments as needed.

Forms of Zonal Cooling

Liquid cooling expands the allowable coolant supply temperature range and enables temperature-specific zones. This zonal approach can be applied at multiple layers:

  • Facility-level: Two distinct temperature zones within a datacenter—one for air-cooled equipment and another for liquid-cooled equipment.
  • Row-level: Tailor coolant temperature for each row based on deployed hardware (e.g., general-purpose vs GPU servers).
  • Rack-level: Enable multiple temperature zones within a single rack for fine-grained optimization across servers.
  • Chip-level: Apply zonal cooling inside the server. For example, use colder coolant for a GPU’s high-bandwidth memory (HBM) while supplying warmer coolant for the SoC and CPUs. This fine-grained approach can enable higher HBM stacking for improved performance, while avoiding unnecessary cooling overhead.

Microsoft is building facility-level zonal cooling in the next generation of its AI datacenters going live in 2028 and beyond, while exploring the other three approaches in the lab.  Facility-level zonal cooling is expected to reduce PUEs by up to 10%.

Benefits from Zonal Cooling

Zonal cooling is a strategic enabler for performance and efficiency. It can deliver:

  1. Improved energy efficiency and sustainability: By reducing the load on datacenter cooling infrastructure, zonal cooling improves energy efficiency as captured by annualized PUE, which measures average efficiency across all operating conditions. Lower annualized PUE means energy savings and lower carbon emissions.
  2. Increased server density: Tailored zonal cooling reduces peak cooling power demand during the hottest days, which in turn lowers peak PUE. Designers can leverage this reduction to reserve power for lower water temperatures (anticipating future accelerator needs), add more servers within the same utility power envelope, or contract less utility power per datacenter.
  3. Higher performance: Strategic control of coolant temperatures unlocks higher chip performance without sacrificing efficiency. For example, colder loops allow GPUs and CPUs to sustain elevated clock speeds via safe overclocking, while optimized memory cooling supports greater stacking density and increased bandwidth.
  4. Improved flexibility: With independent zones, operators can easily adjust coolant supply temperatures or reconfigure zones as new generations of hardware with varied cooling requirements emerge. This flexibility ensures compatibility with future innovations while maintaining optimal performance.

Looking Ahead

Zonal cooling represents a paradigm shift in datacenter thermal management. Its flexible, zone-specific approach to cooling air- and liquid-cooled IT equipment positions datacenters to efficiently adapt to future hardware innovations and workload diversity. As the industry continues to push boundaries in performance and sustainability, zonal cooling will be a foundational strategy for building performance and efficient infrastructure that meets tomorrow’s challenges.

Updated Apr 28, 2026
Version 1.0
No CommentsBe the first to comment