By Microsoft Azure and Intel
In a competitive landscape, Microsoft Azure, like other major cloud service providers, must continuously balance two competing objectives: maximizing performance and improving power efficiency. By using power more effectively, Azure can deploy more servers within its existing datacenter footprint to quickly meet growing customer compute demands and improve sustainability.
While power management encompasses a broad range of technologies, this article focuses on uncore power management, which targets components outside the CPU cores but within the processor package. The uncore domain includes the mesh interconnect, memory controllers, and I/O subsystem.
Figure 1: An illustration of a diurnal workload’s resource utilization
The Need for Uncore Power Management
Cloud servers often operate under low load due to diurnal resource utilization patterns (e.g., user-facing workloads such as Microsoft Teams), which exhibit reduced demand during weeknights and weekends, as shown in Figure 1. In addition, customers often provision VMs for peak demand, causing servers to run under reduced load during off-peak periods.
Even under reduced load, server CPUs continue to consume significant power. Although idle cores can enter deep low-power states (e.g., core C6), the uncore typically remains active and operates at its highest frequency, as the presence of even a single active core prevents it from entering an idle state. The few active cores may be running background Azure server agents for monitoring and maintenance, which generally have relaxed performance requirements. Moreover, workloads operating under reduced load can often tolerate slightly higher latency without degrading tail performance. Together, these characteristics make it feasible to leverage active low-power techniques, such as reducing uncore frequency, to improve power efficiency.
While modern CPUs support dynamic uncore frequency scaling, software-only approaches to reducing uncore frequency under low load are limited in effectiveness, as they struggle to respond quickly to sudden bursts of workload activity.
Hardware/Software Co-design For Improving CPU Power Efficiency
Intel and Microsoft Azure co-designed Efficiency Latency Control (ELC), a mechanism for managing uncore frequency that is now available on Intel Xeon 6 (Granite Rapids) processors. The implementation allows software to define CPU utilization thresholds and their corresponding uncore frequency targets, which are communicated to the CPU firmware for enforcement. This division of responsibility enables software to tailor power–performance behavior to workload characteristics, while the hardware ensures fast and reliable execution of the frequency control logic.
ELC mode allows software to specify three uncore frequency points—Low, Mid, and High—along with two CPU utilization thresholds -- Low and High. When utilization is at or below the Low threshold, firmware sets the uncore frequency to the defined minimum value, thereby maximizing power savings. As utilization rises above the Low threshold, the frequency is increased to the Mid-level, balancing performance and power efficiency. Finally, when utilization exceeds the High threshold, the uncore frequency is increased up to the defined maximum, subject to package power constraints, to meet performance demands under heavy load.
Figure 2 illustrates several ELC configuration strategies, each representing a different tradeoff between latency and power efficiency. Config #1 prioritizes latency by maintaining a consistently high uncore frequency across all CPU utilization levels, mirroring the default high-performance mode. This delivers optimal responsiveness but incurs higher power consumption, particularly under low-load conditions. Config #2 lowers the uncore frequency under very low utilization, improving power efficiency when background tasks (e.g., agents) are active and VMs are largely idle. Finally, Config #3 offers a balanced approach, allowing moderate frequency scaling at low load to conserve power while maintaining acceptable performance. This configuration is appropriate when a slight tradeoff in responsiveness is tolerable in exchange for improved power efficiency. The Perf/Watt Optimized curve represents the ideal dynamic scaling behavior, adjusting uncore frequency to maximize performance per watt across varying workload intensities.
Real-World Impact
ELC mode provides compelling benefits:
Figure 3: Power and performance impact for SPEC CPU Integer benchmark suite- ELC reduces power consumption by up to 11% under iso-performance for moderate loads. Figure 3 shows the performance and power impact of ELC Config #1 (latency-optimized) and Config #3 on the SPEC CPU Integer benchmark suite under moderate load, where only a subset of CPU cores are active while the rest remain idle. As the figure illustrates, Config #3 achieves comparable performance to Config #1 while reducing power consumption by up to 11% (9% on average). At higher loads (not shown), the power savings of Config #3 diminish, as the uncore must operate at higher frequencies to match the performance of Config #1.
Figure 4: Performance/watt improvement under lightly loaded storage operations
- ELC provides up to 1.5× improvement in performance per watt under very low load. Figure 4 compares the performance-per-watt of ELC Config #1 and Config #3 under very low storage loads. Config #1 maintains a consistently high uncore frequency, which limits efficiency. In contrast, Config #3 can lower the uncore frequency to the Low setting under light load, slightly reducing absolute performance but achieving substantially higher performance per watt.
These results demonstrate that ELC’s configurability can deliver performance comparable to latency-optimized mode with significantly higher power efficiency, enabling Azure to increase server deployments within its existing datacenter power footprint to quickly meet customer compute demands while also improving sustainability.
Looking Forward
As cloud workloads continue to evolve, the importance of hardware–software co-design in enabling adaptive infrastructure will increase. The integration of hardware and software controls for CPU uncore frequency management marks a significant step towards improving server power and energy efficiency. Looking ahead, further collaboration between Microsoft Azure and hardware vendors will unlock new opportunities for efficiency, sustainability, and cost effectiveness.
Appendix
ELC mode details: Intel® Xeon® 6 Processors - Performance and Power Profiles - Default, Latency-Optimized Mode, and Other Options Technical Article