Every Copilot prompt or Microsoft Teams call runs on infrastructure most users never see. Behind these experiences across general purpose compute and AI is a purpose-built, end-to-end hardware and software system engineered specifically for modern AI workloads. Microsoft’s investment in custom silicon and systems is designed to enable infrastructure optimized for performance, power-efficiency, and cost.
In Silicon to Systems, Microsoft leaders and engineers walk through how custom silicon, servers, accelerators, and data centers are designed as a single integrated stack to support AI at global scale.
Silicon as the Foundation
At the core of our approach is custom silicon spanning CPUs, AI accelerators, Network accelerators and security modules. In Silicon to Systems, we share more about the recently announced Cobalt 200, a custom system-on-chip (SoC) deployed in Azure servers, forming a key component used across Microsoft products and services. Each Cobalt 200 chip contains 132 cores, supported by hardware and software that is designed to support secure compute sharing across multiple workloads. The server itself is designed to support two independent system deployments per blade, supporting improved power efficiency and compute density while supporting large-scale cloud workloads.
Purpose‑Built AI Acceleration
AI workloads introduce significantly higher power and thermal demands, driving the need for specialized accelerator systems. Microsoft's Maia AI Accelerator platform addresses the power and thermal demands of AI workloads at module, server, and rack level, integrated with closed-loop liquid cooling. Coolant flows directly across the surface of the chip and is continuously recirculated, supporting higher power delivery with no additional water consumption under designed operating conditions. This cooling design allows AI workloads to run at scale while maintaining thermal and resource efficiency.
From Architecture to Deployment
Building custom silicon is a multi‑year engineering process that starts with identifying workload requirements and defining both the silicon and overall system and silicon architecture. Before first silicon is ever manufactured, Microsoft brings up the full hardware platform and software stack in parallel, using pre‑silicon models and emulation to validate system architecture, enable early software development, and support the goal of having the platform ready when the silicon arrives. In-house silicon designs are sent to foundry partners for wafer and interconnect fabrication and packaging. Assembled chips undergo testing, post-silicon validation, and hardware-software bring-up in Microsoft labs while in parallel racks are built for datacenter deployment. Each chip contains billions of transistors and interconnections, designed to meet power, performance, and reliability targets through extensive validation.
Silicon, servers, and our datacenters are designed together, in an integrated fashion. Rack layouts, power density limits, and cooling capacity are co‑engineered with our hardware designs. Our focus on power efficiency spans from the datacenter all the way to the lowest levels of the silicon. Each Cobalt 200 core can operate at different performance points, allowing fine‑grained control of power consumption, while closed‑loop cooling systems manage thermal loads efficiently at the facility level.
An Integrated Stack for AI Workloads
Rather than optimizing individual components in isolation, Microsoft engineers silicon, servers, accelerators, networking, cooling, and data centers as a single system. Together, silicon including Cobalt 200 and Maia 200 form the infrastructure layer that helps enable modern AI experiences, including Copilot.
Watch Silicon to Systems to explore how Microsoft builds AI infrastructure, from custom silicon to global data centers: aka.ms/silicontosystems.