Overview
Over the years, Azure Functions has heard a growing number of customers highlighting the need for: improved performance, virtual network injection for a Consumption SKU, scale to zero and burst scale to 1,000's of instances. Similarly, in late 2021, the newly launched Azure Container Apps team was looking for an architecture with improved networking capabilities and performance in anticipation of future customer demands. In order to meet these requirements, the teams realized they needed to build out a new architecture. Thus, the idea of Legion was born: an Azure Linux based solution leveraging Azure Virtual Machine Scale Sets (VMSS) and containerd as the OCI runtime.
Legion has since been established as the backbone for a number of Azure's core PaaS offerings. It serves as the infrastructure for Azure Container App's Consumption workload profile, which was announced in May 2023, and was the first service to GA with the Legion architecture. Azure Functions Flex Consumption announced at this year's Build conference will also be running on Legion. Finally, Legion has enabled the creation of a new PaaS service, Azure Container Apps dynamic sessions (sessions), which allows customers to run untrusted code.
Legion design principles
- Must be secured with Hyper-V isolation boundary
- Must burst scale to 1000s of instances with VNET injection
- Use latest Azure Security standards and reliability standards with Availability Zones support
- Close feature parity with Kubernetes Pod API spec
- Operate at a growing scale
- Automate/optimize operational workstreams from day 0.
How Legion Works
Legion’s infrastructure manages a regional pool of Virtual Machines hosted in VMSS which can scale organically on demand. Each regional instance is comprised of multiple sub-units we call stamps, with each having its own data store and compute instances. Legion uses well established Azure services like
- Cosmos DB for all configuration store needs
- App Service Environment v3 for all API services
- Nested Virtualization enabled VMSS for all hosting needs
Legion exposes several APIs, which allows consumers to build Azure services on it. Some of the APIs worth noting
- Create, Update, Delete (CRUD) operations on Kubernetes Pod API object.
- CRUD operations on Pool Group
Other features of legion include:
- Startup boost for Functions which provides higher than requested resources for a pod during startup enabling improved performance
- Over provisioning of CPU resources which makes it very cost effective for Legion’s consumers enabling support for Consumption tier Skus
- Managing infrastructure security updates in an automated fashion, thereby drastically reducing the cost of operations to consumers of Legion.
Legion in Action
Azure Container Apps
Azure Container Apps was one of the first services we were able to light up with Legion which powers Consumption workload profiles. Legion's networking architecture was specially designed to support the needs of Azure Container Apps customers who had significant demand for support of User Defined Routes and for optimized IP address allocation by replicas to allow for reduced subnet size restrictions. Legion's design allowed Azure Container Apps to support these core enterprise networking requirements thereby unblocking our customers.
Azure Functions
Azure Functions team and Legion team co-designed several features to enhance Legion’s core value. One of the key items worth detailing is the work in Pool Groups. For all languages, functions have a stringent cold start goal. To meet this cold start metric across all languages and versions, along with supporting functions image update for all these variants, we needed to come up with a construct which helps functions declare all the parameters of the pool, as well as networking and upgrade policies.
Azure Container Apps Dynamic Sessions
Legion infrastructure and subsequent innovations on Pool Groups unblocked yet another nascent scenario, sessions. Sessions allows running untrusted code in a secure sandbox and enabled key scenarios like the Code Interpreter feature. This feature powers Microsoft CoPilot and is integrated with popular frameworks like LangChain.
Future of Legion
Microsoft is constantly investing in improving Legion architecture to improve our customers’ experiences and unlock new scenarios. Upcoming investments in Legion include workstreams to improve cold-start performance for Azure Container Apps, with faster image pulls, image caching and startup boost.
Conclusion
Legion has been instrumental in the development of Azure Container Apps, Azure Functions Flex Consumption, and sessions. We are excited to have more customer workloads running on Legion, helping us push its limits as we continue to improve the architecture.
References