We continually work to improve performance and mitigate Azure Functions cold starts - the extra time it takes for a function that hasn’t been used recently to respond to an event. We understand that no matter when your functions were last called, you want fast executions and little lag time.
In this article:
In measuring Azure Functions performance, we prioritize the cold start of synchronous HTTP triggers in the Consumption and Flex Consumption hosting plans. That means looking at what our platform and Azure Functions host need to do to execute the first HTTP trigger function on a new instance. Then we improve it. We are also working to improve cold start for asynchronous scenarios.
To assess our progress, we run sample HTTP trigger function apps that measure cold start latencies for all supported versions of Azure Functions, in all languages, for both Windows and Linux Consumption. These sample apps are deployed in all Azure regions and subregions where Azure Functions runs. Our test function calls these sample apps every few hours to trigger a true cold start and currently generates nearly 85,000 daily cold start samples. Through this testing infrastructure we observed in past 18 months a reduction on cold start latency by approximately 53 percent across all regions and for all supported languages and platforms.
If any of the tracked metrics start to regress, we’re immediately notified and start investigating. Daily emails, alerts, and historical dashboards tell us the end-to-end cold start latencies across various percentiles. We also perform specific analyses and trigger alerts if our fiftieth percentile, ninety-ninth percentile, or maximum latency numbers regress.
In addition, we collect detailed PerfView profiles of the sample apps deployed in select regions. The breakdown includes full call stacks (user mode and kernel mode) for every millisecond spent during cold start. The profiles reveal CPU usage and call stacks, context switches, disk reads, HTTP calls, memory hard faults, common language runtime (CLR) just-in-time (JIT) compiler, garbage collector (GC), type loads, and many more details about .NET internals. We report all these details in our logging pipelines and receive alerts if metrics regress. And we’re always looking for ways to make improvements based on these profiles.
Since launching Azure Functions, we’ve improved performance across the Azure platform that it runs on, in order to achieve the observed reduction in cold starts. These enhancements extended to the shared platform with Azure App Service and the new Legion platform, the operating system, storage, .NET Core, and communication channels.
We aim to optimize for the ninety-ninth–percentile latency. We delve into cold start scenarios at the millisecond level and continually fine-tune the algorithms that allocate capacity. In short, we’re always working to improve Azure Functions cold start. The following areas are our current our focus:
Here are a few strategies you can follow to further improve cold starts for your apps:
If your Azure Functions app still doesn’t perform as well as you’d like, consider the following:
Note: This article is a modified version of the article originally published on Newsstack.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.