Blog Post

Apps on Azure Blog
5 MIN READ

Announcing General Availability of Azure App Service Automatic Scaling

gauravseth's avatar
gauravseth
Icon for Microsoft rankMicrosoft
Mar 25, 2024

Azure App Service is pleased to announce general availability of the "Automatic Scaling" feature. We received important feedback about the Automatic Scaling feature during the preview phase and have made following enhancements to the Automatic Scaling feature.  

 

  • Automatic scaling is available for Premium V2 (P1V2, P2V2, P3V2) and Premium V3 (P0V3, P1V3, P2V3, P3V3, P1MV3, P2MV3, P3MV3, P4MV3, P5MV3) pricing tiers, and supported for all app types: Windows, Linux, and Windows container. 
  • A new metric viz. "Automatic Scaling Instance Count" is now available for web apps where Automatic Scaling is enabled. "Automatic Scaling Instance Count" will report the number of virtual machines on which the app is running including the pre-warmed instance if it is deployed.

In addition to these enhanced capabilities please remember that the automatic scaling feature continues to support following key capabilities:

 

  • The App Service platform will automatically scale out the number of running instances of your application to keep up with the flow of incoming HTTP requests, and automatically scale in your application by reducing the number of running instances when incoming request traffic slows down.
  • Developers can define per web app scaling and control the minimum number of running instances per web app.
  • Developers can control the maximum number of instances that an underlying app service plan can scale out to. This ensures that connected resources like databases do not become a bottleneck once automatic scaling is triggered.
  • Enable or disable automatic scaling for existing app service plans, as well as apps within these plans.
  • Address cold start issues for your web apps with pre-warmed instances. These instances act as a buffer when scaling out your web apps.
  • Automatic scaling is billed on per second basis and uses the existing Pv2 and Pv3 billing meters.
  • Pre-warmed instances are also charged on per second basis using the existing Pv2 and Pv3 billing meters once it's allocated for use by your web app.

 

For more information about Automatic Scaling Feature please refer to:

 

How to enable automatic scaling - Azure App Service | Microsoft Learn

Azure App Service Automatic Scaling - Microsoft Community Hub

 

The section below showcases the "Performance Benefits" for a web app deployed to App Services with "Automatic Scaling" feature enabled.

 

*** It is important to note that the "Performance Benefits" of enabling Automatic Scaling for a web app may vary across scenarios. The "Performance Benefits" depend on multiple factors like the web app architecture and configuration, App service plan pricing tier, database load and pricing tier to name a few. 

 

For the sample scenario I have deployed an ASP. Net framework 4.8 web app to App Service and this web app is connected to an Azure SQL database. The App Service plan to which this web app is deployed uses P0V3 pricing tier SKU.

 

Initially the web application is scaled manually to 2 instances of P0V3. Refer to screenshot below:

 

Let us now configure HTTP load test for our sample web app using Azure Load Test with following configuration:

 

 

*** The load test configuration remains same for subsequent HTTP load tests.

 

Once the load test begins execution you may use Live Metrics - Application Insights to view the number of instances currently available for your web application. Refer to the screenshot below:

 

 

 

The screenshot below showcases important client and server metrics from the Azure Load Test service once the load test is completed:

 

We have now enabled "Automatic Scaling" on the same web app feature as shown below.

 

We have configured "Maximum Burst" (maximum number of instances any web app within the plan can scale out to) for the App Service Plan as 4, Always ready instances (number of always ready instances for this specific web app) as 2 and Maximum scale limit (maximum number of instances this specific web app can scale out to even if Maximum Burst for the App Service Plan is higher) for this specific web app as 4.

 

*** Please note that "Maximum scale limit" value can be lesser than "Maximum Burst" value of the App Service plan. This may be applicable in scenarios where you want to avoid throttling as your backend ex: database may not be able to scale as fast as your web app.

 

 

Let us rerun the HTTP load test for our sample web app using Azure Load Test with following configuration:

 

Once the load test begins execution you may use Live Metrics - Application Insights to view the number of instances currently available for your web application. Refer to the screenshot below:

 

 

The screenshot below showcases important client and server metrics from the Azure Load Test service once the load test is completed for the same web app with "Automatic Scaling" enabled:

 

 

Screenshots below showcase side-by-side Comparision of the same client and server-side metrics for the same web app when it is scaled out manually (PINK) and when it is scaled out using Automatic Scaling (BLUE):

 

 

 

 

Although the load test results above show significant performance benefits of enabling Automatic Scaling, we can further improve the performance of your web app by making simple yet effective changes like enabling Redis cache to reduce database roundtrips, enabling CDN for static content ex: images\videos, moving static content to Azure Files, reducing page size or even making required code changes. 

 

The home page for the sample web app used for this blog post loads multiple images stored on the App Service storage. I simply compressed the images to reduce the overall size of the web app home page and again executed the load test using similar configuration.

 

Please refer to the screenshots below that showcase side-by-side comparison of the client and server-side metrics for the same web app when it is scaled out manually (PINK) and when it is scaled out using Automatic Scaling (BLUE) after image compression:

 

You can also view the number of instances your web app scaled out to during an Automatic Scaling event. "Automatic Scaling Instance Count" metric will report the number of virtual machines on which the app is running including the pre-warmed instance if it is deployed. This metric can also be used to track the maximum number of instances your web app scaled out during an Automatic Scaling event. This metric is available only for the apps that have Automatic Scaling enabled.

 

 

 

To summarize based on client and server-side metrics showcased above it is clearly evident that enabling Automatic Scaling provides significant performance improvements for any web app without major code and\or code changes.

 

In addition to easy to use yet powerful scale out\in capabilities provided by the Automatic Scaling feature it also opens up yet another critical app migration and modernization scenario. You can easily migrate and modernize your existing ASP. Net or Java web app to App Service using Migration Tools and then simply enable Automatic Scaling for the specific web app to enable seamless scale out\in capabilities improving end user customer and delight.

Updated Mar 20, 2024
Version 1.0
  • Jayendran's avatar
    Jayendran
    Iron Contributor

    Thanks gauravseth, for this cool feature! I have a query about the no of apps in a single-app service plan. I assume the data above is used where a single app service is deployed on a App service plan. hence the automatic scaling will work great for that particular app service.

     

    But let's say I have 10 different app services shared under a single app service plan. Then this automatical scale be the right choice?  Because some app services will have different traffic(s) when compared to others. So enabling this automatic scaling would be beneficial? Is there any data or benchmark that you performed on this scenario?

  • Stolpe's avatar
    Stolpe
    Copper Contributor

    Is there a built-in way to scale up based on rules or a schedule? Let's say you know there will be a large single operation around a certain time (eg a file being downloaded, decrypted and handled by a WebJob. totally just a hypothetical scenario), and you need additional memory available to a single instance around that time. Then scaling out would not be the solution here.