performance and scalability
8 TopicsAnnouncing the open Public Preview of the Premium v2 tier of Azure API Management
Today, we are excited to announce the public preview of Azure API Management Premium v2 tier. Superior capacity, highest entity limits, unlimited included calls, and the most comprehensive set of features set the Premium apart from other API Management tiers. Customers rely on the Premium tier for running enterprise-wide API programs at scale, with high availability, and performance. The Premium v2 tier has a new architecture that eliminates management traffic from the customer VNet, making private networking much more secure and easier to setup. During the creation of a Premium v2 instance, you can choose between VNet injection or VNet integration (introduced in the Standard v2 tier) options. New and improved VNet injection Using VNet injection in Premium v2 no longer requires any network security groups rules, route tables, or service endpoints. Customers can secure their API workloads without impacting API Management dependencies, while Microsoft can secure the infrastructure without interfering with customer API workloads. In short, the new VNet injection implementation enables both parties to manage network security and configuration setting independently and without affecting each other. You can now configure your APIs with complete networking flexibility: force tunnel all outbound traffic on-premises, send all outbound traffic through an NVA, or add a WAF device to monitor all inbound traffic to your API Management Premium v2—all without constraints. Region availability The public preview of the Premium v2 tier is available only in 6 public regions (Australia East, East US2, Germany West Central, Korea Central, Norway East and UK South) and requires creating a new service instance. For pricing information and regional availability, please visit the API Management pricing page. Learn more API Management v2 tiers documentation API Management v2 tiers FAQ API Management overview documentationLogic App Standard - When High Memory / CPU usage strikes and what to do
Introduction Monitoring your applications is essential, as it ensures that you know what's happening and you are not caught by surprise when something happens. One possible event is the performance of your application starting to decrease and processing becomes slower than usual. This may happen due to various reasons, and in this blog post, we will be discussing the High Memory and CPU usage and why it affects your Logic App. We will also observe some possibilities that we've seen that have been deemed as the root cause for some customers. How High Memory and high CPU affects the processing When the instructions and information are loaded into Memory, they will occupy a space that cannot be used for other sets of instructions. The more memory is occupied, the more the Operative System will need to "think" to find the correct set of instructions and retrieve/write the information. So if the OS needs more time to find or write your instructions, the less time it will spend actually doing the processing. Same thing for the CPU. If the CPU load is higher, it will slow down everything, because the available workers are not able to "think" multiple items at the same time. This translates into the Logic App processing, by the overall slowness of performance. When the CPU or Memory reach a certain threshold, we start to see the run durations going up and internal retries increasing as well. This is because the runtime workers are busy and the tasks have timeout limits. For example, let's think of a simple run with a Read Blob built-in connector action, where the Blob is very large (let's say 400MB). The flow goes: Request Trigger -> Read blob -> send email The trigger has a very short duration and doesn't carry much overhead, because we're not loading much data on it. The Read Blob though, will try to read the payload into Memory (because we're using a Built-in Connector, and these load all the information into Memory). Built-in connector overview - Azure Logic Apps | Microsoft Learn So, not considering background processes, Kudu and maintenance jobs, we've loaded 400MB into memory. Using a WS1 plan, we have 3.5GB available. By just having a blank Logic App, you will see some memory occupied, although it may vary. So, if we think it takes 500MB for the base runtime and some processes, it leaves us with 3GB available. If we load 4 files at the same time, we will be using ~1.8GB (files + base usage). Already using about 50% of the memory. And this is just for one workflow and 4 runs. Of course the memory is released after the run completes, but if you think on a broader scale, with multiple runs and multiple actions at the same time, you see how easy it is to reach the thresholds. When we see memory over ~70%, the background tasks may behave in unexpected ways, so it's essential to have a clear idea on how your Logic App is working and what data you're loading into it. Same thing for CPU. The more you load into it, the slower it gets. You may have low memory usage, but if you're doing highly complex tasks such as XML transformations or some other built-in data transforms, your CPU will be heavily used. And the bigger the file and the more complex the transformation, the more CPU will be used. How to check memory/CPU Correctly monitoring your resources usage is vital and can avoid serious impact. To help your Standard logic app workflows run with high availability and performance, the Logic App Product Group has created the Health Check feature. This feature is still in Preview, but it's already a very big assistance in monitoring. You can read more about it, in the following article, written by our PG members, Rohitha Hewawasam and Kent Weare: Monitoring Azure Logic Apps (Standard) Health using Metrics And also the official documentation for this feature: Monitor Standard workflows with Health Check - Azure Logic Apps | Microsoft Learn The Metrics can also assist in providing a better view on the current usage. Logic App Metrics don't drill down on CPU usage, because those metrics are not available at App level, but rather at App Service Plan level. You will be able to see the working memory set and Workflow related metrics. Example metric: Private Bytes (AVG) - Logic App metrics On the AppService Plan overview, you will be able to see some charts with these metrics. It's an entry point to understand what is currently going on with your ASP and the current health status. Example: ASP Dashboard In the Metrics tab, you are able to create your own charts with a much greater granularity and also save as Dashboards. You're also able to create Alerts on these metrics, which greatly increases your ability to effectively monitor and act on abnormal situations, such as High Memory usage for prolonged periods of time. Example: Memory Percentage (Max) - ASP Metrics Currently there are multiple solutions provided to analyze your Logic App behavior and metrics, such as Dashboards and Azure Monitor Logs. I highly recommend reading these two articles from our PG that discuss these topics and explain and exemplify this: Logic Apps Standard Monitoring Dashboards | Microsoft Community Hub Monitoring Azure Logic Apps (Standard) with Azure Monitor Logs How to mitigate - a few possibilities Platform settings on 32 bits If your Logic App was created long back, it may be running on an old setting. Early Logic Apps were created with 32 bits, which severely limited the Memory usage scalability, as this architecture only allows a maximum of 3GB of usage. This comes from the Operative System limitations and memory allocation architecture. After some time, the standard was to create Logic Apps in 64 bits, which allowed the Logic App to scale and fully use the maximum allowed memory in all ASP tiers (up to 14GB in WS3). This can be checked and updated in the Configuration tab, under Platform settings. Orphaned runs It is possible that some runs do not finish due to various possibilities. Either they are long running or have failed due to unexpected exceptions, runs that linger in the system will cause an increase in memory usage, because the information will not be unloaded from Memory. When the runs become orphaned, they may not be spotted but they will remain "eating up" resources. The easiest way to find these runs, is to check the workflows and under the "Running" status, check which ones are still running well after the expected termination. You can filter the Run History by Status and use this to find all the runs that are still in "Running". In my example, I had multiple runs that had started hours before, but were not yet finished. Although this is a good example, it requires you to check each workflow manually. You can also do this by using Log Analytics and execute a query to return all the Runs that are not yet finished. You need to activate the Diagnostic Settings, as mentioned in this blog post: Monitoring Azure Logic Apps (Standard) with Azure Monitor Logs To make your troubleshooting easier, I've created a query that does this for you. It will check only for the Runs and return those that do not have a matching Completed status. The OperationName field will register the Start, Dispatch and Completed status. By eliminating the Dispatched status, we're left with Start and Completed. Therefore, this query should return all the RunIDs that have a Start but not a matching Completed status, as it groups them by counting the RunIDs. LogicAppWorkflowRuntime | where OperationName contains "WorkflowRun" and OperationName != 'WorkflowRunDispatched' | project RunId, WorkflowName, TimeGenerated,OperationName, Status, StartTime,EndTime | summarize Runs = count() by RunId, WorkflowName, StartTime | where Runs == 1 | project RunId, WorkflowName, StartTime Large payloads As previously discussed, large payloads can create a big overload and increase greatly the memory usage. This applies not only to the Built-in Connectors, but also to the Managed connectors, as the information stills needs to be processed. Although the data is not loaded into the ASP memory when it comes to Managed connectors, there is a lot of data flowing and being processed in CPU time. A Logic App is capable of processing a big amount of data, but when you combine large payloads, a very large number of concurrent runs, along with large number of actions and Incoming/Outgoing requests, you get a mixture that if left unattended and continues to scale up, will cause performance issues over time. Usage of built-in connectors The Built-in Connectors (or In App) are natively run in the Azure Logic Apps runtime. Due to this, the performance, capabilities and pricing are better, in most cases. Because they are running under the Logic App Runtime, all the data will be loaded in-memory. This requires you to do a good planning of your architecture and forecast for high levels of usage and heavy payload. As previously shown, using Built-in connectors that handle very large payloads, can cause unexpected errors such as Out of Memory exceptions. The connector will try to load the payload into memory, but if starts to load and the memory is no longer available, it can crash the worker and return an Out Of Memory exception. This will be visible in the Runtime logs and it may also lead to the run becoming orphaned, as it get stuck in a state that is not recoverable. Internally, the Runtime will attempt to gracefully retry these failed tasks, and it will retry multiple times. But there is always the possibility that the state is not recoverable and thus the worker crashes. This makes it also necessary to closely monitor and plan for high usage scenarios, in order to properly scale Up and Out your App Service Plans. Learnings Monitoring can be achieved as well through the Log Stream, which requires you to configure a Log Analytics connection, but can provide a great deal of insight of what the Runtime is doing. It can give you Verbose level or simply Warning/Error levels. It does provide a lot of information and can be a bit tricky to read, but the level of detail it provides, can be a huge assistance in troubleshooting from your side and from the Support side. For this, you can navigate to your Log Stream tab, enable it, change to "Filesystem Logs" and enjoy the show. If an Out of Memory exception is caught, it will show up in red letters (as other exceptions show), and will be something similar to this: Job dispatching error: operationName='JobDispatchingWorker.Run', jobPartition='', jobId='', message='The job dispatching worker finished with unexpected exception.', exception='System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown. at System.Threading.Thread.StartCore() at System.Threading.Thread.Start(Boolean captureContext) at System.Threading.Thread.Start() No PROD Logic App should be left without monitoring and alerting. Being a critical system or not, you should always plan not only for disaster scenarios but also for higher than usual volumes, because nothing is static and there's always the possibility that the system that today has a low usage, will be scaled and will be used in some way that it was not intended to. For this, implementing the monitoring on the resources metrics is very valuable and can detect issues before they get overwhelming and cause a show stopper scenario. You can use the Metrics from the Logic App that are provided out of the box, or the metrics in the ASP. These last metrics will cover a wider range of signals, as it's not as specific as the ones from the Logic App. You can also create custom Alerts from the Metrics and thus increasing your coverage on distress signals from the Logic App processing. Leaving your Logic App without proper monitoring will likely catch you, your system administrators and your business by surprise when the processing falls out of the standard parameters and chaos starts to arise. There is one key insight that must be applied whenever possible: expect the best, prepare for the worst. Always plan ahead, monitor the current status and think proactively and not just reactively. Disclaimer: The base memory and CPU values are specific for your app, and it can vary based on number of apps in App Service Plan, the number of instances you have as Always Running, etc, and number of workflows in the app, and how complex these workflows are and what internal jobs needs to be provisioned.Autoscaling Now Available in Azure API Management v2 Tiers
Gateway-Level Metrics: Deep Insight into Performance Azure API Management now exposes fine-grained metrics for each Azure API management v2 gateway instance, giving you more control and observability. These enhancements give you deeper visibility into your infrastructure and the ability to scale automatically based on real-time usage—without manual effort. Key Gateway Metrics CPU Percentage of Gateway – Available in Basic v2, Standard v2, and Premium v2 Memory Percentage of Gateway – Available in Basic v2 and Standard v2 These metrics are essential for performance monitoring, diagnostics, and intelligent scaling. Native Autoscaling: Adaptive, Metric-Driven Scaling With gateway-level metrics in place, Azure Monitor autoscale rules can now drive automatic scaling of Azure API Management v2 gateways. How It Works You define scaling rules that automatically increase or decrease gateway instances based on: CPU percentage Memory percentage (for Basic v2 and Standard v2) Autoscale evaluates these metrics against your thresholds and acts accordingly, eliminating the need for manual scaling or complex scripts. Benefits of Autoscaling in Azure API management v2 tiers Autoscaling in Azure API Management brings several critical benefits for operational resilience, efficiency, and cost control: Reliability Maintain consistent performance by automatically scaling out during periods of high traffic. Your APIs stay responsive and available—even under sudden load spikes. Operational Efficiency Automated scaling eliminates manual, error-prone intervention. This allows teams to focus on innovation, not infrastructure management. Cost Optimization When traffic drops, auto scale automatically scales in to reduce the number of gateway instances—helping you save on infrastructure costs without sacrificing performance. Use Case Highlights Autoscaling is ideal for: APIs with unpredictable or seasonal traffic Enterprise systems needing automated resiliency Teams seeking cost control and governance Premium environments that demand always-on performance Get Started Today Enabling autoscaling is easy via the Azure Portal: Open your API Management instance Go to Settings > Scale out (Autoscale) Enable autoscaling and define rules using gateway metrics Monitor performance in real time via Azure Monitor Configuration walkthrough: Autoscale your Azure API Management v2 instance