azure load testing

71 Topics

AI-assisted load test authoring in Azure App Testing
Creating reliable load tests shouldn’t require hours of manual scripting. With AI-assisted load test authoring in Azure App Testing, you can go from a simple browser recording to a production-ready JMeter script in minutes, while staying fully in control of what gets applied. This new experience helps you: Create load tests faster by recording real user journeys directly from the browser Improve script quality automatically with AI-recommended best practices Run more realistic tests that better reflect real user behavior Reduce manual effort without giving up transparency or control Record once. Let AI enhance the script. Using the Azure App Testing browser extension for Edge and Chrome, you can record how users interact with your application. Once uploaded to Azure Load Testing, AI analyzes the recording and suggests improvements you can review and apply with a click. AI helps by: Adding smart labels so scripts and test results are easier to understand Applying think times based on actual user interactions Suggesting correlations for dynamic values to prevent test failures at scale Identifying parameterization opportunities to simulate diverse users and data You can accept, edit, or skip recommendations and still manually fine-tune the script if needed. Run at scale with confidence Once your script is ready, configure load, ramp-up, and duration, and run the test at scale using Azure Load Testing. A JMeter script is generated automatically and can be downloaded for further customization. The result is faster test creation, higher-quality scripts, and more meaningful performance insights. Get started AI-assisted load test authoring is available today in Azure Load Testing. Install the Azure App Testing browser extension, record a user journey, and create realistic load tests with less effort and better results. Learn more about the feature here. Tell us what’s working and what we can improve on developer community or directly from the in-product feedback option. Your feedback helps shape the future of AI-assisted load testing. Happy Load Testing!
Nikita_Nallamothu
Dec 31, 2025 Place Apps on Azure Blog
417Views
3likes
1Comment
Stop Running Runbooks at 3 am: Let Azure SRE Agent Do Your On-Call Grunt Work
Your pager goes off. It's 2:47am. Production is throwing 500 errors. You know the drill - SSH into this, query that, check these metrics, correlate those logs. Twenty minutes later, you're still piecing together what went wrong. Sound familiar? The On-Call Reality Nobody Talks About Every SRE, DevOps engineer, and developer who's carried a pager knows this pain. When incidents hit, you're not solving problems - you're executing runbooks. Copy-paste this query. Check that dashboard. Run these az commands. Connect the dots between five different tools. It's tedious. It's error-prone at 3am. And honestly? It's work that doesn't require human creativity but requires human time. What if an AI agent could do this for you? Enter Azure SRE Agent + Runbook Automation Here's what I built: I gave SRE Agent a simple markdown runbook containing the same diagnostic steps I'd run manually during an incident. The agent executes those steps, collects evidence, and sends me an email with everything I need to take action. No more bouncing between terminals. No more forgetting a step because it's 3am and your brain is foggy. What My Runbook Contains Just the basics any on-call would run: az monitor metrics – CPU, memory, request rates Log Analytics queries – Error patterns, exception details, dependency failures App Insights data – Failed requests, stack traces, correlation IDs az containerapp logs – Revision logs, app configuration That's it. Plain markdown with KQL queries and CLI commands. Nothing fancy. What the Agent Does Reads the runbook from its knowledge base Executes each diagnostic step Collects results and evidence Sends me an email with analysis and findings I wake up to an email that says: "CPU spiked to 92% at 2:45am, triggering connection pool exhaustion. Top exception: SqlException (1,832 occurrences). Errors correlate with traffic spike. Recommend scaling to 5 replicas." All the evidence. All the queries used. All the timestamps. Ready for me to act. How to Set This Up (6 Steps) Here's how you can build this yourself: Step 1: Create SRE Agent Create a new SRE Agent in the Azure portal. No Azure resource groups to configure. If your apps run on Azure, the agent pulls context from the incident itself. If your apps run elsewhere, you don't need Azure resource configuration at all. Step 2: Grant Reader Permission (Optional) If your runbooks execute against Azure resources, assign Reader role to the SRE Agent's managed identity on your subscription. This allows the agent to run az commands and query metrics. Skip this if your runbooks target non-Azure apps. Step 3: Add Your Runbook to SRE Agent's Knowledge base You already have runbooks, they're in your wiki, Confluence, or team docs. Just add them as .md files to the agent's knowledge base. To learn about other ways to link your runbooks to the agent, read this Step 4: Connect Outlook Connect the agent to your Outlook so it can send you the analysis email with findings. Step 5: Create a Subagent Create a subagent with simple instructions like: "You are an expert in triaging and diagnosing incidents. When triggered, search the knowledge base for the relevant runbook, execute the diagnostic steps, collect evidence, and send an email summary with your findings." Assign the tools the agent needs: RunAzCliReadCommands – for az monitor, az containerapp commands QueryLogAnalyticsByWorkspaceId – for KQL queries against Log Analytics QueryAppInsightsByResourceId – for App Insights data SearchMemory – to find the right runbook SendOutlookEmail – to deliver the analysis Step 6: Set Up Incident Trigger Connect your incident management tool - PagerDuty, ServiceNow, or Azure Monitor alerts and setup the incident trigger to the subagent. When an incident fires, the agent kicks off automatically. That's it. Your agentic workflow now looks like this: This Works for Any App, Not Just Azure Here's the thing: SRE Agent is platform agnostic. It's executing your runbooks, whatever they contain. On-prem databases? Add your diagnostic SQL. Custom monitoring stack? Add those API calls. The agent doesn't care where your app runs. It cares about following your runbook and getting you answers. Why This Matters Lower MTTR. By the time you're awake and coherent, the analysis is done. Consistent execution. No missed steps. No "I forgot to check the dependencies" at 4am. Evidence for postmortems. Every query, every result, timestamped and documented. Focus on what matters. Your brain should be deciding what to do not gathering data. The Bottom Line On-call runbook execution is the most common, most tedious, and most automatable part of incident response. It's grunt work that pulls engineers away from the creative problem-solving they were hired for. SRE Agent offloads that work from your plate. You write the runbook once, and the agent executes it every time, faster and more consistently than any human at 3am. Stop running runbooks. Start reviewing results. Try it yourself: Create a markdown runbook with your diagnostic queries and commands, add it to your SRE Agent's knowledge base, and let the agent handle your next incident. Your 3am self will thank you.
dchelupati
Dec 20, 2025 Place Apps on Azure Blog
844Views
0likes
0Comments
AI-Powered Performance Testing
Performance testing is critical for delivering reliable, scalable applications. We have been working on AI-driven innovations in Azure Load Testing that will change how you author and analyze load tests. AI-Assisted Authoring of JMeter Scripts Writing high-quality load test scripts has traditionally required deep expertise. From setting correlations and think times to properly parameterizing inputs, it requires significant time and effort. This manual effort slows teams down, especially when they must recreate real-world scenarios under tight deadlines. With our new AI-assisted authoring, that changes. Now you can simply record your application journey, and Azure Load Testing will do the heavy lifting: Record your scenarios using the browser extension AI automatically suggests correlations to handle dynamic values Intelligent parameterization for more realistic test data Smart request labelling to help you organize flows cleanly Recommended think times to match actual user behavior Once refined, a production-ready JMeter script is generated automatically. You can run this script immediately on Azure Load Testing with the scale and reliability you expect. You can create complex, realistic performance tests created in a fraction of the time, even if you’re not a JMeter expert. AI-Powered Actionable Insights Performance tests don’t stop at execution. Real value comes from understanding what happened and knowing what to do next. We have supercharged our insights experience with AI. Insights for Failed Test Runs: When a test fails, the first question is always: why? Now, Azure Load Testing uses AI to automatically analyze test run logs, detect the root cause, and provide clear guidance on what went wrong and how to fix it. Baseline Comparison Insights: Compare any test run against your defined baseline to immediately see what degraded, what improved, and which requests diverged from expected performance. It also helps understand the root cause for performance degradation. Focused Recommendations for Failed Test Criteria: If any of your pass/fail criteria fail, AI surfaces targeted recommendations so you can take corrective action quickly. You get meaningful insights, even when things don’t go as planned. No more staring at graphs trying to figure out what to do next. The Future of Load Testing Is Intelligent With AI assisting script creation and analyzing test outcomes end-to-end, Azure Load Testing now helps teams: Run real world performance tests faster Troubleshoot with confidence Reduce manual debugging The authoring capability will be available in the next couple of weeks. Meanwhile, you can try out AI-powered insights for your load test run to quickly analyze your results. Please share your feedback here. Happy Load Testing!
Nikita_Nallamothu
Nov 18, 2025 Place Apps on Azure Blog
402Views
4likes
0Comments
Scaling Azure Functions Python with orjson
Azure Functions now supports ORJSON in the Python worker, giving developers an easy way to boost performance by simply adding the library to their environment. Benchmarks show that ORJSON delivers measurable gains in throughput and latency, with the biggest improvements on small–medium payloads common in real-world workloads. In tests, ORJSON improved throughput by up to 6% on 35 KB payloads and significantly reduced response times under load, while also eliminating dropped requests in high-throughput scenarios. With its Rust-based speed, standards compliance, and drop-in adoption, ORJSON offers a straightforward path to faster, more scalable Python Functions without any code changes.
eroman
Oct 24, 2025 Place Apps on Azure Blog
458Views
0likes
0Comments
Running a Load Test within a Chaos Experiment
With Azure Chaos Studio and Azure Load Testing, you can simulate both — run a controlled load test while injecting faults into your application or infrastructure to understand how it behaves under stress. Together, they help you find those resiliency blind spots — the cascading failures, retry storms, and degraded dependencies that only appear when your system is both busy and broken. For example: What if your database becomes read-only during peak user traffic? How does your API react if a downstream service starts returning 500s? Can your autoscaling rules recover fast enough? Let’s explore how you can run load tests from Azure Load Testing as part of a chaos experiment. Azure Chaos Studio + Azure Load Testing Integration Azure Chaos Studio has load test actions that let you integrate load testing directly into your chaos experiment flow. From the Chaos Studio fault library, you can find: Start load test (Azure Load Testing) Stop load test (Azure Load Testing) Triggers a load test from your Azure Load Testing resource as part of an experiment step. This means you can now orchestrate a sequence like: Start load test Inject a fault (e.g., shut down VM, throttle network, restart App Service) Observe and measure resiliency Stop the test and analyze metrics Chaos Experiment with Load Test Action Here’s how a typical experiment might look conceptually: Step 1. Define the experiment in Chaos Studio Create a new experiment that targets your application or infrastructure components — for example, an App Service or a SQL Database. Add the Start Load Test (Azure Load Testing) action: This tells Chaos Studio to kick off a load test from Azure Load Testing. Step 2. Add faults to simulate real-world failures You can follow up the load test action with a fault like: CPU pressure on your VM or container Network latency or packet loss injection Service shutdown of a dependent component Step 3. Observe and analyze Once the experiment runs, you can: View load test metrics (like response times, error rates, throughput) in Azure Load Testing View fault outcomes in Chaos Studio Correlate both using Application Insights or Log Analytics This gives a holistic view of performance and resiliency under controlled failure. By combining load and chaos, you can answer: How does latency or failure in one microservice affect end-to-end response times? Do retry policies or circuit breakers behave as expected under load? Does the system self-heal once the fault is removed? What’s the performance impact of failover mechanisms? Conclusion Chaos testing under load helps teams move from confidence to certainty. Azure’s native integration between Chaos Studio and Load Testing makes it easier than ever to build resiliency testing into your CI/CD pipeline — using only Azure-native services. Learn More Azure Chaos Studio documentation Azure Load Testing documentation
Nikita_Nallamothu
Oct 22, 2025 Place Apps on Azure Blog
355Views
0likes
0Comments
Azure Load Test Pricing
H , This is regarding Azure load testing pricing. Please advice. Virtual User Hour (VUH) usage 0 - 10,000 Virtual User Hours - $0.15/VUH10,000+ Virtual User Hours - $0.06/VUH Virtual User Hour (VUH) usage 0 - 10,000 Virtual User Hours - $0.15/VUH10,000+ Virtual User Hours - $0.06/VUH I am trying to understand above pricing. Lets say i want to run a test with 10k users just to login my website. This will max take 10 sec to complete. How will the pricing gets calculated? Regards, Sharukh
sharukh
Sep 04, 2025 Place Azure
200Views
0likes
3Comments
Load patterns for holiday shopping season testing
You want to make sure your application is prepared for all your new and returning customers during the holiday shopping season, right? Well, we have some tips for you.
Nikita_Nallamothu
Sep 04, 2025 Place Apps on Azure Blog
4.5KViews
4likes
1Comment
Introducing Azure App Testing: Scalable End-to-end App Validation
Azure App Testing lets you run AI-driven load and end-to-end web tests at cloud scale—no infrastructure to manage—all from one unified Azure Portal hub.
John_Stallo
Sep 03, 2025 Place Apps on Azure Blog
7.6KViews
1like
3Comments
Azure App Testing: Playwright Workspaces for Local-to-Cloud Test Runs
Clone the sample app and run Playwright tests locally, against an Azure Web App, and at cloud scale with Playwright Workspaces. This is complete with scripts, configs, and troubleshooting.
varghesejoji
Aug 26, 2025 Place Apps on Azure Blog
875Views
2likes
3Comments
Throughput Testing at Scale for Azure Functions
Introduction Ensuring reliable, high-performance serverless applications is central to our work on Azure Functions. With new plans like Flex Consumption expanding the platform’s capabilities, it's critical to continuously validate that our infrastructure can scale—reliably and efficiently—under real-world load. To meet that need, we built PerfBench (Performance Benchmarker), a comprehensive benchmarking system designed to measure, monitor, and maintain our performance baselines—catching regressions before they impact customers. This infrastructure now runs close to 5,000 test executions every month, spanning multiple SKUs, regions, runtimes, and workloads—with Flex Consumption accounting for more than half of the total volume. This scale of testing helps us not only identify regressions early, but also understand system behavior over time across an increasingly diverse set of scenarios. of all Python Function apps across regions (SKU: Flex Consumption, Instance Size: 2048 – 1000 VUs over 5 mins, HTML Parsing test) Motivation: Why We Built PerfBench The Need for Scale Azure Functions supports a range of triggers, from HTTP requests to event-driven flows like Service Bus or Storage Queue messages. With an ever-growing set of runtimes (e.g., .NET, Node.js, Python, Java, PowerShell) and versions (like Python 3.11 or .NET 8.0), multiple SKUs and regions, the possible test combinations explode quickly. Manual testing or single-scenario benchmarks no longer cut it. The current scope of coverage tests. Plan PricingTier DistinctTestName FlexConsumption FLEX2048 110 FlexConsumption FLEX512 20 Consumption CNS 36 App Service Plan P1V3 32 Functions Premium EP1 46 Table 1: Different test combinations per plan based on Stack, Pricing Tier, Scenario, etc. This doesn’t include the ServiceBus tests. The Flex Consumption Plan There have been many iterations of this infrastructure within the team, and we’ve been continuously monitoring the Functions performance for more than 4 years now - with more than a million runs till now. But with the introduction of the Flex Consumption plan (Preview at the time of building PerfBench), we had to redesign the testing from ground up, as Flex Consumption unlocks new scaling behaviors and needed thorough testing—millions of messages or tens of thousands of requests per second—to ensure confidence in performance goals and regressions prevention. Consumption, Instance Size: 2048) PerfBench: High-Level Architecture Overview PerfBench is composed of several key pieces: Resource Creator – Uses meta files and Bicep templates to deploy receiver function apps (test targets) at scale. Test Infra Generator – Deploys and configures the system that actually does the load generation (e.g., SBLoadGen function app, Scheduler function app, ALT webhook function). Test Infra – The “brain” of testing, including the Scheduler, Azure Load Testing integration, and SBLoadGen. Receiver Function Apps – Deployed once per combination of runtime, version, region, OS, SKU, and scenario. Data Aggregation & Dashboards – Gathers test metrics from Azure Load Testing (ALT) or SBLoadGen, stores them in Azure Data Explorer (ADX), and displays trends in ADX dashboards. Below is a simplified architecture diagram illustrating these components: Components Resource Creator The resource creator uses meta files and Jinja templates to generate Bicep templates for creating resources. Meta Files: We define test scenarios in simple text-based files (e.g., os.txt, runtime_version.txt, sku.txt, scenario.txt). Each file lists possible values (like python|3.11 or dotnet|8.0) and short codes for resource naming. Template Generation: A script reads these meta files and uses them to produce Bicep templates—one template per valid combination—deploying receiver function apps into dedicated resource groups. Filters: Regex-like patterns in a filter.txt file exclude unwanted combos, keeping the matrix manageable. CI/CD Flow: Whenever we add a new runtime or region, a pull request updates the relevant meta file. Once merged, our pipeline regenerates Bicep and redeploys resources (these are idempotent updates). Test Infra Generator Deploys and configures the Scheduler Function App, SBLoadGen Durable Functions app, and the ALT webhook function. Similar CI/CD approach—merging changes triggers the creation (or update) of these infrastructure components. Test Infra: Load Generation, Scheduling, and Reporting Scheduler The conductor of the whole operation that runs every 5 minutes to load test configurations ( test_configs.json) from Blob Storage. The configuration includes details on what tests to run, at what time (e.g., “run at 13:45 daily”), and references to either ALT for HTTP or SBLoadGen for non-HTTP tests - to schedule them using different systems. Some tests run multiple times daily, others once a day; a scheduled downtime is built in for maintenance. HTTP Load Generator - Azure Load Testing (ALT) We utilize Azure Functions to trigger Azure Load Tests (ALT) for HTTP-based scenarios. ALT is a production-grade load generator tool that provides an easy to configure way to send load to different server endpoints using JMeter and Locust. We worked closely with the ALT team to optimize the JMeter scripts for different scenarios and it recently completed second year. We created an abstraction on top of ALT to create a webhook-approach of starting tests as well as get notified when tests finish, and this was done using a custom function app that does the following: Initiate a test run using a predefined JMX file. Continuously poll until the test execution is complete. Retrieve the test results and transform them into the required format. Transmit the formatted results to the data aggregation system. Sample ALT Test Run: 8.8 million requests in under 6 minutes, with a 90th percentile response time of 80ms and zero errors. The system maintained a throughput of 28K+ RPS. Some more details that we did within ALT - 25 Runtime Controllers manage the test logic and concurrency. 40 Engines handle actual load execution, distributing test plans. 1,000 Clients total for 5-minute runs to measure throughput, error rates, and latency. Test Types: HelloWorld (GET request, to understand baseline of the system). HtmlParser (POST request sending HTML for parsing to simulate moderate CPU usage). Service Bus Load Generator - SBLoadGen (Durable Functions) For event-driven scenarios (e.g., Service Bus–based triggers), we built SBLoadGen. It’s a Durable Function that uses the fan-out pattern to distribute work across multiple workers—each responsible for sending a portion of the total load. In a typical run, we aim to generate around one million messages in under a minute to stress-test the system. We intentionally avoid a fan-in step—once messages are in-flight, the system defers to the receiver function apps to process and emit relevant telemetry. Highlights: Generates ~1 million messages in under a minute. Durable Function apps are deployed regionally and are triggered via webhook. Implemented as a Python Function App using Model V2. Note: This would be open sourced in the coming days. Receiver Function Apps (Test apps) These are the actual apps receiving all the load generated. They are deployed with different combinations and updated rarely. Each valid combination (region + OS + runtime + SKU + scenario) gets its own function app, receiving load from ALT or SBLoadGen. HTTP Scenarios: HelloWorld: No-op test to measure overhead of the system and baseline. HTML Parser: POST with an HTML document for parsing (Simulating small CPU load). Non-HTTP (Service Bus) Scenario: CSV-to-JSON plus blob storage operations, blending compute and I/O overhead. Collected Metrics: RPS: Requests per second (RPS), success/error rates, latency distributions for HTTP workloads. MPPS: Messages processed per second (MPPS), success/error rates for non-HTTP (e.g. Service Bus) workloads. Data Aggregation & Dashboards Capturing results at scale is just as important as generating load. PerfBenchV2 uses a modular data pipeline to reliably ingest and visualize metrics from both HTTP and Service Bus–based tests. All test results flow through Event Hubs, which act as an intermediary between the test infrastructure and our analytics platform. The webhook function (used with ALT) and the SBLoadGen app both emit structured logs that are routed through Event Hub streams and ingested into dedicated Azure Data Explorer (ADX) tables. We use three main tables in ADX: HTTPTestResults for test runs executed via Azure Load Testing. SBLoadGenRuns for recording message counts and timing data from Service Bus scenarios. SchedulerRuns to log when and how each test was initiated. On top of this telemetry, we’ve built custom ADX dashboards that allow us to monitor trends in latency, throughput, and error rates over time. These dashboards provide clear, actionable views into system behavior across dozens of runtimes, regions, and SKUs. Because our focus is on long-term trend analysis, rather than real-time anomaly detection, this batch-oriented approach works well and reduces operational complexity. CI/CD Pipeline Integration Continuous Updates: Once a new language version or scenario is added to runtime_version.txt or scenario.txt meta files, the pipeline regenerates Bicep and deploys new receiver apps. The Test Infra Generator also updates or redeploys the needed function apps (Scheduler, SBLoadGen, or ALT webhook) whenever logic changes. Release Confidence: We run throughput tests on these new apps early and often, catching any performance regressions before shipping to customers. Challenges & Lessons Learned Designing and running this infrastructure hasn't been easy and we've learned a lot of valuable lessons on the way. Here are few Exploding Matrix - Handling every runtime, OS, SKU, region, scenario can lead to thousands of permutations. Meta files and a robust filter system help keep this under control, but it remains an ongoing effort. Cloud Transience - With ephemeral infrastructure, sometimes tests fail due to network hiccups or short-lived capacity constraints. We built in retries and redundancy to mitigate transient failures. Early Adoption - PerfBench was among the first heavy “customers” of the new Flex Consumption plan. At times, we had to wait for Bicep features or platform fixes—but it gave us great insight into the plan’s real-world performance. Maintenance & Cleanup - When certain stacks or SKUs near end-of-life, we have to decommission their resources—this also means regular grooming of meta files and filter rules. Success Stories Proactive Regression Detection: PerfBench surfaced critical performance regressions early—often before they could impact customers. These insights enabled timely fixes and gave us confidence to move forward with the General Availability of Flex Consumption. Production-Level Confidence: By continuously running tests across live production regions, PerfBench provided a realistic view of system behavior under load. This allowed the team to fine-tune performance, eliminate bottlenecks, and achieve improvements measured in single-digit milliseconds. Influencing Product Evolution: As one of the first large-scale internal adopters of the Flex Consumption plan, PerfBench served as a rigorous validation tool. The feedback it generated played a direct role in shaping feature priorities and improving platform reliability—well before broader customer adoption. Future Directions Open sourcing: We are in the process of open sourcing all the relevant parts of PerfBench - SBLoadGen, BicepTemplates generator, etc. Production Synthetic Validation and Alerting: Adapting PerfBench’s resource generation approach for ongoing synthetic tests in production, ensuring real environments consistently meet performance SLOs. This will also open up alerting and monitoring scenarios across production fleet. Expanding Trigger Coverage and Variations: Exploring additional triggers like Storage queues or Event Hub triggers to broaden test coverage. Testing different settings within the same scenario (e.g., larger payloads, concurrency changes). Conclusion PerfBench underscores our commitment to high-performance Azure Functions. By automating test app creation (via meta files and Bicep), orchestrating load (via ALT and SBLoadGen), and collecting data in ADX, we maintain a continuous pulse on throughput. This approach has already proven invaluable for Flex Consumption, and we’re excited to expand scenarios and triggers in the future. For more details on Flex Consumption and other hosting plans, check out the Azure Functions Documentation. We hope the insights shared here spark ideas for your own large-scale performance testing needs — whether on Azure Functions or any other distributed cloud services. Acknowledgements We’d like to acknowledge the entire Functions Platform and Tooling teams for their foundational work in enabling this testing infrastructure. Special thanks to the Azure Load Testing (ALT) team for their continued support and collaboration. And finally, sincere appreciation to our leadership for making performance a first-class engineering priority across the stack. Further Reading Azure Functions Azure Functions Flex Consumption Plan Azure Durable Funtions Azure Functions Python Developer Reference Guide Azure Functions Performance Optimizer Example case study: Github and Azure Functions Azure Load Testing Overview Azure Data Explorer Dashboards If you have any questions or want to share your own performance testing experiences, feel free to reach out in the comments!
Varad-Meru
May 29, 2025 Place Apps on Azure Blog
1.1KViews
0likes
0Comments