azure load testing
66 TopicsAzure Load Test Pricing
H , This is regarding Azure load testing pricing. Please advice. Virtual User Hour (VUH) usage 0 - 10,000 Virtual User Hours - $0.15/VUH10,000+ Virtual User Hours - $0.06/VUH Virtual User Hour (VUH) usage 0 - 10,000 Virtual User Hours - $0.15/VUH10,000+ Virtual User Hours - $0.06/VUH I am trying to understand above pricing. Lets say i want to run a test with 10k users just to login my website. This will max take 10 sec to complete. How will the pricing gets calculated? Regards, Sharukh96Views0likes3CommentsThroughput Testing at Scale for Azure Functions
Introduction Ensuring reliable, high-performance serverless applications is central to our work on Azure Functions. With new plans like Flex Consumption expanding the platform’s capabilities, it's critical to continuously validate that our infrastructure can scale—reliably and efficiently—under real-world load. To meet that need, we built PerfBench (Performance Benchmarker), a comprehensive benchmarking system designed to measure, monitor, and maintain our performance baselines—catching regressions before they impact customers. This infrastructure now runs close to 5,000 test executions every month, spanning multiple SKUs, regions, runtimes, and workloads—with Flex Consumption accounting for more than half of the total volume. This scale of testing helps us not only identify regressions early, but also understand system behavior over time across an increasingly diverse set of scenarios. of all Python Function apps across regions (SKU: Flex Consumption, Instance Size: 2048 – 1000 VUs over 5 mins, HTML Parsing test) Motivation: Why We Built PerfBench The Need for Scale Azure Functions supports a range of triggers, from HTTP requests to event-driven flows like Service Bus or Storage Queue messages. With an ever-growing set of runtimes (e.g., .NET, Node.js, Python, Java, PowerShell) and versions (like Python 3.11 or .NET 8.0), multiple SKUs and regions, the possible test combinations explode quickly. Manual testing or single-scenario benchmarks no longer cut it. The current scope of coverage tests. Plan PricingTier DistinctTestName FlexConsumption FLEX2048 110 FlexConsumption FLEX512 20 Consumption CNS 36 App Service Plan P1V3 32 Functions Premium EP1 46 Table 1: Different test combinations per plan based on Stack, Pricing Tier, Scenario, etc. This doesn’t include the ServiceBus tests. The Flex Consumption Plan There have been many iterations of this infrastructure within the team, and we’ve been continuously monitoring the Functions performance for more than 4 years now - with more than a million runs till now. But with the introduction of the Flex Consumption plan (Preview at the time of building PerfBench), we had to redesign the testing from ground up, as Flex Consumption unlocks new scaling behaviors and needed thorough testing—millions of messages or tens of thousands of requests per second—to ensure confidence in performance goals and regressions prevention. Consumption, Instance Size: 2048) PerfBench: High-Level Architecture Overview PerfBench is composed of several key pieces: Resource Creator – Uses meta files and Bicep templates to deploy receiver function apps (test targets) at scale. Test Infra Generator – Deploys and configures the system that actually does the load generation (e.g., SBLoadGen function app, Scheduler function app, ALT webhook function). Test Infra – The “brain” of testing, including the Scheduler, Azure Load Testing integration, and SBLoadGen. Receiver Function Apps – Deployed once per combination of runtime, version, region, OS, SKU, and scenario. Data Aggregation & Dashboards – Gathers test metrics from Azure Load Testing (ALT) or SBLoadGen, stores them in Azure Data Explorer (ADX), and displays trends in ADX dashboards. Below is a simplified architecture diagram illustrating these components: Components Resource Creator The resource creator uses meta files and Jinja templates to generate Bicep templates for creating resources. Meta Files: We define test scenarios in simple text-based files (e.g., os.txt, runtime_version.txt, sku.txt, scenario.txt). Each file lists possible values (like python|3.11 or dotnet|8.0) and short codes for resource naming. Template Generation: A script reads these meta files and uses them to produce Bicep templates—one template per valid combination—deploying receiver function apps into dedicated resource groups. Filters: Regex-like patterns in a filter.txt file exclude unwanted combos, keeping the matrix manageable. CI/CD Flow: Whenever we add a new runtime or region, a pull request updates the relevant meta file. Once merged, our pipeline regenerates Bicep and redeploys resources (these are idempotent updates). Test Infra Generator Deploys and configures the Scheduler Function App, SBLoadGen Durable Functions app, and the ALT webhook function. Similar CI/CD approach—merging changes triggers the creation (or update) of these infrastructure components. Test Infra: Load Generation, Scheduling, and Reporting Scheduler The conductor of the whole operation that runs every 5 minutes to load test configurations ( test_configs.json) from Blob Storage. The configuration includes details on what tests to run, at what time (e.g., “run at 13:45 daily”), and references to either ALT for HTTP or SBLoadGen for non-HTTP tests - to schedule them using different systems. Some tests run multiple times daily, others once a day; a scheduled downtime is built in for maintenance. HTTP Load Generator - Azure Load Testing (ALT) We utilize Azure Functions to trigger Azure Load Tests (ALT) for HTTP-based scenarios. ALT is a production-grade load generator tool that provides an easy to configure way to send load to different server endpoints using JMeter and Locust. We worked closely with the ALT team to optimize the JMeter scripts for different scenarios and it recently completed second year. We created an abstraction on top of ALT to create a webhook-approach of starting tests as well as get notified when tests finish, and this was done using a custom function app that does the following: Initiate a test run using a predefined JMX file. Continuously poll until the test execution is complete. Retrieve the test results and transform them into the required format. Transmit the formatted results to the data aggregation system. Sample ALT Test Run: 8.8 million requests in under 6 minutes, with a 90th percentile response time of 80ms and zero errors. The system maintained a throughput of 28K+ RPS. Some more details that we did within ALT - 25 Runtime Controllers manage the test logic and concurrency. 40 Engines handle actual load execution, distributing test plans. 1,000 Clients total for 5-minute runs to measure throughput, error rates, and latency. Test Types: HelloWorld (GET request, to understand baseline of the system). HtmlParser (POST request sending HTML for parsing to simulate moderate CPU usage). Service Bus Load Generator - SBLoadGen (Durable Functions) For event-driven scenarios (e.g., Service Bus–based triggers), we built SBLoadGen. It’s a Durable Function that uses the fan-out pattern to distribute work across multiple workers—each responsible for sending a portion of the total load. In a typical run, we aim to generate around one million messages in under a minute to stress-test the system. We intentionally avoid a fan-in step—once messages are in-flight, the system defers to the receiver function apps to process and emit relevant telemetry. Highlights: Generates ~1 million messages in under a minute. Durable Function apps are deployed regionally and are triggered via webhook. Implemented as a Python Function App using Model V2. Note: This would be open sourced in the coming days. Receiver Function Apps (Test apps) These are the actual apps receiving all the load generated. They are deployed with different combinations and updated rarely. Each valid combination (region + OS + runtime + SKU + scenario) gets its own function app, receiving load from ALT or SBLoadGen. HTTP Scenarios: HelloWorld: No-op test to measure overhead of the system and baseline. HTML Parser: POST with an HTML document for parsing (Simulating small CPU load). Non-HTTP (Service Bus) Scenario: CSV-to-JSON plus blob storage operations, blending compute and I/O overhead. Collected Metrics: RPS: Requests per second (RPS), success/error rates, latency distributions for HTTP workloads. MPPS: Messages processed per second (MPPS), success/error rates for non-HTTP (e.g. Service Bus) workloads. Data Aggregation & Dashboards Capturing results at scale is just as important as generating load. PerfBenchV2 uses a modular data pipeline to reliably ingest and visualize metrics from both HTTP and Service Bus–based tests. All test results flow through Event Hubs, which act as an intermediary between the test infrastructure and our analytics platform. The webhook function (used with ALT) and the SBLoadGen app both emit structured logs that are routed through Event Hub streams and ingested into dedicated Azure Data Explorer (ADX) tables. We use three main tables in ADX: HTTPTestResults for test runs executed via Azure Load Testing. SBLoadGenRuns for recording message counts and timing data from Service Bus scenarios. SchedulerRuns to log when and how each test was initiated. On top of this telemetry, we’ve built custom ADX dashboards that allow us to monitor trends in latency, throughput, and error rates over time. These dashboards provide clear, actionable views into system behavior across dozens of runtimes, regions, and SKUs. Because our focus is on long-term trend analysis, rather than real-time anomaly detection, this batch-oriented approach works well and reduces operational complexity. CI/CD Pipeline Integration Continuous Updates: Once a new language version or scenario is added to runtime_version.txt or scenario.txt meta files, the pipeline regenerates Bicep and deploys new receiver apps. The Test Infra Generator also updates or redeploys the needed function apps (Scheduler, SBLoadGen, or ALT webhook) whenever logic changes. Release Confidence: We run throughput tests on these new apps early and often, catching any performance regressions before shipping to customers. Challenges & Lessons Learned Designing and running this infrastructure hasn't been easy and we've learned a lot of valuable lessons on the way. Here are few Exploding Matrix - Handling every runtime, OS, SKU, region, scenario can lead to thousands of permutations. Meta files and a robust filter system help keep this under control, but it remains an ongoing effort. Cloud Transience - With ephemeral infrastructure, sometimes tests fail due to network hiccups or short-lived capacity constraints. We built in retries and redundancy to mitigate transient failures. Early Adoption - PerfBench was among the first heavy “customers” of the new Flex Consumption plan. At times, we had to wait for Bicep features or platform fixes—but it gave us great insight into the plan’s real-world performance. Maintenance & Cleanup - When certain stacks or SKUs near end-of-life, we have to decommission their resources—this also means regular grooming of meta files and filter rules. Success Stories Proactive Regression Detection: PerfBench surfaced critical performance regressions early—often before they could impact customers. These insights enabled timely fixes and gave us confidence to move forward with the General Availability of Flex Consumption. Production-Level Confidence: By continuously running tests across live production regions, PerfBench provided a realistic view of system behavior under load. This allowed the team to fine-tune performance, eliminate bottlenecks, and achieve improvements measured in single-digit milliseconds. Influencing Product Evolution: As one of the first large-scale internal adopters of the Flex Consumption plan, PerfBench served as a rigorous validation tool. The feedback it generated played a direct role in shaping feature priorities and improving platform reliability—well before broader customer adoption. Future Directions Open sourcing: We are in the process of open sourcing all the relevant parts of PerfBench - SBLoadGen, BicepTemplates generator, etc. Production Synthetic Validation and Alerting: Adapting PerfBench’s resource generation approach for ongoing synthetic tests in production, ensuring real environments consistently meet performance SLOs. This will also open up alerting and monitoring scenarios across production fleet. Expanding Trigger Coverage and Variations: Exploring additional triggers like Storage queues or Event Hub triggers to broaden test coverage. Testing different settings within the same scenario (e.g., larger payloads, concurrency changes). Conclusion PerfBench underscores our commitment to high-performance Azure Functions. By automating test app creation (via meta files and Bicep), orchestrating load (via ALT and SBLoadGen), and collecting data in ADX, we maintain a continuous pulse on throughput. This approach has already proven invaluable for Flex Consumption, and we’re excited to expand scenarios and triggers in the future. For more details on Flex Consumption and other hosting plans, check out the Azure Functions Documentation. We hope the insights shared here spark ideas for your own large-scale performance testing needs — whether on Azure Functions or any other distributed cloud services. Acknowledgements We’d like to acknowledge the entire Functions Platform and Tooling teams for their foundational work in enabling this testing infrastructure. Special thanks to the Azure Load Testing (ALT) team for their continued support and collaboration. And finally, sincere appreciation to our leadership for making performance a first-class engineering priority across the stack. Further Reading Azure Functions Azure Functions Flex Consumption Plan Azure Durable Funtions Azure Functions Python Developer Reference Guide Azure Functions Performance Optimizer Example case study: Github and Azure Functions Azure Load Testing Overview Azure Data Explorer Dashboards If you have any questions or want to share your own performance testing experiences, feel free to reach out in the comments!961Views0likes0CommentsIntroducing AI-Powered Actionable Insights in Azure Load Testing
We’re excited to announce the preview of AI powered Actionable Insights in Azure Load Testing—a new capability that helps teams quickly identify performance issues and understand test results through AI-driven analysis. Performance testing is an essential part of ensuring application reliability and responsiveness, but interpreting the results can often be challenging. It typically involves manually correlating client-side load test telemetry with backend service metrics, which can be both time-consuming and error-prone. Actionable Insights simplifies this process by automatically analyzing test data, surfacing key issues, and offering clear, actionable recommendations—so teams can focus on fixing what matters, not sifting through raw data. AI-powered diagnostics Actionable Insights uses AI to detect performance issues such as latency spikes, failed requests, throughput anomalies, and resource bottlenecks. It presents insights clearly, highlighting patterns and root causes so teams can quickly understand what went wrong and how to fix it. Insights leverage telemetry from both client-side metrics and server-side metrics which is collected via Azure Monitor. When server-side monitoring is enabled, Azure Load Testing correlates frontend traffic patterns with backend system behavior. For example, if an increase in virtual users coincides with latency spikes in Azure Cosmos DB, the insight will highlight this relationship and suggest corrective actions—giving teams a comprehensive view of system behavior under load. You can learn how to enable server-side metrics here. Rich, integrated experience for faster issue resolution Actionable Insights provides a unified, intuitive experience within your test results, clearly illustrating the context of detected performance issues. By consolidating metrics, conditions, and recommendations into a single view, your team can diagnose and resolve issues faster, without switching tools or piecing data together manually. Get Started Actionable Insights is now available in preview. To try it out, trigger a new test run in Azure Load Testing. For best results, enable server-side metrics when configuring your test. Once the run completes, AI-powered insights will be available in the test results view—no additional setup required. This is just the beginning. We are actively working on improving the quality of these insights and adding more capabilities to it. Your feedback is essential. Let us know what’s working well and where we can improve by using the thumbs-up or thumbs-down option on each generated insight in the Azure Load Testing portal. You can also share your feedback on our community. Learn more about Actionable Insights778Views4likes0CommentsOptimize Azure Functions for Performance and Costs using Azure Load Testing
Performance optimizer is a tool that helps you find the optimal balance between cost and performance for your Azure Functions. It runs load tests on different configurations and recommends the best one for your app.6.7KViews3likes1CommentAnnouncing CI/CD Enhancements for Azure Load Testing
We are excited to announce a significant update to our Azure Load Testing service, aimed at enhancing the experience of setting up and running load tests from CI/CD systems, including Azure DevOps and GitHub. This update is a direct response to customer feedback and is designed to streamline the process, making it more efficient and user-friendly. Key Features and Improvements: Enhanced CI/CD Integration: Developers and testers can now configure application components and the metrics to monitor directly from a CI/CD pipeline. This integration allows monitoring the application infrastructure during the test run. You can make the following changes to your load test YAML config. appComponents: - resourceId: "/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/samplerg/providers/microsoft.insights/components/appComponentResource" resourceName: appComponentResource #Optional kind: web # Optional metrics: - name: "requests/duration" namespace: microsoft.insights/components aggregation: "Average" - name: "requests/count" aggregation: "Total" namespace: microsoft.insights/components - resourceId: "/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/samplerg/providers/microsoft.insights/components/appComponentResource" resourceName: appComponentResource #Optional kind: web # Optional metrics: - name: "requests/duration" aggregation: "Average" namespace: microsoft.insights/components - name: "requests/count" aggregation: "Total" namespace: microsoft.insights/components Pass/Fail Criteria on Server Metrics: Users can set pass/fail criteria on server metrics from a CI/CD pipeline, providing more granular control over test outcomes. This feature helps in maintaining high performance standards by automatically flagging any performance issues. You can make the following changes to your load test YAML config. failureCriteria: clientMetrics: - avg(responseTimeMs) > 300 - percentage(error) > 50 - getCustomerDetails: avg(latency) > 200 serverMetrics: - resourceId: /subscriptions/abcdef01-2345-6789-0abc-def012345678/resourceGroups/sample-rg/providers/Microsoft.Compute/virtualMachines/sample-vm metricNamespace: Microsoft.Compute/virtualMachines metricName: Percentage CPU aggregation: Average condition: GreaterThan value: 80 - resourceId: /subscriptions/abcdef01-2345-6789-0abc-def012345678/resourceGroups/sample-rg/providers/Microsoft.Compute/virtualMachines/sample-vm metricNamespace: Microsoft.Compute/virtualMachines metricName: Available Memory aggregation: Average condition: LessThan value: 20 Parameter Overrides: The ability to override parameters of a load test configuration YAML from the Azure DevOps task or GitHub action adds flexibility and customization to the testing process. Output Variables: The Azure DevOps task now includes output variables that can be consumed in downstream steps, jobs, and stages. This feature to take further actions on the load test results within the pipeline. Pipeline Cancellation: If a pipeline in Azure Pipelines or a workflow in GitHub is cancelled, any load test triggered by the pipeline/action will also be cancelled. This ensures avoiding costs for unnecessary tests. Traceability and Results Viewing: Users can trace a test run back to the pipeline that ran the test from Azure portal. This helps in end-to-end traceability to understand what changes might have triggered the test failure. Conclusion These enhancements are designed to provide a more integrated and efficient load testing experience for our users. We believe that these updates will help developers, testers, and DevOps engineers to better manage their load testing processes, ensuring high performance and reliability of their applications. We look forward to your feedback and are excited to see how these new features will improve your CI/CD workflows. Stay tuned for more updates and happy testing!448Views1like0CommentsAzure Load Testing Celebrates Two Years with Two Exciting Announcements!
[Update on March 18, 2025: AI-powered load test generation, referred to in the third section below, is in preview now!] Azure Load Testing (ALT) has been an essential tool for performance testing, enabling customers across industries to run thousands of tests every month. We are thrilled to celebrate its second anniversary with two major announcements. In this blog post, we will delve into the remarkable capabilities of ALT and reveal the exciting developments that will redefine load testing for you. Why do customers love ALT? ALT is a powerful service designed to ensure that your applications can handle high traffic and perform optimally under peak load. Here are some key features of ALT: Large-scale tests: Simulate over 100,000 concurrent users. Long-duration tests: Run tests for up to 24 hours. Multi-region tests: Simultaneously simulate users from any of the 20 supported regions. Continuous tests: Catch performance regression early by integrating with Azure Pipelines, GitHub Actions, or other CI/CD systems. Comprehensive test results: Correlate server-side metrics with client-side metrics for end-to-end insights. Analytics and insights: Quickly and easily identify performance bottlenecks with detailed analytics. Pricing Changes: Listening to You We have heard your feedback and are excited to announce significant pricing changes, effective March 1, 2025: No monthly resource fee: We have eliminated the $10 monthly resource fee to help you save on overall costs. 20% price reduction: The cost per Virtual User Hour (VUH) for >10,000 VUH is reduced from 7.5 cents to 6 cents. Additionally, we are introducing a feature to set a consumption limit per resource. This will enable central teams, such as the Performance Center of Excellence, to effectively manage and control the costs incurred by each team. These changes reflect our commitment to making ALT more accessible and cost-effective, ensuring that you can optimize your applications without worrying about budget constraints. Locust-Based Tests: Offering You a Choice In another exciting development, we are delighted to announce the availability of Locust-based tests. This addition allows you to leverage the power, flexibility, and developer-friendly nature of the Python-based Locust load testing framework, in addition to already supported Apache JMeter load testing framework. We are also working on making it easy for you to generate tests by leveraging AI. With our integration with GitHub Copilot, you will be able to simply start with a Postman Collection or an HTTP file and leverage the copilot to generate Locust-based tests. Stay tuned! This update opens new possibilities for you, providing a choice of load testing frameworks and making it easy to generate tests. In Summary As we celebrate the second anniversary, we are committed to continually improving and evolving the service to meet your needs. With the introduction of half a dozen features (1. consumption limits, 2. Locust-based tests, 3. support for multiple test files, 4. scheduling, 5. notifications, 6. support for managed identity) apart from pricing changes, we are confident that ALT will continue to be an indispensable tool for your performance testing arsenal. We are excited about all the 50+ updates over two years and look forward to seeing how they enhance your testing processes. Thank you for being a part of our journey, and we can't wait to see what you achieve with ALT. If you would like to share how you were able to leverage ALT for an interesting scenario, email me at shon dot shah at microsoft dot com or post your feedback at https://aka.ms/malt-feedback. Happy load testing!904Views5likes2CommentsAI-Powered Load Testing in VS Code with Azure Load Testing & GitHub Copilot
There's a better way than writing load test scripts by hand. The new Azure Load Testing extension for Visual Studio Code (Preview), now integrated with GitHub Copilot, automatically generates realistic, Locust-based load tests. It seamlessly handles authentication, API request sequencing, response validation, and test data—helping you save time and ensure realistic performance testing. With this AI-driven tool, you can: Instantly generate Locust test scripts from Postman collections, Insomnia collections, or .http files. Easily enhance tests with GitHub Copilot, like adding random data or dynamic user flows. Quickly iterate by running tests locally before scaling up. Easily execute large-scale tests in Azure Load Testing to uncover performance bottlenecks. Spend less time wrestling with test scripts and more time optimizing your app's scalability. Key Features 🔹 AI-Generated Locust Scripts in Seconds Skip manual script creation. In just a few clicks, provide a Postman collection, Insomnia collection, or .http file to generate a complete load test script. Copilot then: Sequences API requests, passing values from one request to the next to simulate real workflows Integrates authentication and headers without hardcoding credentials Validates requests and handles errors gracefully with validations and logs Aggregates response metrics for detailed success and failure rates. Cleans up test data when the test finishes Produces reusable scripts you can refine locally and scale to Azure Load Testing 🔹 Refine with AI-Powered Customization Need to adjust your test script? Jusk ask: 💬 "Randomize request payloads" 💬 "Use data from a CSV file for username/password" 💬 "Retry requests for HTTP errors" 💬 "Define a custom load test shape for simulating spikes" Copilot suggest the changes, and you can apply them with a single click—no need to manually edit the script. 🔹 Run Tests Locally for Fast Iteration Quickly validate your Locust script in VS Code before scaling: Run Locust locally right from within VS Code Monitor performance and errors in real time in the Locust web UI Fine-tune and troubleshoot quickly By iterating locally, you ensure your tests work correctly before moving to large-scale testing. 🔹 Scale with One Click in Azure Load Testing Ready for bigger tests? Move your scripts to Azure Load Testing with simple YAML-based configuration: Set virtual users and load parameters in VS Code Choose regions to simulate distributed traffic Track client and server performance metrics in real time to pinpoint bottlenecks Azure Load Testing can handle many thousands of concurrent users, so you can find issues before they hit production. 🔹 Secure Test Secrets with Azure Key Vault Performance tests often require API keys or tokens. The Azure Load Testing extension integrates with Azure Key Vault for secure secret management during test runs. 🚀 Get Started Stop struggling with manual load testing scripts. Let GitHub Copilot and Azure Load Testing do the heavy lifting, so you can focus on building scalable, high-performing apps. 🔹 Install the Azure Load Testing extension for VS Code (Preview) and start testing smarter today!2.4KViews3likes0Comments