web apps
350 TopicsScaling Azure Functions Python with orjson
Azure Functions now supports ORJSON in the Python worker, giving developers an easy way to boost performance by simply adding the library to their environment. Benchmarks show that ORJSON delivers measurable gains in throughput and latency, with the biggest improvements on small–medium payloads common in real-world workloads. In tests, ORJSON improved throughput by up to 6% on 35 KB payloads and significantly reduced response times under load, while also eliminating dropped requests in high-throughput scenarios. With its Rust-based speed, standards compliance, and drop-in adoption, ORJSON offers a straightforward path to faster, more scalable Python Functions without any code changes.161Views0likes0CommentsBuild Long-Running AI Agents on Azure App Service with Microsoft Agent Framework
UPDATE 10/22/2025: An alternative implementation of this sample app has been added to this blog post. The alternate version uses a WebJob for background processing instead of an in-process hosted service. WebJobs are a great alternative for background processing in App Service, providing better separation of concerns, independent restarts, and dedicated logging. To learn more about WebJobs on App Service, see the Azure App Service WebJobs documentation. The AI landscape is evolving rapidly, and with the introduction of Microsoft Agent Framework, developers now have a powerful platform for building sophisticated AI agents that go far beyond simple chat completions. These agents can execute complex, multi-step workflows with persistent state, conversation threads, and structured execution—capabilities that are essential for production AI applications. Today, we're excited to share how Azure App Service provides an excellent platform for running Agent Framework workloads, especially those involving long-running operations. Let's explore why App Service is a great choice and walk through a practical example. 🔗 Quick link to sample app GitHub repo: https://github.com/Azure-Samples/app-service-agent-framework-travel-agent-dotnet 🔗 Quick link to WebJob sample app GitHub repo: https://github.com/Azure-Samples/app-service-agent-framework-travel-agent-dotnet-webjob The Challenge: Long-Running Agent Framework Flows Agent Framework enables AI agents to perform complex tasks that can take significant time to complete: Multi-turn reasoning: Iterative calls to large language models (LLMs) where each response informs the next prompt Tool integration: Function calling and external API interactions for real-time data Complex processing: Budget calculations, content optimization, multi-phase generation Persistent context: Maintaining conversation state across multiple interactions These workflows often take 30 seconds to several minutes to complete—far too long for synchronous HTTP request handling. Traditional web applications run into several constraints: ⏱️ Timeout Limitations: HTTP requests have timeout constraints (typically 30-230 seconds) ⚠️ Connection Issues: Clients may disconnect due to network interruptions or browser navigation 📈 Scalability Concerns: Long-running requests block worker threads and don't survive app restarts 🎯 Poor User Experience: Users see endless loading spinners with no progress feedback The Solution: Async Pattern with App Service Azure App Service provides a robust solution through the asynchronous request-reply pattern combined with background processing: API immediately returns (202 Accepted) with a task ID Background worker processes the Agent Framework workflow Client polls for status with real-time progress updates Durable state storage (Cosmos DB) maintains task status and results This pattern ensures: ✅ No HTTP timeouts—API responds in milliseconds ✅ Resilient to restarts—state survives deployments and scale events ✅ Progress tracking—users see real-time updates (10%, 45%, 100%) ✅ Better scalability—background workers process independently NOTE! This pattern can be implemented with either an in-process BackgroundService or as a separate WebJob process. Deployment Patterns: BackgroundService vs WebJob The following compares the two deployment options you have for this implementation. BackgroundService Pattern: ✅ Simpler deployment (single project) ✅ Shared process and memory ✅ Good for moderate workloads ⚠️ API and worker restart together WebJob Pattern (alternative): ✅ Separate processes (API + WebJob) ✅ Independent restart without API downtime ✅ Dedicated WebJob monitoring in portal ✅ Better for production operations ⚠️ Slightly more complex deployment (manual WebJob upload) Either of these options are a great way to help you get started with implementing long-running processes on App Service. To learn more about WebJobs on App Service, see the Azure App Service WebJobs documentation. Rapid Innovation Support The AI landscape is changing at an unprecedented pace. New models, frameworks, and capabilities are released constantly. Azure App Service's managed platform ensures your applications can adapt quickly without infrastructure rewrites: Framework Updates: Deploy new Agent Framework SDK versions like any application update Model Upgrades: Switch between GPT-4, GPT-4o, or future models with configuration changes Scaling Patterns: Start with combined API+worker, split into separate apps as needs grow New Capabilities: Integrate emerging AI services without changing hosting infrastructure App Service handles the platform complexity so you can focus on building great AI experiences. Sample Application: AI Travel Planner To demonstrate this pattern, we've built a Travel Planner application that uses Agent Framework to generate detailed, multi-day travel itineraries. The agent performs complex reasoning including: Researching destination attractions and activities Optimizing daily schedules based on location proximity Calculating detailed budget breakdowns Generating personalized travel tips and recommendations The entire application runs on a single P0v4 App Service with both the API and background worker combined—showcasing App Service's flexibility for hosting diverse workload patterns in one deployment. Key Architecture Components Azure App Service (P0v4 Premium) Hosts both REST API and background worker in a single app "Always On" feature keeps background worker running continuously Managed identity for secure, credential-less authentication Azure Service Bus Decouples API from long-running Agent Framework processing Reliable message delivery with automatic retries Dead letter queue for error handling Azure Cosmos DB Stores task status with real-time progress updates Automatic 24-hour TTL for cleanup Rich query capabilities for complex itinerary data Azure AI Foundry Hosts persistent agents with conversation threads Structured execution with Agent Framework runtime GPT-4o model for intelligent travel planning One of the powerful features of using Azure AI Foundry with Agent Framework is the ability to inspect agents and conversation threads directly in the Azure portal. This provides valuable visibility into what's happening during execution. Viewing Agents and Threads in Azure AI Foundry When you submit a travel plan request, the application creates an agent in Azure AI Foundry. You can navigate to your AI Foundry project in the Azure portal to see: Agents The application creates an agent for each request Important: Agents are **automatically deleted** after the itinerary is generated to keep your project clean Tip: You'll need to be quick! Navigate to Azure AI Foundry right after submitting a request to see the agent in action Once processing completes, the agent is removed as part of the cleanup process Conversation Threads Unlike agents, threads persist even after the agent completes You can view the complete conversation history at any time See the exact prompts sent to the model and the responses generated Useful for debugging, understanding agent behavior, and improving prompts The ephemeral nature of agents (created per request, deleted after completion) keeps your Azure AI Foundry project clean while the persistent threads give you full traceability of every interaction. Alternative Architecture: WebJob Pattern The alternate version of this app uses a WebJob for background processing instead of an in-process hosted service. However, just a single App Service is still required. WebJobs are a great alternative for background processing in App Service, providing better separation of concerns, independent restarts, and dedicated logging. To learn more about WebJobs on App Service, see the Azure App Service WebJobs documentation. Get Started Today The complete Travel Planner application is available as a reference implementation so you can quickly get started building your own apps with Agent Framework on App Service. Try one or both of these today! 🔗 GitHub Repository for background process version: https://github.com/Azure-Samples/app-service-agent-framework-travel-agent-dotnet 🔗 GitHub Repository for WebJob version: https://github.com/Azure-Samples/app-service-agent-framework-travel-agent-dotnet-webjob The repo includes: Complete .NET 9 source code with Agent Framework integration Infrastructure as Code (Bicep) for automated deployment Web UI with real-time progress tracking Comprehensive README with deployment instructions Deploy in minutes: git clone https://github.com/Azure-Samples/app-service-agent-framework-travel-agent-dotnet.git cd app-service-agent-framework-travel-agent-dotnet azd auth login azd up IMPORTANT! For the WebJob version, you will also need to manually deploy the WebJob. See the instructions in the README to learn how to do this. Key Takeaways ✅ Agent Framework enables sophisticated AI agents beyond simple chat completions ✅ Long-running workflows (30s-minutes) require async patterns to avoid timeouts ✅ App Service provides a simple, cost-effective platform for these workloads ✅ Async request-reply pattern with Service Bus + Cosmos DB ensures reliability ✅ Rapid innovation in AI is supported by App Service's adaptable platform Whether you're building travel planners, document processors, research assistants, or other AI-powered applications, Azure App Service gives you the flexibility and reliability you need—without the complexity of container orchestration or function programming models. What's Next? Build on This Foundation This Travel Planner is just the starting point—a foundation to help you understand the patterns and architecture. Agent Framework is designed to grow with your needs, making it easy to add sophisticated capabilities with minimal effort: 🛠️ Add Tool Calling Connect your agent to real-time APIs for weather, flight prices, hotel availability, and actual booking systems. Agent Framework's built-in tool calling makes this straightforward. 🤝 Implement Multi-Agent Systems Create specialized agents (flight expert, hotel specialist, activity planner) that collaborate to build comprehensive travel plans. Agent Framework handles the orchestration. 🧠 Enhance with RAG Add retrieval-augmented generation to give your agent deep knowledge of destinations, local customs, and insider tips from your own content library. 📊 Expand Functionality Real-time pricing and availability Interactive refinement based on user feedback Personalized recommendations from past trips Multi-language support for global users The beauty of Agent Framework is that these advanced features integrate seamlessly into the pattern we've built. Start with this sample, explore the Agent Framework documentation, and unlock powerful AI capabilities for your applications! Learn More Microsoft Agent Framework Documentation Azure App Service Documentation Async Request-Reply Pattern Azure App Service WebJobs documentation Have you built AI agents on App Service? We'd love to hear about your experience! Share your thoughts in the comments below. Questions about Agent Framework on App Service? Drop a comment and our team will help you get started.559Views1like2CommentsAnnouncing the Public Preview of the New App Service Quota Self-Service Experience
Update 9/15/2025: The App Service Quota Self-Service experience has been temporarily taken offline to incorporate feedback received during this public preview. As this is public preview, availability and features are subject to change as we receive and incorporate feedback. We will post another update when the self-serve experience is available once more. In the meantime, if you require assistance, please file a support ticket following the guidance at the bottom of this post in the Filing a Support Ticket section. We appreciate your patience while we work to build the best experience possible for this scenario. What’s New? The updated experience introduces a dedicated App Service Quota blade in the Azure portal, offering a streamlined and intuitive interface to: View current usage and limits across the various SKUs Set custom quotas tailored to your App Service plan needs This new experience empowers developers and IT admins to proactively manage resources, avoid service disruptions, and optimize performance. Quick Reference - Start here! If your deployment requires quota for ten or more subscriptions, then file a support ticket with problem type Quota following the instructions at the bottom of this post. If any subscription included in your request requires zone redundancy, then file a support ticket with problem type Quota following the instructions at the bottom of this post. Otherwise, leverage the new self-service experience to increase your quota automatically. Self-service Quota Requests For non-zone-redundant needs, quota alone is sufficient to enable App Service deployment or scale-out. Follow the provided steps to place your request. 1. Navigate to the Quotas resource provider in the Azure portal 2. Select App Service Navigating the primary interface: Each App Service VM size is represented as a separate SKU. If the intention is to be able to scale up or down within a specific offering (e.g., Premium v3), then equivalent number of VMs need to be requested for each applicable size of that offering (e.g., request 5 instances for both P1v3 and P3v3). As with other quotas, you can filter by region, subscription, provider, or usage. You can also group the results by usage, quota (App Service VM type), or location (region). Current usage is represented as App Service VMs. This allows you to quickly identify which SKUs are nearing their quota limits. Adjustments can be made inline: no need to visit another page. This is covered in detail in the next section. 3. Request quota adjustments Clicking the pen icon opens a flyout window to capture the quota request: The quota type (App Service SKU) is already populated, along with current usage. Note that your request is not incremental: you must specify the new limit that you wish to see reflected in the portal. For example, to request two additional instances of P1v2 VMs, you would file the request like this: Click submit to send the request for automatic processing. How quota approvals work: Immediately upon submitting a quota request, you will see a processing dialog like the one shown: If the quota request can be automatically fulfilled, then no support request is needed. You should receive this confirmation within a few minutes of submission: If the request cannot be automatically fulfilled, then you will be given the option to file a support request with the same information. In the example below, the requested new limit exceeds what can be automatically granted for the region: 4. If applicable, create support ticket When creating a support ticket, you will need to repopulate the Region and App Service plan details; the new limit has already been populated for you. If you forget the region or SKU that was requested, you can reference them in your notifications pane: If you choose to create a support ticket, then you will interact with the capacity management team for that region. This is a 24x7 service, so requests may be created at any time. Once you have filed the support request, you can track its status via the Help + support dashboard. Known issues The self-service quota request experience for App Service is in public preview. Here are some caveats worth mentioning while the team finalizes the release for general availability: Closing the quota request flyout window will stop meaningful notifications for that request. You can still view the outcome of your quota requests by checking actual quota, but if you want to rely on notifications for alerts, then we recommend leaving the quota request window open for the few minutes that it is processing. Some SKUs are not yet represented in the quota dashboard. These will be added later in the public preview. The Activity Log does not currently provide a meaningful summary of previous quota requests and their outcomes. This will also be addressed during the public preview. As noted in the walkthrough, the new experience does not enable zone-redundant deployments. Quota is an inherently regional construct, and zone-redundant enablement requires a separate step that can only be taken in response to a support ticket being filed. Quota API documentation is being drafted to enable bulk non-zone redundant quota requests without requiring you to file a support ticket. Filing a Support Ticket If your deployment requires zone redundancy or contains many subscriptions, then we recommend filing a support ticket with issue type "Technical" and problem type "Quota": We want your feedback! If you notice any aspect of the experience that does not work as expected, or you have feedback on how to make it better, please use the comments below to share your thoughts!1.5KViews3likes3CommentsBeyond the Desktop: The Future of Development with Microsoft Dev Box and GitHub Codespaces
The modern developer platform has already moved past the desktop. We’re no longer defined by what’s installed on our laptops, instead we look at what tooling we can use to move from idea to production. An organisations developer platform strategy is no longer a nice to have, it sets the ceiling for what’s possible, an organisation can’t iterate it's way to developer nirvana if the foundation itself is brittle. A great developer platform shrinks TTFC (time to first commit), accelerates release velocity, and maybe most importantly, helps alleviate everyday frictions that lead to developer burnout. Very few platforms deliver everything an organization needs from a developer platform in one product. Modern development spans multiple dimensions, local tooling, cloud infrastructure, compliance, security, cross-platform builds, collaboration, and rapid onboarding. The options organizations face are then to either compromise on one or more of these areas or force developers into rigid environments that slow productivity and innovation. This is where Microsoft Dev Box and GitHub Codespaces come into play. On their own, each addresses critical parts of the modern developer platform: Microsoft Dev Box provides a full, managed cloud workstation. Dev Box gives developers a consistent, high-performance environment while letting central IT apply strict governance and control. Internally at Microsoft, we estimate that usage of Dev Box by our development teams delivers savings of 156 hours per year per developer purely on local environment setup and upkeep. We have also seen significant gains in other key SPACE metrics reducing context-switching friction and improving build/test cycles. Although the benefits of Dev Box are clear in the results demonstrated by our customers it is not without its challenges. The biggest challenge often faced by Dev Box customers is its lack of native Linux support. At the time of writing and for the foreseeable future Dev Box does not support native Linux developer workstations. While WSL2 provides partial parity, I know from my own engineering projects it still does not deliver the full experience. This is where GitHub Codespaces comes into this story. GitHub Codespaces delivers instant, Linux-native environments spun up directly from your repository. It’s lightweight, reproducible, and ephemeral ideal for rapid iteration, PR testing, and cross-platform development where you need Linux parity or containerized workflows. Unlike Dev Box, Codespaces can run fully in Linux, giving developers access to native tools, scripts, and runtimes without workarounds. It also removes much of the friction around onboarding: a new developer can open a repository and be coding in minutes, with the exact environment defined by the project’s devcontainer.json. That said, Codespaces isn’t a complete replacement for a full workstation. While it’s perfect for isolated project work or ephemeral testing, it doesn’t provide the persistent, policy-controlled environment that enterprise teams often require for heavier workloads or complex toolchains. Used together, they fill the gaps that neither can cover alone: Dev Box gives the enterprise-grade foundation, while Codespaces provides the agile, cross-platform sandbox. For organizations, this pairing sets a higher ceiling for developer productivity, delivering a truly hybrid, agile and well governed developer platform. Better Together: Dev Box and GitHub Codespaces in action Together, Microsoft Dev Box and GitHub Codespaces deliver a hybrid developer platform that combines consistency, speed, and flexibility. Teams can spin up full, policy-compliant Dev Box workstations preloaded with enterprise tooling, IDEs, and local testing infrastructure, while Codespaces provides ephemeral, Linux-native environments tailored to each project. One of my favourite use cases is having local testing setups like a Docker Swarm cluster, ready to go in either Dev Box or Codespaces. New developers can jump in and start running services or testing microservices immediately, without spending hours on environment setup. Anecdotally, my time to first commit and time to delivering “impact” has been significantly faster on projects where one or both technologies provide local development services out of the box. Switching between Dev Boxes and Codespaces is seamless every environment keeps its own libraries, extensions, and settings intact, so developers can jump between projects without reconfiguring or breaking dependencies. The result is a turnkey, ready-to-code experience that maximizes productivity, reduces friction, and lets teams focus entirely on building, testing, and shipping software. To showcase this value, I thought I would walk through an example scenario. In this scenario I want to simulate a typical modern developer workflow. Let's look at a day in the life of a developer on this hybrid platform building an IOT project using Python and React. Spin up a ready-to-go workstation (Dev Box) for Windows development and heavy builds. Launch a Linux-native Codespace for cross-platform services, ephemeral testing, and PR work. Run "local" testing like a Docker Swarm cluster, database, and message queue ready to go out-of-the-box. Switch seamlessly between environments without losing project-specific configurations, libraries, or extensions. 9:00 AM – Morning Kickoff on Dev Box I start my day on my Microsoft Dev Box, which gives me a fully-configured Windows environment with VS Code, design tools, and Azure integrations. I select my teams project, and the environment is pre-configured for me through the Dev Box catalogue. Fortunately for me, its already provisioned. I could always self service another one using the "New Dev Box" button if I wanted too. I'll connect through the browser but I could use the desktop app too if I wanted to. My Tasks are: Prototype a new dashboard widget for monitoring IoT device temperature. Use GUI-based tools to tweak the UI and preview changes live. Review my Visio Architecture. Join my morning stand up. Write documentation notes and plan API interactions for the backend. In a flash, I have access to my modern work tooling like Teams, I have this projects files already preloaded and all my peripherals are working without additional setup. Only down side was that I did seem to be the only person on my stand up this morning? Why Dev Box first: GUI-heavy tasks are fast and responsive. Dev Box’s environment allows me to use a full desktop. Great for early-stage design, planning, and visual work. Enterprise Apps are ready for me to use out of the box (P.S. It also supports my multi-monitor setup). I use my Dev Box to make a very complicated change to my IoT dashboard. Changing the title from "IoT Dashboard" to "Owain's IoT Dashboard". I preview this change in a browser live. (Time for a coffee after this hardwork). The rest of the dashboard isnt loading as my backend isnt running... yet. 10:30 AM – Switching to Linux Codespaces Once the UI is ready, I push the code to GitHub and spin up a Linux-native GitHub Codespace for backend development. Tasks: Implement FastAPI endpoints to support the new IoT feature. Run the service on my Codespace and debug any errors. Why Codespaces now: Linux-native tools ensure compatibility with the production server. Docker and containerized testing run natively, avoiding WSL translation overhead. The environment is fully reproducible across any device I log in from. 12:30 PM – Midday Testing & Sync I toggle between Dev Box and Codespaces to test and validate the integration. I do this in my Dev Box Edge browser viewing my codespace (I use my Codespace in a browser through this demo to highlight the difference in environments. In reality I would leverage the VSCode "Remote Explorer" extension and its GitHub Codespace integration to use my Codespace from within my own desktop VSCode but that is personal preference) and I use the same browser to view my frontend preview. I update the environment variable for my frontend that is running locally in my Dev Box and point it at the port running my API locally on my Codespace. In this case it was a web socket connection and HTTPS calls to port 8000. I can make this public by changing the port visibility in my Codespace. https://fluffy-invention-5x5wp656g4xcp6x9-8000.app.github.dev/api/devices wss://fluffy-invention-5x5wp656g4xcp6x9-8000.app.github.dev/ws This allows me to: Preview the frontend widget on Dev Box, connecting to the backend running in Codespaces. Make small frontend adjustments in Dev Box while monitoring backend logs in Codespaces. Commit changes to GitHub, keeping both environments in sync and leveraging my CI/CD for deployment to the next environment. We can see the Dev Box running local frontend and the Codespace running the API connected to each other, making requests and displaying the data in the frontend! Hybrid advantage: Dev Box handles GUI previews comfortably and allows me to live test frontend changes. Codespaces handles production-aligned backend testing and Linux-native tools. Dev Box allows me to view all of my files in one screen with potentially multiple Codespaces running in browser of VS Code Desktop. Due to all of those platform efficiencies I have completed my days goals within an hour or two and now I can spend the rest of my day learning about how to enable my developers to inner source using GitHub CoPilot and MCP (Shameless plug). The bottom line There are some additional considerations when architecting a developer platform for an enterprise such as private networking and security not covered in this post but these are implementation details to deliver the described developer experience. Architecting such a platform is a valuable investment to deliver the developer platform foundations we discussed at the top of the article. While in this demo I have quickly built I was working in a mono repository in real engineering teams it is likely (I hope) that an application is built of many different repositories. The great thing about Dev Box and Codespaces is that this wouldn’t slow down the rapid development I can achieve when using both. My Dev Box would be specific for the project or development team, pre loaded with all the tools I need and potentially some repos too! When I need too I can quickly switch over to Codespaces and work in a clean isolated environment and push my changes. In both cases any changes I want to deliver locally are pushed into GitHub (Or ADO), merged and my CI/CD ensures that my next step, potentially a staging environment or who knows perhaps *Whispering* straight into production is taken care of. Once I’m finished I delete my Codespace and potentially my Dev Box if I am done with the project, knowing I can self service either one of these anytime and be up and running again! Now is there overlap in terms of what can be developed in a Codespace vs what can be developed in Azure Dev Box? Of course, but as organisations prioritise developer experience to ensure release velocity while maintaining organisational standards and governance then providing developers a windows native and Linux native service both of which are primarily charged on the consumption of the compute* is a no brainer. There are also gaps that neither fill at the moment for example Microsoft Dev Box only provides windows compute while GitHub Codespaces only supports VS Code as your chosen IDE. It's not a question of which service do I choose for my developers, these two services are better together! * Changes have been announced to Dev Box pricing. A W365 license is already required today and dev boxes will continue to be managed through Azure. For more information please see: Microsoft Dev Box capabilities are coming to Windows 365 - Microsoft Dev Box | Microsoft Learn544Views2likes0CommentsDeployment and Build from Azure Linux based Web App
TOC Introduction Deployment Sources From Laptop From CI/CD tools Build Source From Oryx Build From Runtime From Deployment Sources Walkthrough Laptop + Oryx Laptop + Runtime Laptop CI/CD concept Conclusion 1. Introduction Deployment on Azure Linux Web Apps can be done through several different methods. When a deployment issue occurs, the first step is usually to identify which method was used. The core of these methods revolves around the concept of Build, the process of preparing and loading the third-party dependencies required to run an application. For example, a Python app defines its build process as pip install packages, a Node.js app uses npm install modules, and PHP or Java apps rely on libraries. In this tutorial, I’ll use a simple Python app to demonstrate four different Deployment/Build approaches. Each method has its own use cases and limitations. You can even combine them, for example, using your laptop as the deployment tool while still using Oryx as the build engine. The same concepts apply to other runtimes such as Node.js, PHP, and beyond. 2. Deployment Sources From Laptop Scenarios: Setting up a proof of concept Developing in a local environment Advantages: Fast development cycle Minimal configuration required Limitations: Difficult for the local test environment to interact with cloud resources OS differences between local and cloud environments may cause integration issues From CI/CD tools Scenarios: Projects with established development and deployment workflows Codebases requiring version control and automation Advantages: Developers can focus purely on coding Automatic deployment upon branch commits Limitations: Build and runtime environments may still differ slightly at the OS level 3. Build Source From Oryx Build Scenarios: Offloading resource-intensive build tasks from your local or CI/CD environment directly to the Azure Web App platform, reducing local computing overhead. Advantages: Minimal extra configuration Multi-language support Limitations: Build performance is limited by the App Service SKU and may face performance bottlenecks The build environment may differ from the runtime environment, so apps sensitive to minor package versions should take caution From Runtime Scenarios: When you want the benefits and pricing of a PaaS solution but need control similar to an IaaS setup Advantages: Build occurs in the runtime environment itself Allows greater flexibility for low-level system operations Limitations: Certain system-level settings (e.g., NTP time sync) remain inaccessible From Deployment Sources Scenarios: Pre-package all dependencies and deploy them together, eliminating the need for a separate build step. Advantages: Supports proprietary or closed-source company packages Limitations: Incompatibility may arise if the development and runtime environments differ significantly in OS or package support Type Method Scenario Advantage Limitation Deployment From Laptop POC / Dev Fast setup Poor cloud link Deployment From CI/CD Auto pipeline Focus on code OS mismatch Build From Oryx Platform build Simple, multi-lang Performance cap Build From Runtime High control Flexible ops Limited access Build From Deployment Pre-built deploy Use private pkg Env mismatch 4. Walkthrough Laptop + Oryx Add Environment Variables SCM_DO_BUILD_DURING_DEPLOYMENT=false (Purpose: prevents the deployment environment from packaging during publish; this must also be set in the deployment environment itself.) WEBSITE_RUN_FROM_PACKAGE=false (Purpose: tells Azure Web App not to run the app from a prepackaged file.) ENABLE_ORYX_BUILD=true (Purpose: allows the Azure Web App platform to handle the build process automatically after a deployment event.) Add startup command bash /home/site/wwwroot/run.sh (The run.sh file corresponds to the script in your project code.) Check sample code requirements.txt — defines Python packages (similar to package.json in Node.js). Flask==3.0.3 gunicorn==23.0.0 app.py — main Python application code. from flask import Flask app = Flask(__name__) @app.route("/") def home(): return "Deploy from Laptop + Oryx" if __name__ == "__main__": import os app.run(host="0.0.0.0", port=8000) run.sh — script used to start the application. #!/bin/bash gunicorn --bind=0.0.0.0:8000 app:app .deployment — VS Code deployment configuration file. [config] SCM_DO_BUILD_DURING_DEPLOYMENT=false Deployment Once both the deployment and build processes complete successfully, you should see the expected result. Laptop + Runtime Add Environment Variables (Screenshots omitted since the process is similar to previous steps) SCM_DO_BUILD_DURING_DEPLOYMENT=false Purpose: Prevents the deployment environment from packaging during the publishing process. This setting must also be added in the deployment environment itself. WEBSITE_RUN_FROM_PACKAGE=false Purpose: Instructs Azure Web App not to run the application from a prepackaged file. ENABLE_ORYX_BUILD=false Purpose: Ensures that Azure Web App does not perform any build after deployment; all build tasks will instead be handled during the startup script execution. Add Startup Command (Screenshots omitted since the process is similar to previous steps) bash /home/site/wwwroot/run.sh (The run.sh file corresponds to the script of the same name in your project code.) Check Sample Code (Screenshots omitted since the process is similar to previous steps) requirements.txt: Defines Python packages (similar to package.json in Node.js). Flask==3.0.3 gunicorn==23.0.0 app.py: The main Python application code. from flask import Flask app = Flask(__name__) @app.route("/") def home(): return "Deploy from Laptop + Runtime" if __name__ == "__main__": import os app.run(host="0.0.0.0", port=8000) run.sh: Startup script. In addition to launching the app, it also creates a virtual environment and installs dependencies, all build-related tasks happen here. #!/bin/bash python -m venv venv source venv/bin/activate pip install -r requirements.txt gunicorn --bind=0.0.0.0:8000 app:app .deployment: VS Code deployment configuration file. [config] SCM_DO_BUILD_DURING_DEPLOYMENT=false Deployment (Screenshots omitted since the process is similar to previous steps) Once both deployment and build are completed, you should see the expected output. Laptop Add Environment Variables (Screenshots omitted as the process is similar to previous steps) SCM_DO_BUILD_DURING_DEPLOYMENT=false Purpose: Prevents the deployment environment from packaging during publish. This must also be set in the deployment environment itself. WEBSITE_RUN_FROM_PACKAGE=false Purpose: Instructs Azure Web App not to run the app from a prepackaged file. ENABLE_ORYX_BUILD=false Purpose: Prevents Azure Web App from building after deployment. All build tasks will instead execute during the startup script. Add Startup Command (Screenshots omitted as the process is similar to previous steps) bash /home/site/wwwroot/run.sh (The run.sh corresponds to the same-named file in your project code.) Check Sample Code (Screenshots omitted as the process is similar to previous steps) requirements.txt: Defines Python packages (like package.json in Node.js). Flask==3.0.3 gunicorn==23.0.0 app.py: The main Python application. from flask import Flask app = Flask(__name__) @app.route("/") def home(): return "Deploy from Laptop" if __name__ == "__main__": import os app.run(host="0.0.0.0", port=8000) run.sh: The startup script. In addition to launching the app, it activates an existing virtual environment. The creation of that environment and installation of dependencies will occur in the next section. #!/bin/bash source venv/bin/activate gunicorn --bind=0.0.0.0:8000 app:app .deployment: VS Code deployment configuration file. [config] SCM_DO_BUILD_DURING_DEPLOYMENT=false Deployment Before deployment, you must perform a local build process. Run commands locally (depending on the language, usually for installing dependencies). python -m venv venv source venv/bin/activate pip install -r requirements.txt After completing the local build, deploy your app. Once deployment finishes, you should see the expected result. CI/CD concept For example, when using Azure DevOps (ADO) as your CI/CD tool, its behavior conceptually mirrors deploying directly from a laptop, but with enhanced automation, governance, and reproducibility. Essentially, ADO pipelines translate your manual local deployment steps into codified, repeatable workflows defined in a YAML pipeline file, executed by Microsoft-hosted or self-hosted agents. A typical azure-pipelines.yml defines the stages (e.g., build, deploy) and their corresponding jobs and steps. Each stage runs on a specified VM image (e.g., ubuntu-latest) and executes commands, the same npm install, pip install which you would normally run on your laptop. The ADO pipeline acts as your automated laptop, every build command, environment variable, and deployment step you’d normally execute locally is just formalized in YAML. Whether you build inline, use Oryx, or deploy pre-built artifacts, the underlying concept remains identical: compile, package, and deliver code to Azure. The distinction lies in who performs it. 5. Conclusion Different deployment and build methods lead to different debugging and troubleshooting approaches. Therefore, understanding the selected deployment method and its corresponding troubleshooting process is an essential skill for every developer and DevOps engineer.303Views0likes0CommentsFrom Timeouts to Triumph: Optimizing GPT-4o-mini for Speed, Efficiency, and Reliability
The Challenge Large-scale generative AI deployments can stretch system boundaries — especially when thousands of concurrent requests require both high throughput and low latency. In one such production environment, GPT-4o-mini deployments running under Provisioned Throughput Units (PTUs) began showing sporadic 408 (timeout) and 429 (throttling) errors. Requests that normally completed in seconds were occasionally hitting the 60-second timeout window, causing degraded experiences and unnecessary retries. Initial suspicion pointed toward PTU capacity limitations, but deeper telemetry revealed a different cause. What the Data Revealed Using Azure Data Explorer (Kusto), API Management (APIM) logs, and OpenAI billing telemetry, a detailed investigation uncovered several insights: Latency was not correlated with PTU utilization: PTU resources were healthy and performing within SLA even during spikes. Time-Between-Tokens (TBT) stayed consistently low (~8–10 ms): The model was generating tokens steadily. Excessive token output was the real bottleneck: Requests generating 6K–8K tokens simply required more time than allowed in the 60-second completion window. In short — the model wasn’t slow; the workload was oversized. The Optimization Opportunity The analysis opened a broader optimization opportunity: Balance token length with throughput targets. Introduce architectural patterns to prevent timeout or throttling cascades under load. Enforce automatic token governance instead of relying on client-side discipline. The Solution Three engineering measures delivered immediate impact: token optimization, spillover routing, and policy enforcement. Right-size the Token Budget Empirical throughput for GPT-4o-mini: ~33 tokens/sec → ~2K tokens in 60s. Enforced max_tokens = 2000 for synchronous requests. Enabled streaming responses for longer outputs, allowing incremental delivery without hitting timeout limits. Enable Spillover for Continuity Implemented multi-region spillover using Azure Front Door and APIM Premium gateways. When PTU queues reached capacity or 429s appeared, requests were routed to Standard deployments in secondary regions. The result: graceful degradation and uninterrupted user experience. Govern with APIM Policies Added inbound policies to inspect and adjust max_tokens dynamically. On 408/429 responses, APIM retried and rerouted traffic based on spillover logic. The Results After optimization, improvements were immediate and measurable: Latency Reduction: Significant improvement in end-to-end response times across high-volume workloads Reliability Gains: 408/429 errors fell from >1% to near zero. Cost Efficiency: Average token generation decreased by ~60%, reducing per-request costs. Scalability: Spillover routing ensured consistent performance during regional or capacity surges. Governance: APIM policies established a reusable token-control framework for future AI workloads. Lessons Learned Latency isn’t always about capacity: Investigate workload patterns before scaling hardware. Token budgets define the user experience: Over-generation can quietly break SLA compliance. Design for elasticity: Spillover and multi-region routing maintain continuity during spikes. Measure everything: Combine KQL telemetry, latency and token tracking for faster diagnostics. The Outcome By applying data-driven analysis, architectural tuning, and automated governance, the team turned an operational bottleneck into a model of consistent, scalable performance. The result: Faster responses. Lower costs. Higher trust. A blueprint for building resilient, high-throughput AI systems on Azure.229Views4likes0CommentsHow to set up subdirectory Multisite in WordPress on Azure App Service
WordPress Multisite is a feature of WordPress that enables you to run and manage multiple WordPress websites using the same WordPress installation. Follow these steps to setup Multisite in your WordPress website on App Service...11KViews1like15Comments