microsoft ignite 2024
17 TopicsWhat’s new in Azure Container Apps at Ignite’24
Azure Container Apps is a fully managed serverless container service that enables you to build and deploy modern, cloud-native applications and microservices at scale. It offers simplified developer experience while providing the flexibility and portability of containers. Azure Container Apps supports a variety of languages and frameworks, making it a versatile platform for developers. At the same time, it offers enterprise-grade features such as configurable network topology, secret and key management, and robust security and governance, making it a trusted platform for mission-critical and high-security workloads. The features we're announcing at Ignite'24 for Serverless GPUs, intelligent apps, and other Enterprise features to further deepen this commitment. Azure Container Apps Serverless GPUs One of the major pain points for customers has been the complexity and cost associated with deploying and monetizing custom models, fine-tuned models, and other open-source models within their environment. Managing the necessary infrastructure and ensuring data governance can be both time-consuming and expensive. To address this challenge, we're thrilled to announce the public preview of Azure Container Apps Serverless GPUs, bringing the power of NVIDIA A100 and T4 GPUs to a serverless environment. This feature allows AI development teams to focus on their core AI code without worrying about managing infrastructure. With serverless GPUs, you get a middle layer between Azure AI Model Catalog's serverless APIs and hosting models on managed compute, ensuring full data governance as your data never leaves the container boundaries. Serverless GPUs offer several key benefits, including scale-to-zero capabilities, built-in data governance, and flexible compute options with NVIDIA A100 and T4 GPUs. This managed, serverless compute platform is perfect for a wide range of AI workloads, from real-time inferencing with custom models to fine-tuning generative AI models and video rendering. Available now in West US 3 and Australia East regions, serverless GPUs can be easily set up through the Azure portal or CLI. To get started, ensure you have quota enabled on your subscription. This new feature is designed to meet the growing demands of modern applications, providing powerful GPU resources without the need for dedicated infrastructure management. Azure Container Apps Dynamic Sessions Azure Container Apps dynamic sessions, announced in May 2024, is now generally available. This feature provides instant access to compute sandboxes for running untrusted code at scale, with each session protected by industry-standard Hyper-V isolation. Dynamic sessions are available in two modes: Python code interpreter and custom container sessions. The Python code interpreter sessions offer easy access to built-in Python code interpreter sandboxes, while custom container sessions allow users to run any custom container, supporting any scenario where sandboxes are needed to run untrusted code or applications. Additionally, the public preview of JavaScript code interpreter sessions is now available, supporting the execution of untrusted code on the Node.js runtime. Private Endpoints Private endpoints are now supported in public preview for workload profile environments in Azure Container Apps. This enables customers to connect to their Container Apps environment using a private IP address in their Azure Virtual Network, eliminating exposure to the public internet and securing access to their applications. With private endpoints, customers can also connect directly from Azure Front Door to their workload profile environments over a private link instead of the public internet. Today, customers can enable Private Link to container apps origin for Azure Front Door through CLI, with portal support coming soon. Private endpoints will be free during public preview, but this is subject to change upon GA. Currently, private endpoints are only supported for public cloud. Planned Maintenance Planned maintenance is now supported in public preview for Azure Container Apps. This CLI feature allows you to control when non-critical updates, such as minor security patches, bug fixes, and new releases, are applied to your Container Apps environment to minimize downtime and impact to applications. To configure a weekly maintenance window, you simply need to specify a day of week, a start time in the UTC time zone, and a duration. Planned maintenance support is available for all container apps and jobs, except those running on consumption workload profiles. We are working on adding support for these profiles soon. If you’re interested in this feature, please fill out this survey to share your use case and help us prioritize accordingly. Path-Based Routing Early Access Path-based routing is now supported as an early-access feature in Azure Container Apps. This feature allows customers to configure routing rules to determine which application traffic entering the Azure Container Apps environment is sent to without the configuration of an additional reverse proxy like nginx. You can configure path-based routing rules for your container apps through ARM or bicep, with CLI support coming shortly. See the quickstart and samples for more guidance and getting started with path-based routing. .NET Aspire on Azure Container Apps Earlier this year, we announced the Public Preview of the Aspire Dashboard for Azure Container Apps, providing a developer-centric live view of telemetry across all apps in a container environment. This is helpful to evaluate app performance and debug errors with comprehensive logs, metrics, and traces. At .NET Conf last week, we announced .NET 9, which simplifies the acquisition of .NET Aspire and adds new features like starting and stopping apps from the dashboard, viewing scaled-to-zero apps, and an improved UI. This release is only available in new environments in Australia East, Germany West Central, Italy North, and Switzerland North. Additional regions will be supported in the future. We also announced preview support for Azure Functions that can be deployed to Azure Container Apps. The new .NET Aspire Azure Functions integration enables developers to develop, debug, and orchestrate Azure Functions .NET projects directly within the app host. This integration supports several key triggers, including Event Hubs, Service Bus, Storage Blobs, and HTTP, providing a versatile and powerful toolset for serverless applications. By leveraging the familiar programming model of Azure Functions and the using existing tools such as Visual Studio and .NET CLI, developers can now seamlessly integrate their serverless workflows into Azure Container Apps, benefiting from the unified environment and streamlined deployment processes. Java on Azure Container Apps Azure Container Apps has added multiple features making it an ideal platform for deploying Java Spring applications, offering seamless integration with popular development tools development tools like IntelliJ, VS Code, Maven, and Gradle. The service supports multiple deployment types, including source, binaries, or container images, alongside automation tools such as Azure DevOps, GitHub Actions, and Jenkins. Additionally, it offers multiple Java specific features required for modern Java deployment requirements, such as out-of-the-box JVM metrics, automatic JVM memory fitting, Java in-process agent for log stream and console, as well as various Spring components as managed services. Azure Spring Apps is a fully managed service for running Java Spring applications, jointly built by Microsoft and VMware by Broadcom. After careful consideration and analysis, Microsoft and Broadcom made the difficult decision earlier this year to retire the Azure Spring Apps service. Azure Container Apps is the primary recommended target service to migrate workloads running on Azure Spring Apps. See the migration guide to learn how to move any Spring Boot applications to Azure Container apps. Workload Profile Metrics We have deployed new workload profile metrics in preview. For apps, we now support CPU Usage Percentage, Memory Percentage, and Average Response Time. These metrics help you understand node capacity and set alerts for performance issues. For environments, we now support Workload Profile Node Count to determine node utilization, so you can update the maximum count. Until the metrics blade is available in the portal for Container App Environments, you can view the new metrics by going to the portal blade for Azure Monitor. We will continue to add more Azure Container Apps metrics and observability features over time! See the metrics documentation to learn more about the metrics available today. Azure Container Apps at Ignite’24 conference Also, if you're at Ignite, come see us at the following sessions: Breakout Session 145: Building serverless intelligent apps with Flex Consumption and GPUs Breakout Session 146: Streamline AI App development with Azure App Platform Breakout Session 147: Modernize and scale enterprise Java applications on Azure Breakout Session 144: Delivering business results with app innovation: Customer Insights Lab 413: Mastering Azure Container Apps and GenAI for Intelligent Solutions Community Roundtable 1008: Empower Devs with Advanced Experiences for Production-Ready AI Apps Partner Breakout Session 387: Accelerate generative AI adoption with NVIDIA AI on Azure Or come talk to us at the Serverless booth at the Expert Meet-up area at the Hub! Wrapping up For feedback, feature requests, or questions about Azure Container Apps, visit our GitHub page. You can open a new issue or up-vote existing ones. If you’re curious about what we’re working on next, checkout our roadmap. We look forward to hearing from you!5.1KViews2likes1CommentIntroducing Azure Managed Redis, cost-effective caching for your AI apps
Azure Managed Redis, announced at Microsoft's Ignite conference, is a new service that brings the latest Redis innovations to the hyperscale cloud. It features four tiers—Memory Optimized, Balanced, Compute Optimized, and Flash Optimized—designed to enhance performance and scalability for GenAI applications. With up to 99.999% availability SLA, cost-effective total cost of ownership, and seamless interoperability with Azure services, it supports high-performance, scalable AI workloads.4.1KViews2likes0CommentsConnect Privately to Azure Front Door with Azure Container Apps
Azure Container Apps is a fully managed serverless container service that enables you to deploy and run containerized applications with per-second billing and autoscaling without having to manage infrastructure. The service also provides support for a number of enhanced networking capabilities to address security and compliance needs such as network security groups (NSGs), Azure Firewall, and more. Today, Azure Container Apps is excited to announce public preview for another key networking capability, private endpoints for workload profile environments. This feature allows customers to connect to their Container Apps environment using a private IP address in their Azure Virtual Network, thereby eliminating exposure to the public internet and securing access to their applications. With the introduction of private endpoints for workload profile environments, you can now also establish a direct connection from Azure Front Door to your Container Apps environment via Private Link. By enabling Private Link for an Azure Container Apps origin, customers benefit from an extra layer of security that further isolates their traffic from the public internet. Currently, you can configure this connectivity through CLI (portal support coming soon). In this post, we will do a brief overview of private endpoints on Azure Container Apps and the process of privately connecting it to Azure Front Door. Getting started with private endpoints on Azure Container Apps Private endpoints can be enabled either during the creation of a new environment or within an existing one. For new environments, you simply navigate to the Networking tab, disable public network access, and enable private endpoints. To manage the creation of private endpoints in an existing environment, you can use the new Networking blade, which is also in public preview. Since private endpoints use a private IP address, the endpoint for a container app is inaccessible through the public internet. This can be confirmed by the lack of connectivity when opening the application URL. If you prefer using CLI, you can find further guidance in enabling private endpoints at Use a private endpoint with an Azure Container Apps environment (preview). Adding container apps as a private origin for Azure Front Door With private endpoints, you can securely connect them to Azure Front Door through Private Link as well. The current process involves CLI commands that guide you in enabling an origin for Private Link and approving the private endpoint connection. Once approved, Azure Front Door assigns a private IP address from a managed regional private network, and you can verify the connectivity between your container app and the Azure Front Door. For a detailed tutorial, please navigate to Create a private link to an Azure Container App with Azure Front Door (preview). Troubleshooting Have trouble testing the private endpoints? After creating a private endpoint for a container app, you can build and deploy a virtual machine to test the private connection. With no public inbound ports, this virtual machine would be associated with the virtual network defined during creation of the private endpoint. After creating the virtual machine, you can connect via Bastion and verify the private connectivity. You may find outlined instructions at Verify the private endpoint connection. Conclusion The public preview of private endpoints and private connectivity to Azure Front Door for workload profile environments is a long-awaited feature in Azure Container Apps. We encourage you to implement private endpoints for enhanced security and look forward to your feedback on this experience at our GitHub page. Additional Resources To learn more, please visit the following links to official documentation: Networking in Azure Container Apps environment - Private Endpoints Use a private endpoint with an Azure Container Apps environment Create a private link to an Azure Container App with Azure Front Door (preview) What is a private endpoint? What is Azure Private Link?3.1KViews2likes4CommentsUnlock New AI and Cloud Potential with .NET 9 & Azure: Faster, Smarter, and Built for the Future
.NET 9, now available to developers, marks a significant milestone in the evolution of the .NET platform, pushing the boundaries of performance, cloud-native development, and AI integration. This release, shaped by contributions from over 9,000 community members worldwide, introduces thousands of improvements that set the stage for the future of application development. With seamless integration with Azure and a focus on cloud-native development and AI capabilities, .NET 9 empowers developers to build scalable, intelligent applications with unprecedented ease. Expanding Azure PaaS Support for .NET 9 With the release of .NET 9, a comprehensive range of Azure Platform as a Service (PaaS) offerings now fully support the platform’s new capabilities, including the latest .NET SDK for any Azure developer. This extensive support allows developers to build, deploy, and scale .NET 9 applications with optimal performance and adaptability on Azure. Additionally, developers can access a wealth of architecture references and sample solutions to guide them in creating high-performance .NET 9 applications on Azure’s powerful cloud services: Azure App Service: Run, manage, and scale .NET 9 web applications efficiently. Check out this blog to learn more about what's new in Azure App Service. Azure Functions: Leverage serverless computing to build event-driven .NET 9 applications with improved runtime capabilities. Azure Container Apps: Deploy microservices and containerized .NET 9 workloads with integrated observability. Azure Kubernetes Service (AKS): Run .NET 9 applications in a managed Kubernetes environment with expanded ARM64 support. Azure AI Services and Azure OpenAI Services: Integrate advanced AI and OpenAI capabilities directly into your .NET 9 applications. Azure API Management, Azure Logic Apps, Azure Cognitive Services, and Azure SignalR Service: Ensure seamless integration and scaling for .NET 9 solutions. These services provide developers with a robust platform to build high-performance, scalable, and cloud-native applications while leveraging Azure’s optimized environment for .NET. Streamlined Cloud-Native Development with .NET Aspire .NET Aspire is a game-changer for cloud-native applications, enabling developers to build distributed, production-ready solutions efficiently. Available in preview with .NET 9, Aspire streamlines app development, with cloud efficiency and observability at its core. The latest updates in Aspire include secure defaults, Azure Functions support, and enhanced container management. Key capabilities include: Optimized Azure Integrations: Aspire works seamlessly with Azure, enabling fast deployments, automated scaling, and consistent management of cloud-native applications. Easier Deployments to Azure Container Apps: Designed for containerized environments, .NET Aspire integrates with Azure Container Apps (ACA) to simplify the deployment process. Using the Azure Developer CLI (azd), developers can quickly provision and deploy .NET Aspire projects to ACA, with built-in support for Redis caching, application logging, and scalability. Built-In Observability: A real-time dashboard provides insights into logs, distributed traces, and metrics, enabling local and production monitoring with Azure Monitor. With these capabilities, .NET Aspire allows developers to deploy microservices and containerized applications effortlessly on ACA, streamlining the path from development to production in a fully managed, serverless environment. Integrating AI into .NET: A Seamless Experience In our ongoing effort to empower developers, we’ve made integrating AI into .NET applications simpler than ever. Our strategic partnerships, including collaborations with OpenAI, LlamaIndex, and Qdrant, have enriched the AI ecosystem and strengthened .NET’s capabilities. This year alone, usage of Azure OpenAI services has surged to nearly a billion API calls per month, illustrating the growing impact of AI-powered .NET applications. Real-World AI Solutions with .NET: .NET has been pivotal in driving AI innovations. From internal teams like Microsoft Copilot creating AI experiences with .NET Aspire to tools like GitHub Copilot, developed with .NET to enhance productivity in Visual Studio and VS Code, the platform showcases AI at its best. KPMG Clara is a prime example, developed to enhance audit quality and efficiency for 95,000 auditors worldwide. By leveraging .NET and scaling securely on Azure, KPMG implemented robust AI features aligned with strict industry standards, underscoring .NET and Azure as the backbone for high-performing, scalable AI solutions. Performance Enhancements in .NET 9: Raising the Bar for Azure Workloads .NET 9 introduces substantial performance upgrades with over 7,500 merged pull requests focused on speed and efficiency, ensuring .NET 9 applications run optimally on Azure. These improvements contribute to reduced cloud costs and provide a high-performance experience across Windows, Linux, and macOS. To see how significant these performance gains can be for cloud services, take a look at what past .NET upgrades achieved for Microsoft’s high-scale internal services: Bing achieved a major reduction in startup times, enhanced efficiency, and decreased latency across its high-performance search workflows. Microsoft Teams improved efficiency by 50%, reduced latency by 30–45%, and achieved up to 100% gains in CPU utilization for key services, resulting in faster user interactions. Microsoft Copilot and other AI-powered applications benefited from optimized runtime performance, enabling scalable, high-quality experiences for users. Upgrading to the latest .NET version offers similar benefits for cloud apps, optimizing both performance and cost-efficiency. For more information on updating your applications, check out the .NET Upgrade Assistant. For additional details on ASP.NET Core, .NET MAUI, NuGet, and more enhancements across the .NET platform, check out the full Announcing .NET 9 blog post. Conclusion: Your Path to the Future with .NET 9 and Azure .NET 9 isn’t just an upgrade—it’s a leap forward, combining cutting-edge AI integration, cloud-native development, and unparalleled performance. Paired with Azure’s scalability, these advancements provide a trusted, high-performance foundation for modern applications. Get started by downloading .NET 9 and exploring its features. Leverage .NET Aspire for streamlined cloud-native development, deploy scalable apps with Azure, and embrace new productivity enhancements to build for the future. For additional insights on ASP.NET, .NET MAUI, NuGet, and more, check out the full Announcing .NET 9 blog post. Explore the future of cloud-native and AI development with .NET 9 and Azure—your toolkit for creating the next generation of intelligent applications.9.7KViews2likes1CommentSecure Unique Default Hostnames: GA on App Service Web Apps and Public Preview on Functions
Back in May 2024, we announced the Public Preview of Secure Unique Default Hostnames on Web Apps. We are excited to announce that this feature is now in General Availability on Web Apps and is now in Public Preview for Functions! This feature works similarly for both Web Apps and Functions, so you can refer to the Public Preview announcement for more in-depth information regarding this feature. Secure unique default hostname feature is a long-term solution to protect your resources from dangling DNS entries and subdomain takeover. If you have this feature enabled for your App Service resources, then no one outside of your organization would be able to recreate resources with the same default hostname. This means that malicious actors can no longer take advantage of your dangling DNS entries and takeover your subdomains. We highly encourage everyone to enable secure unique default hostnames on their net-new App Service deployments. Addressing pre-existing resources without secure unique default hostnames enabled Since this feature can only be enabled upon resource creation, if you’d like to use this feature for your pre-existing resources, you can: Clone a pre-existing app to a new app with secure unique default hostname enabled Screenshot of cloning pre-existing app to an app that's about to be created with secure unique default hostname enabled. Use a backup of a pre-existing app to restore to a new app with secure unique default hostname enabled Screenshot of using a backup of a pre-existing app to restore to an app that's about to be created with secure unique default hostname enabled. Looking ahead We highly encourage everyone to enable secure unique default hostnames on all net-new App Service deployments. This is the time to integrate and to adopt this feature to your testing and production environments so that you can build more secure App Service resources to prevent dangling DNS entries and avoid subdomain takeover. Keep an eye out for future announcements where we will launch secure unique default hostnames in Public Preview for Logic Apps (Standard)!2.4KViews1like0CommentsAnnouncing Conversational Diagnostics (Public Preview) for AKS at Ignite 2024
We are thrilled to announce that Conversational Diagnostics (Public Preview) is coming to Azure Kubernetes Service (AKS)! This new functionality will be available in the "Diagnose and solve problems" section of the Azure Portal, starting at Ignite 2024. Diagnose and Solve Problems with Conversational Diagnostics Conversational Diagnostics leverages natural language processing to help you troubleshoot and resolve issues with your AKS clusters more efficiently. By simply describing your problem in plain language, you can get targeted diagnostics and solutions, making it easier to maintain the health and performance of your AKS deployments. Availability and Future Developments Conversational Diagnostics (Public Preview) for AKS will be available as part of Ignite 2024. Shortly after, the AKS + Conversational Diagnostics (Public Preview) integration will expand to Copilot in Azure and GitHub Copilot via azure. With the integration of Conversational Diagnostics into Copilot in Azure, you can now access powerful diagnostic tools from anywhere in Azure Portal allowing you to quickly identify and resolve issues without leaving your workflow, enhancing productivity, and reducing downtime. The expansion of Conversational Diagnostics to GitHub Copilot means that developers can now benefit from advanced troubleshooting capabilities while writing code, providing near real-time diagnostics and suggestions to help catch and fix issues early in the development process. Stay tuned for more updates and detailed demonstrations at Ignite 2024. We look forward to seeing how Conversational Diagnostics will transform the way you manage and troubleshoot your AKS clusters. Thank you!418Views1like0CommentsBuild AI faster and run with confidence
Intelligence is the new baseline for modern apps. We’re seeing the energy and excitement from AI coming to life in apps being built and modernized today. AI is now the critical component of nearly every application and business strategy. And most importantly, we’ve reached an inflection point where organizations are moving from widespread experimentation to full production. In fact, we just completed a global survey of more than 1,500 developers and IT influencers: 60% of respondents have AI apps under development that will go to production over the next year. What’s even more exciting is the evolution of what organizations are looking to achieve. Agents and intelligent assistants continue to shape customer service, make customer experiences more personal and help employees make faster, more accurate decisions. And agents that unlock organizational knowledge are increasingly vital. But more than ever, AI is reinventing core business processes. From long-established business to startups (where our business has increased 200%), this holds incredible promise to drive revenue and growth as well as competitive differentiation. AI is both an inspiration for how the next generation of apps will be defined and an accelerator for how we will build and deliver on that promise. This is the tension that we see with every business; intense pressure to move fast, to take advantage of incredible innovation with speed, while delivering every app securely with the performance and scale that meets the unique needs of AI apps. Developers are on the front lines, responsible for moving quickly from idea to code to cloud while delivering secure, private and trusted AI services that meet ROI and cost requirements. This combination of building fast and running with confidence is core to Microsoft’s strategy. So, as we kick off Microsoft Ignite 2024 today, we’re bringing a slate of new and enhanced innovation that makes this transition to production faster and easier for all. With new innovations across the Copilot and AI stack, new integrations that bring together services across our AI platform and developer tools and an expanding set of partnerships across the AI toolchain, there’s a ton of great innovation at Ignite. Let’s look at just a few… GitHub Copilot GitHub Copilot is the most widely used AI pair-programming tool in the world, with millions of developers using it daily. We’re seeing developers code up to 55% faster through real-time code suggestions and solutions, and elevating quality, generating cleaner, more resilient, code that is easier to maintain. This week we’re showing how Copilot is evolving to drive efficiency across the entire application lifecycle. Developers in VS Code now have the flexibility to select from an array of industry-leading models, including OpenAI’s GPT-4o and Anthropic Claude Sonnet 3.5. And they have the power to use natural language chat to implement complex code changes across multiple files. GitHub Copilot upgrade assistant for Java Now the power of Copilot can help with existing apps as well. Keeping Java apps up to date can be a time-consuming task. GitHub Copilot upgrade assistant for Java uses AI to simplify Java applications upgrades with autonomous agents. The process is transparent, keeping a human in the loop, while your environment actively learns from your context and adjustments, improving accuracy for future upgrades. GitHub Copilot for Azure GitHub Copilot for Azure streamlines path from code to production on Azure for every developer, even those new to Azure. Through Copilot, you can use natural language to learn about your Azure services and resources, find sample applications and templates and quickly deploy to Azure while supporting your enterprise standards and guidelines. Once in production, GitHub Copilot for Azure helps you troubleshoot and resolve application issues stay on top of costs. Copilot knows the full context of you as a developer and your systems to make every recommendation tailored to your unique needs. Available now in public preview, it does it all from the tools you already use helping you minimize interruptions and stay focused. Azure AI Foundry New at Ignite, Azure AI Foundry brings together an end-to-end AI platform across models, tooling, safety, and monitoring to help you efficiently and cost-effectively design and scale your AI applications. By integrating with popular developer tools like GitHub, Visual Studio, and Copilot Studio, Azure AI Foundry opens up this full portfolio of services for developers, giving them access to the best, most advanced models in the world along with tools for building agents on Azure and a unified toolchain to access AI services through one interface. Azure AI Foundry is a key offering enabling easy integration of Azure AI capabilities into your applications. AI Template Gallery AI App Template Gallery is a new resource designed to help you build and deploy AI applications in a matter of minutes, with the flexibility to use the programming language, framework and architecture of your choice. The gallery offers more than 25 curated, ready-to-use application templates, creating a clear path to kickstart your AI projects with confidence and efficiency. And developers can easily discover and access each of them through GitHub Copilot for Azure, further simplifying access through your preferred developer tool. Azure Native Integrations Azure Native Integrations gives developers access to a curated set of ISV services available directly in the Azure portal, SDK, and CLI. This means that developers have the flexibility to work with their preferred vendors across the AI toolchain and other common solution areas, with simplified single sign-on and management, while staying in Azure. Joining our portfolio of integrated services are Pinecone, Weights & Biases, Arize, and LambdaTest all now available in private preview. Neon, Pure Storage Cloud for Azure VMware Solution (AVS), and Dell APEX File Storage will also be available soon as part of Azure Native Integrations. Azure Container Apps with Serverless GPUs Azure Container Apps now supports Serverless GPUs in public preview, enabling effortless scaling and flexibility for real-time custom model inferencing and other machine learning tasks. Serverless GPUs enable you to seamlessly run your AI workloads on-demand, accessing powerful NVIDIA accelerated computing resources, with automatic scaling, optimized cold start and per-second billing without the need for dedicated infrastructure management. Azure Essentials for AI Adoption We also recognize great technology is only part of your success. Microsoft has published design patterns, baseline reference architectures, application landing zones, and a variety of Azure service guides for Azure OpenAI workloads along with FinOps guidance for AI. This week, we are excited to announce new AI specific guidance in the Cloud Adoption Framework and the Azure Well-Architected Framework to help adopt AI at scale while meeting requirements for reliability, security, operations and cost.573Views1like0CommentsIntroducing Serverless GPUs on Azure Container Apps
We're excited to announce the public preview of Azure Container Apps Serverless GPUs accelerated by NVIDIA. This feature provides customers with NVIDIA A100 GPUs and NVIDIA T4 GPUs in a serverless environment, enabling effortless scaling and flexibility for real-time custom model inferencing and other machine learning tasks. Serverless GPUs accelerate the speed of your AI development team by allowing you to focus on your core AI code and less on managing infrastructure when using NVIDIA accelerated computing. They provide an excellent middle layer option between Azure AI Model Catalog's serverless APIs and hosting models on managed compute. It provides full data governance as your data never leaves the boundaries of your container while still providing a managed, serverless platform from which to build your applications. Serverless GPUs are designed to meet the growing demands of modern applications by providing powerful NVIDIA accelerated computing resources without the need for dedicated infrastructure management. "Azure Container Apps' serverless GPU offering is a leap forward for AI workloads. Serverless NVIDIA GPUs are well suited for a wide array of AI workloads from real-time inferencing scenarios with custom models to fine-tuning. NVIDIA is also working with Microsoft to bring NVIDIA NIM microservices to Azure Container Apps to optimize AI inference performance.” - Dave Salvator, Director, Accelerated Computing Products, NVIDIA Key benefits of serverless GPUs Scale-to zero GPUs: Support for serverless scaling of NVIDIA A100 and T4 GPUs. Per-second billing: Pay only for the GPU compute you use. Built-in data governance: Your data never leaves the container boundary. Flexible compute options: Choose between NVIDIA A100 and T4 GPUs. Middle-layer for AI development: Bring your own model on a managed, serverless compute platform. Scenarios Whether you choose to use NVIDIA A100 or T4 GPUs will depend on the types of apps you're creating. The following are a couple example scenarios. For each scenario with serverless GPUs, you pay only for the compute you use with per-second billing, and your apps will automatically scale in and out from zero to meet the demand. NVIDIA T4 Real-time and batch inferencing: Using custom open-source models with fast startup times, automatic scaling, and a per-second billing model, serverless GPUs are ideal for dynamic applications that don't already have a serverless API in the model catalog. NVIDIA A100 Compute intensive machine learning scenarios: Significantly speed up applications that implement fine-tuned custom generative AI models, deep learning, or neural networks. High performance computing (HPC) and data analytics: Applications that require complex calculations or simulations, such as scientific computing and financial modeling as well as accelerated data processing and analysis among massive datasets. Get started with serverless GPUs Serverless GPUs are now available for workload profile environments in West US 3, Australia East, and Sweden Central regions with more regions to come. You will need to have quota enabled on your subscription in order to use serverless GPUs. By default, all Microsoft Enterprise Agreement customers will have one quota. If additional quota is needed, please request it here. Note: In order to achieve the best performance with serverless GPUs, use an Azure Container Registry (ACR) with artifact streaming enabled for your image tag. Follow steps here to enable artifact streaming on your ACR. From the portal, you can select to enable GPUs for your Consumption app in the container tab when creating your Container App or your Container App Job. You can also add a new consumption GPU workload profile to your existing Container App environment through the workload profiles UX in portal or through the CLI commands for managing workload profiles. Deploy a sample Stable Diffusion app To try out serverless GPUs, you can use the stable diffusion image which is provided as a quickstart during the container app create experience: In the container tab select the Use quickstart image box. In the quickstart image dropdown, select GPU hello world container. If you wish to pull the GPU container image into your own ACR to enable artifact streaming for improved performance, or if you wish to manually enter the image, you can find the image at mcr.microsoft.com/k8se/gpu-quickstart:latest. For full steps on using your own image with serverless GPUs, see the tutorial on using serverless GPUs in Azure Container Apps. Learn more about serverless GPUs With serverless GPUs, Azure Container Apps now simplifies the development of your AI applications by providing scale-to-zero compute, pay-as you go pricing, reduced infrastructure management, and more. To learn more, visit: Using serverless GPUs in Azure Container Apps (preview) | Microsoft Learn Tutorial: Generate images using serverless GPUs in Azure Container Apps (preview) | Microsoft Learn5.5KViews1like0Comments