azure container instances
12 TopicsBuilding the Agentic Future
As a business built by developers, for developers, Microsoft has spent decades making it faster, easier and more exciting to create great software. And developers everywhere have turned everything from BASIC and the .NET Framework, to Azure, VS Code, GitHub and more into the digital world we all live in today. But nothing compares to what’s on the horizon as agentic AI redefines both how we build and the apps we’re building. In fact, the promise of agentic AI is so strong that market forecasts predict we’re on track to reach 1.3 billion AI Agents by 2028. Our own data, from 1,500 organizations around the world, shows agent capabilities have jumped as a driver for AI applications from near last to a top three priority when comparing deployments earlier this year to applications being defined today. Of those organizations building AI agents, 41% chose Microsoft to build and run their solutions, significantly more than any other vendor. But within software development the opportunity is even greater, with approximately 50% of businesses intending to incorporate agentic AI into software engineering this year alone. Developers face a fascinating yet challenging world of complex agent workflows, a constant pipeline of new models, new security and governance requirements, and the continued pressure to deliver value from AI, fast, all while contending with decades of legacy applications and technical debt. This week at Microsoft Build, you can see how we’re making this future a reality with new AI-native developer practices and experiences, by extending the value of AI across the entire software lifecycle, and by bringing critical AI, data, and toolchain services directly to the hands of developers, in the most popular developer tools in the world. Agentic DevOps AI has already transformed the way we code, with 15 million developers using GitHub Copilot today to build faster. But coding is only a fraction of the developer’s time. Extending agents across the entire software lifecycle, means developers can move faster from idea to production, boost code quality, and strengthen security, while removing the burden of low value, routine, time consuming tasks. We can even address decades of technical debt and keep apps running smoothly in production. This is the foundation of agentic DevOps—the next evolution of DevOps, reimagined for a world where intelligent agents collaborate with developer teams and with each other. Agents introduced today across GitHub Copilot and Azure operate like a member of your development team, automating and optimizing every stage of the software lifecycle, from performing code reviews, and writing tests to fixing defects and building entire specs. Copilot can even collaborate with other agents to complete complex tasks like resolving production issues. Developers stay at the center of innovation, orchestrating agents for the mundane while focusing their energy on the work that matters most. Customers like EY are already seeing the impact: “The coding agent in GitHub Copilot is opening up doors for each developer to have their own team, all working in parallel to amplify their work. Now we're able to assign tasks that would typically detract from deeper, more complex work, freeing up several hours for focus time." - James Zabinski, DevEx Lead at EY You can learn more about agentic DevOps and the new capabilities announced today from Amanda Silver, Corporate Vice President of Product, Microsoft Developer Division, and Mario Rodriguez, Chief Product Office at GitHub. And be sure to read more from GitHub CEO Thomas Dohmke about the latest with GitHub Copilot. At Microsoft Build, see agentic DevOps in action in the following sessions, available both in-person May 19 - 22 in Seattle and on-demand: BRK100: Reimagining Software Development and DevOps with Agentic AI BRK 113: The Agent Awakens: Collaborative Development with GitHub Copilot BRK118: Accelerate Azure Development with GitHub Copilot, VS Code & AI BRK131: Java App Modernization Simplified with AI BRK102: Agent Mode in Action: AI Coding with Vibe and Spec-Driven Flows BRK101: The Future of .NET App Modernization Streamlined with AI New AI Toolchain Integrations Beyond these new agentic capabilities, we’re also releasing new integrations that bring key services directly to the tools developers are already using. From the 150 million GitHub users to the 50 million monthly users of the VS Code family, we’re making it easier for developers everywhere to build AI apps. If GitHub Copilot changed how we write code, Azure AI Foundry is changing what we can build. And the combination of the two is incredibly powerful. Now we’re bringing leading models from Azure AI Foundry directly into your GitHub experience and workflow, with a new native integration. GitHub models lets you experiment with leading models from OpenAI, Meta, Cohere, Microsoft, Mistral and more. Test and compare performance while building models directly into your codebase all within in GitHub. You can easily select the best model performance and price side by side and swap models with a simple, unified API. And keeping with our enterprise commitment, teams can set guardrails so model selection is secure, responsible, and in line with your team’s policies. Meanwhile, new Azure Native Integrations gives developers seamless access to a curated set of 20 software services from DataDog, New Relic, Pinecone, Pure Storage Cloud and more, directly through Azure portal, SDK, and CLI. With Azure Native Integrations, developers get the flexibility to work with their preferred vendors across the AI toolchain with simplified single sign-on and management, while staying in Azure. Today, we are pleased to announce the addition of even more developer services: Arize AI: Arize’s platform provides essential tooling for AI and agent evaluation, experimentation, and observability at scale. With Arize, developers can easily optimize AI applications through tools for tracing, prompt engineering, dataset curation, and automated evaluations. Learn more. LambdaTest HyperExecute: LambdaTest HyperExecute is an AI-native test execution platform designed to accelerate software testing. It enables developers and testers to run tests up to 70% faster than traditional cloud grids by optimizing test orchestration, observability and streamlining TestOps to expedite release cycles. Learn more. Mistral: Mistral and Microsoft announced a partnership today, which includes integrating Mistral La Plateforme as part of Azure Native Integrations. Mistral La Plateforme provides pay-as-you-go API access to Mistral AI's latest large language models for text generation, embeddings, and function calling. Developers can use this AI platform to build AI-powered applications with retrieval-augmented generation (RAG), fine-tune models for domain-specific tasks, and integrate AI agents into enterprise workflows. MongoDB (Public Preview): MongoDB Atlas is a fully managed cloud database that provides scalability, security, and multi-cloud support for modern applications. Developers can use it to store and search vector embeddings, implement retrieval-augmented generation (RAG), and build AI-powered search and recommendation systems. Learn more. Neon: Neon Serverless Postgres is a fully managed, autoscaling PostgreSQL database designed for instant provisioning, cost efficiency, and AI-native workloads. Developers can use it to rapidly spin up databases for AI agents, store vector embeddings with pgvector, and scale AI applications seamlessly. Learn more. Java and .Net App Modernization Shipping to production isn’t the finish line—and maintaining legacy code shouldn’t slow you down. Today we’re announcing comprehensive resources to help you successfully plan and execute app modernization initiatives, along with new agents in GitHub Copilot to help you modernize at scale, in a fraction of the time. In fact, customers like Ford China are seeing breakthrough results, reducing up to 70% of their Java migration efforts by using GitHub Copilot to automate middleware code migration tasks. Microsoft’s App Modernization Guidance applies decades of enterprise apps experience to help you analyze production apps and prioritize modernization efforts, while applying best practices and technical patterns to ensure success. And now GitHub Copilot transforms the modernization process, handling code assessments, dependency updates, and remediation across your production Java and .NET apps (support for mainframe environments is coming soon!). It generates and executes update plans automatically, while giving you full visibility, control, and a clear summary of changes. You can even raise modernization tasks in GitHub Issues from our proven service Azure Migrate to assign to developer teams. Your apps are more secure, maintainable, and cost-efficient, faster than ever. Learn how we’re reimagining app modernization for the era of AI with the new App Modernization Guidance and the modernization agent in GitHub Copilot to help you modernize your complete app estate. Scaling AI Apps and Agents Sophisticated apps and agents need an equally powerful runtime. And today we’re advancing our complete portfolio, from serverless with Azure Functions and Azure Container Apps, to the control and scale of Azure Kubernetes Service. At Build we’re simplifying how you deploy, test, and operate open-source and custom models on Kubernetes through Kubernetes AI Toolchain Operator (KAITO), making it easy to inference AI models with the flexibility, auto-scaling, pay-per-second pricing, and governance of Azure Container Apps serverless GPU, helping you create real-time, event-driven workflows for AI agents by integrating Azure Functions with Azure AI Foundry Agent Service, and much, much more. The platform you choose to scale your apps has never been more important. With new integrations with Azure AI Foundry, advanced automation that reduces developer overhead, and simplified operations, security and governance, Azure’s app platform can help you deliver the sophisticated, secure AI apps your business demands. To see the full slate of innovations across the app platform, check out: Powering the Next Generation of AI Apps and Agents on the Azure Application Platform Tools that keep pace with how you need to build This week we’re also introducing new enhancements to our tooling to help you build as fast as possible and explore what’s next with AI, all directly from your editor. GitHub Copilot for Azure brings Azure-specific tools into agent mode in VS Code, keeping you in the flow as you create, manage, and troubleshoot cloud apps. Meanwhile the Azure Tools for VS Code extension pack brings everything you need to build apps on Azure using GitHub Copilot to VS Code, making it easy to discover and interact with cloud services that power your applications. Microsoft’s gallery of AI App Templates continues to expand, helping you rapidly move from concept to production app, deployed on Azure. Each template includes fully working applications, complete with app code, AI features, infrastructure as code (IaC), configurable CI/CD pipelines with GitHub Actions, along with an application architecture, ready to deploy to Azure. These templates reflect the most common patterns and use cases we see across our AI customers, from getting started with AI agents to building GenAI chat experiences with your enterprise data and helping you learn how to use best practices such as keyless authentication. Learn more by reading the latest on Build Apps and Agents with Visual Studio Code and Azure Building the agentic future The emergence of agentic DevOps, the new wave of development powered by GitHub Copilot and new services launching across Microsoft Build will be transformative. But just as we’ve seen over the first 50 years of Microsoft’s history, the real impact will come from the global community of developers. You all have the power to turn these tools and platforms into advanced AI apps and agents that make every business move faster, operate more intelligently and innovate in ways that were previously impossible. Learn more and get started with GitHub Copilot2KViews2likes0CommentsReimagining App Modernization for the Era of AI
This blog highlights the key announcements and innovations from Microsoft Build 2025. It focuses on how AI is transforming the software development lifecycle, particularly in app modernization. Key topics include the use of GitHub Copilot for accelerating development and modernization, the introduction of Azure SRE agent for managing production systems, and the launch of the App Modernization Guidance to help organizations modernize their applications with AI-first design. The blog emphasizes the strategic approach to modernization, aiming to reduce complexity, improve agility, and deliver measurable business outcomes3.9KViews2likes0CommentsAnnouncing Public Preview of Larger Container Sizes on Azure Container Instances
ACI provides a fast and simple way to run containers in the cloud. As a serverless solution, ACI eliminates the need to manage underlying infrastructure, automatically scaling to meet application demands. Customers benefit from using ACI because it offers flexible resource allocation, pay-per-use pricing, and rapid deployment, making it easier to focus on development and innovation without worrying about infrastructure management. Today, we are excited to announce the public preview of larger container sizes on Azure Container Instances (ACI). Customers can now deploy workloads with higher vCPU and memory for standard containers, confidential containers, containers with virtual networks, and containers utilizing virtual nodes to connect to Azure Kubernetes Service (AKS). ACI now supports vCPU counts greater than 4 and memory capacities of 16 GB, with a maximum of 32 vCPU and 256 GB for standard containers and a maximum of 32 vCPU and 192 GB for confidential containers. Benefits of Larger Container Sizes on ACI Enhanced Performance More vCPUs mean more processing power, allowing for more efficient handling of complex tasks and applications. The enhanced performance from more vCPUs and larger GB capacity offers faster processing times and reduced latency, which can translate to cost savings in terms of time and productivity. Larger container groups with more GB can handle bigger datasets and more extensive workloads, making them ideal for data-intensive applications. Simplified Scalability Larger container groups provide the flexibility to scale up resources even higher as needed, accommodating growing business demands without compromising performance. Larger container SKUs can simplify the scaling process. Instead of managing many smaller containers, you can scale your applications with fewer, larger ones, potentially reducing the need for frequent scaling adjustments. Scenarios for Larger Container Sizes Data Inferencing Larger container SKUs are ideal for data inferencing tasks that require robust computational power. Examples include real-time fraud detection in financial transactions, predictive maintenance in manufacturing, and personalized recommendation engines in e-commerce. These containers ensure efficient and secure processing of large datasets for accurate predictions and insights. Collaborative Analytics When multiple parties need to share and analyze data, larger container SKUs provide a secure and efficient solution. For instance, companies in healthcare can collaborate on patient data analytics while maintaining confidentiality. Similarly, research institutions can share large datasets for scientific studies without compromising data privacy. Big Data Processing Organizations dealing with large-scale data processing can benefit from the enhanced capacity of larger container SKUs. Examples include processing customer data for targeted marketing campaigns, analyzing social media trends for sentiment analysis, and conducting large-scale financial modeling for risk assessment. These containers ensure efficient handling of extensive workloads. High-Performance Computing High-performance computing applications, such as climate modeling, genomic research, and computational fluid dynamics, demand substantial computational power. Larger container SKUs provide the necessary resources to support these intensive tasks, enabling precise simulations and faster results. How to start using Larger Container Sizes To begin using Larger Container Sizes, follow these steps. If you plan to run containers larger than 4 vCPU and 16 GB, you must request quota. Once your quota has been allocated, you can deploy your container groups through Azure portal, Azure CLI, PowerShell, ARM template, or any other method that allows you to connect to your container groups in Azure. Here are some tutorials for how to deploy containers using different methods. Quickstart - Deploy Docker container to container instance - Azure CLI - Azure Container Instances | Microsoft Learn Quickstart - Deploy Docker container to container instance - Portal - Azure Container Instances | Microsoft Learn Quickstart - Deploy Docker container to container instance - PowerShell - Azure Container Instances | Microsoft Learn Quickstart - create a container instance - Bicep - Azure Container Instances | Microsoft Learn Quickstart - Create a container instance - Azure Resource Manager template - Azure Container Instances | Microsoft Learn To learn more about Azure Container Instances, see Serverless containers in Azure - Azure Container Instances | Microsoft Learn.590Views2likes0CommentsUsing NVIDIA Triton Inference Server on Azure Container Apps
TOC Introduction to Triton System Architecture Architecture Focus of This Tutorial Setup Azure Resources File and Directory Structure ARM Template ARM Template From Azure Portal Testing Azure Container Apps Conclusion References 1. Introduction to Triton Triton Inference Server is an open-source, high-performance inferencing platform developed by NVIDIA to simplify and optimize AI model deployment. Designed for both cloud and edge environments, Triton enables developers to serve models from multiple deep learning frameworks, including TensorFlow, PyTorch, ONNX Runtime, TensorRT, and OpenVINO, using a single standardized interface. Its goal is to streamline AI inferencing while maximizing hardware utilization and scalability. A key feature of Triton is its support for multiple model execution modes, including dynamic batching, concurrent model execution, and multi-GPU inferencing. These capabilities allow organizations to efficiently serve AI models at scale, reducing latency and optimizing throughput. Triton also offers built-in support for HTTP/REST and gRPC endpoints, making it easy to integrate with various applications and workflows. Additionally, it provides model monitoring, logging, and GPU-accelerated inference optimization, enhancing performance across different hardware architectures. Triton is widely used in AI-powered applications such as autonomous vehicles, healthcare imaging, natural language processing, and recommendation systems. It integrates seamlessly with NVIDIA AI tools, including TensorRT for high-performance inference and DeepStream for video analytics. By providing a flexible and scalable deployment solution, Triton enables businesses and researchers to bring AI models into production with ease, ensuring efficient and reliable inferencing in real-world applications. 2. System Architecture Architecture Development Environment OS: Ubuntu Version: Ubuntu 18.04 Bionic Beaver Docker version: 26.1.3 Azure Resources Storage Account: SKU - General Purpose V2 Container Apps Environments: SKU - Consumption Container Apps: N/A Focus of This Tutorial This tutorial walks you through the following stages: Setting up Azure resources Publishing the project to Azure Testing the application Each of the mentioned aspects has numerous corresponding tools and solutions. The relevant information for this session is listed in the table below. Local OS Windows Linux Mac V How to setup Azure resources and deploy Portal (i.e., REST api) ARM Bicep Terraform V 3. Setup Azure Resources File and Directory Structure Please open a terminal and enter the following commands: git clone https://github.com/theringe/azure-appservice-ai.git cd azure-appservice-ai After completing the execution, you should see the following directory structure: File and Path Purpose triton/tools/arm-template.json The ARM template to setup all the Azure resources related to this tutorial, including a Container Apps Environments, a Container Apps, and a Storage Account with the sample dataset. ARM Template We need to create the following resources or services: Manual Creation Required Resource/Service Container Apps Environments Yes Resource Container Apps Yes Resource Storage Account Yes Resource Blob Yes Service Deployment Script Yes Resource Let’s take a look at the triton/tools/arm-template.json file. Refer to the configuration section for all the resources. Since most of the configuration values don’t require changes, I’ve placed them in the variables section of the ARM template rather than the parameters section. This helps keep the configuration simpler. However, I’d still like to briefly explain some of the more critical settings. As you can see, I’ve adopted a camelCase naming convention, which combines the [Resource Type] with [Setting Name and Hierarchy]. This makes it easier to understand where each setting will be used. The configurations in the diagram are sorted by resource name, but the following list is categorized by functionality for better clarity. Configuration Name Value Purpose storageAccountContainerName data-and-model [Purpose 1: Blob Container for Model Storage] Use this fixed name for the Blob Container. scriptPropertiesRetentionInterval P1D [Purpose 2: Script for Uploading Models to Blob Storage] No adjustments are needed. This script is designed to launch a one-time instance immediately after the Blob Container is created. It downloads sample model files and uploads them to the Blob Container. The Deployment Script resource will automatically be deleted after one day. caeNamePropertiesPublicNetworkAccess Enabled [Purpose 3: For Testing] ACA requires your local machine to perform tests; therefore, external access must be enabled. appPropertiesConfigurationIngressExternal true [Purpose 3: For Testing] Same as above. appPropertiesConfigurationIngressAllowInsecure true [Purpose 3: For Testing] Same as above. appPropertiesConfigurationIngressTargetPort 8000 [Purpose 3: For Testing] The Triton service container uses port 8000. appPropertiesTemplateContainers0Image nvcr.io/nvidia/tritonserver:22.04-py3 [Purpose 3: For Testing] The Triton service container utilizes this online resource. ARM Template From Azure Portal In addition to using az cli to invoke ARM Templates, if the JSON file is hosted on a public network URL, you can also load its configuration directly into the Azure Portal by following the method described in the article [Deploy to Azure button - Azure Resource Manager]. This is my example. Click Me After filling in all the required information, click Create. And we could have a test once the creation process is complete. 4. Testing Azure Container App In our local environment, use the following command to start a one-time Docker container. We will use NVIDIA's official test image and send a sample image from within it to the Triton service that was just deployed to Container Apps. # Replace XXX.YYY.ZZZ.azurecontainerapps.io with the actual FQDN of your app. There is no need to add https:// docker run --rm nvcr.io/nvidia/tritonserver:22.04-py3-sdk /workspace/install/bin/image_client -u XXX.YYY.ZZZ.azurecontainerapps.io -m densenet_onnx -c 3 -s INCEPTION /workspace/images/mug.jpg After sending the request, you should see the prediction results, indicating that the deployed Triton server service is functioning correctly. 5. Conclusion Beyond basic model hosting, Triton Inference Server's greatest strength lies in its ability to efficiently serve AI models at scale. It supports multiple deep learning frameworks, allowing seamless deployment of diverse models within a single infrastructure. With features like dynamic batching, multi-GPU execution, and optimized inference pipelines, Triton ensures high performance while reducing latency. While it may not replace custom-built inference solutions for highly specialized workloads, it excels as a standardized and scalable platform for deploying AI across cloud and edge environments. Its flexibility makes it ideal for applications such as real-time recommendation systems, autonomous systems, and large-scale AI-powered analytics. 6. References Quickstart — NVIDIA Triton Inference Server Deploying an ONNX Model — NVIDIA Triton Inference Server Model Repository — NVIDIA Triton Inference Server Triton Tutorials — NVIDIA Triton Inference Server650Views0likes0CommentsVulnerability Assessment on Azure Container Registry with Microsoft Defender and Docker Hub
Azure Container Registry: Microsoft Defender for Containers and Docker Scout. Rather than comparing these tools, I want to highlight how they complement each other to enhance container security on ACR. Containers revolutionize the way applications are deployed, providing a consistent and efficient way to package code and dependencies. However, with this convenience comes the critical need for robust security measures. In this post, we'll explore why container security is essential, focusing on preventing malicious code, avoiding vulnerabilities, and managing configuration and deployment risks.3.8KViews3likes2CommentsConfidential containers on Azure Container Instances General Availability
We are announcing the general availability of confidential containers on ACI making it easier to adopt confidential computing with the simplicity and benefits of a fully managed serverless container platform.5.2KViews0likes0CommentsGive us your thoughts on running Kubernetes anywhere for a chance to win a $300 USD gift card!
Do you work with Kubernetes? Are you interested in improving Kubernetes experience across hybrid and multicloud environments? Tell us about your experience for a chance to win a $300 USD gift card.1.5KViews1like0CommentsAnnouncing public preview of confidential containers on Azure Container Instances
We are announcing the public preview of confidential containers on ACI makes it easier to adopt confidential computing with the simplicity and benefits of a fully managed serverless container platform.14KViews3likes0Comments