observability
27 TopicsModel Mondays S2E12: Models & Observability
1. Weekly Highlights This week’s top news in the Azure AI ecosystem included: GPT Real Time (GA): Azure AI Foundry now offers GPT Real Time (GA)—lifelike voices, improved instruction following, audio fidelity, and function calling, with support for image context and lower pricing. Read the announcement and check out the model card for more details. Azure AI Translator API (Public Preview): Choose between fast Neural Machine Translation (NMT) or nuanced LLM-powered translations, with real-time flexibility for multilingual workflows. Read the announcement then check out the Azure AI Translator documentation for more details. Azure AI Foundry Agents Learning Plan: Build agents with autonomous goal pursuit, memory, collaboration, and deep fine-tuning (SFT, RFT, DPO) - on Azure AI Foundry. Read the announcement what Agentic AI involves - then follow this comprehensive learning plan with step-by-step guidance. CalcLM Agent Grid (Azure AI Foundry Labs): Project CalcLM: Agent Grid is a prototype and open-source experiment that illustrates how agents might live in a grid-like surface (like Excel). It's formula-first and lightweight - defining agentic workflows like calculations. Try the prototype and visit Foundry Labs to learn more. Agent Factory Blog: Observability in Agentic AI: Agentic AI tools and workflows are gaining rapid adoption in the enterprise. But delivering safe, reliable and performant agents requires foundation support for Observability. Read the 6-part Agent Factory series and check out the Top 5 agent observability best practices for reliable AI blog post for more details. 2. Spotlight On: Observability in Azure AI Foundry This week’s spotlight featured a deep dive and demo by Han Che (Senior PM, Core AI/ Microsoft ), showing observability end-to-end for agent workflows. Why Observability? Ensures AI quality, performance, and safety throughout the development lifecycle. Enables monitoring, root cause analysis, optimization, and governance for agents and models. Key Features & Demos: Development Lifecycle: Leaderboard: Pick the best model for your agent with real-time evaluation. Playground: Chat and prototype agents, view instant quality and safety metrics. Evaluators: Assess quality, risk, safety, intent resolution, tool accuracy, code vulnerability, and custom metrics. Governance: Integrate with partners like Cradle AI and SideDot for policy mapping and evidence archiving. Red Teaming Agent: Automatically test for vulnerabilities and unsafe behavior. CI/CD Integration: Automate evaluation in GitHub Actions and Azure DevOps pipelines. Azure DevOps GitHub Actions Monitoring Dashboard: Resource usage, application analytics, input/output tokens, request latency, cost breakdown, and evaluation scores. Azure Cost Management SDKs & Local Evaluation: Run evaluations locally or in the cloud with the Azure AI Evaluation SDK. Demo Highlights: Chat with a travel planning agent, view run metrics and tool usage. Drill into run details, debugging, and real-time safety/quality scores. Configure and run large-scale agent evaluations in CI/CD pipelines. Compare agents, review statistical analysis, and monitor in production dashboards 3. Customer Story: Saifr Saifr is a RegTech company that uses artificial intelligence to streamline compliance for marketing, communications, and creative teams in regulated industries. Incubated at Fidelity Labs (Fidelity Investments’ innovation arm), Saifr helps enterprises create, review, and approve content that meets regulatory standards—faster and with less manual effort. What Saifr Offers AI-Powered Compliance: Saifr’s platform leverages proprietary AI models trained on decades of regulatory expertise to automatically detect potential compliance risks in text, images, audio, and video. Automated Guardrails: The solution flags risky or non-compliant language, suggests compliant alternatives, and provides explanations—all in real time. Workflow Integration: Saifr seamlessly integrates with enterprise content creation and approval workflows, including cloud platforms and agentic AI systems like Azure AI Foundry. Multimodal Support: Goes beyond text to check images, videos, and audio for compliance risks, supporting modern marketing and communications teams. 4. Key Takeaways Observability is Essential: Azure AI Foundry offers complete monitoring, evaluation, tracing, and governance for agentic AI—making production safe, reliable, and compliant. Built-In Evaluation and Red Teaming: Use leaderboards, evaluators, and red teaming agents to assess and continuously improve model safety and quality. CI/CD and Dashboard Integration: Automate evaluations in GitHub Actions or Azure DevOps, then monitor and optimize agents in production with detailed dashboards. Compliance Made Easy: Safer’s agents and models help financial services and regulated industries proactively meet compliance standards for content and communications. Sharda's Tips: How I Wrote This Blog I focus on organizing highlights, summarizing customer stories, and linking to official Microsoft docs and real working resources. For this recap, I explored the Azure AI Foundry Observability docs, tested CI/CD pipeline integration, and watched the customer demo to share best practices for regulated industries. Here’s my Copilot prompt for this episode: "Generate a technical blog post for Model Mondays S2E12 based on the transcript and episode details. Focus on observability, agent dashboards, CI/CD, compliance, and customer stories. Add correct, working Microsoft links!" Coming Up Next Week Next week: Open Source Models! Join us for the final episode with Hugging Face VP of Product, live demos, and open model workflows. Register For The Livestream – Sep 15, 2025 About Model Mondays Model Mondays is your weekly Azure AI learning series: 5-Minute Highlights: Latest AI news and product updates 15-Minute Spotlight: Demos and deep dives with product teams 30-Minute AMA Fridays: Ask anything in Discord or the forum Start building: Watch Past Replays Register For AMA Recap Past AMAs Join The Community Don’t build alone! The Azure AI Developer Community is here for real-time chats, events, and support: Join the Discord Explore the Forum About Me I'm Sharda, a Gold Microsoft Learn Student Ambassador focused on cloud and AI. Find me on GitHub, Dev.to, Tech Community, and LinkedIn. In this blog series, I share takeaways from each week’s Model Mondays livestream.81Views0likes0CommentsThe Future of AI: Harnessing AI agents for Customer Engagements
Discover how AI-powered agents are revolutionizing customer engagement—enhancing real-time support, automating workflows, and empowering human professionals with intelligent orchestration. Explore the future of AI-driven service, including Customer Assist created with Azure AI Foundry.689Views2likes0CommentseBPF-Powered Observability Beyond Azure: A Multi-Cloud Perspective with Retina
Kubernetes simplifies container orchestration but introduces observability challenges due to dynamic pod lifecycles and complex inter-service communication. eBPF technology addresses these issues by providing deep system insights and efficient monitoring. The open-source Retina project leverages eBPF for comprehensive, cloud-agnostic network observability across AKS, GKE, and EKS, enhancing troubleshooting and optimization through real-world demo scenarios.856Views9likes0CommentsThe Future of AI: Reduce AI Provisioning Effort - Jumpstart your solutions with AI App Templates
In the previous post, we introduced Contoso Chat – an open-source RAG-based retail chat sample for Azure AI Foundry, that serves as both an AI App template (for builders) and the basis for a hands-on workshop (for learners). And we briefly talked about five stages in the developer workflow (provision, setup, ideate, evaluate, deploy) that take them from the initial prompt to a deployed product. But how can that sample help you build your app? The answer lies in developer tools and AI App templates that jumpstart productivity by giving you a fast start and a solid foundation to build on. In this post, we answer that question with a closer look at Azure AI App templates - what they are, and how we can jumpstart our productivity with a reuse-and-extend approach that builds on open-source samples for core application architectures.463Views0likes0CommentsAutomating the Linux Quality Assurance with LISA on Azure
Introduction Building on the insights from our previous blog regarding how MSFT ensures the quality of Linux images, this article aims to elaborate on the open-source tools that are instrumental in securing exceptional performance, reliability, and overall excellence of virtual machines on Azure. While numerous testing tools are available for validating Linux kernels, guest OS images and user space packages across various cloud platforms, finding a comprehensive testing framework that addresses the entire platform stack remains a significant challenge. A robust framework is essential, one that seamlessly integrates with Azure's environment while providing the coverage for major testing tools, such as LTP and kselftest and covers critical areas like networking, storage and specialized workloads, including Confidential VMs, HPC, and GPU scenarios. This unified testing framework is invaluable for developers, Linux distribution providers, and customers who build custom kernels and images. This is where LISA (Linux Integration Services Automation) comes into play. LISA is an open-source tool specifically designed to automate and enhance the testing and validation processes for Linux kernels and guest OS images on Azure. In this blog, we will provide the history of LISA, its key advantages, the wide range of test cases it supports, and why it is an indispensable resource for the open-source community. Moreover, LISA is available under the MIT License, making it free to use, modify, and contribute. History of LISA LISA was initially developed as an internal tool by Microsoft to streamline the testing process of Linux images and kernel validations on Azure. Recognizing the value it could bring to the broader community, Microsoft open-sourced LISA, inviting developers and organizations worldwide to leverage and enhance its capabilities. This move aligned with Microsoft's growing commitment to open-source collaboration, fostering innovation and shared growth within the industry. LISA serves as a robust solution to validate and certify that Linux images meet the stringent requirements of modern cloud environments. By integrating LISA into the development and deployment pipeline, teams can: Enhance Quality Assurance: Catch and resolve issues early in the development cycle. Reduce Time to Market: Accelerate deployment by automating repetitive testing tasks. Build Trust with Users: Deliver stable and secure applications, bolstering user confidence. Collaborate and Innovate: Leverage community-driven improvements and share insights. Benefits of Using LISA Scalability: Designed to run large-scale test cases, from 1 test case to 10k test cases in one command. Multiple platform orchestration: LISA is created with modular design, to support run the same test cases on various platforms including Microsoft Azure, Windows HyperV, BareMetal, and other cloud-based platforms. Customization: Users can customize test cases, workflow, and other components to fit specific needs, allowing for targeted testing strategies. It’s like building kernels on-the-fly, sending results to custom database, etc. Community Collaboration: Being open source under the MIT License, LISA encourages community contributions, fostering continuous improvement and shared expertise. Extensive Test Coverage: It offers a rich suite of test cases covering various aspects of compatibility of Azure and Linux VMs, from kernel, storage, networking to middleware. How it works Infrastructure LISA is designed to be componentized and maximize compatibility with different distros. Test cases can focus only on test logic. Once test requirements (machines, CPU, memory, etc) are defined, just write the test logic without worrying about environment setup or stopping services on different distributions. Orchestration. LISA uses platform APIs to create, modify and delete VMs. For example, LISA uses Azure API to create VMs, run test cases, and delete VMs. During the test case running, LISA uses Azure API to collect serial log and can hot add/remove data disks. If other platforms implement the same serial log and data disk APIs, the test cases can run on the other platforms seamlessly. Ensure distro compatibility by abstracting over 100 commands in test cases, allowing focus on validation logic rather than distro compatibility. Pre-processing workflow assists in building the kernel on-the-fly, installing the kernel from package repositories, or modifying all test environments. Test matrix helps one run to test all. For example, one run can test different vm sizes on Azure, or different images, even different VM sizes and different images together. Anything is parameterizable, can be tested in a matrix. Customizable notifiers enable the saving of test results and files to any type of storage and database. Agentless and low dependency LISA operates test systems via SSH without requiring additional dependencies, ensuring compatibility with any system that supports SSH. Although some test cases require installing extra dependencies, LISA itself does not. This allows LISA to perform tests on systems with limited resources or even different operating systems. For instance, LISA can run on Linux, FreeBSD, Windows, and ESXi. Getting Started with LISA Ready to dive in? Visit the LISA project at aka.ms/lisa to access the documentation. Install: Follow the installation guide provided in the repository to set up LISA in your testing environment. Run: Follow the instructions to run LISA on local machine, Azure or existing systems. Extend: Follow the documents to extend LISA by test cases, data sources, tools, platform, workflow, etc. Join the Community: Engage with other users and contributors through forums and discussions to share experiences and best practices. Contribute: Modify existing test cases or create new ones to suit your needs. Share your contributions with the community to enhance LISA's capabilities. Conclusion LISA offers open-source collaborative testing solutions designed to operate across diverse environments and scenarios, effectively narrowing the gap between enterprise demands and community-led innovation. By leveraging LISA, customers can ensure their Linux deployments are reliable and optimized for performance. Its comprehensive testing capabilities, combined with the flexibility and support of an active community, make LISA an indispensable tool for anyone involved in Linux quality assurance and testing. Your feedback is invaluable, and we would greatly appreciate your insights.490Views1like0CommentsEffective Cloud Governance: Leveraging Azure Activity Logs with Power BI
We all generally accept that governance in the cloud is a continuous journey, not a destination. There's no one-size-fits-all solution and depending on the size of your Azure cloud estate, staying on top of things can be challenging even at the best of times. One way of keeping your finger on the pulse is to closely monitor your Azure Activity Log. This log contains a wealth of information ranging from noise to interesting to actionable data. One could set up alerts for delete and update signals however, that can result in a flood of notifications. To address this challenge, you could develop a Power Bi report, similar to this one, that pulls in the Azure Activity Log and allows you to group and summarize data by various dimensions. You still need someone to review the report regularly however consuming the data this way makes it a whole lot easier. This by no means replaces the need for setting up alerts for key signals, however it does give you a great view of what's happened in your environment. If you're interested, this is the KQL query I'm using in Power Bi let start_time = ago(24h); let end_time = now(); AzureActivity | where TimeGenerated > start_time and TimeGenerated < end_time | where OperationNameValue contains 'WRITE' or OperationNameValue contains 'DELETE' | project TimeGenerated, Properties_d.resource, ResourceGroup, OperationNameValue, Authorization_d.scope, Authorization_d.action, Caller, CallerIpAddress, ActivityStatusValue | order by TimeGenerated asc51Views0likes0CommentsHow Microsoft Ensures the Quality of Linux VM Images and Platform Experiences on Azure?
In the continuously evolving landscape of cloud computing and AI, the quality and reliability of virtual machines (VMs) plays vital role for businesses running mission-critical workloads. With over 65% of Azure workloads running Linux our commitment to delivering high-quality Linux VM images and platforms remains unwavering. This involves overcoming unique challenges and implementing rigorous validation processes to ensure that every Linux VM image offered on Azure meets the high standards of quality and reliability. Ensuring the quality of Linux images and the overall platform experience on Azure involves addressing the challenges posed by a unique platform stack and the complexity of managing and validating multiple independent release cycles. High-quality Linux VMs are essential for ensuring consistent performance, minimizing downtime and regressions, and enhancing security by addressing vulnerabilities with timely updates. Figure 1: Complexity of Linux VMs in Azure VM Image Updates: Azure's Marketplace offers a diverse array of Linux distributions, each maintained by its respective publishers. These distributions release updates on their own schedules, independent of Azure's infrastructure updates. Package Updates: Within each Linux distribution, numerous packages are maintained and updated separately, adding another layer of complexity to the update and validation process. Extension and Agent Updates: Azure provides over 75+ guest VM extensions to enhance operating system capabilities, security, recovery etc. These extensions are updated independently, requiring careful validation to ensure compatibility and stability. Azure Infrastructure Updates: Azure regularly updates its underlying infrastructure, including components like Azure Boost, to improve reliability, performance, and security. VM SKUs and Sizes: Azure provides thousands of VM sizes with various combinations of CPU, memory, disk, and network configurations to meet diverse customer needs. Managing concurrent updates across all VMs poses significant QA challenges. To address this, Azure uses rigorous testing, gating and validation processes to ensure all components function reliably and meet customer expectations. Azure’s Approach to Overcoming Challenges To address these challenges, we have implemented a comprehensive validation strategy that involves testing at every stage of the image and kernel lifecycle. By adopting a shift-left approach, we execute Linux VM-specific test cases as early as possible. This strategy helps us catch failures close to the source of changes before they are deployed to Azure fleet. Our validation gates integrate with various entry points and provide coverage for a wide variety of scenarios on Azure. Upstream Kernel Validation: As a founding member of Kernel CI, Microsoft validates commits from Linux next and stable trees using Linux VMs in Azure and shares results with the community via Kernel CI DB. This enables us to detect regressions at early stages. Azure-Tuned Kernel Validation: Azure-Tuned Kernels provided by our endorsed distribution partners are thoroughly validated and signed off by Microsoft before it is released to the Azure fleet. Linux Guest Image Validation: The quality team works with endorsed distribution partners for major releases to conduct thorough validation. Each refreshed image, including those from third-party publishers, is validated and certified before being added to the marketplace. Automated pipelines are in place to validate the images once they are available in the Marketplace. Package Validation: Unattended Update: We conduct validation of packages updates with target distro to prevent regression and ensure that only tested snapshots are utilized for updating Linux VM in Azure. Guest Extension Validation: Every Azure-provided extensions undergoes Basic Validation Testing (BVT) across all images and kernel versions to ensure compatibility and functionality amidst any changes. Additionally, comprehensive release testing is conducted for major releases to maintain reliability and compatibility. New VM SKU Validation: Any new VM SKU undergoes validation to confirm it supports Linux before its release to the Azure fleet. This process includes functionality, performance and stress testing across various Linux distributions, and compatibility tests with existing Linux images in the fleet. Azure HostOS & Host Agent Validation: Updates to the Azure Host OS & Agents are thoroughly tested from the Linux guest OS perspective to confirm that changes in the Azure host environment do not result in regressions in compatibility, performance, or stability for Linux VMs. At any stage where regressions or bugs are identified, we block those releases to ensure they never reach customers. All issues are resolved and rigorously retested before images, kernels, or extension updates are made available. Through these robust validation processes, Azure ensures that Linux VMs consistently deliver to customer expectations, delivering a reliable, secure, and high-performance environment for mission-critical workloads. Validation Tools for VM Guest Images and Kernel To ensure the quality and reliability of Linux VM images and kernels on Azure, we leverage open-source kernel testing frameworks like LTP, kselftest, and fstest, along with extensive Azure-specific test cases available in LISA, to comprehensively validate all aspects of the platforms. LISA (Linux Integration Services Automation): Microsoft is committed to open source and that is no different with our testing framework LISA. LISA is an open-source core testing framework designed to meet all Linux validation needs. It includes over 400 tests covering performance, features and security, ensuring comprehensive validation of Linux images on Azure. By automating diverse test scenarios, LISA enables early detection and resolution of issues, enhancing the stability and performance of Linux VMs. Conclusion At Azure, Linux quality is a fundamental aspect of our commitment to delivering reliable VM images and platforms. Through comprehensive testing and strong collaboration with Linux distribution partners, we ensure quality and reliability of VMs while proactively identifying and resolving potential issues. This approach allows us to continually refine our processes and maintain the quality that customers expect from Azure. Quality is a core focus, and we remain dedicated to continuous improvement, delivering world-class Linux environments to businesses and customers. For us, quality is not just a priority—it’s our standard. Your feedback is invaluable, and we would greatly appreciate your insights.683Views0likes0CommentsAI reports: Improve AI governance and GenAIOps with consistent documentation
AI reports are designed to help organizations improve cross-functional observability, collaboration, and governance when developing, deploying, and operating generative AI applications and fine-tuned or custom models. These reports support AI governance best practices by helping developers document the purpose of their AI model or application, its features, potential risks or harms, and applied mitigations, so that cross-functional teams can track and assess production-readiness throughout the AI development lifecycle and then monitor it in production. Starting in December, AI reports will be available in private preview in a US and EU Azure region for Azure AI Foundry customers. To request access to the private preview of AI reports, please complete the Interest Form. Furthermore, we are excited to announce new collaborations with Credo AI and Saidot to support customers’ end-to-end AI governance. By integrating the best of Azure AI with innovative and industry-leading AI governance solutions, we hope to provide our customers with choice and help empower greater cross-functional collaboration to align AI solutions with their own principles and regulatory requirements. Building on learnings at Microsoft Microsoft’s approach for governing generative AI applications builds on our Responsible AI Standard and the National Institute of Standards and Technology’s AI Risk Management Framework. This approach requires teams to map, measure, and manage risks for generative applications throughout their development cycle. A core asset of the first—and iterative—map phase is the Responsible AI Impact Assessment. These assessments help identify potential risks and their associated harms, as well as mitigations to address them. As development of an AI system progresses, additional iterations can help development teams document their progress in risk mitigation and allow experts to review the evaluations and mitigations and make further recommendations or requirements before products are launched. Post-deployment, these assessments become a source of truth for ongoing governance and audits, and help guide how to monitor the application in production. You can learn more about Microsoft’s approach to AI governance in our Responsible AI Transparency Report and find a Responsible AI Impact Assessment Guide and example template on our website. How AI reports support AI impact assessments and GenAIOps AI reports can help organizations govern their GenAI models and applications by making it easier for developers to provide the information needed for cross-functional teams to assess production-readiness throughout the GenAIOps lifecycle. Developers will be able to assemble key project details, such as the intended business use case, potential risks and harms, model card, model endpoint configuration, content safety filter settings, and evaluation results into a unified AI report from within their development environment. Teams can then publish these reports to a central dashboard in the Azure AI Foundry portal, where business leaders can track, review, update, and assess reports from across their organization. Users can also export AI reports in PDF and industry-standard SPDX 3.0 AI BOM formats, for integration into existing GRC workflows. These reports can then be used by the development team, their business leaders, and AI, data, and other risk professionals to determine if an AI model or application is fit for purpose and ready for production as part of their AI impact assessment processes. Being versioned assets, AI reports can also help organizations build a consistent bridge across experimentation, evaluation, and GenAIOps by documenting what metrics were evaluated, what will be monitored in production, and the thresholds that will be used to flag an issue for incident response. For even greater control, organizations can choose to implement a release gate or policy as part of their GenAIOps that validates whether an AI report has been reviewed and approved for production. Key benefits of these capabilities include: Observability: Provide cross-functional teams with a shared view of AI models and applications in development, in review, and in production, including how these projects perform in key quality and safety evaluations. Collaboration: Enable consistent information-sharing between GRC, development, and operational teams using a consistent and extensible AI report template, accelerating feedback loops and minimizing non-coding time for developers. Governance: Facilitate responsible AI development across the GenAIOps lifecycle, reinforcing consistent standards, practices, and accountability as projects evolve or expand over time. Build production-ready GenAI apps with Azure AI Foundry If you are interested in testing AI reports and providing feedback to the product team, please request access to the private preview by completing the Interest Form. Want to learn more about building trustworthy GenAI applications with Azure AI? Here’s more guidance and exciting announcements to support your GenAIOps and governance workflows from Microsoft Ignite: Learn about new GenAI evaluation capabilities in Azure AI Foundry Learn about new GenAI monitoring capabilities in Azure AI Foundry Learn about new IT governance capabilities in Azure AI Foundry Whether you’re joining in person or online, we can’t wait to see you at Microsoft Ignite 2024. We’ll share the latest from Azure AI and go deeper into capabilities that support trustworthy AI with these sessions: Keynote: Microsoft Ignite Keynote Breakout: Trustworthy AI: Future trends and best practices Breakout: Trustworthy AI: Advanced AI risk evaluation and mitigation Demo: Simulate, evaluate, and improve GenAI outputs with Azure AI Foundry Demo: Track and manage GenAI app risks with AI reports in Azure AI Foundry We’ll also be available for questions in the Connection Hub on Level 3, where you can find “ask the expert” stations for Azure AI and Trustworthy AI.2.4KViews1like0Comments