ai
21 TopicsMicrosoft’s A-Grade Azure AI Stack: From Dissertation Prototype to Smart Campus Pilot
This post isn't just about the Student Support Agent (SSA) I built, which earned me a Distinction. It's about how Microsoft's tools made it possible to go from a rough concept to a robust pilot, proving their developer stack is one of the most convenient and powerful options for building intelligent, ethical, and scalable educational systems. The Vision: Cutting Through Campus Complexity University life is full of fragmented systems. Students constantly juggle multiple logins, websites, and interfaces just to check a timetable, book a room, or find a policy. My goal was simple: reduce that cognitive load by creating a unified assistant that could manage all these tasks through a single, intelligent conversation. The Stack That Made It Possible The core of the system relied on a few key, interconnected technologies: Technology Core Function Impact Azure AI Search Hybrid Data Retrieval Anchored responses in official documents. Azure OpenAI Natural Language Generation Created human-like, accurate answers. Semantic Kernel (SK) Multi-Agent Orchestration Managed complex workflows and memory. Azure Speech SDK Multimodal Interface Enabled accessible voice input and output. The foundation was built using Streamlit and FastAPI for rapid prototyping. Building a system that's context-aware, accessible, and extensible is a huge challenge, but it's exactly where the Microsoft AI stack shined. From Simple Chatbot to Multi-Agent Powerhouse Early campus chatbots are often single-agent models, great for basic FAQs, but they quickly fail when tasks span multiple services. I used Semantic Kernel (SK) Microsoft's powerful, open-source framework to build a modular, hub-and-spoke multi-agent system. A central orchestrator routes a request (like "book a study room") to a specialist agent (the Booking Agent), which knows exactly how to handle that task. This modularity was a game-changer: I could add new features (like an Events Agent) without breaking the core system, ensuring the architecture stayed clean and ready for expansion. Agentic Retrieval-Augmented Generation (Agentic RAG): Trust and Transparency To ensure the assistant was trustworthy, I used Agentic RAG to ground responses in real campus (Imperial College London) documentation. This included everything from admission fee payments to campus shuttle time. Azure AI Search indexed all handbooks and policies, allowing the assistant to pull relevant chunks of data and then cite the sources directly in its response. Result: The system avoids common hallucinations by refusing to answer when confidence is low. Students can verify every piece of advice, dramatically improving trust and transparency. Results: A Foundation for Scalable Support A pilot study with 15 students was highly successful: 100% positive feedback on the ease of use and perceived benefit. 93% satisfaction with the voice features. High trust was established due to transparent citations. The SSA proved it could save students time by centralising tasks like booking rooms, checking policies and offering study tips! Final Thoughts Microsoft’s AI ecosystem didn’t just support my dissertation; it shaped it. The tools were reliable, well-documented, and flexible enough to handle real-world complexity. More importantly, they allowed me to focus on student experience, ethics, and pedagogy, rather than wrestling with infrastructure. If you’re a student, educator, or developer looking to build intelligent systems that are transparent, inclusive, and scalable, Microsoft’s AI stack is a great place to start! 🙋🏽♀️ About Me I’m Tyana Tshiota, a postgraduate student in Applied Computational Science and Engineering at Imperial College London. Leveraging Microsoft’s AI stack and the extensive documentation on Microsoft Learn played a key role in achieving a Distinction in my dissertation. Moving forward, I’m excited to deepen my expertise by pursuing Azure certifications. I’d like to extend my sincere gratitude to my supervisor, Lee_Stott , for his invaluable mentorship and support throughout this project. If you haven’t already, check out his insightful posts on the Educator Developer Blog, or try building your own agent with the AI Agents for Beginners curriculum developed by Lee and his team! You can reach out via my LinkedIn if you’re interested in smart campus systems, AI in education, collaborative development, or would like to discuss opportunities.98Views0likes0CommentsEdge AI for Student Developers: Learn to Run AI Locally
AI isn’t just for the cloud anymore. With the rise of Small Language Models (SLMs) and powerful local inference tools, developers can now run intelligent applications directly on laptops, phones, and edge devices—no internet required. If you're a student developer curious about building AI that works offline, privately, and fast, Microsoft’s Edge AI for Beginners course is your perfect starting point. What Is Edge AI? Edge AI refers to running AI models directly on local hardware—like your laptop, mobile device, or embedded system—without relying on cloud servers. This approach offers: ⚡ Real-time performance 🔒 Enhanced privacy (no data leaves your device) 🌐 Offline functionality 💸 Reduced cloud costs Whether you're building a chatbot that works without Wi-Fi or optimizing AI for low-power devices, Edge AI is the future of intelligent, responsive apps. About the Course Edge AI for Beginners is a free, open-source curriculum designed to help you: Understand the fundamentals of Edge AI and local inference Explore Small Language Models like Phi-2, Mistral-7B, and Gemma Deploy models using tools like Llama.cpp, Olive, MLX, and OpenVINO Build cross-platform apps that run AI locally on Windows, macOS, Linux, and mobile The course is hosted on GitHub and includes hands-on labs, quizzes, and real-world examples. You can fork it, remix it, and contribute to the community. What You’ll Learn Module Focus 01. Introduction What is Edge AI and why it matters 02. SLMs Overview of small language models 03. Deployment Running models locally with various tools 04. Optimization Speeding up inference and reducing memory 05. Applications Building real-world Edge AI apps Each module is beginner-friendly and includes practical exercises to help you build and deploy your own local AI solutions. Who Should Join? Student developers curious about AI beyond the cloud Hackathon participants looking to build offline-capable apps Makers and builders interested in privacy-first AI Anyone who wants to explore the future of on-device intelligence No prior AI experience required just a willingness to learn and experiment. Why It Matters Edge AI is a game-changer for developers. It enables smarter, faster, and more private applications that work anywhere. By learning how to deploy AI locally, you’ll gain skills that are increasingly in demand across industries—from healthcare to robotics to consumer tech. Plus, the course is: 💯 Free and open-source 🧠 Backed by Microsoft’s best practices 🧪 Hands-on and project-based 🌐 Continuously updated Ready to Start? Head to aka.ms/edgeai-for-beginners and dive into the modules. Whether you're coding in your dorm room or presenting at your next hackathon, this course will help you build smarter AI apps that run right where you need them on the edge.177Views1like0CommentsModel Mondays S2E13: Open Source Models (Hugging Face)
1. Weekly Highlights 1. Weekly Highlights Here are the key updates we covered in the Season 2 finale: O1 Mini Reinforcement Fine-Tuning (GA): Fine-tune models with as few as ~100 samples using built-in Python code graders. Azure Live Interpreter API (Preview): Real-time speech-to-speech translation supporting 76 input languages and 143 locales with near human-level latency. Agent Factory – Part 5: Connecting agents using open standards like MCP (Model Context Protocol) and A2A (Agent-to-Agent protocol). Ask Ralph by Ralph Lauren: A retail example of agentic AI for conversational styling assistance, built on Azure OpenAI and Foundry’s agentic toolset. VS Code August Release: Brings auto-model selection, stronger safety guards for sensitive edits, and improved agent workflows through new agents.md support. 2. Spotlight – Open Source Models in Azure AI Foundry Guest: Jeff Boudier, VP of Product at Hugging Face Jeff showcased the deep integration between the Hugging Face community and Azure AI Foundry, where developers can access over 10 000 open-source models across multiple modalities—LLMs, speech recognition, computer vision, and even specialized domains like protein modeling and robotics. Demo Highlights Discover models through Azure AI Foundry’s task-based catalog filters. Deploy directly from Hugging Face Hub to Azure with one-click deployment. Explore Use Cases such as multilingual speech recognition and vision-language-action models for robotics. Jeff also highlighted notable models, including: SmoLM3 – a 3 B-parameter model with hybrid reasoning capabilities Qwen 3 Coder – a mixture-of-experts model optimized for coding tasks Parakeet ASR – multilingual speech recognition Microsoft Research protein-modeling collection MAGMA – a vision-language-action model for robotics Integration extends beyond deployment to programmatic access through the Azure CLI and Python SDKs, plus local development via new VS Code extensions. 3. Customer Story – DraftWise (BUILD 2025 Segment) The finale featured a customer spotlight on DraftWise, where CEO James Ding shared how the company accelerates contract drafting with Azure AI Foundry. Problem Legal contract drafting is time-consuming and error-prone. Solution DraftWise uses Azure AI Foundry to fine-tune Hugging Face language models on legal data, generating contract drafts and redline suggestions. Impact Faster drafting cycles and higher consistency Easy model management and deployment with Foundry’s secure workflows Transparent evaluation for legal compliance 4. Community Story – Hugging Face & Microsoft The episode also celebrated the ongoing collaboration between Hugging Face and Microsoft and the impact of open-source AI on the global developer ecosystem. Community Benefits Access to State-of-the-Art Models without licensing barriers Transparent Performance through public leaderboards and benchmarks Rapid Innovation as improvements and bug fixes spread quickly Education & Empowerment via tutorials, docs, and active forums Responsible AI Practices encouraged through community oversight 5. Key Takeaways Open Source AI Is Here to Stay Azure AI Foundry and Hugging Face make deploying, fine-tuning, and benchmarking open models easier than ever. Community Drives Innovation: Collaboration accelerates progress, improves transparency, and makes AI accessible to everyone. Responsible AI and Transparency: Open-source models come with clear documentation, licensing, and community-driven best practices. Easy Deployment & Customization: Azure AI Foundry lets you deploy, automate, and customize open models from a single, unified platform. Learn, Build, Share: The open-model ecosystem is a great place for students, developers, and researchers to learn, build, and share their work. Sharda's Tips: How I Wrote This Blog For this final recap, I focused on capturing the energy of the open source AI movement and the practical impact of Hugging Face and Azure AI Foundry collaboration. I watched the livestream, took notes on the demos and interviews, and linked directly to official resources for models, docs, and community sites. Here’s my Copilot prompt for this episode: "Generate a technical blog post for Model Mondays S2E13 based on the transcript and episode details. Focus on open source models, Hugging Face, Azure AI Foundry, and community workflows. Include practical links and actionable insights for developers and students! Learn & Connect Explore Open Models in Azure AI Foundry Hugging Face Leaderboard Responsible AI in Azure Machine Learning Llama-3 by Meta Hugging Face Community Azure AI Documentation About Model Mondays Model Mondays is your weekly Azure AI learning series: 5-Minute Highlights: Latest AI news and product updates 15-Minute Spotlight: Demos and deep dives with product teams 30-Minute AMA Fridays: Ask anything in Discord or the forum Start building: Watch Past Replays Register For AMA Recap Past AMAs Join The Community Don’t build alone! The Azure AI Developer Community is here for real-time chats, events, and support: Join the Discord Explore the Forum About Me I'm Sharda, a Gold Microsoft Learn Student Ambassador focused on cloud and AI. Find me on GitHub, Dev.to, Tech Community, and LinkedIn. In this blog series, I share takeaways from each week’s Model Mondays livestream.180Views0likes0CommentsModel Mondays S2E01 Recap: Advanced Reasoning Session
About Model Mondays Want to know what Reasoning models are and how you can build advanced reasoning scenarios like a Deep Research agent using Azure AI Foundry? Check out this recap from Model Mondays Season 2 Ep 1. Model Mondays is a weekly series to help you build your model IQ in three steps: 1. Catch the 5-min Highlights on Monday, to get up to speed on model news 2. Catch the 15-min Spotlight on Monday, for a deep-dive into a model or tool 3. Catch the 30-min AMA on Friday, for a Q&A session with subject matter experts Want to follow along? Register Here- to watch upcoming livestreams for Season 2 Visit The Forum- to see the full AMA schedule for Season 2 Register Here - to join the AMA on Friday Jun 20 Spotlight On: Advanced Reasoning This week, the Model Mondays spotlight was on Advanced Reasoning with subject matter expert Marlene Mhangami. In this blog post, I'll talk about my five takeaways from this episode: Why Are Reasoning Models Important? What Is an Advanced Reasoning Scenario? How Can I Get Started with Reasoning Models ? Spotlight: My Aha Moment Highlights: What’s New in Azure AI 1. Why Are Reasoning Models Important? In today's fast-evolving AI landscape, it's no longer enough for models to just complete text or summarize content. We need AI that can: Understand multi-step tasks Make decisions based on logic Plan sequences of actions or queries Connect context across turns Reasoning models are large language models (LLMs) trained with reinforcement learning techniques to "think" before they answer. Rather than simply generating a response based on probability, these models follow an internal thought process producing a chain of reasoning before responding. This makes them ideal for complex problem-solving tasks. And they’re the foundation of building intelligent, context-aware agents. They enable next-gen AI workflows in everything from customer support to legal research and healthcare diagnostics. Reason: They allow AI to go beyond surface-level response and deliver solutions that reflect understanding, not just language patterning. 2. What does Advanced Reasoning involve? An advanced reasoning scenario is one where a model: Breaks a complex prompt into smaller steps Retrieves relevant external data Uses logic to connect dots Outputs a structured, reasoned answer Example: A user asks: What are the financial and operational risks of expanding a startup to Southeast Asia in 2025? This is the kind of question that requires extensive research and analysis. A reasoning model might tackle this by: Retrieving reports on Southeast Asia market conditions Breaking down risks into financial, political, and operational buckets Cross-referencing data with recent trends Returning a reasoned, multi-part answer 3. How Can I Get Started with Reasoning Models? To get started, you need to visit a catalog that has examples of these models. Try the GitHub Models Marketplace and look for the reasoning category in the filter. Try the Azure AI Foundry model catalog and look for reasoning models by name. Example: The o-series of models from Azure Open AI The DeepSeek-R1 models The Grok 3 models The Phi-4 reasoning models Next, you can use SDKs or Playground for exploring the model capabiliies. 1. Try Lab 331 - for a beginner-friendly guide. 2. Try Lab 333 - for an advanced project. 3. Try the GitHub Model Playground - to compare reasoning and GPT models. 4. Try the Deep Research Agent using LangChain - sample as a great starting project. Have questions or comments? Join the Friday AMA on Azure AI Foundry Discord: 4. Spotlight: My Aha Moment Before this session, I thought reasoning meant longer or more detailed responses. But this session helped me realize that reasoning means structured thinking — models now plan, retrieve, and respond with logic. This inspired me to think about building AI agents that go beyond chat and actually assist users like a teammate. It also made me want to dive deeper into LangChain + Azure AI workflows to build mini-agents for real-world use. 5. Highlights: What’s New in Azure AI Here’s what’s new in the Azure AI Foundry: Direct From Azure Models - Try hosted models like OpenAI GPT on PTU plans SORA Video Playground - Generate video from prompts via SORA models Grok 3 Models - Now available for secure, scalable LLM experiences DeepSeek R1-0528 - A reasoning-optimized, Microsoft-tuned open-source model These are all available in the Azure Model Catalog and can be tried with your Azure account. Did You Know? Your first step is to find the right model for your task. But what if you could have the model automatically selected for you_ based on the prompt you provide? That's the magic of Model Router a deployable AI chat model that dynamically selects the best LLM based on your prompt. Instead of choosing one model manually, the Router makes that choice in real time. Currently, this works with a fixed set of Azure OpenAI models, including a reasoning model option. Keep an eye on the documentation for more updates. Why it’s powerful: Saves cost by switching between models based on complexity Optimizes performance by selecting the right model for the task Lets you test and compare model outputs quickly Try it out in Azure AI Foundry or read more in the Model Catalog Coming Up Next Next week, we dive into Model Context Protocol, an open protocol that empowers agentic AI applications by making it easier to discover and integrate knowledge and action tools with your model choices. Register Here to get reminded - and join us live on Monday! Join The Community Great devs don't build alone! In a fast-pased developer ecosystem, there's no time to hunt for help. That's why we have the Azure AI Developer Community. Join us today and let's journey together! Join the Discord - for real-time chats, events & learning Explore the Forum - for AMA recaps, Q&A, and help! About Me. I'm Sharda, a Gold Microsoft Learn Student Ambassador interested in cloud and AI. Find me on Github, Dev.to,, Tech Community and Linkedin. In this blog series I have summarizef my takeaways from this week's Model Mondays livestream .381Views0likes0CommentsModel Mondays S2:E7 · AI-Assisted Azure Development
Welcome to Episode 7! This week, we explore how AI is transforming Azure development. We’ll break down two key tools—Azure MCP Server and GitHub Copilot for Azure—and see how they make working with Azure resources easier for everyone. We’ll also look at a real customer story from SightMachine, showing how AI streamlines manufacturing operations.243Views0likes0CommentsModel Mondays S2E12: Models & Observability
1. Weekly Highlights This week’s top news in the Azure AI ecosystem included: GPT Real Time (GA): Azure AI Foundry now offers GPT Real Time (GA)—lifelike voices, improved instruction following, audio fidelity, and function calling, with support for image context and lower pricing. Read the announcement and check out the model card for more details. Azure AI Translator API (Public Preview): Choose between fast Neural Machine Translation (NMT) or nuanced LLM-powered translations, with real-time flexibility for multilingual workflows. Read the announcement then check out the Azure AI Translator documentation for more details. Azure AI Foundry Agents Learning Plan: Build agents with autonomous goal pursuit, memory, collaboration, and deep fine-tuning (SFT, RFT, DPO) - on Azure AI Foundry. Read the announcement what Agentic AI involves - then follow this comprehensive learning plan with step-by-step guidance. CalcLM Agent Grid (Azure AI Foundry Labs): Project CalcLM: Agent Grid is a prototype and open-source experiment that illustrates how agents might live in a grid-like surface (like Excel). It's formula-first and lightweight - defining agentic workflows like calculations. Try the prototype and visit Foundry Labs to learn more. Agent Factory Blog: Observability in Agentic AI: Agentic AI tools and workflows are gaining rapid adoption in the enterprise. But delivering safe, reliable and performant agents requires foundation support for Observability. Read the 6-part Agent Factory series and check out the Top 5 agent observability best practices for reliable AI blog post for more details. 2. Spotlight On: Observability in Azure AI Foundry This week’s spotlight featured a deep dive and demo by Han Che (Senior PM, Core AI/ Microsoft ), showing observability end-to-end for agent workflows. Why Observability? Ensures AI quality, performance, and safety throughout the development lifecycle. Enables monitoring, root cause analysis, optimization, and governance for agents and models. Key Features & Demos: Development Lifecycle: Leaderboard: Pick the best model for your agent with real-time evaluation. Playground: Chat and prototype agents, view instant quality and safety metrics. Evaluators: Assess quality, risk, safety, intent resolution, tool accuracy, code vulnerability, and custom metrics. Governance: Integrate with partners like Cradle AI and SideDot for policy mapping and evidence archiving. Red Teaming Agent: Automatically test for vulnerabilities and unsafe behavior. CI/CD Integration: Automate evaluation in GitHub Actions and Azure DevOps pipelines. Azure DevOps GitHub Actions Monitoring Dashboard: Resource usage, application analytics, input/output tokens, request latency, cost breakdown, and evaluation scores. Azure Cost Management SDKs & Local Evaluation: Run evaluations locally or in the cloud with the Azure AI Evaluation SDK. Demo Highlights: Chat with a travel planning agent, view run metrics and tool usage. Drill into run details, debugging, and real-time safety/quality scores. Configure and run large-scale agent evaluations in CI/CD pipelines. Compare agents, review statistical analysis, and monitor in production dashboards 3. Customer Story: Saifr Saifr is a RegTech company that uses artificial intelligence to streamline compliance for marketing, communications, and creative teams in regulated industries. Incubated at Fidelity Labs (Fidelity Investments’ innovation arm), Saifr helps enterprises create, review, and approve content that meets regulatory standards—faster and with less manual effort. What Saifr Offers AI-Powered Compliance: Saifr’s platform leverages proprietary AI models trained on decades of regulatory expertise to automatically detect potential compliance risks in text, images, audio, and video. Automated Guardrails: The solution flags risky or non-compliant language, suggests compliant alternatives, and provides explanations—all in real time. Workflow Integration: Saifr seamlessly integrates with enterprise content creation and approval workflows, including cloud platforms and agentic AI systems like Azure AI Foundry. Multimodal Support: Goes beyond text to check images, videos, and audio for compliance risks, supporting modern marketing and communications teams. 4. Key Takeaways Observability is Essential: Azure AI Foundry offers complete monitoring, evaluation, tracing, and governance for agentic AI—making production safe, reliable, and compliant. Built-In Evaluation and Red Teaming: Use leaderboards, evaluators, and red teaming agents to assess and continuously improve model safety and quality. CI/CD and Dashboard Integration: Automate evaluations in GitHub Actions or Azure DevOps, then monitor and optimize agents in production with detailed dashboards. Compliance Made Easy: Safer’s agents and models help financial services and regulated industries proactively meet compliance standards for content and communications. Sharda's Tips: How I Wrote This Blog I focus on organizing highlights, summarizing customer stories, and linking to official Microsoft docs and real working resources. For this recap, I explored the Azure AI Foundry Observability docs, tested CI/CD pipeline integration, and watched the customer demo to share best practices for regulated industries. Here’s my Copilot prompt for this episode: "Generate a technical blog post for Model Mondays S2E12 based on the transcript and episode details. Focus on observability, agent dashboards, CI/CD, compliance, and customer stories. Add correct, working Microsoft links!" Coming Up Next Week Next week: Open Source Models! Join us for the final episode with Hugging Face VP of Product, live demos, and open model workflows. Register For The Livestream – Sep 15, 2025 About Model Mondays Model Mondays is your weekly Azure AI learning series: 5-Minute Highlights: Latest AI news and product updates 15-Minute Spotlight: Demos and deep dives with product teams 30-Minute AMA Fridays: Ask anything in Discord or the forum Start building: Watch Past Replays Register For AMA Recap Past AMAs Join The Community Don’t build alone! The Azure AI Developer Community is here for real-time chats, events, and support: Join the Discord Explore the Forum About Me I'm Sharda, a Gold Microsoft Learn Student Ambassador focused on cloud and AI. Find me on GitHub, Dev.to, Tech Community, and LinkedIn. In this blog series, I share takeaways from each week’s Model Mondays livestream.163Views0likes0CommentsModel Mondays S2E11: Exploring Speech AI in Azure AI Foundry
1. Weekly Highlights This week’s top news in the Azure AI ecosystem included: Lakuna — Copilot Studio Agent for Product Teams: A hackathon project built with Copilot Studio and Azure AI Foundry, Lakuna analyzes your requirements and docs to surface hidden assumptions, helping teams reflect, test, and reduce bias in product planning. Azure ND H200 v5 VMs for AI: Azure Machine Learning introduced ND H200 v5 VMs, featuring NVIDIA H200 GPUs (over 1TB GPU memory per VM!) for massive models, bigger context windows, and ultra-fast throughput. Agent Factory Blog Series: The next wave of agentic AI is about extensibility: plug your agents into hundreds of APIs and services using Model Connector Protocol (MCP) for portable, reusable tool integrations. GPT-5 Tool Calling on Azure AI Foundry: GPT-5 models now support free-form tool calling—no more rigid JSON! Output SQL, Python, configs, and more in your preferred format for natural, flexible workflows. Microsoft a Leader in 2025 Gartner Magic Quadrant: Azure was again named a leader for Cloud Native Application Platforms—validating its end-to-end runway for AI, microservices, DevOps, and more. 2. Spotlight On: Azure AI Foundry Speech Playground The main segment featured a live demo of the new Azure AI Speech Playground (now part of Foundry), showing how developers can experiment with and deploy cutting-edge voice, transcription, and avatar capabilities. Key Features & Demos: Speech Recognition (Speech-to-Text): Try real-time transcription directly in the playground—recognizing natural speech, pauses, accents, and domain terms. Batch and Fast transcription options for large files and blob storage. Custom Speech: Fine-tune models for your industry, vocabulary, and noise conditions. Text to Speech (TTS): Instantly convert text into natural, expressive audio in 150+ languages with 600+ neural voices. Demo: Listen to pre-built voices, explore whispering, cheerful, angry, and more styles. Custom Neural Voice: Clone and train your own professional or personal voice (with strict Responsible AI controls). Avatars & Video Translation: Bring your apps to life with prebuilt avatars and video translation, which syncs voice-overs to speakers in multilingual videos. Voice Live API: Voice Live API (Preview) integrates all premium speech capabilities with large language models, enabling real-time, proactive voice agents and chatbots. Demo: Language learning agent with voice, avatars, and proactive engagement. One-click code export for deployment in your IDE. 3. Customer Story: Hilo Health This week’s customer spotlight featured Helo Health—a healthcare technology company using Azure AI to boost efficiency for doctors, staff, and patients. How Hilo Uses Azure AI: Document Management: Automates fax/document filing, splits multi-page faxes by patient, reduces staff effort and errors using Azure Computer Vision and Document Intelligence. Ambient Listening: Ambient clinical note transcription captures doctor-patient conversations and summarizes them for easy EHR documentation. Genie AI Contact Center: Agentic voice assistants handle patient calls, book appointments, answer billing/refill questions, escalate to humans, and assist human agents—using Azure Communication Services, Azure Functions, FastAPI (community), and Azure OpenAI. Conversational Campaigns: Outbound reminders, procedure preps, and follow-ups all handled by voice AI—freeing up human staff. Impact: Hilo reaches 16,000+ physician practices and 180,000 providers, automates millions of communications, and processes $2B+ in payments annually—demonstrating how multimodal AI transforms patient journeys from first call to post-visit care. 4. Key Takeaways Here’s what you need to know from S2E11: Speech AI is Accessible: The Azure AI Foundry Speech Playground makes experimenting with voice recognition, TTS, and avatars easy for everyone. From Playground to Production: Fine-tune, export code, and deploy speech models in your own apps with Azure Speech Service. Responsible AI Built-In: Custom Neural Voice and avatars require application and approval, ensuring ethical, secure use. Agentic AI Everywhere: Voice Live API brings real-time, multimodal voice agents to any workflow. Healthcare Example: Hilo’s use of Azure AI shows the real-world impact of speech and agentic AI, from patient intake to after-visit care. Join the Community: Keep learning and building—join the Discord and Forum. Sharda's Tips: How I Wrote This Blog I organize key moments from each episode, highlight product demos and customer stories, and use GitHub Copilot for structure. For this recap, I tested the Speech Playground myself, explored the docs, and summarized answers to common developer questions on security, dialects, and deployment. Here’s my favorite Copilot prompt this week: "Generate a technical blog post for Model Mondays S2E11 based on the transcript and episode details. Focus on Azure Speech Playground, TTS, avatars, Voice Live API, and healthcare use cases. Add practical links for developers and students!" Coming Up Next Week Next week: Observability! Learn how to monitor, evaluate, and debug your AI models and workflows using Azure and OpenAI tools. Register For The Livestream – Sep 1, 2025 Register For The AMA – Sep 5, 2025 Ask Questions & View Recaps – Discussion Forum About Model Mondays Model Mondays is your weekly Azure AI learning series: 5-Minute Highlights: Latest AI news and product updates 15-Minute Spotlight: Demos and deep dives with product teams 30-Minute AMA Fridays: Ask anything in Discord or the forum Start building: Register For Livestreams Watch Past Replays Register For AMA Recap Past AMAs Join The Community Don’t build alone! The Azure AI Developer Community is here for real-time chats, events, and support: Join the Discord Explore the Forum About Me I'm Sharda, a Gold Microsoft Learn Student Ambassador focused on cloud and AI. Find me on GitHub, Dev.to, Tech Community, and LinkedIn. In this blog series, I share takeaways from each week’s Model Mondays livestream.172Views0likes0CommentsModel Mondays S2E10: Automating Document Processing with AI
1. Weekly Highlights We kicked off with the top news and updates in the Azure AI ecosystem: Agent Factory Blog Series: A new 6-part blog series on designing reliable, agentic AI—exploring multi-step, collaborative agents that reflect, plan, and adapt using tool integrations and design patterns. Text PII Preview in Azure AI Language: Now redacts PII (like date of birth, license plates) in major European languages, with better accuracy for UK bank entities. Claude Opus 4.1 in Copilot Pro & Enterprise: Public preview brings smarter summaries, tool assistant thinking, and "Ask Mode" in VS Code.Now leverages stronger computer vision algorithms for table parsing—achieving 94-97% accuracy across Latin, Chinese, Japanese, and Korean—with sub-10ms latency. Mistral Document AI in Azure Foundry: Instantly turn PDFs, contracts, and scanned docs into structured JSON with tables, headings, and LaTeX support. Serverless, multilingual, secure, and perfect for regulated industries. 2. Spotlight On: Document Intelligence with Azure & Mistral This week’s spotlight was a hands-on exploration of document processing, featuring both Microsoft and Mistral AI experts. Why Document Processing? Unstructured data—receipts, forms, handwritten notes—are everywhere. Modern document AI can extract, structure, and even annotate this data, fueling everything from search to RAG pipelines. Azure Document Intelligence: State-of-the-art OCR and table extraction with super-high accuracy and speed. Handles multi-language, complex layouts, and returns structured outputs ready for programmatic use. Mistral Document AI: Transforms PDFs and scanned docs into JSON, retaining complex formatting, tables, images, and even LaTeX. Supports custom schema extraction, image/document annotations, and returns everything in one API call. Integrates seamlessly with Azure AI Foundry and developer workflows. Demo Highlights: Extracting Receipts: OCR accurately pulls out store, date, and transaction details from photos. Handwriting Recognition: Even historical documents (like Thomas Jefferson’s letters) are parsed with surprising accuracy. Tables & Structured Data: Financial statements and reports converted into structured markdown and JSON—ready for downstream apps. Advanced Annotations: Define your own schema (via JSON Schema or Pydantic), extract custom fields, classify images, summarize documents, and even translate summaries—all in a single call. 3. Customer Story: Oracle Health Oracle Health shared how agentic AI and fine-tuned models are revolutionizing clinical workflows: Problem: Clinicians spend hours on documentation, searching records, and manual data entry—reducing time for patient care. Solution: Oracle’s clinical AI agents automate chart reviews, data extraction, and even conversational Q&A—while keeping humans in the loop for safety. Technical Highlights: Multi-agent architecture understands provider specialty and context. Orchestrator model "routes" requests to the right agent or plugin, extracting needed arguments from context. Fine-tuning was key: For low latency, Oracle used lightweight models (like GPT-4 Mini) and fine-tuned on their data—achieving sub-800ms responses, with accuracy matching larger models. Fine-tuning also allowed for nuanced tool selection, argument extraction, and rule-based orchestration—better than prompt engineering alone. Used LoRA for efficient, targeted fine-tuning without erasing base model knowledge. Live Demo: Agent summarizes patient history, retrieves lab results, filters for abnormals, and answers follow-up questions—all conversationally. Fine-tuned orchestrator chooses the right tool and context for each doctor’s workflow. Result: 1-2 hours saved per day, more time for patients, and happier doctors! 4. Key Takeaways Here are the key learnings from this episode: Document AI is Production-Ready: Azure Document Intelligence and Mistral Document AI offer fast, accurate, and customizable document parsing for real enterprise needs. Schema-Driven Extraction & Annotation: Define your own schemas and extract exactly what you want—no more one-size-fits-all. Fine-Tuning Unlocks Performance: For low latency and high accuracy, fine-tuning lightweight models beats prompt engineering in complex, rule-based agent workflows. Agentic Workflows in Action: Multi-agent systems can automate complex tasks, route requests, and keep humans in control, especially in regulated domains like healthcare. Community & Support: Join the Discord and Forum to ask questions, share use cases, and connect with the team. Sharda's Tips: How I Wrote This Blog Writing this recap is all about sharing what I learned and making it practical for the community! I start by organizing the key highlights, then walk through customer stories and demos, using simple language and real-world examples. Copilot helps me structure and clarify my notes, especially when summarizing technical sections. Here’s the prompt I used for Copilot this week: "Generate a technical blog post for Model Mondays S2E10 based on the transcript and episode details. Focus on document processing with Azure AI and Mistral, include customer demos, and highlight practical workflows and fine-tuning. Make it clear and approachable for developers and students." Every episode inspires me to try these tools myself, and I hope this blog makes it easy for you to start, too. If you have questions or want to share your own experience, I’d love to hear from you! Coming Up Next Week Next week: Text & Speech AI Playgrounds! Learn how to build and test language and speech models, with live demos and expert guests. | Register For The Livestream – Aug 25, 2025 | Register For The AMA – Aug 29, 2025 | Ask Questions & View Recaps – Discussion Forum About Model Mondays Model Mondays is a weekly series to build your Azure AI IQ with: 5-Minute Highlights: News & updates on Mondays 15-Minute Spotlight: Deep dives into new features, models, and protocols 30-Minute AMA Fridays: Live Q&A with product teams and experts Get started: Register For Livestreams Watch Past Replays Register For AMA Recap Past AMAs Join The Community Don’t build alone! Join the Azure AI Developer Community for real-time chats, events, support, and more: Join the Discord Explore the Forum About Me I'm Sharda, a Gold Microsoft Learn Student Ambassador focused on cloud and AI. Find me on GitHub, Dev.to, Tech Community, and LinkedIn. In this blog series, I share takeaways from each week’s Model Mondays livestream.206Views0likes0CommentsModel Mondays S2E8: On-Device & Local AI
Model Mondays S2E8: On-Device & Local AI Welcome to Episode 8! This week, we explored how AI is moving from the cloud to your own device, making it faster, more private, and more accessible. We also saw a real-world customer story from Xander Glasses, showing how AI can help people with hearing loss. RFD Observability tools in Azure AI Foundry: Real-time model telemetry, auto evals, quick evals, Python grader. GitHub Copilot Pro with Spark: AI pair programmer for code explanation and workflow suggestions. Synthetic Data for Vision Models: Training accurate models with procedurally generated data. Agent-Friendly Websites: Making sites accessible to AI agents via APIs, semantic markup, and OpenAPI specs. MCP (Model Context Protocol): Standardizing agent memory and context for scalable AI.146Views0likes0CommentsHow Microsoft Semantic Kernel Transforms Proven Workflows into Intelligent Agents
Most developers today face a common challenge when integrating AI into their applications: the gap between natural language prompts and actual code execution. While services like OpenAI's ChatGPT excel at generating responses, they can't directly interact with your existing systems, databases, or business logic. You're left building complex orchestration layers, managing function calls manually, and creating brittle workflows that break when requirements change. Microsoft Semantic Kernel changes this paradigm entirely. Unlike traditional LLM integrations where you send a prompt and receive text, Semantic Kernel acts as an AI orchestration layer that bridges natural language with your existing codebase. Semantic Kernel intelligently decides which of your trusted functions to execute, chains your reliable workflows together automatically, and handles the complete workflow from user intent to business outcome using your proven business logic rather than asking the LLM to handle complex tasks with the risk of hallucinating solutions. What Makes Semantic Kernel Different The Traditional (Novice) LLM Integration Problem Meet Kemi, a data analyst who has spent months perfecting a Python script that generates exactly the sales visualizations her team needs. Her workflow is reliable: run the script, review the charts, write insights based on patterns she knows matter to the business, and deliver a concise report. Excited about AI's potential, Kemi decides to "upgrade" her process with ChatGPT. She uploads her sales data and asks the model to create visualizations and analysis. The LLM responds by generating an entirely new script with a dozen different chart types - many irrelevant to her business needs. She then has to upload the generated images back to the model for analysis, hoping it will provide accurate insights. The result? Instead of streamlining her proven workflow, Kemi now has: Unreliable outputs: The LLM generates different charts each time, some irrelevant to business decisions Loss of domain expertise: Her carefully crafted analysis logic is replaced by generic AI interpretations Broken workflow: What was once a single script is now a multi-step process of uploading, generating, downloading, and re-uploading Reduced confidence: She can't trust the AI's business recommendations the way she trusted her own tested methodology More complexity, not less: Her "AI upgrade" created more steps and uncertainty than her original manual process Kemi's experience reflects a common pitfall: replacing proven business logic with unpredictable LLM generation rather than enhancing existing workflows with intelligent orchestration. A Better Approach: Semantic Kernel Integration In this article, I present a better approach that solves Kemi's problem entirely. Instead of replacing her proven workflows with unpredictable AI generation, we'll show how Microsoft Semantic Kernel transforms her existing script into an intelligent agent that preserves her business logic while adding natural language control. By the end of this article, you'll have a solid grasp of how to integrate Semantic Kernel into your own workflows - whether that's connecting weather APIs for automated marketing campaigns, database queries for sales reporting, or Teams notifications for development task management. The principles you'll learn here apply to automating any specific marketing, sales, or development task where you want AI orchestration without sacrificing the reliability of your existing business logic. The Semantic Kernel Transformation Let's see how Semantic Kernel solves Kemi's workflow problem by transforming her existing script into an intelligent agent that preserves her business logic while adding natural language orchestration. The Complete Example Before: Kemi's Original Script After: Smart Business Agent Full Repository: semantic-kernel-business-agent Kemi's Original Functions Kemi's script contains two core functions that she's refined over months: get_sales_summary(): Calculates total sales, daily averages, and key metrics create_basic_chart(): Generates a reliable sales trend visualization These functions work perfectly for her needs, but require manual orchestration and individual execution. Setting Up the Foundation First, Kemi needs to install the required libraries and set up her OpenAI credentials: pip install semantic-kernel pandas matplotlib python-dotenv She creates a .env file to securely store her OpenAI API key: OPENAI_API_KEY=your-openai-api-key-here Get your OpenAI API key from platform.openai.com → API Keys Step 1: From Manual Function Calls to Kernel Functions In her original script, Kemi had to manually orchestrate everything: # From basic_data_analysis.py - Kemi's manual workflow analyzer = DataAnalyzer() print(analyzer.get_sales_summary()) # She manually calls this analyzer.create_basic_chart() # Then manually calls this With Semantic Kernel, she transforms these exact same functions into AI-discoverable capabilities: from semantic_kernel.functions import kernel_function from typing import Annotated @kernel_function( description="Get sales performance summary with total sales, averages, and trends", name="get_sales_summary" ) def get_sales_summary(self) -> Annotated[str, "Sales summary with key metrics"]: # Kemi's exact same trusted business logic - unchanged! total_sales = self.sales_data['sales'].sum() avg_daily_sales = self.sales_data['sales'].mean() return f"Total: ${total_sales:,}, Daily Avg: ${avg_daily_sales:.2f}" She's not replacing her proven logic with AI generation - she's making her existing, reliable functions available to intelligent orchestration. Step 2: Enhancing Her Chart Function with Smart Parameters Kemi's original create_basic_chart() only made one type of chart. With SK, she can enhance it to be more versatile while keeping the core logic: ( description="Create and save a sales performance chart visualization", name="create_sales_chart" ) def create_sales_chart( self, chart_type: Annotated[str, "Type of chart: 'trend', 'regional', or 'product'"] = "trend" ) -> Annotated[str, "Confirmation that chart was created"]: # Kemi's same matplotlib logic, now with intelligent chart selection plt.figure(figsize=(12, 8)) if chart_type == "trend": plt.plot(self.sales_data['date'], self.sales_data['sales'], marker='o') plt.title('Sales Trend Over Time', fontsize=16) # ... rest of her charting logic Step 3: Adding New Capabilities She Always Wanted Now she can add functions she never had time to build manually, like automated insights and report sending: ( description="Send performance report via email to team", name="send_report" ) def send_report(self, recipient: Annotated[str, "Email address"]) -> Annotated[str, "Confirmation"]: # For now, simulated - but she could easily integrate real email here return f"📧 Performance report sent to {recipient}" Step 4: Creating the Intelligent Agent Here's where the magic happens - connecting her functions to Semantic Kernel: from semantic_kernel import Kernel from semantic_kernel.connectors.ai.open_ai import OpenAIChatCompletion from semantic_kernel.connectors.ai import FunctionChoiceBehavior from dotenv import load_dotenv load_dotenv() # Load her OpenAI key class SmartBusinessAgent: def __init__(self): # Initialize the kernel self.kernel = Kernel() # Connect to OpenAI self.kernel.add_service( OpenAIChatCompletion( service_id="business_agent", api_key=os.getenv("OPENAI_API_KEY"), ai_model_id="gpt-4o-mini" ) ) # Register Kemi's functions as AI-accessible tools self.kernel.add_plugin(SmartBusinessPlugin(), plugin_name="business") # Enable automatic function orchestration self.execution_settings = OpenAIChatPromptExecutionSettings( function_choice_behavior=FunctionChoiceBehavior.Auto() ) Step 5: The Natural Language Interface Now Kemi can interact with her proven workflows using natural language: async def process_request(self, user_request: str) -> str: result = await self.kernel.invoke_prompt( prompt=f"You are a business intelligence agent. You can analyze sales data, create charts, generate insights, and send reports.\n\nRequest: {user_request}", arguments=KernelArguments(settings=self.execution_settings) ) return str(result) The Transformation in Action Before - Kemi's manual, step-by-step process: analyzer = DataAnalyzer() summary = analyzer.get_sales_summary() # She decides to call this chart = analyzer.create_basic_chart() # Then she decides to call this # Then she manually writes insights and sends emails After - Intelligent orchestration of her same functions: agent = SmartBusinessAgent() response = await agent.process_request( "Analyze our sales performance, create relevant charts, and email the full report to sarah@company.com" ) # SK automatically calls: get_sales_summary() → create_sales_chart("trend") → # create_sales_chart("regional") → get_business_insights() → send_report("sarah@company.com") The breakthrough: Kemi keeps her trusted business logic intact while gaining an intelligent interface that can understand complex requests, automatically determine which of her functions to call, and handle multi-step workflows - all while using her proven, reliable analysis methods instead of unpredictable AI generation. This is the core power of Semantic Kernel: enhancing existing workflows with AI orchestration rather than replacing proven business logic with risky hallucination-prone generation. Whether you're working with weather APIs for marketing automation, database queries for sales reporting, or Teams notifications for development workflows, these same patterns apply. You can keep your proven logic and enhance with AI orchestration. Try It Yourself Ready to transform your own workflows? Here's how to get started: 1. Clone and Run the Complete Example git clone https://github.com/your-username/semantic-kernel-business-agent cd semantic-kernel-business-agent pip install -r requirements.txt 2. Set Up Your Environment # Add your OpenAI API key cp .env.example .env # Edit .env and add: OPENAI_API_KEY=your-key-here 3. Experience the Transformation # Run Kemi's original manual script python basic_data_analysis.py # Then run the intelligent agent python smart_business_agent.py 4. Experiment with Natural Language Requests Try these prompts with the smart agent: "Give me a comprehensive sales analysis with multiple chart types" "Create regional performance charts and send insights to my email" "What trends should we focus on for next quarter's strategy?" Watch how Semantic Kernel automatically orchestrates Kemi's trusted functions to fulfill complex, multi-step requests. Next Steps: Adapt to Your Workflow Take your own scripts and apply the same transformation: Identify your core functions (like Kemi's get_sales_summary() and create_basic_chart()) Add decorators with clear descriptions Create your agent class connecting to your preferred LLM Test with natural language requests that combine multiple functions The full repository includes additional examples and documentation to help you extend these concepts to your specific use cases. The goal isn't to replace your expertise with AI - it's to make your expertise accessible through intelligent, natural language orchestration. Start with the working example, then gradually transform your own workflows. You'll discover that Semantic Kernel doesn't just automate tasks - it amplifies your existing capabilities while keeping you in control of the business logic that matters. Further Reading Introduction to Semantic Kernel355Views0likes0Comments