agents
126 TopicsCustomer Story: EPAM's Data-Driven Copilot Adoption Journey
EPAM, a global engineering consulting organization, used Microsoft 365 admin center and Super User reports to drive adoption at scale, using customized reporting for deeper insights and measurable impact. EPAM’s Copilot story is one of proactive measurement, data-driven license management, and continuous improvement — combining early adoption, custom analytics, and Microsoft’s evolving tools to lead in AI-driven enterprise productivity.413Views6likes0CommentsImplementing A2A protocol in NET: A Practical Guide
As AI systems mature into multi‑agent ecosystems, the need for agents to communicate reliably and securely has become fundamental. Traditionally, agents built on different frameworks like Semantic Kernel, LangChain, custom orchestrators, or enterprise APIs do not share a common communication model. This creates brittle integrations, duplicate logic, and siloed intelligence. The Agent‑to‑Agent Standard (A2AS) addresses this gap by defining a universal, vendor‑neutral protocol for structured agent interoperability. A2A establishes a common language for agents, built on familiar web primitives: JSON‑RPC 2.0 for messaging and HTTPS for transport. Each agent exposes a machine‑readable Agent Card describing its capabilities, supported input/output modes, and authentication requirements. Interactions are modeled as Tasks, which support synchronous, streaming, and long‑running workflows. Messages exchanged within a task contain Parts; text, structured data, files, or streams, that allow agents to collaborate without exposing internal implementation details. By standardizing discovery, communication, authentication, and task orchestration, A2A enables organizations to build composable AI architectures. Specialized agents can coordinate deep reasoning, planning, data retrieval, or business automation regardless of their underlying frameworks or hosting environments. This modularity, combined with industry adoption and Linux Foundation governance, positions A2A as a foundational protocol for interoperable AI systems. A2AS in .NET — Implementation Guide Prerequisites • .NET 8 SDK • Visual Studio 2022 (17.8+) • A2A and A2A.AspNetCore packages • Curl/Postman (optional, for direct endpoint testing) The open‑source A2A project provides a full‑featured .NET SDK, enabling developers to build and host A2A agents using ASP.NET Core or integrate with other agents as a client. Two A2A and A2A.AspNetCore packages power the experience. The SDK offers: A2AClient - to call remote agents TaskManager - to manage incoming tasks & message routing AgentCard / Message / Task models - strongly typed protocol objects MapA2A() - ASP.NET Core router integration that auto‑generates protocol endpoints This allows you to expose an A2A‑compliant agent with minimal boilerplate. Project Setup Create two separate projects: CurrencyAgentService → ASP.NET Core web project that hosts the agent A2AClient → Console app that discovers the agent card and sends a message Install the packages from the pre-requisites in the above projects. Building a Simple A2A Agent (Currency Agent Example) Below is a minimal Currency Agent implemented in ASP.NET Core. It responds by converting amounts between currencies. Step 1: In CurrencyAgentService project, create the CurrencyAgentImplementation class to implement the A2A agent. The class contains the logic for the following: a) Describing itself (agent “card” metadata). b) Processing the incoming text messages like “100 USD to EUR”. c) Returning a single text response with the conversion. The AttachTo(ITaskManager taskManager) method hooks two delegates on the provided taskManager - a) OnAgentCardQuery → GetAgentCardAsync: returns agent metadata. b) OnMessageReceived → ProcessMessageAsync: handles incoming messages and produces a response. Step 2: In the Program.cs of the Currency Agent Solution, create a TaskManager , and attach the agent to it, and expose the A2A endpoint. Typical flow: GET /agent → A2A host asks OnAgentCardQuery → returns the card POST /agent with a text message → A2A host calls OnMessageReceived → returns the conversion text. All fully A2A‑compliant. Calling an A2A Agent from .NET To interact with any A2A‑compliant agent from .NET, the client follows a predictable sequence: identify where the agent lives, discover its capabilities through the Agent Card, initialize a correctly configured A2AClient, construct a well‑formed message, send it asynchronously, and finally interpret the structured response. This ensures your client is fully aligned with the agent’s advertised contract and remains resilient as capabilities evolve. Below are the steps implemented to call the A2A agent from the A2A client: Identify the agent endpoint: Why: You need a stable base URL to resolve the agent’s metadata and send messages. What: Construct a Uri pointing to the agent service, e.g., https://localhost:7009/agent. Discover agent capabilities via an Agent Card. Why: Agent Cards provide a contract: name, description, final URL to call, and features (like streaming). This de-couples your client from hard-coded assumptions and enables dynamic capability checks. What: Use A2ACardResolver with the endpoint Uri, then call GetAgentCardAsync() to obtain an AgentCard. Initialize the A2AClient with the resolved URL. Why: The client encapsulates transport details and ensures messages are sent to the correct agent endpoint, which may differ from the discovery URL. What: Create A2AClient using new Uri (currencyCard.Url) from the Agent Card for correctness. Construct a well-formed agent request message. Why: Agents typically require structured messages for roles, traceability, and multi-part inputs. A unique message ID supports deduplication and logging. What: Build an AgentMessage: • Role = MessageRole.User clarifies intent. • MessageId = Guid.NewGuid().ToString() ensures uniqueness. • Parts contains content; for simple queries, a single TextPart with the prompt (e.g., “100 USD to EUR”). Package and send the message. Why: MessageSendParams can carry the message plus any optional settings (e.g., streaming flags or context). Using a dedicated params object keeps the API extensible. What: Wrap the AgentMessage in MessageSendParams and call SendMessageAsync(...) on the A2AClient. Outcome: Await the asynchronous response to avoid blocking and to stay scalable. Interpret the agent response. Why: Agents can return multiple Parts (text, data, attachments). Extracting the appropriate part avoids assumptions and keeps your client robust. What: Cast to AgentMessage, then read the first TextPart’s Text for the conversion result in this scenario. Best Practices 1. Keep Agents Focused and Single‑Purpose Design each agent around a clear, narrow capability (e.g., currency conversion, scheduling, document summarization). Single‑responsibility agents are easier to reason about, scale, and test, especially when they become part of larger multi‑agent workflows. 2. Maintain Accurate and Helpful Agent Cards The Agent Card is the first interaction point for any client. Ensure it accurately reflects: Supported input/output formats Streaming capabilities Authentication requirements (if any) Version information A clean and honest card helps clients integrate reliably without guesswork. 3. Prefer Structured Inputs and Outputs Although A2A supports plain text, using structured payloads through DataPart objects significantly improves consistency. JSON inputs and outputs reduce ambiguity, eliminate prompt‑engineering edge cases, and make agent behavior more deterministic especially when interacting with other automated agents. 4. Use Meaningful Task States Treat A2A Tasks as proper state machines. Transition through states intentionally (Submitted → Working → Completed, or Working → InputRequired → Completed). This gives clients clarity on progress, makes long‑running operations manageable, and enables more sophisticated control flows. 5. Provide Helpful Error Messages Make use of A2A and JSON‑RPC error codes such as -32602 (invalid input) or -32603 (internal error), and include additional context in the error payload. Avoid opaque messages, error details should guide the client toward recovery or correction. 6. Keep Agents Stateless Where Possible Stateless agents are easier to scale and less prone to hidden failures. When state is necessary, ensure it is stored externally or passed through messages or task contexts. For local POCs, in‑memory state is acceptable, but design with future statelessness in mind. 7. Validate Input Strictly Do not assume incoming messages are well‑formed. Validate fields, formats, and required parameters before processing. For example, a currency conversion agent should confirm both currencies exist and the value is numeric before attempting a conversion. 8. Design for Streaming Even if Disabled Streaming is optional, but it’s a powerful pattern for agents that perform progressive reasoning or long computations. Structuring your logic so it can later emit partial TextPart updates makes it easy to upgrade from synchronous to streaming workflows. 9. Include Traceability Metadata Embed and log identifiers such as TaskId, MessageId, and timestamps. These become crucial for debugging multi‑agent scenarios, improving observability, and correlating distributed workflows—especially once multiple agents collaborate. 10. Offer Clear Guidance When Input Is Missing Instead of returning a generic failure, consider shifting the task to InputRequired and explaining what the client should provide. This improves usability and makes your agent self‑documenting for new consumers.Engineering a Local-First Agentic Podcast Studio: A Deep Dive into Multi-Agent Orchestration
The transition from standalone Large Language Models (LLMs) to Agentic Orchestration marks the next frontier in AI development. We are moving away from simple "prompt-and-response" cycles toward a paradigm where specialized, autonomous units—AI Agents—collaborate to solve complex, multi-step problems. As a Technology Evangelist, my focus is on building these production-grade systems entirely on the edge, ensuring privacy, speed, and cost-efficiency. This technical guide explores the architecture and implementation of The AI Podcast Studio. This project demonstrates the seamless integration of the Microsoft Agent Framework, Local Small Language Models (SLMs), and VibeVoice to automate a complete tech podcast pipeline. I. The Strategic Intelligence Layer: Why Local-First? At the core of our studio is a Local-First philosophy. While cloud-based LLMs are powerful, they introduce friction in high-frequency, creative pipelines. By using Ollama as a model manager, we run SLMs like Qwen-3-8B directly on user hardware. 1. Architectural Comparison: Local vs. Cloud Choosing the deployment environment is a fundamental architectural decision. For an agentic podcasting workflow, the edge offers distinct advantages: Dimension Local Models (e.g., Qwen-3-8B) Cloud Models (e.g., GPT-5.2) Latency Zero/Ultra-low: Instant token generation without network "jitter". Variable: Dependent on network stability and API traffic. Privacy Total Sovereignty: Creative data and drafts never leave the local device. Shared Risk: Data is processed on third-party servers. Cost Zero API Fees: One-time hardware investment; free to run infinite tokens. Pay-as-you-go: Costs scale with token count and frequency of calls. Availability Offline: The studio remains functional without an internet connection. Online Only: Requires a stable, high-speed connection. 2. Reasoning and Tool-Calling on the Edge To move beyond simple chat, we implement Reasoning Mode, utilizing Chain-of-Thought (CoT) prompting. This allows our local agents to "think" through the podcast structure before writing. Furthermore, we grant them "superpowers" through Tool-Calling, allowing them to execute Python functions for real-time web searches to gather the latest news. II. The Orchestration Engine: Microsoft Agent Framework The true complexity of this project lies in Agent Orchestration—the coordination of specialized agents to work as a cohesive team. We distinguish between Agents, who act as "Jazz Musicians" making flexible decisions, and Workflows, which act as the "Orchestra" following a predefined score. 1. Advanced Orchestration Patterns Drawing from the WorkshopForAgentic architecture, the studio utilizes several sophisticated patterns: Sequential: A strict pipeline where the output of the Researcher flows into the Scriptwriter. Concurrent (Parallel): Multiple agents search different news sources simultaneously to speed up data gathering. Handoff: An agent dynamically "transfers" control to another specialist based on the context of the task. Magentic-One: A high-level "Manager" agent decides which specialist should handle the next task in real-time. III. Implementation: Code Analysis (Workshop Patterns) To maintain a production-grade codebase, we follow the modular structure found in the WorkshopForAgentic/code directory. This ensures that agents, clients, and workflows are decoupled and maintainable. 1. Configuration: Connecting to Local SLMs The first step is initializing the local model client using the framework's Ollama integration. # Based on WorkshopForAgentic/code/config.py from agent_framework.ollama import OllamaChatClient # Initialize the local client for Qwen-3-8B # Standard Ollama endpoint on localhost chat_client = OllamaChatClient( model_id="qwen3:8b", endpoint="http://localhost:11434" ) 2. Agent Definition: Specialized Roles Each agent is a ChatAgent instance defined by its persona and instructions. # Based on WorkshopForAgentic/code/agents.py from agent_framework import ChatAgent # The Researcher Agent: Responsible for web discovery researcher_agent = client.create_agent( name="SearchAgent", instructions="You are my assistant. Answer the questions based on the search engine.", tools=[web_search], ) # The Scriptwriter Agent: Responsible for conversational narrative generate_script_agent = client.create_agent( name="GenerateScriptAgent", instructions=""" You are my podcast script generation assistant. Please generate a 10-minute Chinese podcast script based on the provided content. The podcast script should be co-hosted by Lucy (the host) and Ken (the expert). The script content should be generated based on the input, and the final output format should be as follows: Speaker 1: …… Speaker 2: …… Speaker 1: …… Speaker 2: …… Speaker 1: …… Speaker 2: …… """ ) 3. Workflow Setup: The Sequential Pipeline For a deterministic production line, we use the WorkflowBuilder to connect our agents. # Based on WorkshopForAgentic/code/workflow_setup.py from agent_framework import WorkflowBuilder # Building the podcast pipeline search_executor = AgentExecutor(agent=search_agent, id="search_executor") gen_script_executor = AgentExecutor(agent=gen_script_agent, id="gen_script_executor") review_executor = ReviewExecutor(id="review_executor", genscript_agent_id="gen_script_executor") # Build workflow with approval loop # search_executor -> gen_script_executor -> review_executor # If not approved, review_executor -> gen_script_executor (loop back) workflow = ( WorkflowBuilder() .set_start_executor(search_executor) .add_edge(search_executor, gen_script_executor) .add_edge(gen_script_executor, review_executor) .add_edge(review_executor, gen_script_executor) # Loop back for regeneration .build() ) IV. Multimodal Synthesis: VibeVoice Technology The "Future Bytes" podcast is brought to life using VibeVoice, a specialized technology from Microsoft Research designed for natural conversational synthesis. Conversational Rhythm: It automatically handles natural turn-taking and speech cadences. High Efficiency: By operating at an ultra-low 7.5 Hz frame rate, it significantly reduces the compute power required for high-fidelity audio. Scalability: The system supports up to 4 distinct voices and can generate up to 90 minutes of continuous audio. V. Observability and Debugging: DevUI Building multi-agent systems requires deep visibility into the agentic "thinking" process. We leverage DevUI, a specialized web interface for testing and tracing: Interactive Tracing: Developers can watch the message flow and tool-calling in real-time. Automatic Discovery: DevUI auto-discovers agents defined within the project structure. Input Auto-Generation: The UI generates input fields based on workflow requirements, allowing for rapid iteration. VI. Technical Requirements for Edge Deployment Deploying this studio locally requires specific hardware and software configurations to handle simultaneous LLM and TTS inference: Software: Python 3.10+, Ollama, and the Microsoft Agent Framework. Hardware: 16GB+ RAM is the minimum requirement; 32GB is recommended for running multiple agents and VibeVoice concurrently. Compute: A modern GPU/NPU (e.g., NVIDIA RTX or Snapdragon X Elite) is essential for smooth inference. Final Perspective: From Coding to Directing The AI Podcast Studio represents a significant shift toward Agentic Content Creation. By mastering these orchestration patterns and leveraging local EdgeAI, developers move from simply writing code to directing entire ecosystems of intelligent agents. This "local-first" model ensures that the future of creativity is private, efficient, and infinitely scalable. Download sample Here Resource EdgeAI for Beginners - https://github.com/microsoft/edgeai-for-beginners Microsoft Agent Framework - https://github.com/microsoft/agent-framework Microsoft Agent Framework Samples - https://github.com/microsoft/agent-framework-samples6.8KViews3likes0CommentsFrom Cloud to Chip: Building Smarter AI at the Edge with Windows AI PCs
As AI engineers, we’ve spent years optimizing models for the cloud, scaling inference, wrangling latency, and chasing compute across clusters. But the frontier is shifting. With the rise of Windows AI PCs and powerful local accelerators, the edge is no longer a constraint it’s now a canvas. Whether you're deploying vision models to industrial cameras, optimizing speech interfaces for offline assistants, or building privacy-preserving apps for healthcare, Edge AI is where real-world intelligence meets real-time performance. Why Edge AI, Why Now? Edge AI isn’t just about running models locally, it’s about rethinking the entire lifecycle: - Latency: Decisions in milliseconds, not round-trips to the cloud. - Privacy: Sensitive data stays on-device, enabling HIPAA/GDPR compliance. - Resilience: Offline-first apps that don’t break when the network does. - Cost: Reduced cloud compute and bandwidth overhead. With Windows AI PCs powered by Intel and Qualcomm NPUs and tools like ONNX Runtime, DirectML, and Olive, developers can now optimize and deploy models with unprecedented efficiency. What You’ll Learn in Edge AI for Beginners The Edge AI for Beginners curriculum is a hands-on, open-source guide designed for engineers ready to move from theory to deployment. Multi-Language Support This content is available in over 48 languages, so you can read and study in your native language. What You'll Master This course takes you from fundamental concepts to production-ready implementations, covering: Small Language Models (SLMs) optimized for edge deployment Hardware-aware optimization across diverse platforms Real-time inference with privacy-preserving capabilities Production deployment strategies for enterprise applications Why EdgeAI Matters Edge AI represents a paradigm shift that addresses critical modern challenges: Privacy & Security: Process sensitive data locally without cloud exposure Real-time Performance: Eliminate network latency for time-critical applications Cost Efficiency: Reduce bandwidth and cloud computing expenses Resilient Operations: Maintain functionality during network outages Regulatory Compliance: Meet data sovereignty requirements Edge AI Edge AI refers to running AI algorithms and language models locally on hardware, close to where data is generated without relying on cloud resources for inference. It reduces latency, enhances privacy, and enables real-time decision-making. Core Principles: On-device inference: AI models run on edge devices (phones, routers, microcontrollers, industrial PCs) Offline capability: Functions without persistent internet connectivity Low latency: Immediate responses suited for real-time systems Data sovereignty: Keeps sensitive data local, improving security and compliance Small Language Models (SLMs) SLMs like Phi-4, Mistral-7B, Qwen and Gemma are optimized versions of larger LLMs, trained or distilled for: Reduced memory footprint: Efficient use of limited edge device memory Lower compute demand: Optimized for CPU and edge GPU performance Faster startup times: Quick initialization for responsive applications They unlock powerful NLP capabilities while meeting the constraints of: Embedded systems: IoT devices and industrial controllers Mobile devices: Smartphones and tablets with offline capabilities IoT Devices: Sensors and smart devices with limited resources Edge servers: Local processing units with limited GPU resources Personal Computers: Desktop and laptop deployment scenarios Course Modules & Navigation Course duration. 10 hours of content Module Topic Focus Area Key Content Level Duration 📖 00 Introduction to EdgeAI Foundation & Context EdgeAI Overview • Industry Applications • SLM Introduction • Learning Objectives Beginner 1-2 hrs 📚 01 EdgeAI Fundamentals Cloud vs Edge AI comparison EdgeAI Fundamentals • Real World Case Studies • Implementation Guide • Edge Deployment Beginner 3-4 hrs 🧠 02 SLM Model Foundations Model families & architecture Phi Family • Qwen Family • Gemma Family • BitNET • μModel • Phi-Silica Beginner 4-5 hrs 🚀 03 SLM Deployment Practice Local & cloud deployment Advanced Learning • Local Environment • Cloud Deployment Intermediate 4-5 hrs ⚙️ 04 Model Optimization Toolkit Cross-platform optimization Introduction • Llama.cpp • Microsoft Olive • OpenVINO • Apple MLX • Workflow Synthesis Intermediate 5-6 hrs 🔧 05 SLMOps Production Production operations SLMOps Introduction • Model Distillation • Fine-tuning • Production Deployment Advanced 5-6 hrs 🤖 06 AI Agents & Function Calling Agent frameworks & MCP Agent Introduction • Function Calling • Model Context Protocol Advanced 4-5 hrs 💻 07 Platform Implementation Cross-platform samples AI Toolkit • Foundry Local • Windows Development Advanced 3-4 hrs 🏭 08 Foundry Local Toolkit Production-ready samples Sample applications (see details below) Expert 8-10 hrs Each module includes Jupyter notebooks, code samples, and deployment walkthroughs, perfect for engineers who learn by doing. Developer Highlights - 🔧 Olive: Microsoft's optimization toolchain for quantization, pruning, and acceleration. - 🧩 ONNX Runtime: Cross-platform inference engine with support for CPU, GPU, and NPU. - 🎮 DirectML: GPU-accelerated ML API for Windows, ideal for gaming and real-time apps. - 🖥️ Windows AI PCs: Devices with built-in NPUs for low-power, high-performance inference. Local AI: Beyond the Edge Local AI isn’t just about inference, it’s about autonomy. Imagine agents that: - Learn from local context - Adapt to user behavior - Respect privacy by design With tools like Agent Framework, Azure AI Foundry and Windows Copilot Studio, and Foundry Local developers can orchestrate local agents that blend LLMs, sensors, and user preferences, all without cloud dependency. Try It Yourself Ready to get started? Clone the Edge AI for Beginners GitHub repo, run the notebooks, and deploy your first model to a Windows AI PC or IoT devices Whether you're building smart kiosks, offline assistants, or industrial monitors, this curriculum gives you the scaffolding to go from prototype to production.AI Upskilling Framework Level 3 Building
The Global AI Community is excited to bring you the latest updates on AI Upskilling Framework Level 3 Building, straight from Microsoft Ignite! This session dives deep into advanced concepts for building agentic workflows and showcases new announcements that will help developers accelerate their Agentic AI journey.Microsoft 365 & Power Platform product updates call
💡Microsoft 365 & Power Platform product updates call concentrates on the different use cases and features within the Microsoft 365 and in Power Platform. Call includes topics like Microsoft 365 Copilot, Copilot Studio, Microsoft Teams, Power Platform, Microsoft Graph, Microsoft Viva, Microsoft Search, Microsoft Lists, SharePoint, Power Automate, Power Apps and more. 👏 Weekly Tuesday call is for all community members to see Microsoft PMs, engineering and Cloud Advocates showcasing the art of possible with Microsoft 365 and Power Platform. 📅 On the 16th of December we'll have following agenda: News and updates from Microsoft Together mode group photo Fabian Williams – Copilot Studio × MCP: From Zero to Shipping a Real Event Agent Vesa Juvonen – Introduction to SharePoint Pages Agent in Microsoft 365 Copilot 📞 & 📺 Join the Microsoft Teams meeting live at https://aka.ms/community/ms-speakers-call-join 👋 See you in the call! 🎅🎄 Notice that this call is on a holiday break after 16th of December call and back on 13th of January 2026. 💡 Building something cool for Microsoft 365 or Power Platform? We are always looking for presenters - Volunteer for a community call demo at https://aka.ms/community/request/demo 📖 Resources: Previous community call recordings and demos from the Microsoft Community Learning YouTube channel at https://aka.ms/community/youtube Microsoft 365 & Power Platform samples from Microsoft and community - https://aka.ms/community/samples Microsoft 365 & Power Platform community details - https://aka.ms/community/home90Views0likes0Comments