github
346 TopicsRethinking Documentation Translation: Treating Translations as Versioned Software Assets
Rethinking Documentation Translation: Treating Translations as Versioned Software Assets This article is written from the perspective of maintaining large, open-source documentation repositories in the Microsoft ecosystem. I am the maintainer of Co-op Translator, an open-source tool for automating multilingual documentation translation, used across multiple large documentation repositories, including Microsoft’s For Beginners series. In large documentation repositories, translation problems rarely fail loudly. They fail quietly, and they accumulate over time. Recently, we made a fundamental design decision in how Co-op Translator handles translations. Translations are treated as versioned software assets, not static outputs. This article explains why we reached that conclusion, and what this perspective enables for teams maintaining large, fast-moving documentation repositories. When translations quietly become a liability In most documentation projects, translations are treated as finished outputs. Once a file is translated, it is assumed to remain valid until someone explicitly notices a problem. But documentation rarely stands still. Text changes. Code examples evolve. Screenshots are replaced. Notebooks are updated to reflect new behavior. The problem is that these changes are often invisible in translated content. A translation may still read fluently, while the information it contains is already out of date. At that point, the issue is no longer about translation quality. It becomes a maintenance problem. Reframing the question Most translation workflows implicitly ask: Is this translation correct? In practice, maintainers struggle with a different question: Is this translation still synchronized with the current source? This distinction matters. A translation can be correct and still be out of sync. Once we acknowledged this, it became clear that treating translations as static content was no longer sufficient. The design decision: translations as versioned assets Starting with Co-op Translator 0.16.2, we made a deliberate design decision: Translations are treated as versioned software assets. This applies not only to Markdown files, but also to images, notebooks, and any other translated artifacts. Translated content is not just text. It is an artifact generated from a specific version of a source. To make this abstraction operational rather than theoretical, we did not invent a new mechanism. Instead, we looked to systems that already solve a similar problem: pip, poetry, and npm. These tools are designed to track artifacts as their sources evolve. We applied the same thinking to translated content. Closer to dependency management than translation jobs The closest analogy is software dependency management. When a dependency becomes outdated: it is not suddenly “wrong,” it is simply no longer aligned with the current version. Translations behave the same way. When the source document changes: the translated file does not immediately become incorrect, it becomes out of sync with its source version. This framing shifts the problem away from translation output and toward state and synchronization. Why file-level versioning matters Many translation systems operate at the string or segment level. That model works well for UI text and relatively stable resources. Documentation is different. A Markdown file is an artifact. A screenshot is an artifact. A notebook is an artifact. They are consumed as units, not as isolated strings. Managing translation state at the file level allows maintainers to reason about translations using the same mental model they already apply to other repository assets. What changed in practice From embedded markers to explicit state Previously, translation metadata lived inside translated files as embedded comments or markers. This approach had clear limitations: translation state was fragmented, difficult to inspect globally, and easy to miss as repositories grew. We moved to language-scoped JSON state files that explicitly track: the source version, the translated artifact, and its synchronization status. Translation state is no longer hidden inside content. It is a first-class, inspectable part of the repository. Extending the model to images and notebooks The same model now applies consistently to: translated images, localized notebooks, and other non-text artifacts. If an image changes in the source language, the translated image becomes out of sync. If a notebook is updated, its translated versions are evaluated against the new source version. The format does not matter. The lifecycle does. Once translations are treated as versioned assets, the system remains consistent across all content types. What this enables This design enables: Explicit drift detection See which translations are out of sync without guessing. Consistent maintenance signals Text, images, and notebooks follow the same rules. Clear responsibility boundaries The system reports state. Humans decide action. Scalability for fast-moving repositories Translation maintenance becomes observable, not reactive. In large documentation sets, this difference determines whether translation maintenance is sustainable at all. What this is not This system does not: judge translation quality, determine semantic correctness, or auto-approve content. It answers one question only: Is this translated artifact synchronized with its source version? Who this is for This approach is designed for teams that: maintain multilingual documentation, update content frequently, and need confidence in what is actually up to date. When documentation evolves faster than translations, treating translations as versioned assets becomes a necessity, not an optimization. Closing thought Once translations are modeled as software assets, long-standing ambiguities disappear. State becomes visible. Maintenance becomes manageable. And translations fit naturally into existing software workflows. At that point, the question is no longer whether translation drift exists, but: Can you see it? Reference Co-op Translator repository https://github.com/Azure/co-op-translatorThe Perfect Fusion of GitHub Copilot SDK and Cloud Native
In today's rapidly evolving AI landscape, we've witnessed the transformation from simple chatbots to sophisticated agent systems. As a developer and technology evangelist, I've observed an emerging trend—it's not about making AI omnipotent, but about enabling each AI Agent to achieve excellence in specific domains. Today, I want to share an exciting technology stack: GitHub Copilot SDK (a development toolkit that embeds production-grade agent engines into any application) + Agent-to-Agent (A2A) Protocol (a communication standard enabling standardized agent collaboration) + Cloud Native Deployment (the infrastructure foundation for production systems). Together, these three components enable us to build truly collaborative multi-agent systems. 1. From AI Assistants to Agent Engines: Redefining Capability Boundaries Traditional AI assistants often pursue "omnipotence"—attempting to answer any question you throw at them. However, in real production environments, this approach faces serious challenges: Inconsistent Quality: A single model trying to write code, perform data analysis, and generate creative content struggles to achieve professional standards in each domain Context Pollution: Mixing prompts from different tasks leads to unstable model outputs Difficult Optimization: Adjusting prompts for one task type may negatively impact performance on others High Development Barrier: Building agents from scratch requires handling planning, tool orchestration, context management, and other complex logic GitHub proposed a revolutionary approach—instead of forcing developers to build agent frameworks from scratch, provide a production-tested, programmable agent engine. This is the core value of the GitHub Copilot SDK. Evolution from Copilot CLI to SDK GitHub Copilot CLI is a powerful command-line tool that can: Plan projects and features Modify files and execute commands Use custom agents Delegate tasks to cloud execution Integrate with MCP servers The GitHub Copilot SDK extracts the agentic core behind Copilot CLI and offers it as a programmable layer for any application. This means: You're no longer confined to terminal environments You can embed this agent engine into GUI applications, web services, and automation scripts You gain access to the same execution engine validated by millions of users Just like in the real world, we don't expect one person to be a doctor, lawyer, and engineer simultaneously. Instead, we provide professional tools and platforms that enable professionals to excel in their respective domains. 2. GitHub Copilot SDK: Embedding Copilot CLI's Agentic Core into Any App Before diving into multi-agent systems, we need to understand a key technology: GitHub Copilot SDK. What is GitHub Copilot SDK? GitHub Copilot SDK (now in technical preview) is a programmable agent execution platform. It allows developers to embed the production-tested agentic core from GitHub Copilot CLI directly into any application. Simply put, the SDK provides: Out-of-the-box Agent Loop: No need to build planners, tool orchestration, or context management from scratch Multi-model Support: Choose different AI models (like GPT-4, Claude Sonnet) for different task phases Tool and Command Integration: Built-in file editing, command execution, and MCP server integration capabilities Streaming Real-time Responses: Support for progress updates on long-running tasks Multi-language Support: SDKs available for Node.js, Python, Go, and .NET Why is the SDK Critical for Building Agents? Building an agentic workflow from scratch is extremely difficult. You need to handle: Context management across multiple conversation turns Orchestration of tools and commands Routing between different models MCP server integration Permission control, safety boundaries, and error handling GitHub Copilot SDK abstracts away all this underlying complexity. You only need to focus on: Defining agent professional capabilities (through Skill files) Providing domain-specific tools and constraints Implementing business logic SDK Usage Examples Python Example (from actual project implementation): from copilot import CopilotClient # Initialize client copilot_client = CopilotClient() await copilot_client.start() # Create session and load Skill session = await copilot_client.create_session({ "model": "claude-sonnet-4.5", "streaming": True, "skill_directories": ["/path/to/skills/blog/SKILL.md"] }) # Send task await session.send_and_wait({ "prompt": "Write a technical blog about multi-agent systems" }, timeout=600) Skill System: Professionalizing Agents While the SDK provides a powerful execution engine, how do we make agents perform professionally in specific domains? The answer is Skill files. A Skill file is a standardized capability definition containing: Capability Declaration: Explicitly tells the system "what I can do" (e.g., blog generation, PPT creation) Domain Knowledge: Preset best practices, standards, and terminology guidelines Workflow: Defines the complete execution path from input to output Output Standards: Ensures generated content meets format and quality requirements Through the combination of Skill files + SDK, we can build truly professional agents rather than generic "jack-of-all-trades assistants." 3. A2A Protocol: Enabling Seamless Agent Collaboration Once we have professional agents, the next challenge is: how do we make them work together? This is the core problem the Agent-to-Agent (A2A) Protocol aims to solve. Three Core Mechanisms of A2A Protocol 1. Agent Discovery (Service Discovery) Each agent exposes its capability card through the standardized /.well-known/agent-card.json endpoint, acting like a business card that tells other agents "what I can do": { "name": "blog_agent", "description": "Blog generation with DeepSearch", "primaryKeywords": ["blog", "article", "write"], "skills": [{ "id": "blog_generation", "tags": ["blog", "writing"], "examples": ["Write a blog about..."] }], "capabilities": { "streaming": true } } 2. Intelligent Routing The Orchestrator matches tasks with agent capabilities through scoring. The project's routing algorithm implements keyword matching and exclusion detection: Positive Matching: If a task contains an agent's primaryKeywords, score +0.5 Negative Exclusion: If a task contains other agents' keywords, score -0.3 This way, when users say "write a blog about cloud native," the system automatically selects the Blog Agent; when they say "create a tech presentation PPT," it routes to the PPT Agent. 3. SSE Streaming (Real-time Streaming) For time-consuming tasks (like generating a 5000-word blog), A2A uses Server-Sent Events to push real-time progress, allowing users to see the agent working instead of just waiting. This is crucial for user experience. 4. Cloud Native Deployment: Making Agent Systems Production-Ready Even the most powerful technology is just a toy if it can't be deployed to production environments. This project demonstrates a complete deployment of a multi-agent system to a cloud-native platform (Azure Container Apps). Why Choose Cloud Native? Elastic Scaling: When blog generation requests surge, the Blog Agent can auto-scale; it scales down to zero during idle times to save costs Independent Evolution: Each agent has its own Docker image and deployment pipeline; updating the Blog Agent doesn't affect the PPT Agent Fault Isolation: If one agent crashes, it won't bring down the entire system; the Orchestrator automatically degrades Global Distribution: Through Azure Container Apps, agents can be deployed across multiple global regions to reduce latency Container Deployment Essentials Each agent in the project has a standardized Dockerfile: FROM python:3.12-slim WORKDIR /app COPY requirements.txt . RUN pip install --no-cache-dir -r requirements.txt COPY . . EXPOSE 8001 CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8001"] Combined with the deploy-to-aca.sh script, one-click deployment to Azure: # Build and push image az acr build --registry myregistry --image blog-agent:latest . # Deploy to Container Apps az containerapp create \ --name blog-agent \ --resource-group my-rg \ --environment my-env \ --image myregistry.azurecr.io/blog-agent:latest \ --secrets github-token=$COPILOT_TOKEN \ --env-vars COPILOT_GITHUB_TOKEN=secretref:github-token 5. Real-World Results: From "Works" to "Works Well" Let's see how this system performs in real scenarios. Suppose a user initiates a request: "Write a technical blog about Kubernetes multi-tenancy security, including code examples and best practices" System Execution Flow: Orchestrator receives the request and scans all agents' capability cards Keyword matching: "write" + "blog" → Blog Agent scores 1.0, PPT Agent scores 0.0 Routes to Blog Agent, loads technical writing Skill Blog Agent initiates DeepSearch to collect latest K8s security materials SSE real-time push: "Collecting materials..." → "Generating outline..." → "Writing content..." Returns complete blog after 5 minutes, including code highlighting, citation sources, and best practices summary Compared to traditional "omnipotent" AI assistants, this system's advantages: ✅ Professionalism: Blog Agent trained with technical writing Skills produces content with clear structure, accurate terminology, and executable code ✅ Visibility: Users see progress throughout, knowing what the AI is doing ✅ Extensibility: Adding new agents (video script, data analysis) in the future requires no changes to existing architecture 6. Key Technical Challenges and Solutions Challenge 1: Inaccurate Agent Capability Descriptions Leading to Routing Errors Solution: Define clear primaryKeywords and examples in Agent Cards Implement exclusion detection mechanism to prevent tasks from being routed to unsuitable agents Challenge 2: Poor User Experience for Long-Running Tasks Solution: Fully adopt SSE streaming, pushing working/completed/error status in real-time Display progress hints in status messages so users know what the system is doing Challenge 3: Sensitive Information Leakage Risk Solution: Use Azure Key Vault or Container Apps Secrets to manage GitHub Tokens Inject via environment variables, never hardcode in code or images Check required environment variables in deployment scripts to prevent configuration errors 7. Future Outlook: SDK-Driven Multi-Agent Ecosystem This project is just the beginning. As GitHub Copilot SDK and A2A Protocol mature, we can build richer agent ecosystems: Actual SDK Application Scenarios According to GitHub's official blog, development teams have already used the Copilot SDK to build: YouTube Chapter Generator: Automatically generates timestamped chapter markers for videos Custom Agent GUIs: Visual agent interfaces for specific business scenarios Speech-to-Command Workflows: Control desktop applications through voice AI Battle Games: Interactive competitive experiences with AI Intelligent Summary Tools: Automatic extraction and summarization of key information Multi-Agent System Evolution Directions 🏪 Agent Marketplace: Developers can publish specialized agents (legal documents, medical reports, etc.) that plug-and-play via A2A protocol 🔗 Cascade Orchestration: Orchestrator automatically breaks down complex tasks, calling multiple agents collaboratively (e.g., "write blog + generate images + create PPT") 🌐 Cross-Platform Interoperability: Based on A2A standards, agents developed by different companies can call each other, breaking down data silos ⚙️ Automated Workflows: Delegate routine repetitive work to agent chains, letting humans focus on creative work 🎯 Vertical Domain Specialization: Combined with Skill files, build high-precision agents in professional fields like finance, healthcare, and legal Core Value of the SDK The significance of GitHub Copilot SDK lies in: it empowers every developer to become a builder of agent systems. You don't need deep learning experts, you don't need to implement agent frameworks yourself, and you don't even need to manage GPU clusters. You only need to: Install the SDK (npm install github/copilot-sdk) Define your business logic and tools Write Skill files describing professional capabilities Call the SDK's execution engine And you can build production-grade intelligent agent applications. Summary: From Demo to Production GitHub Copilot SDK + A2A + Cloud Native isn't three independent technology stacks, but a complete methodology: GitHub Copilot SDK provides an out-of-the-box agent execution engine—handling planning, tool orchestration, context management, and other underlying complexity Skill files enable agents with domain-specific professional capabilities—defining best practices, workflows, and output standards A2A Protocol enables standardized communication and collaboration between agents—implementing service discovery, intelligent routing, and streaming Cloud Native makes the entire system production-ready—containerization, elastic scaling, fault isolation For developers, this means we no longer need to build agent frameworks from scratch or struggle with the black magic of prompt engineering. We only need to: Use GitHub Copilot SDK to obtain a production-grade agent execution engine Write domain-specific Skill files to define professional capabilities Follow A2A protocol to implement standard interfaces between agents Deploy to cloud platforms through containerization And we can build AI Agent systems that are truly usable, well-designed, and production-ready. 🚀 Start Building Complete project code is open source: https://github.com/kinfey/Multi-AI-Agents-Cloud-Native/tree/main/code/GitHubCopilotAgents_A2A Follow the README guide and deploy your first Multi-Agent system in 30 minutes! References GitHub Copilot SDK Official Announcement - Build an agent into any app with the GitHub Copilot SDK GitHub Copilot SDK Repository - github.com/github/copilot-sdk A2A Protocol Official Specification - a2a-protocol.org/latest/ Project Source Code - Multi-AI-Agents-Cloud-Native Azure Container Apps Documentation - learn.microsoft.com/azure/container-apps418Views0likes0CommentsWhat is trending in Hugging Face on Microsoft Foundry? Feb, 2, 2026
Open‑source AI is moving fast, with important breakthroughs in reasoning, agentic systems, multimodality, and efficiency emerging every day. Hugging Face has been a leading platform where researchers, startups, and developers share and discover new models. Microsoft Foundry brings these trending Hugging Face models into a production‑ready experience, where developers can explore, evaluate, and deploy them within their Azure environment. Our weekly Model Monday’s series highlights Hugging Face models available in Foundry, focusing on what matters most to developers: why a model is interesting, where it fits, and how to put it to work quickly. This week’s Model Mondays edition highlights three Hugging Face models, including a powerful Mixture-of-Experts model from Z. AI designed for lightweight deployment, Meta’s unified foundation model for image and video segmentation, and MiniMax’s latest open-source agentic model optimized for complex workflows. Models of the week Z.AI’s GLM-4.7-flash Model Basics Model name: zai-org/GLM-4.7-Flash Parameters / size: 30B total -3B active Default settings: 131,072 max new tokens Primary task: Agentic, Reasoning and Coding Why this model matters Why it’s interesting: It utilizes a Mixture-of-Experts (MoE) architecture (30B total parameters and 3B active parameters) to offer a new option for lightweight deployment. It demonstrates strong performance on logic and reasoning benchmarks, outperforming similar sized models like gpt-oss-20b on AIME 25 and GPQA benchmarks. It supports advanced inference features like "Preserved Thinking" mode for multi-turn agentic tasks. Best‑fit use cases: Lightweight local deployment, multi-turn agentic tasks, and logical reasoning applications. What’s notable: From the Foundry catalog, users can deploy on a A100 instance or unsloth/GLM-4.7-Flash-GGUF on a CPU. ource SOTA scores among models of comparable size. Additionally, compared to similarly sized models, GLM-4.7-Flash demonstrates superior frontend and backend development capabilities. Click to see more: https://docs.z.ai Try it Use case Best‑practice prompt pattern Agentic coding (multi‑step repo work, debugging, refactoring) Treat the model as an autonomous coding agent, not a snippet generator. Explicitly require task decomposition and step‑by‑step execution, then a single consolidated result. Long‑context agent workflows (local or low‑cost autonomous agents) Call out long‑horizon consistency and context preservation. Instruct the model to retain earlier assumptions and decisions across turns. Now that you know GLM‑4.7‑Flash works best when you give it a clear goal and let it reason through a bounded task, here’s an example prompt that a product or engineering team might use to identify risks and propose mitigations: You are a software reliability analyst for a mid‑scale SaaS platform. Review recent incident reports, production logs, and customer issues to uncover edge‑case failures outside normal usage (e.g., rare inputs, boundary conditions, timing/concurrency issues, config drift, or unexpected feature interactions). Prioritize low‑frequency, high‑impact risks that standard testing misses. Recommend minimal, low‑cost fixes (validation, guardrails, fallback logic, or documentation). Deliver a concise executive summary with sections: Observed Edge Cases, Root Causes, User Impact, Recommended Lightweight Fixes, and Validation Steps. Meta's Segment Anything 3 (SAM3) Model Basics Model name: facebook/sam3 Parameters / size: 0.9B Primary task: Mask Generation, Promptable Concept Segmentation (PCS) Why this model matters Why it’s interesting: It handles a vastly larger set of open-vocabulary prompts than SAM 2, and unifies image and video segmentation capabilities. It includes a "SAM 3 Tracker" mode that acts as a drop-in replacement for SAM 2 workflows with improved performance. Best‑fit use cases: Open-vocabulary object detection, video object tracking, and automatic mask generation What’s notable: Introduces Promptable Concept Segmentation (PCS), allowing users to find all matching objects (e.g., "dial") via text prompt rather than just single instances. Try it This model enables users to identify specific objects within video footage and isolate them over extended periods. With just one line of code, it is possible to detect multiple similar objects simultaneously. The accompanying GIF demonstrates how SAM3 efficiently highlights players wearing white on the field as they appear and disappear from view. Additional examples are available at the following repository: https://github.com/facebookresearch/sam3/blob/main/assets/player.gif Use case Best‑practice prompt pattern Agentic coding (multi‑step repo work, debugging, refactoring) Treat SAM 3 as a concept detector, not an interactive click tool. Use short, concrete noun‑phrase concept prompts instead of describing the scene or asking questions. Example prompt: “yellow school bus” or “shipping containers”. Avoid verbs or full sentences. Video segmentation + object tracking Specify the same concept prompt once, then apply it across the video sequence. Do not restate the prompt per frame. Let the model maintain identity continuity. Example: “person wearing a red jersey”. Hard‑to‑name or visually subtle objects Use exemplar‑based prompts (image region or box) when text alone is ambiguous. Optionally combine positive and negative exemplars to refine the concept. Avoid over‑constraining with long descriptions. Using the GIF above as a leading example, here is a prompt that shows how SAM 3 turns raw sports footage into structured, reusable data. By identifying and tracking players based on visual concepts like jersey color so that sports leagues can turn tracked data into interactive experiences where automated player identification can relay stats, fun facts, etc when built into a larger application. Here is a prompt that will allow you to start identifying specific players across video: Act as a sports analytics operator analyzing football match footage. Segment and track all football players wearing blue jerseys across the video. Generate pixel‑accurate segmentation masks for each player and assign persistent instance IDs that remain stable during camera movement, zoom, and player occlusion. Exclude referees, opposing team jerseys, sidelines, and crowd. Output frame‑level masks and tracking metadata suitable for overlays, player statistics, and downstream analytics pipelines. MiniMax AI's MiniMax-M2.1 Model Basics Model name: MiniMaxAI/MiniMax-M2.1 Parameters / size: 229B-10B Active Default settings: 200,000 max new tokens Primary task: Agentic and Coding Why this model matters Why it’s interesting: It is optimized for robustness in coding, tool use, and long-horizon planning, outperforming Claude Sonnet 4.5 in multilingual scenarios. It excels in full-stack application development, capable of architecting apps "from zero to one”. Previous coding models focused on Python optimization, M2.1 brings enhanced capabilities in Rust, Java, Golang, C++, Kotlin, Objective-C, TypeScript, JavaScript, and other languages. The model delivers exceptional stability across various coding agent frameworks. Best‑fit use cases: Lightweight local deployment, multi-turn agentic tasks, and logical reasoning applications. What’s notable: The release of open-source weights for M2.1 delivers a massive leap over M2 on software engineering leaderboards. https://www.minimax.io/ Try it Use case Best‑practice prompt pattern End‑to‑end agentic coding (multi‑file edits, run‑fix loops) Treat the model as an autonomous coding agent, not a snippet generator. Explicitly require task decomposition and step‑by‑step execution, then a single consolidated result. Long‑horizon tool‑using agents (shell, browser, Python) Explicitly request stepwise planning and sequential tool use. M2.1’s interleaved thinking and improved instruction‑constraint handling are designed for complex, multi‑step analytical tasks that require evidence tracking and coherent synthesis, not conversational back‑and‑forth. Long‑context reasoning & analysis (large documents / logs) Declare the scope and desired output structure up front. MiniMax‑M2.1 performs best when the objective and final artifact are clear, allowing it to manage long context and maintain coherence. Because MiniMax‑M2.1 is designed to act as a long‑horizon analytical agent, it shines when you give it a clear end goal and let it work through large volumes of information—here’s a prompt a risk or compliance team could use in practice: You are a financial risk analysis agent. Analyze the following transaction logs and compliance policy documents to identify potential regulatory violations and systemic risk patterns. Plan your approach before executing. Work through the data step by step, referencing evidence where relevant. Deliver a final report with the following sections: Key Risk Patterns Identified, Supporting Evidence, Potential Regulatory Impact, Recommended Mitigations. Your response should be a complete, executive-ready report, not a conversational draft. Getting started You can deploy open‑source Hugging Face models directly in Microsoft Foundry by browsing the Hugging Face collection in the Foundry model catalog and deploying to managed endpoints in just a few clicks. You can also start from the Hugging Face Hub. First, select any supported model and then choose "Deploy on Microsoft Foundry", which brings you straight into Azure with secure, scalable inference already configured. Learn how to discover models and deploy them using Microsoft Foundry documentation. Follow along the Model Mondays series and access the GitHub to stay up to date on the latest Read Hugging Face on Azure docs Learn about one-click deployments from the Hugging Face Hub on Microsoft Foundry Explore models in Microsoft Foundry385Views0likes0CommentsGitHub Copilot SDK and Hybrid AI in Practice: Automating README to PPT Transformation
Introduction In today's rapidly evolving AI landscape, developers often face a critical choice: should we use powerful cloud-based Large Language Models (LLMs) that require internet connectivity, or lightweight Small Language Models (SLMs) that run locally but have limited capabilities? The answer isn't either-or—it's hybrid models—combining the strengths of both to create AI solutions that are secure, efficient, and powerful. This article explores hybrid model architectures through the lens of GenGitHubRepoPPT, demonstrating how to elegantly combine Microsoft Foundry Local, GitHub Copilot SDK, and other technologies to automatically generate professional PowerPoint presentations from GitHub README files. 1. Hybrid Model Scenarios and Value 1.1 What Are Hybrid Models? Hybrid AI Models strategically combine locally-running Small Language Models (SLMs) with cloud-based Large Language Models (LLMs) within the same application, selecting the most appropriate model for each task based on its unique characteristics. Core Principles: Local Processing for Sensitive Data: Privacy-critical content analysis happens on-device Cloud for Value Creation: Complex reasoning and creative generation leverage cloud power Balancing Cost and Performance: High-frequency, simple tasks run locally to minimize API costs 1.2 Typical Hybrid Model Use Cases Use Case Local SLM Role Cloud LLM Role Value Proposition Intelligent Document Processing Text extraction, structural analysis Content refinement, format conversion Privacy protection + Professional output Code Development Assistant Syntax checking, code completion Complex refactoring, architecture advice Fast response + Deep insights Customer Service Systems Intent recognition, FAQ handling Complex issue resolution Reduced latency + Enhanced quality Content Creation Platforms Keyword extraction, outline generation Article writing, multilingual translation Cost control + Creative assurance 1.3 Why Choose Hybrid Models? Three Core Advantages: Privacy and Security Sensitive data never leaves local devices Compliant with GDPR, HIPAA, and other regulations Ideal for internal corporate documents and personal information Cost Optimization Reduces cloud API call frequency Local models have zero usage fees Predictable operational costs Performance and Reliability Local processing eliminates network latency Partial functionality in offline environments Cloud models ensure high-quality output 2. Core Technology Analysis 2.1 Large Language Models (LLMs): Cloud Intelligence Representatives What are LLMs? Large Language Models are deep learning-based natural language processing models, typically with billions to trillions of parameters. Through training on massive text datasets, they've acquired powerful language understanding and generation capabilities. Representative Models: Claude Sonnet 4.5: Anthropic's flagship model, excelling at long-context processing and complex reasoning GPT-5.2 Series: OpenAI's general-purpose language models Gemini: Google's multimodal large models LLM Advantages: ✅ Exceptional text generation quality ✅ Powerful contextual understanding ✅ Support for complex reasoning tasks ✅ Continuous model updates and optimization Typical Applications: Professional document writing (technical reports, business plans) Code generation and refactoring Multilingual translation Creative content creation 2.2 Small Language Models (SLMs) and Microsoft Foundry Local 2.2.1 SLM Characteristics Small Language Models typically have 1B-7B parameters, designed specifically for resource-constrained environments. Mainstream SLM Model Families: Microsoft Phi Family (Phi Family): Inference-optimized efficient models Alibaba Qwen Family (Qwen Family): Excellent Chinese language capabilities Mistral Series: Outstanding performance with small parameter counts SLM Advantages: ⚡ Low-latency response (millisecond-level) 💰 Zero API costs 🔒 Fully local, data stays on-device 📱 Suitable for edge device deployment 2.2.2 Microsoft Foundry Local: The Foundation of Local AI Foundry Local is Microsoft's local AI runtime tool, enabling developers to easily run SLMs on Windows or macOS devices. Core Features: OpenAI-Compatible API # Using Foundry Local is like using OpenAI API from openai import OpenAI from foundry_local import FoundryLocalManager manager = FoundryLocalManager("qwen2.5-7b-instruct") client = OpenAI( base_url=manager.endpoint, api_key=manager.api_key ) Hardware Acceleration Support CPU: General computing support GPU: NVIDIA, AMD, Intel graphics acceleration NPU: Qualcomm, Intel AI-specific chips Apple Silicon: Neural Engine optimization Based on ONNX Runtime Cross-platform compatibility Highly optimized inference performance Supports model quantization (INT4, INT8) Convenient Model Management # View available models foundry model list # Run a model foundry model run qwen2.5-7b-instruct-generic-cpu:4 # Check running status foundry service ps Foundry Local Application Value: 🎓 Educational Scenarios: Students can learn AI development without cloud subscriptions 🏢 Enterprise Environments: Process sensitive data while maintaining compliance 🧪 R&D Testing: Rapid prototyping without API cost concerns ✈️ Offline Environments: Works on planes, subways, and other no-network scenarios 2.3 GitHub Copilot SDK: The Express Lane from Agent to Business Value 2.3.1 What is GitHub Copilot SDK? GitHub Copilot SDK, released as a technical preview on January 22, 2026, is a game-changer for AI Agent development. Unlike other AI SDKs, Copilot SDK doesn't just provide API calling interfaces—it delivers a complete, production-grade Agent execution engine. Why is it revolutionary? Traditional AI application development requires you to build: ❌ Context management systems (multi-turn conversation state) ❌ Tool orchestration logic (deciding when to call which tool) ❌ Model routing mechanisms (switching between different LLMs) ❌ MCP server integration ❌ Permission and security boundaries ❌ Error handling and retry mechanisms Copilot SDK provides all of this out-of-the-box, letting you focus on business logic rather than underlying infrastructure. 2.3.2 Core Advantages: The Ultra-Short Path from Concept to Code Production-Grade Agent Engine: Battle-Tested Reliability Copilot SDK uses the same Agent core as GitHub Copilot CLI, which means: ✅ Validated in millions of real-world developer scenarios ✅ Capable of handling complex multi-step task orchestration ✅ Automatic task planning and execution ✅ Built-in error recovery mechanisms Real-World Example: In the GenGitHubRepoPPT project, we don't need to hand-write the "how to convert outline to PPT" logic—we simply tell Copilot SDK the goal, and it automatically: Analyzes outline structure Plans slide layouts Calls file creation tools Applies formatting logic Handles multilingual adaptation # Traditional approach: requires hundreds of lines of code for logic def create_ppt_traditional(outline): slides = parse_outline(outline) for slide in slides: layout = determine_layout(slide) content = format_content(slide) apply_styling(content, layout) # ... more manual logic return ppt_file # Copilot SDK approach: focus on business intent session = await client.create_session({ "model": "claude-sonnet-4.5", "streaming": True, "skill_directories": [skills_dir] }) session.send_and_wait({"prompt": prompt}, timeout=600) Custom Skills: Reusable Encapsulation of Business Knowledge This is one of Copilot SDK's most powerful features. In traditional AI development, you need to provide complete prompts and context with every call. Skills allow you to: Define once, reuse forever: # .copilot_skills/ppt/SKILL.md # PowerPoint Generation Expert Skill ## Expertise You are an expert in business presentation design, skilled at transforming technical content into easy-to-understand visual presentations. ## Workflow 1. **Structure Analysis** - Identify outline hierarchy (titles, subtitles, bullet points) - Determine topic and content density for each slide 2. **Layout Selection** - Title slide: Use large title + subtitle layout - Content slides: Choose single/dual column based on bullet count - Technical details: Use code block or table layouts 3. **Visual Optimization** - Apply professional color scheme (corporate blue + accent colors) - Ensure each slide has a visual focal point - Keep bullets to 5-7 items per page 4. **Multilingual Adaptation** - Choose appropriate fonts based on language (Chinese: Microsoft YaHei, English: Calibri) - Adapt text direction and layout conventions ## Output Requirements Generate .pptx files meeting these standards: - 16:9 widescreen ratio - Consistent visual style - Editable content (not images) - File size < 5MB Business Code Generation Capability This is the core value of this project. Unlike generic LLM APIs, Copilot SDK with Skills can generate truly executable business code. Comparison Example: Aspect Generic LLM API Copilot SDK + Skills Task Description Requires detailed prompt engineering Concise business intent suffices Output Quality May need multiple adjustments Professional-grade on first try Code Execution Usually example code Directly generates runnable programs Error Handling Manual implementation required Agent automatically handles and retries Multi-step Tasks Manual orchestration needed Automatic planning and execution Comparison of manual coding workload: Task Manual Coding Copilot SDK Processing logic code ~500 lines ~10 lines configuration Layout templates ~200 lines Declared in Skill Style definitions ~150 lines Declared in Skill Error handling ~100 lines Automatically handled Total ~950 lines ~10 lines + Skill file Tool Calling & MCP Integration: Connecting to the Real World Copilot SDK doesn't just generate code—it can directly execute operations: 🗃️ File System Operations: Create, read, modify files 🌐 Network Requests: Call external APIs 📊 Data Processing: Use pandas, numpy, and other libraries 🔧 Custom Tools: Integrate your business logic 3. GenGitHubRepoPPT Case Study 3.1 Project Overview GenGitHubRepoPPT is an innovative hybrid AI solution that combines local AI models with cloud-based AI agents to automatically generate professional PowerPoint presentations from GitHub repository README files in under 5 minutes. Technical Architecture: 3.2 Why Adopt a Hybrid Model? Stage 1: Local SLM Processes Sensitive Data Task: Analyze GitHub README, extract key information, generate structured outline Reasons for choosing Qwen-2.5-7B + Foundry Local: Privacy Protection README may contain internal project information Local processing ensures data doesn't leave the device Complies with data compliance requirements Cost Effectiveness Each analysis processes thousands of tokens Cloud API costs are significant in high-frequency scenarios Local models have zero additional fees Performance Qwen-2.5-7B excels at text analysis tasks Outstanding Chinese support Acceptable CPU inference latency (typically 2-3 seconds) Stage 2: Cloud LLM + Copilot SDK Creates Business Value Task: Create well-formatted PowerPoint files based on outline Reasons for choosing Claude Sonnet 4.5 + Copilot SDK: Automated Business Code Generation Traditional approach pain points: Need to hand-write 500+ lines of code for PPT layout logic Require deep knowledge of python-pptx library APIs Style and formatting code is error-prone Multilingual support requires additional conditional logic Copilot SDK solution: Declare business rules and best practices through Skills Agent automatically generates and executes required code Zero-code implementation of complex layout logic Development time reduced from 2-3 days to 2-3 hours Ultra-Short Path from Intent to Execution Comparison: Different ways to implement "Generate professional PPT" 3. Production-Grade Reliability and Quality Assurance Battle-tested Agent engine: Uses the same core as GitHub Copilot CLI Validated in millions of real-world scenarios Automatically handles edge cases and errors Consistent output quality: Professional standards ensured through Skills Automatic validation of generated files Built-in retry and error recovery mechanisms 4. Rapid Iteration and Optimization Capability Scenario: Client requests PPT style adjustment The GitHub Repo https://github.com/kinfey/GenGitHubRepoPPT 4. Summary 4.1 Core Value of Hybrid Models + Copilot SDK The GenGitHubRepoPPT project demonstrates how combining hybrid models with Copilot SDK creates a new paradigm for AI application development. Privacy and Cost Balance The hybrid approach allows sensitive README analysis to happen locally using Qwen-2.5-7B, ensuring data never leaves the device while incurring zero API costs. Meanwhile, the value-creating work—generating professional PowerPoint presentations—leverages Claude Sonnet 4.5 through Copilot SDK, delivering quality that justifies the per-use cost. From Code to Intent Traditional AI development required writing hundreds of lines of code to handle PPT generation logic, layout selection, style application, and error handling. With Copilot SDK and Skills, developers describe what they want in natural language, and the Agent automatically generates and executes the necessary code. What once took 3-5 days now takes 3-4 hours, with 95% less code to maintain. Automated Business Code Generation Copilot SDK doesn't just provide code examples—it generates complete, executable business logic. When you request a multilingual PPT, the Agent understands the requirement, selects appropriate fonts, generates the implementation code, executes it with error handling, validates the output, and returns a ready-to-use file. Developers focus on business intent rather than implementation details. 4.2 Technology Trends The Shift to Intent-Driven Development We're witnessing a fundamental change in how developers work. Rather than mastering every programming language detail and framework API, developers are increasingly defining what they want through declarative Skills. Copilot SDK represents this future: you describe capabilities in natural language, and AI Agents handle the code generation and execution automatically. Edge AI and Cloud AI Integration The evolution from pure cloud LLMs (powerful but privacy-concerning) to pure local SLMs (private but limited) has led to today's hybrid architectures. GenGitHubRepoPPT exemplifies this trend: local models handle data analysis and structuring, while cloud models tackle complex reasoning and professional output generation. This combination delivers fast, secure, and professional results. Democratization of Agent Development Copilot SDK dramatically lowers the barrier to building AI applications. Senior engineers see 10-20x productivity gains. Mid-level engineers can now build sophisticated agents that were previously beyond their reach. Even junior engineers and business experts can participate by writing Skills that capture domain knowledge without deep technical expertise. The future isn't about whether we can build AI applications—it's about how quickly we can turn ideas into reality. References Projects and Code GenGitHubRepoPPT GitHub Repository - Case study project Microsoft Foundry Local - Local AI runtime GitHub Copilot SDK - Agent development SDK Copilot SDK Getting Started Tutorial - Official quick start Deep Dive: Copilot SDK Build an Agent into Any App with GitHub Copilot SDK - Official announcement GitHub Copilot SDK Cookbook - Practical examples Copilot CLI Official Documentation - CLI tool documentation Learning Resources Edge AI for Beginners - Edge AI introductory course Azure AI Foundry Documentation - Azure AI documentation GitHub Copilot Extensions Guide - Extension development guide650Views2likes0CommentsBuilding Agents with GitHub Copilot SDK: A Practical Guide to Automated Tech Update Tracking
Introduction In the rapidly evolving tech landscape, staying on top of key project updates is crucial. This article explores how to leverage GitHub's newly released Copilot SDK to build intelligent agent systems, featuring a practical case study on automating daily update tracking and analysis for Microsoft's Agent Framework. GitHub Copilot SDK: Embedding AI Capabilities into Any Application SDK Overview On January 22, 2026, GitHub officially launched the GitHub Copilot SDK technical preview, marking a new era in AI agent development. The SDK provides these core capabilities: Production-grade execution loop: The same battle-tested agentic engine powering GitHub Copilot CLI Multi-language support: Node.js, Python, Go, and .NET Multi-model routing: Flexible model selection for different tasks MCP server integration: Native Model Context Protocol support Real-time streaming: Support for streaming responses and live interactions Tool orchestration: Automated tool invocation and command execution Core Advantages Building agentic workflows from scratch presents numerous challenges: Context management across conversation turns Orchestrating tools and commands Routing between models Handling permissions, safety boundaries, and failure modes The Copilot SDK encapsulates all this complexity. As Mario Rodriguez, GitHub's Chief Product Officer, explains: "The SDK takes the agentic power of Copilot CLI and makes it available in your favorite programming language... GitHub handles authentication, model management, MCP servers, custom agents, and chat sessions plus streaming. That means you are in control of what gets built on top of those building blocks." Quick Start Examples Here's a simple TypeScript example using the Copilot SDK: import { CopilotClient } from "@github/copilot-sdk"; const client = new CopilotClient(); await client.start(); const session = await client.createSession({ model: "gpt-5", }); await session.send({ prompt: "Hello, world!" }); And in Python, it's equally straightforward: from copilot import CopilotClient client = CopilotClient() await client.start() session = await client.create_session({ "model": "claude-sonnet-4.5", "streaming": True, "skill_directories": ["./.copilot_skills/pr-analyzer/SKILL.md"] }) await session.send_and_wait({ "prompt": "Analyze PRs from microsoft/agent-framework merged yesterday" }) Real-World Case Study: Automated Agent Framework Daily Updates Project Background agent-framework-update-everyday is an automated system built with GitHub Copilot SDK and CLI that tracks daily code changes in Microsoft's Agent Framework and generates high-quality technical blog posts. System Architecture The project leverages the following technology stack: GitHub Copilot CLI (@github/copilot): Command-line AI capabilities GitHub Copilot SDK (github-copilot-sdk): Programmatic AI interactions Copilot Skills: Custom PR analysis behaviors GitHub Actions: CI/CD automation pipeline Core Workflow The system runs fully automated via GitHub Actions, executing Monday through Friday at UTC 00:00 with these steps: Step Action Description 1 Checkout repository Clone the repo using actions/checkout@v4 2 Setup Node.js Configure Node.js 22 environment for Copilot CLI 3 Install Copilot CLI Install via npm i -g github/copilot 4 Setup Python Configure Python 3.11 environment 5 Install Python dependencies Install github-copilot-sdk package 6 Run PR Analysis Execute pr_trigger_v2.py with Copilot authentication 7 Commit and push Auto-commit generated blog posts to repository Technical Implementation Details 1. Copilot Skill Definition The project uses a custom Copilot Skill (.copilot_skills/pr-analyzer/SKILL.md) to define: PR analysis behavior patterns Blog post structure requirements Breaking changes priority strategy Code snippet extraction rules This skill-based approach enables the AI agent to focus on domain-specific tasks and produce higher-quality outputs. 2. Python SDK Integration The core script pr_trigger_v2.py demonstrates Python SDK usage: from copilot import CopilotClient # Initialize client client = CopilotClient() await client.start() # Create session with model and skill specification session = await client.create_session({ "model": "claude-sonnet-4.5", "streaming": True, "skill_directories": ["./.copilot_skills/pr-analyzer/SKILL.md"] }) # Send analysis request await session.send_and_wait({ "prompt": "Analyze PRs from microsoft/agent-framework merged yesterday" }) 3. CI/CD Integration The GitHub Actions workflow (.github/workflows/daily-pr-analysis.yml) ensures automated execution: name: Daily PR Analysis on: schedule: - cron: '0 0 * * 1-5' # Monday-Friday at UTC 00:00 workflow_dispatch: # Support manual triggers jobs: analyze: runs-on: ubuntu-latest steps: - name: Setup and Run Analysis env: COPILOT_GITHUB_TOKEN: ${{ secrets.COPILOT_GITHUB_TOKEN }} run: | npm i -g github/copilot pip install github-copilot-sdk --break-system-packages python pr_trigger_v2.py Output Results The system automatically generates structured blog posts saved in the blog/ directory with naming convention: blog/agent-framework-pr-summary-{YYYY-MM-DD}.md Each post includes: Breaking Changes (highlighted first) Major Updates (with code examples) Minor Updates and Bug Fixes Summary and impact assessment Latest Advancements in GitHub Copilot CLI Released alongside the SDK, Copilot CLI has also received major updates, making it an even more powerful development tool: Enhanced Core Capabilities Persistent Memory: Cross-session context retention and intelligent compaction Multi-Model Collaboration: Choose different models for explore, plan, and review workflows Autonomous Execution: Custom agent support Agent skill system Full MCP support Async task delegation Real-World Applications Development teams have already built innovative applications using the SDK: YouTube chapter generators Custom GUI interfaces for agents Speech-to-command workflows for desktop apps Games where you compete with AI Content summarization tools These examples showcase the flexibility and power of the Copilot SDK. SDK vs CLI: Complementary, Not Competing Understanding the relationship between SDK and CLI is important: CLI: An interactive tool for end users, providing a complete development experience SDK: A programmable layer for developers to build customized applications The SDK essentially provides programmatic access to the CLI's core capabilities, enabling developers to: Integrate Copilot agent capabilities into any environment Build graphical user interfaces Create personal productivity tools Run custom internal agents in enterprise workflows GitHub handles the underlying authentication, model management, MCP servers, and session management, while developers focus on building value on top of these building blocks. Best Practices and Recommendations Based on experience from the agent-framework-update-everyday project, here are practical recommendations: 1. Leverage Copilot Skills Effectively Define clear skill files that specify: Input and output formats for tasks Rules for handling edge cases Quality standards and priorities 2. Choose Models Wisely Use different models for different tasks: Exploratory tasks: Use more powerful models (e.g., GPT-5) Execution tasks: Use faster models (e.g., Claude Sonnet) Cost-sensitive tasks: Balance performance and budget 3. Implement Robust Error Handling AI calls in CI/CD environments need to consider: Network timeout and retry strategies API rate limit handling Output validation and fallback mechanisms 4. Secure Authentication Management Use fine-grained Personal Access Tokens (PAT): Create dedicated Copilot access tokens Set minimum permission scope (Copilot Requests: Read) Store securely using GitHub Secrets 5. Version Control and Traceability Automated systems should: Log metadata for each execution Preserve historical outputs for comparison Implement auditable change tracking Future Outlook The release of GitHub Copilot SDK marks the democratization of AI agent development. Developers can now: Lower Development Barriers: No need to deeply understand complex AI infrastructure Accelerate Innovation: Focus on business logic rather than underlying implementation Flexible Integration: Embed AI capabilities into any application scenario Production-Ready: Leverage proven execution loops and security mechanisms As the SDK moves from technical preview to general availability, we can expect: Official support for more languages Richer tool ecosystem More powerful MCP integration capabilities Community-driven best practice libraries Conclusion This article demonstrates how to build practical automation systems using GitHub Copilot SDK through the agent-framework-update-everyday project. This case study not only validates the SDK's technical capabilities but, more importantly, showcases a new development paradigm: Using AI agents as programmable building blocks, integrated into daily development workflows, to liberate developer creativity. Whether you want to build personal productivity tools, enterprise internal agents, or innovative AI applications, the Copilot SDK provides a solid technical foundation. Visit github/copilot-sdk to start your AI agent journey today! Reference Resources GitHub Copilot SDK Official Repository Agent Framework Update Everyday Project GitHub Copilot CLI Documentation Microsoft Agent Framework Build an agent into any app with the GitHub Copilot SDK2.6KViews1like0CommentsPublishing Agents from Microsoft Foundry to Microsoft 365 Copilot & Teams
Better Together is a series on how Microsoft’s AI platforms work seamlessly to build, deploy, and manage intelligent agents at enterprise scale. As organizations embrace AI across every workflow, Microsoft Foundry, Microsoft 365, Agent 365, and Microsoft Copilot Studio are coming together to deliver a unified approach—from development to deployment to day-to-day operations. This three-part series explores how these technologies connect to help enterprises build AI agents that are secure, governed, and deeply integrated with Microsoft’s product ecosystem. Series Overview Part 1: Publishing from Foundry to Microsoft 365 Copilot and Microsoft Teams Part 2: Foundry + Agent 365 — Native Integration for Enterprise AI Part 3: Microsoft Copilot Studio Integration with Foundry Agents This blog focuses on Part 1: Publishing from Foundry to Microsoft 365 Copilot—how developers can now publish agents built in Foundry directly to Microsoft 365 Copilot and Teams in just a few clicks. Build once. Publish everywhere. Developers can now take an AI agent built in Microsoft Foundry and publish it directly to Microsoft 365 Copilot and Microsoft Teams in just a few clicks. The new streamlined publishing flow eliminates manual setup across Entra ID, Azure Bot Service, and manifest files, turning hours of configuration into a seamless, guided flow in the Foundry Playground. Simplifying Agent Publishing for Microsoft 365 Copilot & Microsoft Teams Previously, deploying a Foundry AI agent into Microsoft 365 Copilot and Microsoft Teams required multiple steps: app registration, bot provisioning, manifest editing, and admin approval. With the new Foundry → M365 integration, the process is straightforward and intuitive. Key capabilities No-code publishing — Prepare, package, and publish agents directly from Foundry Playground. Unified build — A single agent package powers multiple Microsoft 365 channels, including Teams Chat, Microsoft 365 Copilot Chat, and BizChat. Agent-type agnostic — Works seamlessly whether you have a prompt agent, hosted agent, or workflow agent. Built-in Governance — Every agent published to your organization is automatically routed through Microsoft 365 Admin Center (MAC) for review, approval, and monitoring. Downloadable package — Developers can download a .zip for local testing or submission to the Microsoft Marketplace. For pro-code developers, the experience is also simplified. A C# code-first sample in the Agent Toolkit for Visual Studio is searchable, featured, and ready to use. Why It Matters This integration isn’t just about convenience; it’s about scale, control, and trust. Faster time to value — Deliver intelligent agents where people already work, without infrastructure overhead. Enterprise control — Admins retain full oversight via Microsoft 365 Admin Center, with built-in approval, review and governance flows. Developer flexibility — Both low-code creators and pro-code developers benefit from the unified publishing experience. Better Together — This capability lays the groundwork for Agent 365 publishing and deeper M365 integrations. Real-world scenarios YoungWilliams built Priya, an AI agent that helps handle government service inquiries faster and more efficiently. Using the one-click publishing flow, Priya was quickly deployed to Microsoft Teams and M365 Copilot without manual setup. This allowed Young Williams’ customers to provide faster, more accurate responses while keeping governance and compliance intact. “Integrating Microsoft Foundry with Microsoft 365 Copilot fundamentally changed how we deliver AI solutions to our government partners,” said John Tidwell, CTO of YoungWilliams. “With Foundry’s one-click publishing to Teams and Copilot, we can take an idea from prototype to production in days instead of weeks—while maintaining the enterprise-grade security and governance our clients expect. It’s a game changer for how public services can adopt AI responsibly and at scale.” Availability Publishing from Foundry to M365 is in Public Preview within the Foundry Playground. Developers can explore the preview in Microsoft Foundry and test the Teams / M365 publishing flow today. SDK and CLI extensions for code-first publishing are generally available. What’s Next in the Better Together Series This blog is part of the broader Better Together series connecting Microsoft Foundry, Microsoft 365, Agent 365, and Microsoft Copilot Studio. Continue the journey: Foundry + Agent 365 — Native Integration for Enterprise AI (Link) Start building today [Quickstart — Publish an Agent to Microsoft 365 ] Try it now in the new Foundry Playground2.3KViews0likes2CommentsOptiMind: A small language model with optimization expertise
Turning a real world decision problem into a solver ready optimization model can take days—sometimes weeks—even for experienced teams. The hardest part is often not solving the problem; it’s translating business intent into precise mathematical objectives, constraints, and variables. OptiMind is designed to try and remove that bottleneck. This optimization‑aware language model translates natural‑language problem descriptions into solver‑ready mathematical formulations, can help organizations move from ideas to decisions faster. Now available through public preview as an experimental model through Microsoft Foundry, OptiMind targets one of the more expertise‑intensive steps in modern optimization workflows. Addressing the Optimization Bottleneck Mathematical optimization underpins many enterprise‑critical decisions—from designing supply chains and scheduling workforces to structuring financial portfolios and deploying networks. While today’s solvers can handle enormous and complex problem instances, formulating those problems remains a major obstacle. Defining objectives, constraints, and decision variables is an expertise‑driven process that often takes days or weeks, even when the underlying business problem is well understood. OptiMind tries to address this gap by automating and accelerating formulation. Developed by Microsoft Research, OptiMind transforms what was once a slow, error‑prone modeling task into a streamlined, repeatable step—freeing teams to focus on decision quality rather than syntax. What makes OptiMind different? OptiMind is not just as a language model, but as a specialized system built for real-world optimization tasks. Unlike general-purpose large language models adapted for optimization through prompting, OptiMind is purpose-built for mixed integer linear programming (MILP), and its design reflects this singular focus. At inference time, OptiMind follows a multi‑stage process: Problem classification (e.g., scheduling, routing, network design) Hint retrieval tailored to the identified problem class Solution generation in solver‑compatible formats such as GurobiPy Optional self‑correction, where multiple candidate formulations are generated and validated This design can improve reliability without relying on agentic orchestration or multiple large models. In internal evaluations on cleaned public benchmarks—including IndustryOR, Mamo‑Complex, and OptMATH—OptiMind demonstrated higher formulation accuracy than similarly sized open models and competitive performance relative to significantly larger systems. OptiMind improved accuracy by approximately 10 percent over the base model. In comparison to open-source models under 32 billion parameters, OptiMind was also found to match or exceed performance benchmarks. For more information on the model, please read the official research blog or the technical paper for OptiMind. Practical use cases: Unlocking efficiency across domains OptiMind is especially valuable where modeling effort—not solver capability—is the primary bottleneck. Typical use cases include: Supply Chain Network Design: Faster formulation of multi‑period network models and logistics flows Manufacturing and Workforce Scheduling: Easier capacity planning under complex operational constraints Logistics and Routing Optimization: Rapid modeling that captures real‑world constraints and variability Financial Portfolio Optimization: More efficient exploration of portfolios under regulatory and market constraints By reducing the time and expertise required to move from problem description to validated model, OptiMind helps teams reach actionable decisions faster and with greater confidence. Getting started OptiMind is available today as an experimental model, and Microsoft Research welcomes feedback from practitioners and enterprise teams. Next steps: Explore the research details: Read more about the model on Foundry Labs and the technical paper on arXiv Try the model: Access OptiMind through Microsoft Foundry Test sample code: Available in the OptiMind GitHub repository Take the next step in optimization innovation with OptiMind—empowering faster, more accurate, and cost-effective problem solving for the future of decision intelligence.1.3KViews0likes0CommentsOpen AI’s GPT-5.1-codex-max in Microsoft Foundry: Igniting a New Era for Enterprise Developers
Announcing GPT-5.1-codex-max: The Future of Enterprise Coding Starts Now We’re thrilled to announce the general availability of OpenAI's GPT-5.1-codex-max in Microsoft Foundry Models; a leap forward that redefines what’s possible for enterprise-grade coding agents. This isn’t just another model release; it’s a celebration of innovation, partnership, and the relentless pursuit of developer empowerment. At Microsoft Ignite, we unveiled Microsoft Foundry: a unified platform where businesses can confidently choose the right model for every job, backed by enterprise-grade reliability. Foundry brings together the best from OpenAI, Anthropic, xAI, Black Forest Labs, Cohere, Meta, Mistral, and Microsoft’s own breakthroughs, all under one roof. Our partnership with Anthropic is a testament to our commitment to giving developers access to the most advanced, safe, and high-performing models in the industry. And now, with GPT-5.1-codex-max joining the Foundry family, the possibilities for intelligent applications and agentic workflows have never been greater. GPT 5.1-codex-max is available today in Microsoft Foundry and accessible in Visual Studio Code via the Foundry extension . Meet GPT-5.1-codex-max: Enterprise-Grade Coding Agent for Complex Projects GPT-5.1-codex-max is engineered for those who build the future. Imagine tackling complex, long-running projects without losing context or momentum. GPT-5.1-codex-max delivers efficiency at scale, cross-platform readiness, and proven performance with top scores on SWE-Bench (77.9), the gold standard for AI coding. With GPT-5.1-codex-max, developers can focus on creativity and problem-solving, while the model handles the heavy lifting. GPT-5.1-codex-max isn’t just powerful; it’s practical, designed to solve real challenges for enterprise developers: Multi-Agent Coding Workflows: Automate repetitive tasks across microservices, maintaining shared context for seamless collaboration. Enterprise App Modernization: Effortlessly refactor legacy .NET and Java applications into cloud-native architectures. Secure API Development: Generate and validate secure API endpoints, with `compliance checks built-in for peace of mind. Continuous Integration Support: Integrate GPT-5.1-codex-max into CI/CD pipelines for automated code reviews and test generation, accelerating delivery cycles. These use cases are just the beginning. GPT-5.1-codex-max is your partner in building robust, scalable, and secure solutions. Foundry: Platform Built for Developers Who Build the Future Foundry is more than a model catalog—it’s an enterprise AI platform designed for developers who need choice, reliability, and speed. • Choice Without Compromise: Access the widest range of models, including frontier models from leading model providers. • Enterprise-Grade Infrastructure: Built-in security, observability, and governance for responsible AI at scale. • Integrated Developer Experience: From GitHub to Visual Studio Code, Foundry connects with tools developers love for a frictionless build-to-deploy journey. Start Building Smarter with GPT-5.1-codex-max in Foundry The future is here, and it’s yours to shape. Supercharge your coding workflows with GPT-5.1-codex-max in Microsoft Foundry today. Learn more about Microsoft Foundry: aka.ms/IgniteFoundryModels. Watch Ignite sessions for deep dives and demos: ignite.microsoft.com. Build faster, smarter, and with confidence on the platform redefining enterprise AI.4.7KViews3likes5CommentsAnonimização de Dados Sensíveis com a plataforma de IA do Azure
A proteção de dados pessoais e sensíveis prevista pela LGPD tornou essencial que organizações adotem mecanismos eficazes para tratar e anonimizar informações presentes em documentos, especialmente em setores como o público e o de saúde, onde o risco de exposição é elevado. Este artigo apresenta uma arquitetura moderna baseada em serviços de IA do Azure — incluindo Document Intelligence, Language Service, modelos personalizados de NER e Computer Vision — para identificar, classificar e mascarar automaticamente conteúdos sensíveis.224Views6likes0CommentsFrom Cloud to Chip: Building Smarter AI at the Edge with Windows AI PCs
As AI engineers, we’ve spent years optimizing models for the cloud, scaling inference, wrangling latency, and chasing compute across clusters. But the frontier is shifting. With the rise of Windows AI PCs and powerful local accelerators, the edge is no longer a constraint it’s now a canvas. Whether you're deploying vision models to industrial cameras, optimizing speech interfaces for offline assistants, or building privacy-preserving apps for healthcare, Edge AI is where real-world intelligence meets real-time performance. Why Edge AI, Why Now? Edge AI isn’t just about running models locally, it’s about rethinking the entire lifecycle: - Latency: Decisions in milliseconds, not round-trips to the cloud. - Privacy: Sensitive data stays on-device, enabling HIPAA/GDPR compliance. - Resilience: Offline-first apps that don’t break when the network does. - Cost: Reduced cloud compute and bandwidth overhead. With Windows AI PCs powered by Intel and Qualcomm NPUs and tools like ONNX Runtime, DirectML, and Olive, developers can now optimize and deploy models with unprecedented efficiency. What You’ll Learn in Edge AI for Beginners The Edge AI for Beginners curriculum is a hands-on, open-source guide designed for engineers ready to move from theory to deployment. Multi-Language Support This content is available in over 48 languages, so you can read and study in your native language. What You'll Master This course takes you from fundamental concepts to production-ready implementations, covering: Small Language Models (SLMs) optimized for edge deployment Hardware-aware optimization across diverse platforms Real-time inference with privacy-preserving capabilities Production deployment strategies for enterprise applications Why EdgeAI Matters Edge AI represents a paradigm shift that addresses critical modern challenges: Privacy & Security: Process sensitive data locally without cloud exposure Real-time Performance: Eliminate network latency for time-critical applications Cost Efficiency: Reduce bandwidth and cloud computing expenses Resilient Operations: Maintain functionality during network outages Regulatory Compliance: Meet data sovereignty requirements Edge AI Edge AI refers to running AI algorithms and language models locally on hardware, close to where data is generated without relying on cloud resources for inference. It reduces latency, enhances privacy, and enables real-time decision-making. Core Principles: On-device inference: AI models run on edge devices (phones, routers, microcontrollers, industrial PCs) Offline capability: Functions without persistent internet connectivity Low latency: Immediate responses suited for real-time systems Data sovereignty: Keeps sensitive data local, improving security and compliance Small Language Models (SLMs) SLMs like Phi-4, Mistral-7B, Qwen and Gemma are optimized versions of larger LLMs, trained or distilled for: Reduced memory footprint: Efficient use of limited edge device memory Lower compute demand: Optimized for CPU and edge GPU performance Faster startup times: Quick initialization for responsive applications They unlock powerful NLP capabilities while meeting the constraints of: Embedded systems: IoT devices and industrial controllers Mobile devices: Smartphones and tablets with offline capabilities IoT Devices: Sensors and smart devices with limited resources Edge servers: Local processing units with limited GPU resources Personal Computers: Desktop and laptop deployment scenarios Course Modules & Navigation Course duration. 10 hours of content Module Topic Focus Area Key Content Level Duration 📖 00 Introduction to EdgeAI Foundation & Context EdgeAI Overview • Industry Applications • SLM Introduction • Learning Objectives Beginner 1-2 hrs 📚 01 EdgeAI Fundamentals Cloud vs Edge AI comparison EdgeAI Fundamentals • Real World Case Studies • Implementation Guide • Edge Deployment Beginner 3-4 hrs 🧠 02 SLM Model Foundations Model families & architecture Phi Family • Qwen Family • Gemma Family • BitNET • μModel • Phi-Silica Beginner 4-5 hrs 🚀 03 SLM Deployment Practice Local & cloud deployment Advanced Learning • Local Environment • Cloud Deployment Intermediate 4-5 hrs ⚙️ 04 Model Optimization Toolkit Cross-platform optimization Introduction • Llama.cpp • Microsoft Olive • OpenVINO • Apple MLX • Workflow Synthesis Intermediate 5-6 hrs 🔧 05 SLMOps Production Production operations SLMOps Introduction • Model Distillation • Fine-tuning • Production Deployment Advanced 5-6 hrs 🤖 06 AI Agents & Function Calling Agent frameworks & MCP Agent Introduction • Function Calling • Model Context Protocol Advanced 4-5 hrs 💻 07 Platform Implementation Cross-platform samples AI Toolkit • Foundry Local • Windows Development Advanced 3-4 hrs 🏭 08 Foundry Local Toolkit Production-ready samples Sample applications (see details below) Expert 8-10 hrs Each module includes Jupyter notebooks, code samples, and deployment walkthroughs, perfect for engineers who learn by doing. Developer Highlights - 🔧 Olive: Microsoft's optimization toolchain for quantization, pruning, and acceleration. - 🧩 ONNX Runtime: Cross-platform inference engine with support for CPU, GPU, and NPU. - 🎮 DirectML: GPU-accelerated ML API for Windows, ideal for gaming and real-time apps. - 🖥️ Windows AI PCs: Devices with built-in NPUs for low-power, high-performance inference. Local AI: Beyond the Edge Local AI isn’t just about inference, it’s about autonomy. Imagine agents that: - Learn from local context - Adapt to user behavior - Respect privacy by design With tools like Agent Framework, Azure AI Foundry and Windows Copilot Studio, and Foundry Local developers can orchestrate local agents that blend LLMs, sensors, and user preferences, all without cloud dependency. Try It Yourself Ready to get started? Clone the Edge AI for Beginners GitHub repo, run the notebooks, and deploy your first model to a Windows AI PC or IoT devices Whether you're building smart kiosks, offline assistants, or industrial monitors, this curriculum gives you the scaffolding to go from prototype to production.