rag
80 TopicsBuild a Fully Offline RAG App with Foundry Local: No Cloud Required
A practical guide to building an on-device AI support agent using Retrieval-Augmented Generation, JavaScript, and Microsoft Foundry Local. The Problem: AI That Can't Go Offline Most AI-powered applications today are firmly tethered to the cloud. They assume stable internet, low-latency API calls, and the comfort of a managed endpoint. But what happens when your users are in an environment with zero connectivity a gas pipeline in a remote field, a factory floor, an underground facility? That's exactly the scenario that motivated this project: a fully offline RAG-powered support agent that runs entirely on a laptop. No cloud. No API keys. No outbound network calls. Just a local model, a local vector store, and domain-specific documents all accessible from a browser on any device. The Gas Field Support Agent - running entirely on-device What is RAG and Why Should You Care? Retrieval-Augmented Generation (RAG) is a pattern that makes language models genuinely useful for domain-specific tasks. Instead of hoping the model "knows" the answer from pre-training, you: Retrieve relevant chunks from your own documents Augment the model's prompt with those chunks as context Generate a response grounded in your actual data The result: fewer hallucinations, traceable answers, and an AI that works with your content. If you're building internal tools, customer support bots, field manuals, or knowledge bases, RAG is the pattern you want. Why fully offline? Data sovereignty, air-gapped environments, field operations, latency-sensitive workflows, and regulatory constraints all demand AI that doesn't phone home. Running everything locally gives you complete control over your data and eliminates any external dependency. The Tech Stack This project is deliberately simple — no frameworks, no build steps, no Docker: Layer Technology Why AI Model Foundry Local + Phi-3.5 Mini Runs locally, OpenAI-compatible API, no GPU needed Backend Node.js + Express Lightweight, fast, universally known Vector Store SQLite via better-sqlite3 Zero infrastructure, single file on disk Retrieval TF-IDF + cosine similarity No embedding model required, fully offline Frontend Single HTML file with inline CSS No build step, mobile-responsive, field-ready The total dependency footprint is just four npm packages: express , openai , foundry-local-sdk , and better-sqlite3 . Architecture Overview The system has five layers — all running on a single machine: Five-layer architecture: Client → Server → RAG Pipeline → Data → AI Model Client Layer — A single HTML file served by Express, with quick-action buttons and responsive chat Server Layer — Express.js handles API routes for chat (streaming + non-streaming), document upload, and health checks RAG Pipeline — The chat engine orchestrates retrieval and generation; the chunker handles TF-IDF vectorization Data Layer — SQLite stores document chunks and their TF-IDF vectors; source docs live as .md files AI Layer — Foundry Local runs Phi-3.5 Mini Instruct on CPU/NPU, exposing an OpenAI-compatible API Getting Started in 5 Minutes You need two prerequisites: Node.js 20+ — nodejs.org Foundry Local — Microsoft's on-device AI runtime: Terminal winget install Microsoft.FoundryLocal Then clone, install, ingest, and run: git clone https://github.com/leestott/local-rag.git cd local-rag npm install npm run ingest # Index the 20 gas engineering documents npm start # Start the server + Foundry Local Open http://127.0.0.1:3000 and start chatting. Foundry Local auto-downloads Phi-3.5 Mini (~2 GB) on first run. How the RAG Pipeline Works Let's trace what happens when a user asks: "How do I detect a gas leak?" RAG query flow: Browser → Server → Vector Store → Model → Streaming response Step 1: Document Ingestion Before any queries happen, npm run ingest reads every .md file from the docs/ folder, splits each into overlapping chunks (~200 tokens, 25-token overlap), computes a TF-IDF vector for each chunk, and stores everything in SQLite. Chunking example docs/01-gas-leak-detection.md → Chunk 1: "Gas Leak Detection – Safety Warnings: Ensure all ignition..." → Chunk 2: "...sources are eliminated. Step-by-step: 1. Perform visual..." → Chunk 3: "...inspection of all joints. 2. Check calibration date..." The overlap ensures no information falls between chunk boundaries — a critical detail in any RAG system. Step 2: Query → Retrieval When the user sends a question, the server converts it into a TF-IDF vector, compares it against every stored chunk using cosine similarity, and returns the top-K most relevant results. For 20 documents (~200 chunks), this executes in under 10ms. src/vectorStore.js /** Retrieve top-K most relevant chunks for a query. */ search(query, topK = 5) { const queryTf = termFrequency(query); const rows = this.db.prepare("SELECT * FROM chunks").all(); const scored = rows.map((row) => { const chunkTf = new Map(JSON.parse(row.tf_json)); const score = cosineSimilarity(queryTf, chunkTf); return { ...row, score }; }); scored.sort((a, b) => b.score - a.score); return scored.slice(0, topK).filter((r) => r.score > 0); } Step 3: Prompt Construction The retrieved chunks are injected into the prompt alongside system instructions: Prompt structure System: You are an offline gas field support agent. Safety-first... Context: [Chunk 1: Gas Leak Detection – Safety Warnings...] [Chunk 2: Gas Leak Detection – Step-by-step...] [Chunk 3: Purging Procedures – Related safety...] User: How do I detect a gas leak? Step 4: Generation + Streaming The prompt is sent to Foundry Local via the OpenAI-compatible API. The response streams back token-by-token through Server-Sent Events (SSE) to the browser: Safety-first response with structured guidance Expandable sources with relevance scores Foundry Local: Your Local AI Runtime Foundry Local is what makes the "offline" part possible. It's a runtime from Microsoft that runs small language models (SLMs) on CPU or NPU — no GPU required. It exposes an OpenAI-compatible API and manages model downloads, caching, and lifecycle automatically. The integration code is minimal if you've used the OpenAI SDK before, this will feel instantly familiar: src/chatEngine.js import { FoundryLocalManager } from "foundry-local-sdk"; import { OpenAI } from "openai"; // Start Foundry Local and load the model const manager = new FoundryLocalManager(); const modelInfo = await manager.init("phi-3.5-mini"); // Use the standard OpenAI client — pointed at the local endpoint const client = new OpenAI({ baseURL: manager.endpoint, apiKey: manager.apiKey, }); // Chat completions work exactly like the cloud API const stream = await client.chat.completions.create({ model: modelInfo.id, messages: [ { role: "system", content: "You are a helpful assistant." }, { role: "user", content: "How do I detect a gas leak?" } ], stream: true, }); Portability matters Because Foundry Local uses the OpenAI API format, any code you write here can be ported to Azure OpenAI or OpenAI's cloud API with a single config change. You're not locked in. Why TF-IDF Instead of Embeddings? Most RAG tutorials use embedding models for retrieval. We chose TF-IDF for this project because: Fully offline — no embedding model to download or run Zero latency — vectorization is instantaneous (just math on word frequencies) Good enough — for a curated collection of 20 domain-specific documents, TF-IDF retrieves the right chunks reliably Transparent — you can inspect the vocabulary and weights, unlike neural embeddings For larger collections (thousands of documents) or when semantic similarity matters more than keyword overlap, you'd swap in an embedding model. But for this use case, TF-IDF keeps the stack simple and dependency-free. Mobile-Responsive Field UI Field engineers use this app on phones and tablets often wearing gloves. The UI is designed for harsh conditions with a dark, high-contrast theme, large touch targets (minimum 48px), and horizontally scrollable quick-action buttons. Desktop view Mobile view The entire frontend is a single index.html file — no React, no build step, no bundler. This keeps the project accessible and easy to deploy anywhere. Runtime Document Upload Users can upload new documents without restarting the server. The upload endpoint receives markdown content, chunks it, computes TF-IDF vectors, and inserts the chunks into SQLite — all in memory, immediately available for retrieval. Drag-and-drop document upload with instant indexing Adapt This for Your Own Domain This project is a scenario sample designed to be forked and customized. Here's the three-step process: 1. Replace the Documents Delete the gas engineering docs in docs/ and add your own .md files with optional YAML front-matter: docs/my-procedure.md --- title: Troubleshooting Widget Errors category: Support id: KB-001 --- # Troubleshooting Widget Errors ...your content here... 2. Edit the System Prompt Open src/prompts.js and rewrite the instructions for your domain: src/prompts.js export const SYSTEM_PROMPT = `You are an offline support agent for [YOUR DOMAIN]. Rules: - Only answer using the retrieved context - If the answer isn't in the context, say so - Use structured responses: Summary → Details → Reference `; 3. Tune the Retrieval Adjust chunking and retrieval parameters in src/config.js : src/config.js export const config = { model: "phi-3.5-mini", chunkSize: 200, // smaller = more precise, less context per chunk chunkOverlap: 25, // prevents info from falling between chunks topK: 3, // chunks per query (more = richer context, slower) }; Extending to Multi-Agent Architectures Once you have a working RAG agent, the natural next step is multi-agent orchestration where specialized agents collaborate to handle complex workflows. With Foundry Local's OpenAI-compatible API, you can compose multiple agent roles on the same machine: Multi-agent concept // Each agent is just a different system prompt + RAG scope const agents = { safety: { prompt: safetyPrompt, docs: "safety/*.md" }, diagnosis: { prompt: diagnosisPrompt, docs: "faults/*.md" }, procedure: { prompt: procedurePrompt, docs: "procedures/*.md" }, }; // Router determines which agent handles the query function route(query) { if (query.match(/safety|warning|hazard/i)) return agents.safety; if (query.match(/fault|error|code/i)) return agents.diagnosis; return agents.procedure; } // Each agent uses the same Foundry Local model endpoint const response = await client.chat.completions.create({ model: modelInfo.id, messages: [ { role: "system", content: selectedAgent.prompt }, { role: "system", content: `Context:\n${retrievedChunks}` }, { role: "user", content: userQuery } ], stream: true, }); This pattern lets you build specialized agent pipelines a triage agent routes to the right specialist, each with its own document scope and system prompt, all running on the same local Foundry instance. For production multi-agent systems, explore Microsoft Foundry for cloud-scale orchestration when connectivity is available. Local-first, cloud-ready Start with Foundry Local for development and offline scenarios. When your agents need cloud scale, swap to Azure AI Foundry with the same OpenAI-compatible API your agent code stays the same. Key Takeaways 1 RAG = Retrieve + Augment + Generate Ground your AI in real documents — dramatically reducing hallucination and making answers traceable. 2 Foundry Local makes local AI accessible OpenAI-compatible API running on CPU/NPU. No GPU required. No cloud dependency. 3 TF-IDF + SQLite is viable For small-to-medium document collections, you don't need a dedicated vector database. 4 Same API, local or cloud Build locally with Foundry Local, deploy with Azure OpenAI — zero code changes. What's Next? Embedding-based retrieval — swap TF-IDF for a local embedding model for better semantic matching Conversation memory — persist chat history across sessions Multi-agent routing — specialized agents for safety, diagnostics, and procedures PWA packaging — make it installable as a standalone app on mobile devices Hybrid retrieval — combine keyword search with semantic embeddings for best results Get the code Clone the repo, swap in your own documents, and start building: git clone https://github.com/leestott/local-rag.git github.com/leestott/local-rag — MIT licensed, contributions welcome. Open source under the MIT License. Built with Foundry Local and Node.js.60Views0likes0CommentsBuilding High-Performance Agentic Systems
Most enterprise chatbots fail in the same quiet way. They answer questions. They impress in demos. And then they stall in production. Knowledge goes stale. Answers cannot be audited. The system cannot act beyond generating text. When workflows require coordination, execution, or accountability, the chatbot stops being useful. Agentic systems exist because that model is insufficient. Instead of treating the LLM as the product, agentic architecture embeds it inside a bounded control loop: plan → act (tools) → observe → refine The model becomes one component in a runtime system with explicit state management, safety policies, identity enforcement, and operational telemetry. This shift is not speculative. A late-2025 MIT Sloan Management Review / BCG study reports that 35% of organizations have already adopted AI agents, with another 44% planning deployment. Microsoft is advancing open protocols for what it calls the “agentic web,” including Agent-to-Agent (A2A) interoperability and Model Context Protocol (MCP), with integration paths emerging across Copilot Studio and Azure AI Foundry. The real question is no longer whether agents are coming. It is whether enterprise architecture is ready for them. This article translates “agentic” into engineering reality: the runtime layers, latency and cost levers, orchestration patterns, and governance controls required for production deployment. The Core Capabilities of Agentic AI What makes an AI “agentic” is not a single feature—it’s the interaction of different capabilities. Together, they form the minimum set needed to move from “answering” to “operating”. Autonomy – Goal-Driven Task Completion Traditional bots are reactive: they wait for a prompt and produce output. Autonomy introduces a goal state and a control loop. The agent is given an objective (or a trigger) and it can decide the next step without being micromanaged. The critical engineering distinction is that autonomy must be bounded: in production, you implement it with explicit budgets and stop conditions—maximum tool calls, maximum retries, timeouts, and confidence thresholds. The typical execution shape is a loop: plan → act → observe → refine. A project-management agent, for example, doesn’t just answer “what’s the status?” It monitors signals (work items, commits, build health), detects a risk pattern (slippage, dependency blockage), and then either surfaces an alert or prepares a remediation action (re-plan milestones, notify owners). In high-stakes environments, autonomy is usually human-in-the-loop by design: the agent can draft changes, propose next actions, and only execute after approval. Over time, teams expand the autonomy envelope for low-risk actions while keeping approvals for irreversible or financially sensitive operations. Tool Integration – Taking Action and Staying Current A standalone LLM cannot fetch live enterprise state and cannot change it. Tool integration is how an agent becomes operational: it can query systems of record, call APIs, trigger workflows, and produce outputs that reflect the current world rather than the model’s pretraining snapshot. There are two classes of tools that matter in enterprise agents: Retrieval tools (grounding / RAG)When the agent needs facts, it retrieves them. This is the backbone of reducing hallucination: instead of guessing, the agent pulls authoritative content (SharePoint, Confluence, policy repositories, CRM records, Fabric datasets) and uses it as evidence. In practice, retrieval works best when it is engineered as a pipeline: query rewrite (optional) → hybrid search (keyword + vector) → filtering (metadata/ACL) → reranking → compact context injection. The point is not “stuff the prompt with documents,” but “inject only the minimum evidence required to answer accurately.” Action tools (function calling / connectors) These are the hands of the agent: update a CRM record, create a ticket, send an email, schedule a meeting, generate a report, run a pipeline. Tool integration shifts value from “advice” to “execution,” but also introduces risk—so action tools need guardrails: least-privilege permissions, input validation, idempotency keys, and post-condition checks (confirm the update actually happened). In Microsoft ecosystems, this tool plane often maps to Graph actions + business connectors (via Logic Apps/Power Automate) + custom APIs, with Copilot Studio (low code) or Foundry-style runtimes (pro code) orchestrating the calls. Memory (Context & Learning) – Context Awareness and Adaptation “Memory” is not just a long prompt. In agentic systems, memory is an explicit state strategy: Working memory: what the agent has learned during the current run (intermediate tool results, constraints, partial plans). Session memory: what should persist across turns (user preferences, ongoing tasks, summarized history). Long-term memory: enterprise knowledge the agent can retrieve (indexed documents, structured facts, embeddings + metadata). Short-term memory enables multi-step workflows without repeating questions. An HR onboarding agent can carry a new hire’s details from intake through provisioning without re-asking, because the workflow state is persisted and referenced. Long-term “learning” is typically implemented through feedback loops rather than real-time model weight updates: capturing corrections, storing validated outcomes, and periodically improving prompts, routing logic, retrieval configuration, or (where appropriate) fine-tuning. The key design rule is that memory must be policy-aware: retention rules, PII handling, and permission trimming apply to stored state as much as they apply to retrieved documents. Orchestration – Coordinating Multi-Agent Teams Complex enterprise work is rarely single-skill. Orchestration is how agentic systems scale capability without turning one agent into an unmaintainable monolith. The pattern is “manager + specialists”: an orchestrator decomposes the goal into subtasks, routes each to the best tool or sub-agent, and then composes a final response. This can be done sequentially or in parallel. Employee onboarding is a classic: HR intake, IT account creation, equipment provisioning, and training scheduling can run in parallel where dependencies allow. The engineering challenge is making orchestration reliable: defining strict input/output contracts between agents (often structured JSON), handling failures (timeouts, partial completion), and ensuring only one component has authority to send the final user-facing message to avoid conflicting outputs. In Microsoft terms, orchestration can be implemented as agentic flows in Copilot Studio, connected-agent patterns in Foundry, or explicit orchestrators in code using structured tool schemas and shared state. Strategic Impact – How Agentic AI Changes Knowledge Work Agentic AI is no longer an experimental overlay to enterprise systems. It is becoming an embedded operational layer inside core workflows. Unlike earlier chatbot deployments that answered isolated questions, modern enterprise agents execute end-to-end processes, interact with structured systems, maintain context, and operate within governed boundaries. The shift is not about conversational intelligence alone; it is about workflow execution at scale. The transformation becomes clearer when examining real implementations across industries. In legal services, agentic systems have moved beyond document summarization into operational case automation. Assembly Software’s NeosAI, built on Azure AI infrastructure, integrates directly into legal case management systems and automates document analysis, structured data extraction, and first-draft generation of legal correspondence. What makes this deployment impactful is not merely the generative drafting capability, but the integration architecture. NeosAI is not an isolated chatbot; it operates within the same document management systems, billing systems, and communication platforms lawyers already use. Firms report time savings of up to 25 hours per case, with document drafting cycles reduced from days to minutes for first-pass outputs. Importantly, the system runs within secure Azure environments with zero data retention policies, addressing one of the most sensitive concerns in legal AI adoption: client confidentiality. JPMorgan’s COiN platform represents another dimension of legal and financial automation. Instead of conversational assistance, COiN performs structured contract intelligence at production scale. It analyzes more than 12,000 commercial loan agreements annually, extracting over 150 clause attributes per document. Work that previously required approximately 360,000 human hours now executes in seconds. The architecture emphasizes structured NLP pipelines, taxonomy-based clause classification, and private cloud deployment for regulatory compliance. Rather than replacing legal professionals, the system flags unusual clauses for human review, maintaining oversight while dramatically accelerating analysis. Over time, COiN has also served as a knowledge retention mechanism, preserving institutional contract intelligence that would otherwise be lost with employee turnover. In financial services, the impact is similarly structural. Morgan Stanley’s internal AI Assistant allows wealth advisors to query over 100,000 proprietary research documents using natural language. Adoption has reached nearly universal usage across advisor teams, not because it replaces expertise, but because it compresses research time and surfaces insights instantly. Building on this foundation, the firm introduced an AI meeting debrief agent that transcribes client conversations using speech-to-text models and generates CRM notes and follow-up drafts through GPT-based reasoning. Advisors review outputs before finalization, preserving human judgment. The result is faster client engagement and measurable productivity improvements. What differentiates Morgan Stanley’s approach is not only deployment scale, but disciplined evaluation before release. The firm established rigorous benchmarking frameworks to test model outputs against expert standards for accuracy, compliance, and clarity. Only after meeting defined thresholds were systems expanded firmwide. This pattern—evaluation before scale—is becoming a defining trait of successful enterprise agent deployment. Human Resources provides a different perspective on agentic AI. Johnson Controls deployed an AI HR assistant inside Slack to manage policy questions, payroll inquiries, and onboarding support across a global workforce exceeding 100,000 employees. By embedding the agent in a channel employees already use, adoption barriers were reduced significantly. The result was a 30–40% reduction in live HR call volume, allowing HR teams to redirect focus toward strategic workforce initiatives. Similarly, Ciena integrated an AI assistant directly into Microsoft Teams, unifying HR and IT support through a single conversational interface. Employees no longer navigate separate portals; the agent orchestrates requests across backend systems such as Workday and ServiceNow. The technical lesson here is clear: integration breadth drives usability, and usability drives adoption. Engineering and IT operations reveal perhaps the most technically sophisticated application of agentic AI: multi-agent orchestration. In a proof-of-concept developed through collaboration between Microsoft and ServiceNow, an AI-driven incident response system coordinates multiple agents during high-priority outages. Microsoft 365 Copilot transcribes live war-room discussions and extracts action items, while ServiceNow’s Now Assist executes operational updates within IT service management systems. A Semantic Kernel–based manager agent maintains shared context and synchronizes activity across platforms. This eliminates the longstanding gap between real-time discussion and structured documentation, automatically generating incident reports while freeing engineers to focus on remediation rather than clerical tasks. The system demonstrates that orchestration is not conceptual—it is operational. Across these examples, the pattern is consistent. Agentic AI changes knowledge work by absorbing structured cognitive labor: document parsing, compliance classification, research synthesis, workflow routing, transcription, and task coordination. Humans remain essential for judgment, ethics, and accountability, but the operational layer increasingly runs through AI-mediated execution. The result is not incremental productivity improvement; it is structural acceleration of knowledge processes. Design and Governance Challenges – Managing the Risks As agentic AI shifts from answering questions to executing workflows, governance must mature accordingly. These systems retrieve enterprise data, invoke APIs, update records, and coordinate across platforms. That makes them operational actors inside your architecture—not just assistants. The primary shift is this: autonomy increases responsibility. Agents must be observable. Every retrieval, reasoning step, and tool invocation should be traceable. Without structured telemetry and audit trails, enterprises lose visibility into why an agent acted the way it did. Agents must also operate within scoped authority. Least-privilege access, role-based identity, and bounded credentials are essential. An HR agent should not access finance systems. A finance agent should not modify compliance data without policy constraints. Autonomy only works when it is deliberately constrained. Execution boundaries are equally critical. High-risk actions—financial approvals, legal submissions, production changes—should include embedded thresholds or human approval gates. Autonomy should be progressive, not absolute. Cost and performance must be governed just like cloud infrastructure. Agentic systems can trigger recursive calls and model loops. Without usage monitoring, rate limits, and model-tier routing, compute consumption can escalate unpredictably. Finally, agentic systems require continuous evaluation. Real-world testing, live monitoring, and drift detection ensure the system remains aligned with business rules and compliance requirements. These are not “set and forget” deployments. In short, agentic AI becomes sustainable only when autonomy is paired with observability, scoped authority, embedded guardrails, cost control, and structured oversight. Conclusion – Towards the Agentic Enterprise The organizations achieving meaningful returns from agentic AI share a common pattern. They do not treat AI agents as experimental tools. They design them as production systems with defined roles, scoped authority, measurable KPIs, embedded observability, and formal governance layers. When autonomy is paired with integration, memory, orchestration, and governance discipline, agentic AI becomes more than automation—it becomes an operational architecture. Enterprises that master this architecture are not merely reducing costs; they are redefining how knowledge work is executed. In this emerging model, human professionals focus on strategic judgment and innovation, while AI agents manage structured cognitive execution at scale. The competitive advantage will not belong to those who deploy the most AI, but to those who deploy it with architectural rigor and governance maturity. Before we rush to deploy more agents, a few questions are worth asking: If an AI agent executes a workflow in your enterprise today, can you trace every reasoning step and tool invocation behind that decision? Does your architecture treat AI as a conversational layer - or as an operational actor with scoped identity, cost controls, and policy enforcement? Where should autonomy stop in your organization - and who defines that boundary? Agentic AI is not just a capability shift. It is an architectural decision. Curious to hear how others are designing their control planes and orchestration layers. References MIT Sloan – “Agentic AI, Explained” by Beth Stackpole: A foundational overview of agentic AI, its distinction from traditional generative AI, and its implications for enterprise workflows, governance, and strategy. Microsoft TechCommunity – “Introducing Multi-Agent Orchestration in Foundry Agent Service”: Details Microsoft’s multi-agent orchestration capabilities, including Connected Agents, Multi-Agent Workflows, and integration with A2A and MCP protocols. Microsoft Learn – “Extend the Capabilities of Your Agent – Copilot Studio”: Explains how to build and extend custom agents in Microsoft Copilot Studio using tools, connectors, and enterprise data sources. Assembly Software’s NeosAI case – Microsoft Customer Stories JPMorgan COiN platform – GreenData Case Study HR support AI (Johnson Controls, Ciena, Databricks) – Moveworks case studies ServiceNow & Semantic Kernel multi-agent P1 Incident – Microsoft Semantic Kernel BlogLevel up your Python + AI skills with our complete series
We've just wrapped up our live series on Python + AI, a comprehensive nine-part journey diving deep into how to use generative AI models from Python. The series introduced multiple types of models, including LLMs, embedding models, and vision models. We dug into popular techniques like RAG, tool calling, and structured outputs. We assessed AI quality and safety using automated evaluations and red-teaming. Finally, we developed AI agents using popular Python agents frameworks and explored the new Model Context Protocol (MCP). To help you apply what you've learned, all of our code examples work with GitHub Models, a service that provides free models to every GitHub account holder for experimentation and education. Even if you missed the live series, you can still access all the material using the links below! If you're an instructor, feel free to use the slides and code examples in your own classes. If you're a Spanish speaker, check out the Spanish version of the series. Python + AI: Large Language Models 📺 Watch recording In this session, we explore Large Language Models (LLMs), the models that power ChatGPT and GitHub Copilot. We use Python to interact with LLMs using popular packages like the OpenAI SDK and LangChain. We experiment with prompt engineering and few-shot examples to improve outputs. We also demonstrate how to build a full-stack app powered by LLMs and explain the importance of concurrency and streaming for user-facing AI apps. Slides for this session Code repository with examples: python-openai-demos Python + AI: Vector embeddings 📺 Watch recording In our second session, we dive into a different type of model: the vector embedding model. A vector embedding is a way to encode text or images as an array of floating-point numbers. Vector embeddings enable similarity search across many types of content. In this session, we explore different vector embedding models, such as the OpenAI text-embedding-3 series, through both visualizations and Python code. We compare distance metrics, use quantization to reduce vector size, and experiment with multimodal embedding models. Slides for this session Code repository with examples: vector-embedding-demos Python + AI: Retrieval Augmented Generation 📺 Watch recording In our third session, we explore one of the most popular techniques used with LLMs: Retrieval Augmented Generation. RAG is an approach that provides context to the LLM, enabling it to deliver well-grounded answers for a particular domain. The RAG approach works with many types of data sources, including CSVs, webpages, documents, and databases. In this session, we walk through RAG flows in Python, starting with a simple flow and culminating in a full-stack RAG application based on Azure AI Search. Slides for this session Code repository with examples: python-openai-demos Python + AI: Vision models 📺 Watch recording Our fourth session is all about vision models! Vision models are LLMs that can accept both text and images, such as GPT-4o and GPT-4o mini. You can use these models for image captioning, data extraction, question answering, classification, and more! We use Python to send images to vision models, build a basic chat-with-images app, and create a multimodal search engine. Slides for this session Code repository with examples: openai-chat-vision-quickstart Python + AI: Structured outputs 📺 Watch recording In our fifth session, we discover how to get LLMs to output structured responses that adhere to a schema. In Python, all you need to do is define a Pydantic BaseModel to get validated output that perfectly meets your needs. We focus on the structured outputs mode available in OpenAI models, but you can use similar techniques with other model providers. Our examples demonstrate the many ways you can use structured responses, such as entity extraction, classification, and agentic workflows. Slides for this session Code repository with examples: python-openai-demos Python + AI: Quality and safety 📺 Watch recording This session covers a crucial topic: how to use AI safely and how to evaluate the quality of AI outputs. There are multiple mitigation layers when working with LLMs: the model itself, a safety system on top, the prompting and context, and the application user experience. We focus on Azure tools that make it easier to deploy safe AI systems into production. We demonstrate how to configure the Azure AI Content Safety system when working with Azure AI models and how to handle errors in Python code. Then we use the Azure AI Evaluation SDK to evaluate the safety and quality of output from your LLM. Slides for this session Code repository with examples: ai-quality-safety-demos Python + AI: Tool calling 📺 Watch recording In the final part of the series, we focus on the technologies needed to build AI agents, starting with the foundation: tool calling (also known as function calling). We define tool call specifications using both JSON schema and Python function definitions, then send these definitions to the LLM. We demonstrate how to properly handle tool call responses from LLMs, enable parallel tool calling, and iterate over multiple tool calls. Understanding tool calling is absolutely essential before diving into agents, so don't skip over this foundational session. Slides for this session Code repository with examples: python-openai-demos Python + AI: Agents 📺 Watch recording In the penultimate session, we build AI agents! We use Python AI agent frameworks such as the new agent-framework from Microsoft and the popular LangGraph framework. Our agents start simple and then increase in complexity, demonstrating different architectures such as multiple tools, supervisor patterns, graphs, and human-in-the-loop workflows. Slides for this session Code repository with examples: python-ai-agent-frameworks-demos Python + AI: Model Context Protocol 📺 Watch recording In the final session, we dive into the hottest technology of 2025: MCP (Model Context Protocol). This open protocol makes it easy to extend AI agents and chatbots with custom functionality, making them more powerful and flexible. We demonstrate how to use the Python FastMCP SDK to build an MCP server running locally and consume that server from chatbots like GitHub Copilot. Then we build our own MCP client to consume the server. Finally, we discover how easy it is to connect AI agent frameworks like LangGraph and Microsoft agent-framework to MCP servers. With great power comes great responsibility, so we briefly discuss the security risks that come with MCP, both as a user and as a developer. Slides for this session Code repository with examples: python-mcp-demo6.8KViews2likes0CommentsStaying in the flow: SleekFlow and Azure turn customer conversations into conversions
A customer adds three items to their cart but never checks out. Another asks about shipping, gets stuck waiting eight minutes, only to drop the call. A lead responds to an offer but is never followed up with in time. Each of these moments represents lost revenue, and they happen to businesses every day. SleekFlow was founded in 2019 to help companies turn those almost-lost-customer moments into connection, retention, and growth. Today we serve more than 2,000 mid-market and enterprise organizations across industries including retail and e-commerce, financial services, healthcare, travel and hospitality, telecommunications, real estate, and professional services. In total, those customers rely on SleekFlow to orchestrate more than 600,000 daily customer interactions across WhatsApp, Instagram, web chat, email, and more. Our name reflects what makes us different. Sleek is about unified, polished experiences—consolidating conversations into one intelligent, enterprise-ready platform. Flow is about orchestration—AI and human agents working together to move each conversation forward, from first inquiry to purchase to renewal. The drive for enterprise-ready agentic AI Enterprises today expect always-on, intelligent conversations—but delivering that at scale proved daunting. When we set out to build AgentFlow, our agentic AI platform, we quickly ran into familiar roadblocks: downtime that disrupted peak-hour interactions, vector search delays that hurt accuracy, and costs that ballooned under multi-tenant workloads. Development slowed from limited compatibility with other technologies, while customer onboarding stalled without clear compliance assurances. To move past these barriers, we needed a foundation that could deliver the performance, trust, and global scale enterprises demand. The platform behind the flow: How Azure powers AgentFlow We chose Azure because building AgentFlow required more than raw compute power. Chatbots built on a single-agent model often stall out. They struggle to retrieve the right context, they miss critical handoffs, and they return answers too slowly to keep a customer engaged. To fix that, we needed an ecosystem capable of supporting a team of specialized AI agents working together at enterprise scale. Azure Cosmos DB provides the backbone for memory and context, managing short-term interactions, long-term histories, and vector embeddings in containers that respond in 15–20 milliseconds. Powered by Azure AI Foundry, our agents use Azure OpenAI models within Azure AI Foundry to understand and generate responses natively in multiple languages. Whether in English, Chinese, or Portuguese, the responses feel natural and aligned with the brand. Semantic Kernel acts as the conductor, orchestrating multiple agents, each of which retrieves the necessary knowledge and context, including chat histories, transactional data, and vector embeddings, directly from Azure Cosmos DB. For example, one agent could be retrieving pricing data, another summarizing it, and a third preparing it for a human handoff. The result is not just responsiveness but accuracy. A telecom provider can resolve a billing question while surfacing an upsell opportunity in the same dialogue. A financial advisor can walk into a call with a complete dossier prepared in seconds rather than hours. A retailer can save a purchase by offering an in-stock substitute before the shopper abandons the cart. Each of these conversations is different, yet the foundation is consistent on AgentFlow. Fast, fluent, and focused: Azure keeps conversations moving Speed is the heartbeat of a good conversation. A delayed answer feels like a dropped call, and an irrelevant one breaks trust. For AgentFlow to keep customers engaged, every operation behind the scenes has to happen in milliseconds. A single interaction can involve dozens of steps. One agent pulls product information from embeddings, another checks it against structured policy data, and a third generates a concise, brand-aligned response. If any of these steps lag, the dialogue falters. On Azure, they don’t. Azure Cosmos DB manages conversational memory and agent state across dedicated containers for short-term exchanges, long-term history, and vector search. Sharded DiskANN indexing powers semantic lookups that resolve in the 15–20 millisecond range—fast enough that the customer never feels a pause. Microsoft Phi’s model Phi-4 as well as Azure OpenAI in Foundry Models like o3-mini and o4-mini, provide the reasoning, and Azure Container Apps scale elastically, so performance holds steady during event-driven bursts, such as campaign broadcasts that can push the platform from a few to thousands of conversations per minute, and during daily peak-hour surges. To support that level of responsiveness, we run Azure Container Apps on the Pay-As-You-Go consumption plan, using KEDA-based autoscaling to expand from five idle containers to more than 160 within seconds. Meanwhile, Microsoft Orleans coordinates lightweight in-memory clustering to keep conversations sleek and flowing. The results are tangible. Retrieval-augmented generation recall improved from 50 to 70 percent. Execution speed is about 50 percent faster. For SleekFlow’s customers, that means carts are recovered before they’re abandoned, leads are qualified in real time, and support inquiries move forward instead of stalling out. With Azure handling the complexity under the hood, conversations flow naturally on the surface—and that’s what keeps customers engaged. Secure enough for enterprises, human enough for customers AgentFlow was built with security-by-design as a first principle, giving businesses confidence that every interaction is private, compliant, and reliable. On Azure, every AI agent operates inside guardrails enterprises can depend on. Azure Cosmos DB enforces strict per-tenant isolation through logical partitioning, encryption, and role-based access control, ensuring chat histories, knowledge bases, and embeddings remain auditable and contained. Models deployed through Azure AI Foundry, including Azure OpenAI and Microsoft Phi, process data entirely within SleekFlow’s Azure environment and guarantees it is never used to train public models, with activity logged for transparency. And Azure’s certifications—including ISO 27001, SOC 2, and GDPR—are backed by continuous monitoring and regional data residency options, proving compliance at a global scale. But trust is more than a checklist of certifications. AgentFlow brings human-like fluency and empathy to every interaction, powered by Azure OpenAI running with high token-per-second throughput so responses feel natural in real time. Quality control isn’t left to chance. Human override workflows are orchestrated through Azure Container Apps and Azure App Service, ensuring AI agents can carry conversations confidently until they’re ready for human agents. Enterprises gain the confidence to let AI handle revenue-critical moments, knowing Azure provides the foundation and SleekFlow provides the human-centered design. Shaping the next era of conversational AI on Azure The benefits of Azure show up not only in customer conversations but also in the way our own teams work. Faster processing speeds and high token-per-second throughput reduce latency, so we spend less time debugging and more time building. Stable infrastructure minimizes downtime and troubleshooting, lowering operational costs. That same reliability and scalability have transformed the way we engineer AgentFlow. AgentFlow started as part of our monolithic system. Shipping new features used to take about a month of development and another week of heavy testing to make sure everything held together. After moving AgentFlow to a microservices architecture on Azure Container Apps, we can now deploy updates almost daily with no down time or customer impact. And this is all thanks to native support for rolling updates and blue-green deployments. This agility is what excites us most about what's ahead. With Azure as our foundation, SleekFlow is not simply keeping pace with the evolution of conversational AI—we are shaping what comes next. Every interaction we refine, every second we save, and every workflow we streamline brings us closer to our mission: keeping conversations sleek, flowing, and valuable for enterprises everywhere.399Views3likes0Comments🎉 Announcing General Availability of AI & RAG Connectors in Logic Apps (Standard)
We’re excited to share that a comprehensive set of AI and Retrieval-Augmented Generation (RAG) capabilities is now Generally Available in Azure Logic Apps (Standard). This release brings native support for document processing, semantic retrieval, embeddings, and grounded reasoning directly into the Logic Apps workflow engine. 🔌 Available AI Connectors in Logic Apps Standard Logic Apps (Standard) had previously previewed four AI-focused connectors that open the door for a new generation of intelligent automation across the enterprise. Whether you're processing large volumes of documents, enriching operational data with intelligence, or enabling employees to interact with systems using natural language, these connectors provide the foundation for building solutions that are smarter, faster, and more adaptable to business needs. These are now in GA. They allow teams to move from routine workflow automation to AI-assisted decisioning, contextual responses, and multi-step orchestration that reflects real business intent. Below is the full set of built-in connectors and their actions as they appear in the designer. 1. Azure OpenAI Actions Get an embedding Get chat completions Get chat completions using Prompt Template Get completion Get multiple chat completions Get multiple embeddings What this unlocks Bring natural language reasoning and structured AI responses directly into workflows. Common scenarios include guided decisioning, user-facing assistants, classification and routing, or preparing embeddings for semantic search and RAG workflows. 2. Azure AI Search Actions Delete a document Delete multiple documents Get agentic retrieval output (Preview) Index a document Index multiple documents Merge document Search vectors Search vectors with natural language What this unlocks Add vector, hybrid semantic, and natural language search directly to workflow logic. Ideal for retrieving relevant content from enterprise data, powering search-driven workflows, and grounding AI responses with context from your own documents. 3. Azure AI Document Intelligence Action Analyze document What this unlocks Document Intelligence serves as the entry point for document-heavy scenarios. It extracts structured information from PDFs, images, and forms, allowing workflows to validate documents, trigger downstream processes, or feed high-quality data into search and embeddings pipelines. 4. AI Operations Actions Chunk text with metadata Parse document with metadata What this unlocks Transform unstructured files into enriched, structured content. Enables token-aware chunking, page-level metadata, and clean preparation of content for embeddings and semantic search at scale. 🤖 Advanced AI & Agentic Workflows with AgentLoop Logic Apps (Standard) also supports AgentLoop (also Generally Available), allowing AI models to use workflow actions as tools and iterate until the task is complete. Combined with chunking, embeddings, and natural language search, this opens the door to advanced agentic scenarios such as document intelligence agents, RAG-based assistants, and iterative evaluators. Conclusion With these capabilities now built into Logic Apps Standard, teams can bring AI directly into their integration workflows without additional infrastructure or complexity. Whether you’re streamlining document-heavy processes, enabling richer search experiences, or exploring more advanced agentic patterns, these capabilities provide a strong foundation to start building today.📢 Agent Loop Ignite Update - New Set of AI Features Arrive in Public Preview
Today at Ignite, we announced the General Availability of Agent Loop in Logic Apps Standard—bringing production-ready agentic automation to every customer. But GA is just the beginning. We’re also releasing a broad set of new and powerful AI-first capabilities in Public Preview that dramatically expand what developers can build: run agents in the Consumption SKU ,bring your own models through APIM AI Gateway, call any tool through MCP, deploy agents directly into Teams, secure RAG with document-level permissions, onboard with Okta, and build in a completely redesigned workflow designer. With these preview features layered on top of GA, customers can build AI applications that bring together secure tool calling, user identity, governance, observability, and integration with their existing systems—whether they’re running in Standard, Consumption, or the Microsoft 365 ecosystem. Here’s a closer look at the new capabilities now available in Public Preview. Public Preview of Agentic Workflows in Consumption SKU Agent Loop is now available in Azure Logic Apps Consumption, bringing autonomous and conversational AI agents to everyone through a fully serverless, pay-as-you-go experience. You can now turn any workflow into an intelligent workflow using the agent loop action—without provisioning infrastructure or managing AI models. This release provides instant onboarding, simple authentication, and a frictionless entry point for building agentic automation. Customers can also tap into Logic Apps’ ecosystem of 1,400+ connectors for tool calling and system integrations. This update makes AI-powered automation accessible for rapid prototyping while still offering a clear path to scale and production-ready deployments in Logic Apps Standard, including BYOM, VNET integration, and enterprise-grade controls. Preview limitations include limited regions, no VS Code local development, and no nested agents or MCP tools yet. Read more about this in our announcement blog! Bring your Own Model We’re excited to introduce Bring Your Own Model (BYOM) support in Agent Loop for Logic Apps Standard - making it possible to use any AI model in your agentic workflows from Foundry, and even on-prem or private cloud models. The key highlight of this feature is the deep integration with the Azure API Management (APIM) AI Gateway, which now serves as the control plane for how Agent Loop connects to models. Instead of wiring agents directly to individual endpoints, AI Gateway creates a single, governed interface that manages authentication, keys, rate limits, and quotas in one place. It provides built-in monitoring, logging, and observability, giving you full visibility into every request. It also ensures a consistent API shape for model interactions, so your workflows remain stable even as backends evolve. With AI Gateway in front, you can test, upgrade, and refine your model configuration without changing your Logic Apps, making model management safer, more predictable, and easier to operate at scale. Beyond AI Gateway, Agent Loop also supports: Direct external model integration when you want lightweight, point-to-point access to a third-party model API. Local/VNET model integration for on-prem, private cloud, or custom fine-tuned models that require strict data residency and private networking. Together, these capabilities let you treat the model as a pluggable component - start with the model you have today, bring in specialized or cost-optimized models as needed, and maintain enterprise-grade governance, security, and observability throughout. This makes Logic Apps one of the most flexible platforms for building model-agnostic, production-ready AI agent workflows. Ready to try this out? Go to http://aka.ms/agentloop/byom to learn more and get started. MCP support for Agent Loop in Logic Apps Standard Agent Loop in Azure Logic Apps Standard now supports the Model Context Protocol (MCP), enabling agents to discover and call external tools through an open, standardized interface. This brings powerful, flexible tool extensibility to both conversational and autonomous agents. Agent Loop offers three ways to bring MCP tools into your workflows: Bring Your Own MCP connector – Point to any external MCP server using its URL and credentials, instantly surfacing its published tools in your agent. Managed MCP connector – Access Azure-hosted MCP servers through the familiar managed connector experience, with shared connections and Azure-managed catalogs. Custom MCP connector – Build and publish your own OpenAPI-based MCP connector to expose private or tenant-scoped MCP servers. Idea for reusability of MCPs across organization. Managed and Custom MCP connectors support on-behalf-of (OBO) authentication, allowing agents to call MCP tools using the end user’s identity. This provides user-context-aware, permission-sensitive tool access across your intelligent workflows. Want to learn more – check out our announcement blog and how-to documents. Deploy Conversational Agents to Teams/M365 Workflows with conversational agents in Logic Apps can now be deployed directly into Microsoft Teams, so your agentic workflows show up where your users already live all day. Instead of going to a separate app or portal, employees can ask the agent questions, kick off approvals, check order or incident status, or look up internal policies right from a Teams chat or channel. The agent becomes just another teammate in the conversation—joining stand-ups, project chats, and support rooms as a first-class participant. Because the same Logic Apps agent can also be wired into other Microsoft 365 experiences that speak to Bots and web endpoints, this opens the door to a consistent and personalized “organization copilot” that follows users across the M365 ecosystem: Teams for chat, meetings, and channels today, and additional surfaces over time. Azure Bot Service and your proxy handle identity, tokens, and routing, while Logic Apps takes care of reasoning, tools, and back-end systems. The result is an agent that feels native to Teams and Microsoft 365—secure, governed, and always just one @mention away. Ready to bring your agentic workflows into Teams? Here’s how to get started. Secure Knowledge Retrieval for AI Agents in Logic Apps We’ve added native document-level authorization to Agent Loop by integrating Azure AI Search ACLs. This ensures AI agents only retrieve information the requesting user is permitted to access—making RAG workflows secure, compliant, and permission-aware by default. Documents are indexed with user or group permissions, and Agent Loop automatically applies those permissions during search using the caller’s principal ID or group memberships. Only authorized documents reach the LLM, preventing accidental exposure of sensitive data. This simplifies development, removes custom security code, and allows a single agent to safely serve users with different access levels—whether for HR, IT, or internal knowledge assistants. Here is our blogpost to learn more about this feature. Okta Agent Loop now supports Okta as an identity provider for conversational agents, alongside Microsoft Entra ID. This makes it easy for organizations using Okta for workforce identity to pass authenticated user context—including user attributes, group membership, and permissions—directly into the agent at runtime. Agents can now make user-aware decisions, enforce access rules, personalize responses, and execute tools with proper user context. This update helps enterprises adopt Agent Loop without changing their existing identity architecture and enables secure, policy-aligned AI interactions across both Okta and Entra environments. Setting up Okta as the identity provider requires a few steps and they are all explained in details here at Logic Apps Labs. Designer makeover! We’ve introduced a major redesign of the Azure Logic Apps designer, now in Public Preview for Standard workflows. This release marks the beginning of a broader modernization effort to make building, testing, and operating workflows faster, cleaner, and more intuitive. The new designer focuses on reducing friction and streamlining the development loop. You now land directly in the designer when creating a workflow, with plans to remove early decisions like stateful/stateless or agentic setup. The interface has been simplified into a single unified view, bringing together the visual canvas, code view, settings, and run history so you no longer switch between blades. A major addition is Draft Mode with auto-save, which preserves your work every few seconds without impacting production. Drafts can be tested safely and only go live when you choose to publish—without restarting the app during editing. Search has also been completely rebuilt for speed and accuracy, powered by backend indexing instead of loading thousands of connectors upfront. The designer now supports sticky notes and markdown, making it easy to document workflows directly on the canvas. Monitoring is integrated into the same page, letting you switch between runs instantly and compare draft and published results. A new hierarchical timeline view improves debugging by showing every action executed in order. This release is just the start—many more improvements and a unified designer experience across Logic Apps are on the way as we continue to iterate based on your feedback. Learn more about the designer updates in our announcement blog ! What's Next We’d love your feedback. Which capabilities should we prioritize, and what would create the biggest impact for your organization?1.1KViews1like0Comments🎉Announcing General Availability of Agent Loop in Azure Logic Apps
Transforming Business Automation with Intelligent, Collaborative Multi-Agentic workflows! Agent Loop is now Generally Available in Azure Logic Apps Standard, turning Logic Apps platform into a complete multi-agentic automation system. Build AI agents that work alongside workflows and humans, secured with enterprise-grade identity and access controls, deployed using your existing CI/CD pipelines. Thousands of customers have already built tens of thousands of agents—now you can take them to production with confidence. Get Started | Workshop | Demo Videos | Ignite 2025 Session | After an incredible journey since we introduced Agent Loop at Build earlier this year, we're thrilled to announce that Agent Loop is now generally available in Azure Logic Apps. This milestone represents more than just a feature release—it's the culmination of learnings from thousands of customers who have been pushing the boundaries of what's possible with agentic workflows. Agent Loop transforms Azure Logic Apps into a complete multi-agentic business process automation platform, where AI agents, automated workflows, and human expertise collaborate seamlessly to solve complex business challenges. With GA, we're delivering enterprise-grade capabilities that organizations need to confidently deploy intelligent automation at scale. The Journey to GA: Proven by Customers, Built for Production Since our preview launch at Build, the response has been extraordinary. Thousands of customers—from innovative startups to Fortune 500 enterprises—have embraced Agent Loop, building thousands of active agents that have collectively processed billions of tokens every month for the past six months. The growth of agents, executions, and token usage has accelerated significantly, doubling month over month. Since the launch of Conversational Agents in September, they already account for nearly 30% of all agentic workflows. Across the platform, agentic workflows now consume billions of tokens, with overall token usage increasing at nearly 3× month over month. Cyderes: 5X Faster Security Investigation Cycles Cyderes leveraged Agent Loop to automate triage and handling of security alerts, leading to faster investigation cycles and significant cost savings. "We were drowning in data—processing over 10,000 alerts daily while analysts spent more time chasing noise than connecting narratives. Agent Loop changed everything. By empowering our team to design and deploy their own AI agents through low-code orchestration, we've achieved 5X faster investigation cycles and significant cost savings, all while keeping pace with increasingly sophisticated cyber threats that now leverage AI to operate 25X faster than traditional attacks." – Eric Summers, Engineering Manager - AI & SOAR Vertex Pharmaceuticals: Hours Condensed to Minutes Vertex Pharmaceuticals unlocked knowledge trapped across dozens of systems via a team of agents. VAIDA, built with Logic Apps and Agent Loop, orchestrates multiple AI agents and helps employees find information faster, while maintaining compliance and supporting multiple languages. "We had knowledge trapped across dozens of systems—ServiceNow, documentation, training materials—and teams were spending valuable time hunting for answers. Logic Apps Agent Loop changed that. VAIDA now orchestrates multiple AI agents to summarize, search, and analyze this knowledge, then routes approvals right in Teams and Outlook. We've condensed hours into minutes while maintaining compliance and delivering content in multiple languages." – Pratik Shinde, Director, Digital Infrastructure & GenAI Platforms Where Customers Are Deploying Agent Loop Customers across industries are using Agent Loop to build AI applications that power both everyday tasks and mission-critical business processes across Healthcare, Retail, Energy, Financial Services, and beyond. These applications drive impact across a wide range of scenarios: Developer Productivity: Write code, generate unit tests, create workflows, map data between systems, automate source control, deployment and release pipelines IT Operations: Incident management, ticket and issue handling, policy review and enforcement, triage, resource management, cost optimization, issue remediation Business Process Automation: Empower sales specialists, retail assistants, order processing/approval flows, and healthcare assistants for intake and scheduling Customer & Stakeholder Support: Project planning and estimation, content generation, automated communication, and streamlined customer service workflows Proven Internally at Microsoft Agent Loop is also powering Microsoft and Logic Apps team's own operations, demonstrating its versatility and real-world impact: IcM Automation Team: Transforming Microsoft's internal incident automation platform into an agent studio that leverages Logic Apps' Agent Loop, enabling teams across Microsoft to build agentic live site incident automations Logic Apps Team Use Cases: Release & Deployment Agent: Streamlines deployment and release management for the Logic Apps platform Incident Management Agent: An extension of our SRE Agent, leveraging Agent Loop to accelerate incident response and remediation Analyst Agent: Assists teams in exploring product usage and health data, generating insights directly from analytics What's Generally Available Today Core Agent Loop Capabilities (GA) Agent Loop in Logic Apps Standard SKU - Support for both Autonomous and Conversational workflows Autonomous workflows run agents automatically based on triggers and conditions Conversational workflows use A2A to enable interactive chat experiences with agents On-Behalf-Of Authentication - Per-user authentication for 1st-party and 3rd-party connectors Agent Hand-Off - Enable seamless collaboration in multi-agent workflows Python Code Interpreter - Execute Python code dynamically for data analysis and computation Nested Agent Action - Use agents as tools within other agents for sophisticated orchestration User ACLs Support - Fine-grained document access control for knowledge Exciting New Agent Loop Features in Public Preview We've also released several groundbreaking features in Public Preview: New Designer Experience - Redesigned interface optimized for building agentic workflows Agent Loop in Consumption SKU - Deploy agents in the serverless Consumption tier MCP Support - Integrate Model Context Protocol servers as tools, enabling agents to access standardized tool ecosystems AI Gateway Integration - Use Azure AI Gateway as a model source for unified governance and monitoring Teams/M365 Deployment - Deploy conversational agents directly in Microsoft Teams and Microsoft 365 Okta Identity Provider - Use Okta as the identity provider for conversational agents Here’s our Announcement Blog for these new capabilities Built on a Platform You Already Trust Azure Logic Apps is already a proven iPaaS platform with thousands of customers using it for automation – ranging from startups to 100% of Fortune 500 companies. Agent Loop doesn't create a separate "agentic workflow automation platform" you have to learn and operate. Instead, it makes Azure Logic Apps itself your agentic platform: Workflows orchestrate triggers, approvals, retries, and branching Agent Loop, powered by LLMs, handle reasoning, planning, and tool selection Humans stay in control through approvals, exceptions, and guided hand-offs Agent Loop runs inside your Logic Apps Standard environment, so you get the same benefits you already know: enterprise SLAs, VNET integration, data residency controls, hybrid hosting options, and integration with your existing deployment pipelines and governance model. Enterprise Ready - Secure, User-Aware Agents by Design Bringing agents into the enterprise only works if security and compliance are first-class. With Agent Loop in Azure Logic Apps, security is built into every layer of the stack. Per-User Actions with On-Behalf-Of (OBO) and Delegated Permissions Many agent scenarios require tools to act in the context of the signed-in user. Agent Loop supports the OAuth 2.0 On-Behalf-Of (OBO) flow so that supported connector actions can run with delegated, per-user connections rather than a broad app-only identity. That means when an agent sends mail, reads SharePoint, or updates a service desk system, it does so as the user (where supported), respecting that user's licenses, permissions, and data boundaries. This is critical for scenarios like IT operations, HR requests, and finance approvals where "who did what" must be auditable. Document-Level Security with Microsoft Entra-Based Access Control Agents should only see the content a user is entitled to see. With Azure AI Search's Entra-based document-level security, your retrieval-augmented workflows can enforce ACLs and RBAC directly in the index so that queries are automatically trimmed to documents the user has access to. Secured Chat Entry Point with Easy Auth and Entra ID The built-in chat client and your custom clients can be protected using App Service Authentication (Easy Auth) and Microsoft Entra ID, so only authorized users and apps can invoke your conversational endpoints. Together, OBO, document-level security, and Easy Auth give you end-to-end identity and access control—from the chat surface, through the agent, down to your data and systems. An Open Toolbox: Connectors, Workflows, MCP Servers, and External Agents Agent Loop inherits the full power of the Logic Apps ecosystem and more - 1,400+ connectors for SaaS, on-premises, and custom APIs Workflows and agents as tools - compose sophisticated multi-step capabilities MCP server support - integrate with the Model Context Protocol for standardized tool access (Preview) A2A protocol support - enable agent-to-agent communication across platforms Multi-model flexibility - use Azure OpenAI, Azure AI Foundry hosted models, or bring your own model on any endpoint via AI gateway You're not locked into a single vendor or model provider. Agent Loop gives you an open, extensible framework that works with your existing investments and lets you choose the right tools for each job. Run Agents Wherever You Run Logic Apps Agent Loop is native to Logic Apps Standard, so your agentic workflows run consistently across cloud, on-premises, or hybrid environments. They inherit the same deployment, scaling, and networking capabilities as your workflows, bringing adaptive, AI-driven automation to wherever your systems and data live. Getting Started with Agent Loop We're in very exciting times, and we can't wait to see our customers go to production and realize the benefits of these capabilities for their business outcomes and success. Here are some useful links to get started on your AI journey with Logic Apps! Logic Apps Labs - https://aka.ms/LALabs Workshop - https://aka.ms/la-agent-in-a-day Demos - https://aka.ms/agentloopdemos1.5KViews3likes0Comments🤖 Agent Loop Demos 🤖
We announced the public preview of agent loop at Build 2025. Agent Loop is a new feature in Logic Apps to build AI Agents for use cases that span across industry domains and patterns. Here are some resources to learn more about them Logic Apps Labs - https://aka.ms/lalabs Agent in a day workshop - https://aka.ms/la-agent-in-a-day In this article, share with you use cases implemented in Logic Apps using agent loop and other features. This video shows a conversational agent that answers questions about Health Plans and their coverage. The demo features document ingestion of policy documents in AI Search and then retrieving them in Agent loop using natural language This video shows an autonomous agent that generates sales report. The demo features Python Code Interpreter which analyzed excel data and aggregates it for the LLM to reason on it This video shows a conversational agent that helps recruiters with candidate screening and interview scheduling. The demo features OBO (On-Behalf-Of) where agent uses tools in the security context of user. This video shows a conversational agent for a Utility company. The demo features multi agent orchestration using handoff, and a native chat client that supports multi turn conversations and streaming, and is secured via Entra for user authentication This video shows an autonomous Loan Approval Agent specifically that handles auto loans for a bank. The demo features an AI Agent that uses an Azure Open AI model, company's policies, and several tools to process loan application. For edge cases, huma in involved via Teams connector. This video shows an autonomous Product Return Agent for Fourth Coffee company. The returns are processed by agent based on company policy, and other criterions. In this case also, a human is involved when decisions are outside the agent's boundaries This video shows a commercial agent that grants credits for purchases of groceries and other products, for Northwind Stores. The Agent extracts financial information from an IBM Mainframe and an IBM i system to assess each requestor and updates the internal Northwind systems with the approved customers information. Multi-Agent scenario including both a codeful and declarative method of implementation. Note: This is pre-release functionality and is subject to change. If you are interested in further discussing Logic Apps codeful Agents, please fill out the following feedback form. Operations Agent (part 1): In this conversational agent, we will perform Logic Apps operations such as repair and resubmit to ensure our integration platform is healthy and processing transactions. To ensure of compliance we will ensure all operational activities are logged in ServiceNow. Operations Agent (part 2): In this autonomous agent, we will perform Logic Apps operations such as repair and resubmit to ensure our integration platform is healthy and processing transactions. To ensure of compliance we will ensure all operational activities are logged in ServiceNow.4.6KViews4likes2CommentsLogic Apps Aviators Newsletter - November 2025
In this issue: Ace Aviator of the Month News from our product group News from our community Ace Aviator of the Month Novembers’s Ace Aviator: Al Ghoniem What's your role and title? What are your responsibilities? As a Senior Integration Consultant, I design and deliver enterprise-grade integration on Microsoft Azure, primarily using Logic Apps Standard, API Management, Service Bus, Event Grid and Azure Functions. My remit covers reference architectures, “golden” templates, governance and FinOps guardrails, CI/CD automation (Bicep and YAML), and production-ready patterns for reliability, observability and cost efficiency. Alongside my technical work, I lead teams of consultants and engineers, helping them adopt standardised delivery models, mentor through code reviews and architectural walkthroughs, and ensure we deliver consistent, high-quality outcomes across projects. I also help teams apply decisioning patterns (embedded versus external rules) and integrate AI responsibly within enterprise workflows. Can you give us some insights into your day-to-day activities and what a typical day in your role looks like? Architecture and patterns: refining solution designs, sequence diagrams and rules models for new and existing integrations. Build and automation: evolving reusable Logic App Standard templates, Bicep modules and pipelines, embedding monitoring, alerts and identity-first security. Problem-solving: addressing performance tuning, transient fault handling, poison/DLQ flows and “design for reprocessing.” Leadership and enablement: mentoring consultants, facilitating technical discussions, and ensuring knowledge is shared across teams. Community and writing: publishing articles and examples to demystify real-world integration trade-offs. What motivates and inspires you to be an active member of the Aviators/Microsoft community? The community continuously turns hard-won lessons into reusable practices. Sharing patterns (and anti-patterns) saves others time and incidents, while learning from peers strengthens my own work. Microsoft’s product teams also listen closely, and seeing customer feedback directly shape the platform is genuinely rewarding. Looking back, what advice do you wish you had been given earlier that you'd now share with those looking to get into STEM/technology? Optimise for learning speed, not titles. Choose problems that stretch you and deliver in small, measurable increments. Master the fundamentals. Naming, idempotency, retries and observability are not glamorous but make systems dependable. Document everything. Diagrams, runbooks and ADRs multiply your impact. Understand trade-offs. Every decision buys something and costs something; acknowledge both sides clearly. Value collaboration over heroics. Ask questions, share knowledge and give credit freely. What has helped you grow professionally? Reusable scaffolding: creating golden templates and reference repositories that capture best practice once and reuse it everywhere. Feedback loops: leveraging telemetry, post-incident reviews and peer critique to improve. Teaching and mentoring: explaining concepts to others brings clarity and strengthens leadership. Cross-disciplinary curiosity: combining architecture, DevOps, FinOps and AI to address problems holistically. If you had a magic wand that could create a feature in Logic Apps, what would it be and why? "Stateful Sessions and Decisions” as a first-class capability: Built-in session state across multiple workflows, durable correlation and resumable orchestrations without external storage. A native decisioning activity with versioned decision tables and rule auditing (“why this rule fired”). A local-first developer experience with fast testing and contract validation for confident iteration. This would simplify complex, human-in-the-loop and event-driven scenarios, reduce custom plumbing, and make advanced orchestration patterns accessible to a wider audience. News from our product group Logic Apps Community Day 2025 Did you miss or want to catch up again on your favorite Logic Apps Community Day videos – jump back into action on this four hours long learning session, with 10 sessions from our Community Experts. And stay tuned for individual sessions being shared throughout the week. Announcing Parse & Chunk with Metadata in Logic Apps: Build Context-Aware RAG Agents New Parse & Chunk actions add metadata like page numbers and sentence completeness—perfect for context-aware document Q&A using Azure AI Search and Agent Loop. Introducing the RabbitMQ Connector (Public Preview) The new connector (Public Preview) lets you send and receive messages with RabbitMQ in Logic Apps Standard and Hybrid—ideal for scalable, reliable messaging across industries. News from our community EventGrid And Entra Auth In Logic Apps Standard Post by Riccardo Viglianisi Learn how to use Entra Auth for webhook authentication, ditch SAS tokens, and configure private endpoints with public access rules—perfect for secure, scalable integrations. Debugging XSLT Made Easy in VS Code: .NET-Based Debugging for Logic Apps Post by Daniel Jonathan A new .NET-based extension brings real debugging to XSLT for Logic Apps. Set breakpoints, step through transformations, and inspect variables—making XSLT development clear and productive. This is the 3 rd post in a 5 part series, so worth checking out the other posts too. Modifying the Logic App Azure Workbook: Custom Views for Multi Workflow Monitoring Post by Jeff Wessling Learn how to tailor dashboards with KQL, multi-workflow views, and context panes—boosting visibility, troubleshooting speed, and operational efficiency across your integrations. Azure AI Agents in Logic Apps: A Guide to Automate Decisions Post by Imashi Kinigama Discover how GPT-powered agents, created using Logic Apps Agent Loop, automate decisions, extract data, and adapt in real time. Build intelligent workflows with minimal effort—no hardcoding, just instructions and tools. How to Turn Logic App Connectors into MCP Servers (Step-by-Step Guide) Post by Stephen W. Thomas Learn how to expose connectors like Google Drive or Salesforce as MCP endpoints using Azure API Center—giving AI agents secure, real-time access to 1,400+ services directly from VS Code. Custom SAP MCP Server with Logic Apps Post by Sebastian Meyer Learn how to turn Logic Apps into AI-accessible tools using MCP. From workflow descriptions to Easy Auth setup and VS Code integration—this guide unlocks SAP automation with Copilot. How Azure Logic Apps as MCP Servers Accelerate AI Agent Development Post by Monisha S Turn 1,400+ connectors into AI tools with Logic Apps Standard. Build agents fast, integrate with legacy systems, and scale intelligent workflows across your organization. Designing Business Rules in Azure Logic Apps: When to Go Embedded vs External Post by Al Ghoniem Learn when to use Logic Apps' native Rules Engine or offload to Azure Functions with NRules or JSON RulesEngine. Discover hybrid patterns for scalable, testable decision automation. Syncing SharePoint with Azure Blob Storage using Logic Apps & Azure Functions for Azure AI Search Post by Daniel Jonathan Solve folder delete issues by tagging blobs with SharePoint metadata. Use Logic Apps and a custom Azure Function to clean up orphaned files and keep Azure AI Search in sync. Step-by-Step Guide: Building a Conversational Agent in Azure Logic Apps Post by Stephen W. Thomas Use Azure AI Foundry and Logic Apps Standard to create chatbots that shuffle cards, answer questions, and embed into websites—no code required, just smart workflows and EasyAuth. You can hide sensitive data from the Logic App run history Post by Francisco Leal Learn how to protect sensitive data like authentication tokens, credentials, and personal information in Logic App, so this data don’t appear in the run history, which could pose security and privacy risks.577Views0likes0CommentsIntroducing langchain-azure-storage: Azure Storage integrations for LangChain
We're excited to introduce langchain-azure-storage , the first official Azure Storage integration package built by Microsoft for LangChain 1.0. As part of its launch, we've built a new Azure Blob Storage document loader (currently in public preview) that improves upon prior LangChain community implementations. This new loader unifies both blob and container level access, simplifying loader integration. More importantly, it offers enhanced security through default OAuth 2.0 authentication, supports reliably loading millions to billions of documents through efficient memory utilization, and allows pluggable parsing, so you can leverage other document loaders to parse specific file formats. What are LangChain document loaders? A typical Retrieval‑Augmented Generation (RAG) pipeline follows these main steps: Collect source content (PDFs, DOCX, Markdown, CSVs) — often stored in Azure Blob Storage. Parse into text and associated metadata (i.e., represented as LangChain Document objects). Chunk + embed those documents and store in a vector store (e.g., Azure AI Search, Postgres pgvector, etc.). At query time, retrieve the most relevant chunks and feed them to an LLM as grounded context. LangChain document loaders make steps 1–2 turnkey and consistent so the rest of the stack (splitters, vector stores, retrievers) “just works”. See this LangChain RAG tutorial for a full example of these steps when building a RAG application in LangChain. How can the Azure Blob Storage document loader help? The langchain-azure-storage package offers the AzureBlobStorageLoader , a document loader that simplifies retrieving documents stored in Azure Blob Storage for use in a LangChain RAG application. Key benefits of the AzureBlobStorageLoader include: Flexible loading of Azure Storage blobs to LangChain Document objects. You can load blobs as documents from an entire container, a specific prefix within a container, or by blob names. Each document loaded corresponds 1:1 to a blob in the container. Lazy loading support for improved memory efficiency when dealing with large document sets. Documents can now be loaded one-at-a-time as you iterate over them instead of all at once. Automatically uses DefaultAzureCredential to enable seamless OAuth 2.0 authentication across various environments, from local development to Azure-hosted services. You can also explicitly pass your own credential (e.g., ManagedIdentityCredential , SAS token). Pluggable parsing. Easily customize how documents are parsed by providing your own LangChain document loader to parse downloaded blob content. Using the Azure Blob Storage document loader Installation To install the langchain-azure-storage package, run: pip install langchain-azure-storage Loading documents from a container To load all blobs from an Azure Blob Storage container as LangChain Document objects, instantiate the AzureBlobStorageLoader with the Azure Storage account URL and container name: from langchain_azure_storage.document_loaders import AzureBlobStorageLoader loader = AzureBlobStorageLoader( "https://<your-storage-account>.blob.core.windows.net/", "<your-container-name>" ) # lazy_load() yields one Document per blob for all blobs in the container for doc in loader.lazy_load(): print(doc.metadata["source"]) # The "source" metadata contains the full URL of the blob print(doc.page_content) # The page_content contains the blob's content decoded as UTF-8 text Loading documents by blob names To only load specific blobs as LangChain Document objects, you can additionally provide a list of blob names: from langchain_azure_storage.document_loaders import AzureBlobStorageLoader loader = AzureBlobStorageLoader( "https://<your-storage-account>.blob.core.windows.net/", "<your-container-name>", ["<blob-name-1>", "<blob-name-2>"] ) # lazy_load() yields one Document per blob for only the specified blobs for doc in loader.lazy_load(): print(doc.metadata["source"]) # The "source" metadata contains the full URL of the blob print(doc.page_content) # The page_content contains the blob's content decoded as UTF-8 text Pluggable parsing By default, loaded Document objects contain the blob's UTF-8 decoded content. To parse non-UTF-8 content (e.g., PDFs, DOCX, etc.) or chunk blob content into smaller documents, provide a LangChain document loader via the loader_factory parameter. When loader_factory is provided, the AzureBlobStorageLoader processes each blob with the following steps: Downloads the blob to a new temporary file Passes the temporary file path to the loader_factory callable to instantiate a document loader Uses that loader to parse the file and yield Document objects Cleans up the temporary file For example, below shows parsing PDF documents with the PyPDFLoader from the langchain-community package: from langchain_azure_storage.document_loaders import AzureBlobStorageLoader from langchain_community.document_loaders import PyPDFLoader # Requires langchain-community and pypdf packages loader = AzureBlobStorageLoader( "https://<your-storage-account>.blob.core.windows.net/", "<your-container-name>", prefix="pdfs/", # Only load blobs that start with "pdfs/" loader_factory=PyPDFLoader # PyPDFLoader will parse each blob as a PDF ) # Each blob is downloaded to a temporary file and parsed by PyPDFLoader instance for doc in loader.lazy_load(): print(doc.page_content) # Content parsed by PyPDFLoader (yields one Document per page in the PDF) This file path-based interface allows you to use any LangChain document loader that accepts a local file path as input, giving you access to a wide range of parsers for different file formats. Migrating from community document loaders to langchain-azure-storage If you're currently using AzureBlobStorageContainerLoader or AzureBlobStorageFileLoader from the langchain-community package, the new AzureBlobStorageLoader provides an improved alternative. This section provides step-by-step guidance for migrating to the new loader. Steps to migrate To migrate to the new Azure Storage document loader, make the following changes: Depend on the langchain-azure-storage package Update import statements from langchain_community.document_loaders to langchain_azure_storage.document_loaders . Change class names from AzureBlobStorageFileLoader and AzureBlobStorageContainerLoader to AzureBlobStorageLoader . Update document loader constructor calls to: Use an account URL instead of a connection string. Specify UnstructuredLoader as the loader_factory to continue to use Unstructured for parsing documents. Enable Microsoft Entra ID authentication in environment (e.g., run az login or configure managed identity) instead of using connection string authentication. Migration samples Below shows code snippets of what usage patterns look like before and after migrating from langchain-community to langchain-azure-storage : Before migration from langchain_community.document_loaders import AzureBlobStorageContainerLoader, AzureBlobStorageFileLoader container_loader = AzureBlobStorageContainerLoader( "DefaultEndpointsProtocol=https;AccountName=<account>;AccountKey=<account-key>;EndpointSuffix=core.windows.net", "<container>", ) file_loader = AzureBlobStorageFileLoader( "DefaultEndpointsProtocol=https;AccountName=<account>;AccountKey=<account-key>;EndpointSuffix=core.windows.net", "<container>", "<blob>" ) After migration from langchain_azure_storage.document_loaders import AzureBlobStorageLoader from langchain_unstructured import UnstructuredLoader # Requires langchain-unstructured and unstructured packages container_loader = AzureBlobStorageLoader( "https://<account>.blob.core.windows.net", "<container>", loader_factory=UnstructuredLoader # Only needed if continuing to use Unstructured for parsing ) file_loader = AzureBlobStorageLoader( "https://<account>.blob.core.windows.net", "<container>", "<blob>", loader_factory=UnstructuredLoader # Only needed if continuing to use Unstructured for parsing ) What's next? We're excited for you to try the new Azure Blob Storage document loader and would love to hear your feedback! Here are some ways you can help shape the future of langchain-azure-storage : Show support for interface stabilization - The document loader is currently in public preview and the interface may change in future versions based on feedback. If you'd like to see the current interface marked as stable, upvote the proposal PR to show your support. Report issues or suggest improvements - Found a bug or have an idea to make the document loaders better? File an issue on our GitHub repository. Propose new LangChain integrations - Interested in other ways to use Azure Storage with LangChain (e.g., checkpointing for agents, persistent memory stores, retriever implementations)? Create a feature request or write to us to let us know. Your input is invaluable in making langchain-azure-storage better for the entire community! Resources langchain-azure GitHub repository langchain-azure-storage PyPI package AzureBlobStorageLoader usage guide AzureBlobStorageLoader documentation reference