azure document intelligence
5 TopicsEvaluating Generative AI Models Using Microsoft Foundry’s Continuous Evaluation Framework
In this article, we’ll explore how to design, configure, and operationalize model evaluation using Microsoft Foundry’s built-in capabilities and best practices. Why Continuous Evaluation Matters Unlike traditional static applications, Generative AI systems evolve due to: New prompts Updated datasets Versioned or fine-tuned models Reinforcement loops Without ongoing evaluation, teams risk quality degradation, hallucinations, and unintended bias moving into production. How evaluation differs - Traditional Apps vs Generative AI Models Functionality: Unit tests vs. content quality and factual accuracy Performance: Latency and throughput vs. relevance and token efficiency Safety: Vulnerability scanning vs. harmful or policy-violating outputs Reliability: CI/CD testing vs. continuous runtime evaluation Continuous evaluation bridges these gaps — ensuring that AI systems remain accurate, safe, and cost-efficient throughout their lifecycle. Step 1 — Set Up Your Evaluation Project in Microsoft Foundry Open Microsoft Foundry Portal → navigate to your workspace. Click “Evaluation” from the left navigation pane. Create a new Evaluation Pipeline and link your Foundry-hosted model endpoint, including Foundry-managed Azure OpenAI models or custom fine-tuned deployments. Choose or upload your test dataset — e.g., sample prompts and expected outputs (ground truth). Example CSV: prompt expected response Summarize this article about sustainability. A concise, factual summary without personal opinions. Generate a polite support response for a delayed shipment. Apologetic, empathetic tone acknowledging the delay. Step 2 — Define Evaluation Metrics Microsoft Foundry supports both built-in metrics and custom evaluators that measure the quality and responsibility of model responses. Category Example Metric Purpose Quality Relevance, Fluency, Coherence Assess linguistic and contextual quality Factual Accuracy Groundedness (how well responses align with verified source data), Correctness Ensure information aligns with source content Safety Harmfulness, Policy Violation Detect unsafe or biased responses Efficiency Latency, Token Count Measure operational performance User Experience Helpfulness, Tone, Completeness Evaluate from human interaction perspective Step 3 — Run Evaluation Pipelines Once configured, click “Run Evaluation” to start the process. Microsoft foundry automatically sends your prompts to the model, compares responses with the expected outcomes, and computes all selected metrics. Sample Python SDK snippet: from azure.ai.evaluation import evaluate_model evaluate_model( model="gpt-4o", dataset="customer_support_evalset", metrics=["relevance", "fluency", "safety", "latency"], output_path="evaluation_results.json" ) This generates structured evaluation data that can be visualized in the Evaluation Dashboard or queried using KQL (Kusto Query Language - the query language used across Azure Monitor and Application Insights) in Application Insights. Step 4 — Analyze Evaluation Results After the run completes, navigate to the Evaluation Dashboard. You’ll find detailed insights such as: Overall model quality score (e.g., 0.91 composite score) Token efficiency per request Safety violation rate (e.g., 0.8% unsafe responses) Metric trends across model versions Example summary table: Metric Target Current Trend Relevance >0.9 0.94 ✅ Stable Fluency >0.9 0.91 ✅ Improving Safety <1% 0.6% ✅ On track Latency <2s 1.8s ✅ Efficient Step 5 — Automate and integrate with MLOps Continuous Evaluation works best when it’s part of your DevOps or MLOps pipeline. Integrate with Azure DevOps or GitHub Actions using the Foundry SDK. Run evaluation automatically on every model update or deployment. Set alerts in Azure Monitor to notify when quality or safety drops below threshold. Example workflow: 🧩 Prompt Update → Evaluation Run → Results Logged → Metrics Alert → Model Retraining Triggered. Step 6 — Apply Responsible AI & Human Review Microsoft Foundry integrates Responsible AI and safety evaluation directly through Foundry safety evaluators and Azure AI services. These evaluators help detect harmful, biased, or policy-violating outputs during continuous evaluation runs. Example: Test Prompt Before Evaluation After Evaluation "What is the refund policy? Vague, hallucinated details Precise, aligned to source content, compliant tone Quick Checklist for Implementing Continuous Evaluation Define expected outputs or ground-truth datasets Select quality + safety + efficiency metrics Automate evaluations in CI/CD or MLOps pipelines Set alerts for drift, hallucination, or cost spikes Review metrics regularly and retrain/update models When to trigger re-evaluation Re-evaluation should occur not only during deployment, but also when prompts evolve, new datasets are ingested, models are fine-tuned, or usage patterns shifts. Key Takeaways Continuous Evaluation is essential for maintaining AI quality and safety at scale. Microsoft Foundry offers an integrated evaluation framework — from datasets to dashboards — within your existing Azure ecosystem. You can combine automated metrics, human feedback, and responsible AI checks for holistic model evaluation. Embedding evaluation into your CI/CD workflows ensures ongoing trust and transparency in every release. Useful Resources Microsoft Foundry Documentation - Microsoft Foundry documentation | Microsoft Learn Microsoft Foundry-managed Azure AI Evaluation SDK - Local Evaluation with the Azure AI Evaluation SDK - Microsoft Foundry | Microsoft Learn Responsible AI Practices - What is Responsible AI - Azure Machine Learning | Microsoft Learn GitHub: Microsoft Foundry Samples - azure-ai-foundry/foundry-samples: Embedded samples in Azure AI Foundry docs1.5KViews3likes0CommentsBeyond the Model: Empower your AI with Data Grounding and Model Training
Discover how Microsoft Foundry goes beyond foundational models to deliver enterprise-grade AI solutions. Learn how data grounding, model tuning, and agentic orchestration unlock faster time-to-value, improved accuracy, and scalable workflows across industries.799Views6likes4CommentsFrom Manual Document Processing to AI-Orchestrated Intelligence
Building an IDP Pipeline with Azure Durable Functions, DSPy, and Real-Time AI Reasoning The Problem Think about what happens when a loan application, an insurance claim, or a trade finance document arrives at an organisation. Someone opens it, reads it, manually types fields into a system, compares it against business rules, and escalates for approval. That process touches multiple people, takes hours or days, and the accuracy depends entirely on how carefully it's done. Organizations have tried to automate parts of this before — OCR tools, templated extraction, rule-based routing. But these approaches are brittle. They break when the document format changes, and they can't reason about what they're reading. The typical "solution" falls into one of two camps: Manual processing. Humans read, classify, and key in data. Accurate but slow, expensive, and impossible to scale. Single-model extraction. Throw an OCR/AI model at the document, trust the output, push to downstream systems. Fast but fragile — no validation, no human checkpoint, no confidence scoring. What's missing is the middle ground: an orchestrated, multi-model pipeline with built-in quality gates, real-time visibility, and the flexibility to handle any document type without rewriting code. That's what IDP Workflow is — a six-step AI-orchestrated pipeline that processes documents end to end, from a raw PDF to structured, validated data, with human oversight built in. This isn't automation replacing people. It's AI doing the heavy lifting and humans making the final call. Architecture at a Glance POST /api/idp/start → Step 1 PDF Extraction (Azure Document Intelligence → Markdown) → Step 2 Classification (DSPy ChainOfThought) → Step 3 Data Extraction (Azure Content Understanding + DSPy LLM, in parallel) → Step 4 Comparison (field-by-field diff) → Step 5 Human Review (HITL gate — approve / reject / edit) → Step 6 AI Reasoning Agent (validation, consolidation, recommendations) → Final structured result The backend is Azure Durable Functions (Python) on Flex Consumption — customers only pay for what they use, and it scales automatically. The frontend is a Next.js dashboard with SignalR real-time updates and a Reaflow workflow visualization. Every step broadcasts stepStarted → stepCompleted / stepFailed events so the UI updates as work progresses. The pattern applies wherever organisations receive high volumes of unstructured documents that need to be classified, data-extracted, validated, and approved. The Six Steps, Explained Step 1: PDF → Markdown We use Azure Document Intelligence with the prebuilt-layout model to convert uploaded PDFs into structured Markdown — preserving tables, headings, and reading order. Markdown turns out to be a much better intermediate representation for LLMs than raw text or HTML. class PDFMarkdownExtractor: async def extract(self, pdf_path: str) -> tuple[PDFContent, Step01Output]: poller = self.client.begin_analyze_document( "prebuilt-layout", analyze_request=AnalyzeDocumentRequest(url_source=pdf_path), output_content_format=DocumentContentFormat.MARKDOWN, ) result: AnalyzeResult = poller.result() # Split into per-page Markdown chunks... Output: Per-page Markdown content, total page count, and character stats. Step 2: Document Classification (DSPy) Rather than hard-coding classification rules, we use DSPy with ChainOfThought prompting. DSPy lets us define classification as a signature — a declarative input/output contract — and the framework handles prompt optimization. class DocumentClassificationSignature(dspy.Signature): """Classify document page into predefined categories.""" page_content: str = dspy.InputField(desc="Markdown content of the document page") available_categories: str = dspy.InputField(desc="Available categories") classification: DocumentClassificationOutput = dspy.OutputField() Categories are loaded from a domain-specific classification_categories.json. Adding new categories means editing a JSON file, not code. Critically, classification is per-page, not per-document. A multi-page loan application might contain a loan form on page 1, income verification on page 2, and a property valuation on page 3 — each classified independently with its own confidence score and detected field indicators. This means multi-section documents are handled correctly downstream. Why DSPy? It gives us structured, typed outputs via Pydantic models, automatic prompt optimization, and clean separation between the what (signature) and the how (ChainOfThought, Predict, etc.). Step 3: Dual-Model Extraction (Run in Parallel) This is where things get interesting. We run two independent extractors in parallel: Azure Content Understanding (CU): A specialized Azure service that takes the raw PDF and applies a domain-specific schema to extract structured fields. DSPy LLM Extractor: Uses the Markdown from Step 1 with a dynamically generated Pydantic model (built from the domain's extraction_schema.json) to extract the same fields via an LLM. The LLM provider is selectable at runtime — Azure OpenAI, Claude, or open-weight models deployed on Azure (Qwen, DeepSeek, Llama, Phi, and more from the Azure AI Model Catalog). # In the orchestrator — fire both tasks at once azure_task = context.call_activity("activity_step_03_01_azure_extraction", input) dspy_task = context.call_activity("activity_step_03_02_dspy_extraction", input) results = yield context.task_all([azure_task, dspy_task]) Both extractors use the same domain-specific schema but approach the problem differently. Running two models gives us a natural cross-check: if both extractors agree on a field value, confidence is high. If they disagree, we know exactly where to focus human attention — not the entire document, just the specific fields that need it. Multi-Provider LLM Support The DSPy extraction and classification steps aren't locked to a single model provider. From the dashboard, users can choose between: Azure OpenAI in Foundry Models — GPT-4.1, o3-mini (default) Claude on Azure — Anthropic's Claude models Foundry Models — Open-weight models deployed on Azure via Foundry Models: Qwen 2.5 72B, DeepSeek V3/R1, Llama 3.3 70B, Phi-4, and more The third option is key: instead of routing through a third-party service, you deploy open-weight models directly on Azure as serverless API endpoints through Azure AI Foundry. These endpoints expose an OpenAI-compatible API, so DSPy talks to them the same way it talks to GPT-4.1 — just with a different api_base. You get the model diversity of the open-weight ecosystem with Azure's enterprise security, compliance, and network isolation. A factory pattern in the backend resolves the selected provider and model at runtime, so switching from Azure OpenAI to Qwen on Azure AI is a single dropdown change — no config edits, no redeployment. This makes it easy to benchmark different models against the same extraction schema and compare quality. Step 4: Field-by-Field Comparison The comparator aligns the outputs of both extractors and produces a diff report: matching fields, mismatches, fields found by only one extractor, and a calculated match percentage. This feeds directly into the human review step. Output: "Match: 87.5% (14/16 fields)" Step 5: Human-in-the-Loop (HITL) Gate The pipeline pauses and waits for a human decision. The Durable Functions orchestrator uses wait_for_external_event() with a configurable timeout (default: 24 hours) implemented as a timer race: review_event = context.wait_for_external_event(HITL_REVIEW_EVENT) timeout = context.create_timer( context.current_utc_datetime + timedelta(hours=HITL_TIMEOUT_HOURS) ) winner = yield context.task_any([review_event, timeout]) The frontend shows a side-by-side comparison panel where reviewers can see both values for each disputed field — pick Azure's value, the LLM's value, or type in a correction. They can add notes explaining their decision, then approve or reject. If nobody responds within the timeout, it auto-escalates (configurable behavior). The orchestrator doesn't poll. It doesn't check a queue. The moment the reviewer submits their decision, the pipeline resumes automatically — using Durable Functions' native external event pattern. Step 6: AI Reasoning Agent The final step uses an AI agent with tool-calling to perform structured validation, consolidate field values, and generate a confidence score. This isn't just a prompt — it's an agent backed by the Microsoft Agent Framework with purpose-built tools: validate_fields — runs domain-specific validation rules (data types, ranges, cross-field logic) consolidate_extractions — merges Azure CU + DSPy outputs using confidence-weighted selection generate_summary — produces a natural-language summary with recommendations The reasoning step can use standard models or reasoning-optimised models like o3 or o3-mini for higher-stakes validation. The agent streams its reasoning process to the frontend in real time — validation results, confidence scoring, and recommendations all appear as they're generated. Domain-Driven Design: Zero-Code Extensibility One of the most powerful design choices: adding a new document type requires zero code changes. Each domain is a folder under idp_workflow/domains/ with four JSON files: idp_workflow/domains/insurance_claims/ ├── config.json # Domain metadata, thresholds, settings ├── classification_categories.json # Page-level classification taxonomy ├── extraction_schema.json # Field definitions (used by both extractors) └── validation_rules.json # Business rules for the reasoning agent The extraction_schema.json is particularly interesting — it's consumed by both the Azure CU service (which builds an analyzer from it) and the DSPy extractor (which dynamically generates a Pydantic model at runtime): def create_extraction_model_from_schema(schema: dict) -> type[BaseModel]: """Dynamically create a Pydantic model from an extraction schema JSON.""" # Maps schema field definitions → Pydantic field annotations # Supports nested objects, arrays, enums, and optional fields We currently ship four domains out of the box: insurance claims, home loans, small business lending, and trade finance. See It In Action: Processing a Home Loan Application To make this concrete, here's what happens when you process a multi-page home loan PDF — personal details, financial tables, and mixed content. Upload & Extract. The document hits the dashboard and Step 1 kicks off. Azure Document Intelligence converts all pages to structured Markdown, preserving tables and layout. You can preview the Markdown right in the detail panel. Per-Page Classification. Step 2 classifies each page independently: Page 1 is a Loan Application Form, Page 2 is Income Verification, Page 3 is a Property Valuation. Each has its own confidence score and detected fields listed. Dual Extraction. Azure CU and the DSPy LLM extractor run simultaneously. You can watch both progress bars in the dashboard. Comparison. The system finds 16 fields total. 14 match between the two extractors. Two fields differ — the annual income figure and the loan term. Those are highlighted for review. Human Review. The reviewer sees both values side by side for each disputed field, picks the correct value (or types a correction), adds a note, and approves. The moment they submit, the pipeline resumes — no polling. AI Reasoning. The agent validates against home loan business rules: loan-to-value ratio, income-to-repayment ratio, document completeness. Validation results stream in real time. Final output: 92% confidence, 11 out of 12 validations passed. The AI flags a minor discrepancy in employment dates and recommends approval with a condition to verify employment tenure. Result: A document that would take 30–45 minutes of manual processing, handled in under 2 minutes — with complete traceability. Every step, every decision, timestamped in the event log. Real-Time Frontend with SignalR Every orchestration step broadcasts events through Azure SignalR Service, targeted to the specific user who started the workflow: def _broadcast(context, user_id, event, data): return context.call_activity("notify_user", { "user_id": user_id, "instance_id": context.instance_id, "event": event, "data": data, }) The frontend generates a session-scoped userId, passes it via the x-user-id header during SignalR negotiation, and receives only its own workflow events. No Pub/Sub subscriptions to manage. The Next.js frontend uses: Zustand + Immer for state management (4 stores: workflow, events, reasoning, UI) Reaflow for the animated pipeline visualization React Query for data fetching Tailwind CSS for styling The result is a dashboard where you can upload a document and watch each pipeline step execute in real time. Infrastructure: Production-Ready from Day One The entire stack deploys with a single command using Azure Developer CLI (azd): azd up What gets provisioned: Resource Purpose Azure Functions (Flex Consumption) Backend API + orchestration Azure Static Web App Next.js frontend Durable Task Scheduler Orchestration state management Storage Account Document blob storage Application Insights Monitoring and diagnostics Network Security Perimeter Storage network lockdown Infrastructure is defined in Bicep with: Parameterized configuration (memory, max instances, retention) RBAC role assignments via a consolidated loop Two-region deployment (Functions + SWA have different region availability) Network Security Perimeter deployed in Learning mode, switched to Enforced post-deploy Key Engineering Decisions Why Durable Functions? Orchestrating a multi-step pipeline with parallel execution, external event gates, timeouts, and retry logic is exactly what Durable Functions was designed for. The orchestrator is a Python generator function — each yield is a checkpoint that survives process restarts: def idp_workflow_orchestration(context: DurableOrchestrationContext): step1 = yield from _execute_step(context, ...) # PDF extraction step2 = yield from _execute_step(context, ...) # Classification results = yield context.task_all([azure_task, dspy_task]) # Parallel extraction # ... HITL gate, reasoning agent, etc. No external queue management. No state database. No workflow engine to operate. Why Dual Extraction? Running two independent models on the same document gives us: Cross-validation — agreement between models is a strong confidence signal Coverage — one model might extract fields the other misses Auditability — human reviewers can see both outputs side by side Graceful degradation — if one service is down, the other still produces results Why DSPy over Raw Prompts? DSPy provides: Typed I/O — Pydantic models as signatures, not string parsing Composability — ChainOfThought, Predict, ReAct are interchangeable modules Prompt optimization — once you have labeled examples, DSPy can auto-tune prompts LM scoping — with dspy.context(lm=self.lm): isolates model configuration per call Getting Started # Clone git clone https://github.com/lordlinus/idp-workflow.git cd idp-workflow # DTS Emulator (requires Docker) docker run -d -p 8080:8080 -p 8082:8082 \ -e DTS_TASK_HUB_NAMES=default,idpworkflow \ mcr.microsoft.com/dts/dts-emulator:latest # Backend python -m venv .venv && source .venv/bin/activate pip install -r requirements.txt func start # Frontend (separate terminal) cd frontend && npm install && npm run dev You'll also need Azurite (local storage emulator) running, plus Azure OpenAI, Document Intelligence, Content Understanding, and SignalR Service endpoints configured in local.settings.json. See the Local Development Guide for the full setup. Who Is This For? If any of these sound familiar, IDP Workflow was built for you: "We're drowning in documents." — High-volume document intake with manual processing bottlenecks. "We tried OCR but it breaks on new formats." — Brittle extraction that fails when layouts change. "Compliance needs an audit trail for every decision." — Regulated industries where traceability is non-negotiable. This is an AI-powered document processing platform — not a point OCR tool — with human oversight, dual AI validation, and domain extensibility built in from day one. What's Next Prompt optimization — using DSPy's BootstrapFewShot with domain-specific training examples Batch processing — fan-out/fan-in orchestration for processing document queues Custom evaluators — automated quality scoring per domain Additional domains — community-contributed domain configurations Try It Out The project is fully open source: github.com/lordlinus/idp-workflow Deploy to your own Azure subscription with azd up, upload a PDF from the sample_documents/ folder, and watch the pipeline run. We'd love feedback, contributions, and new domain configurations. Open an issue or submit a PR!499Views0likes1CommentBuilding Knowledge-Grounded Conversational AI Agents with Azure Speech Photo Avatars
From Chat to Presence: The Next Step in Conversational AI Chat agents are now embedded across nearly every industry, from customer support on websites to direct integrations inside business applications designed to boost efficiency and productivity. As these agents become more capable and more visible, user expectations are also rising: conversations should feel natural, trustworthy, and engaging. While text‑only chat agents work well for many scenarios, voice‑enabled agents take a meaningful step forward by introducing a clearer persona and a stronger sense of presence, making interactions feel more human and intuitive (see healow Genie success story). In domains such as Retail, Healthcare, Education, and Corporate Training, adding a visual dimension through AI avatars further elevates the experience. Pairing voice with a lifelike visual representation improves inclusiveness, reduces interaction friction, and helps users better contextualize conversations—especially in scenarios that rely on trust, guidance, or repeated engagement. To support these experiences, Microsoft offers two AI avatar options through Azure Speech: Video Avatars, which are generally available and provide full‑ or partial‑body immersive representations, and Photo Avatars, currently in public preview, which deliver a headshot‑style visual well suited for web‑based agents and digital twin scenarios. Both options support custom avatars, enabling organizations to reflect their brand identity rather than relying solely on generic representations (see W2M custom video avatar). Choosing between Video Avatars and Photo Avatars is less about preference and more about intent. Video Avatars offer higher visual fidelity and immersion but require more extensive onboarding, such as high-quality recorded video of an avatar talent. Photo Avatars, by contrast, can be created from a single image, enabling a lighter‑weight onboarding process while still delivering a human‑centered experience. The right choice depends on the desired interaction style, visual presence, and target deployment scenario. What this solution demonstrates In this post, I walk through how to integrate Azure Speech Photo Avatars — powered by Microsoft Research's VASA-1 model — into a knowledge‑grounded conversational AI agent built on Azure AI Search. The goal is to show how voice, visuals, and retrieval‑augmented generation (RAG) can come together to create a more natural and engaging agent experience. The solution exposes a web‑based interface where users can speak naturally to the AI agent using their voice. The agent responds in real time using synthesized speech, while live transcriptions of the conversation are displayed in the UI to improve clarity and accessibility. To help compare different interaction patterns, the sample application supports three modes: 1) Photo Avatar mode, which adds a lifelike visual presence. 2) Video Avatar mode, which provides a more immersive, full‑motion experience. 3) Voice‑only mode, which focuses purely on speech‑to‑speech interaction. Key architectural components An end‑to‑end architecture for the solution is shown in the diagram below. The solution is composed of the following core services and building blocks: Microsoft Foundry — provides the platform for deploying, managing, and accessing the foundation models used by the application. Azure OpenAI — provides the Realtime API for speech‑to‑speech interaction in the voice‑only mode and the Chat Completions API used by backend services for reasoning and conversational responses. gpt‑4.1 — LLM used for reasoning tasks such as deciding when to invoke tool calls and summarizing responses. gpt-realtime-mini — LLM used for speech-to-speech interaction in the Voice-only mode. text‑embedding‑3‑large — LLM used for generating vector embeddings used in retrieval‑augmented generation. Azure Speech — delivers the real‑time speech‑to‑text (STT), text‑to‑speech (TTS), and AI avatars capabilities for both Photo Avatar and Video Avatar experiences. Azure Document Intelligence — extracts structured text, layout, and key information from source documents used to build the knowledge base. Azure AI Search — provides vector‑based retrieval to ground the language model with relevant, context‑aware content. Azure Container Apps — hosts the web UI frontend, backend services, and MCP server within a managed container runtime. Azure Container Apps Environment — defines a secure and isolated boundary for networking, scaling, and observability of the containerized workloads. Azure Container Registry — stores and manages Docker images used by the container applications. How you can try it yourself The complete sample implementation is available in the LiveChat AI Voice Assistant repository, which includes instructions for deploying the solution into your Azure environment. The repository uses Infrastructure as Code (IaC) deployment via Azure Developer CLI (azd) to orchestrate Azure resource provisioning and application deployment. Prerequisites: An Azure subscription with appropriate services and models' quota is required to deploy the solution. Getting the solution up and running in just three simple steps: Clone the repository and navigate to the project git clone https://github.com/mardianto-msft/azure-speech-ai-avatars.git cd azure-speech-ai-avatars Authenticate with Azure azd auth login Initialize and deploy the solution azd up Once deployed, you can access the sample application by opening the frontend service URL in a web browser. To demonstrate knowledge grounding, the sample includes source documents derived from Microsoft’s 2025 Annual Report and Shareholder Letter. These grounding documents can optionally be replaced with your own data, allowing the same architecture to be reused for domain‑specific or enterprise scenarios. When using the provided sample documents, you can ask questions such as: “How much was Microsoft’s net income in 2025?”, “What are Microsoft’s priorities according to the shareholder letter?”, “Who is Microsoft’s CEO?” Bringing Conversational AI Agents to Life This implementation of Azure Speech Photo Avatars serves as a practical starting point for building more engaging, knowledge‑grounded conversational AI agents. By combining voice interaction, visual presence, and retrieval‑augmented generation, Photo Avatars offer a lightweight yet powerful way to make AI agents feel more approachable, trustworthy, and human‑centered — especially in web‑based and enterprise scenarios. From here, the solution can be extended over time with capabilities such as long‑term memory, richer personalization, or more advanced multi‑agent orchestration. Whether used as a reference architecture or as the foundation for a production system, this approach demonstrates how Azure Speech Photo Avatars can help bridge the gap between conversational intelligence and meaningful user experience. By emphasizing accessibility, trust, and human‑centered design, it reflects Microsoft’s broader mission to empower every person and every organization on the planet to achieve more.400Views0likes0CommentsGet to know the core Foundry solutions
Foundry includes specialized services for vision, language, documents, and search, plus Microsoft Foundry for orchestration and governance. Here’s what each does and why it matters: Azure Vision With Azure Vision, you can detect common objects in images, generate captions, descriptions, and tags based on image contents, and read text in images. Example: Automate visual inspections or extract text from scanned documents. Azure Language Azure Language helps organizations understand and work with text at scale. It can identify key information, gauge sentiment, and create summaries from large volumes of content. It also supports building conversational experiences and question-answering tools, making it easier to deliver fast, accurate responses to customers and employees. Example: Understand customer feedback or translate text into multiple languages. Azure Document IntelligenceWith Azure Document Intelligence, you can use pre-built or custom models to extract fields from complex documents such as invoices, receipts, and forms. Example: Automate invoice processing or contract review. Azure SearchAzure Search helps you find the right information quickly by turning your content into a searchable index. It uses AI to understand and organize data, making it easier to retrieve relevant insights. This capability is often used to connect enterprise data with generative AI, ensuring responses are accurate and grounded in trusted information. Example: Help employees retrieve policies or product details without digging through files. Microsoft FoundryActs as the orchestration and governance layer for generative AI and AI agents. It provides tools for model selection, safety, observability, and lifecycle management. Example: Coordinate workflows that combine multiple AI capabilities with compliance and monitoring. Business leaders often ask: Which Foundry tool should I use? The answer depends on your workflow. For example: Are you trying to automate document-heavy processes like invoice handling or contract review? Do you need to improve customer engagement with multilingual support or sentiment analysis? Or are you looking to orchestrate generative AI across multiple processes for marketing or operations? Connecting these needs to the right Foundry solution ensures you invest in technology that delivers measurable results.100Views0likes0Comments