artificial intelligence
392 TopicsRevolutionizing Document Intelligence: Scaling Construction Industries with AI-Driven Extraction
Introduction Generative AI (GenAI) is poised to transform the construction industry by addressing chronic challenges such as low productivity, cost overruns, schedule delays, and labor shortages. By automating the analysis of drawings, specifications, contracts, and project documentation, GenAI can reduce manual effort, accelerate decision-making, and improve coordination across architects, engineers, contractors, and suppliers. Industry studies indicate that AI-powered workflows can increase productivity by 20–40% in planning, engineering, and administrative functions while reducing costly rework and errors. The result is faster project delivery, improved resource utilization, lower costs, and more predictable project outcomes. A major opportunity for GenAI in construction lies in its ability to unlock the vast amount of information trapped within AutoCAD drawings, architectural plans, BIM models, specifications, and engineering documents. Today, project teams spend countless hours manually reviewing drawings, performing quantity takeoffs, identifying dependencies, and translating design intent into actionable work packages for downstream trades. GenAI can automate this process by extracting and interpreting dimensions, materials, quantities, assemblies, and building components directly from design artifacts, then intelligently distributing that information to foundation, framing, roofing, insulation, MEP, and finish teams. This creates a digital thread from design through execution, eliminating manual handoffs, reducing human error, and ensuring every stakeholder works from a single source of truth. The impact extends beyond productivity gains—GenAI enables more accurate material forecasting, streamlined procurement, reduced waste, faster response to design changes, fewer change orders, and greater confidence that the architect's vision is executed precisely in the field. In an industry where margins are tight and inefficiencies are costly, GenAI has the potential to fundamentally redefine how construction projects are planned, coordinated, and delivered. This article specifically demonstrates how organizations can leverage Azure AI services—including Azure Content Understanding, Azure foundry, Azure Blob Storage, Azure Open AI—to extract, understand, and operationalize information from construction drawings and project documentation. The solution illustrates how Azure's AI platform can transform unstructured design artifacts into actionable intelligence that improves productivity, reduces risk, accelerates procurement, and enables more efficient execution across the entire construction lifecycle. This transformation is now achievable through a hybrid AI architecture. By combining structured layout understanding models with Generative AI reasoning capabilities, organizations can build highly scalable, intelligent extraction systems that meet the rigorous safety and compliance standards of the construction sector. The Evolution from GenAI Approach to Deterministic Precision Starting with a Generative AI–driven approach to extract structured fields from documents is a fundamentally more effective initial strategy. It accelerates early-stage extraction without requiring large, labeled datasets, while simultaneously enabling structured data collection needed to train deterministic models—which typically require thousands of annotated samples. This approach delivers immediate value by rapidly identifying relevant data patterns in documents and uncovering key factors that influence extraction accuracy, such as document quality, layout complexity, and multi-section ambiguity. At the same time, it naturally builds the dataset necessary to transition toward a more scalable and repeatable solution. However, while powerful for contextual reasoning across document sections, Generative AI is inherently probabilistic and sensitive to input variability. For enterprise-grade reliability, precision, and repeatable structured document extraction, a complementary approach is required. The optimal solution is a hybrid model that combines the strengths of both: Azure Content Understanding provides precise, consistent field extraction with per-field confidence scores at scale. Azure OpenAI GPT-5.2 (generative) adds contextual reasoning, validates ambiguous fields, fills extraction gaps, and interprets complex multi-section relationships. AI Agent (bounded triage) handles exception cases with structured CORRECT/ACCEPT/ESCALATE decisions before human escalation. Together, they form a superior system—delivering higher accuracy, reduced ambiguity, bounded AI cost, and stronger auditability in complex real-world conditions. Note : AI cannot compensate for inconsistent input data. Standardized document schemas and operational discipline remain prerequisites for reliable automation. Solution Components and Architecture The solution follows a modular, event-driven architecture that combines deterministic document understanding and Generative AI to enable scalable, intelligent extraction workflows. At a high level, documents are ingested, deduplicated, processed through Azure Content Understanding for primary extraction, enhanced with GPT-5.2 for gap-fill verification, validated against business rules, and routed through a confidence-based decision system before persistence. The code repository for the solution can be found here Conceptual Architecture Azure Architecture: - The pipeline execution follows this flow: a document is uploaded to Azure Blob Storage, triggering the orchestrator. The pipeline checks for duplicates via SHA-256 hash against Cosmos DB. New documents are submitted to Azure Content Understanding, which returns structured fields with per-field confidence scores. The AI Schema Mapper then identifies gaps—fields that are missing or have confidence below 0.70—and sends only those to GPT-4.1 for verification. Results are normalized, validated against cross-field business rules, and routed based on aggregate confidence. Throughout the pipeline, built-in feedback loops—quality filtering, validation checks, and confidence gates—ensure that only high-confidence results are persisted automatically, enabling a reliable and production-ready extraction system. Azure Blob Storage — Primary storage for source PDFs and extraction artifacts. Standard_LRS, Hot tier, HTTPS-only with SAS-secured access for Content Understanding. Azure Content Understanding — Primary deterministic extractor with custom analyzer supporting 100+ configurable fields. Returns per-field confidence scores (0.0–1.0) plus raw markdown text. Non-LLM, repeatable, and auditable. Azure AI Foundry / OpenAI (GPT-5.2) — Bounded gap-fill verifier invoked only for missing or low-confidence fields (typically 10–20% of total). Temperature 0.0, JSON response format enforced, schema-aware prompting with domain rules. Azure Cosmos DB (Serverless)— Document persistence with SHA-256 deduplication, version increment on re-processing, and partition-by-document-type for efficient querying. Pay-per-request scales from zero. Azure Service Bus (Basic) — Event-driven queue integration with `document-processing` and `human-review` queues for processing triggers and escalation routing. Application Insights + OpenTelemetry — End-to-end observability with per-stage telemetry events, custom metrics (fill_rate, record_confidence, extraction_duration_ms), and distributed tracing Cost Impact of Hybrid Approach Metric CU-Only GPT-Only Hybrid (This Architecture) Cost per document ~$0.01 $0.15–0.30 $0.03–0.05 Determinism 100% Variable 95%+ Accuracy 75-80% 80–90% 90-95% Auditability Full Limited Per-field source attribution Cost savings: 60–80% reduction compared to GPT-only by limiting LLM to gap fields. Security and Enterprise Considerations Azure Blob Storage: Storage accounts can be secured by minimizing public exposure, enforcing strong identity‑based access, protecting data, and continuously monitoring for threats. Organizations should use Private Endpoints and disable public network access wherever possible, authenticate users and applications with Microsoft Entra ID instead of shared keys, and apply least‑privilege Azure RBAC with managed identities. Data should be encrypted in transit (TLS 1.2+) and at rest using Microsoft‑managed or customer‑managed keys stored in Azure Key Vault, while Microsoft Defender for Storage, logging, soft delete, backups, and Azure Policy should be enabled to detect threats, support recovery, and enforce compliance at scale. Content Safety can be called from the application layer to block uploads based on image content. Staging containers can be used to isolate untrusted uploads. Content Safety provides signals; your app enforces policy. Azure Content Understanding / AI Vision: Azure AI services support enterprise-grade security through Microsoft Entra ID–based authentication and Azure RBAC, ensuring only authorized applications can access extraction models. Network isolation can be enforced using Virtual Network (VNet) integration and Private Link to restrict public internet exposure. All data transmitted is encrypted in transit and at rest. Microsoft Defender for Cloud provides continuous security posture visibility across these AI workloads. Azure OpenAI Govern which models are approved for use and protect model artifacts and training data from unauthorized access through strong identity, network, encryption, and logging controls. AI applications should be designed with layered defenses, including multi‑stage content filtering, safety meta‑prompts, and least‑privilege permissions for agents and plugins to reduce the risk of prompt injection, data leakage, and unintended actions. High‑risk AI operations should include human‑in‑the‑loop review to prevent autonomous execution of harmful or incorrect outcomes. Organizations must continuously monitor AI systems for misuse, anomalous behavior, and data exfiltration, and they should perform ongoing AI red teaming to identify vulnerabilities such as jailbreaking, adversarial inputs, and model manipulation before they can be exploited. Azure Cosmos DB Azure Cosmos enhances network security by supporting access restrictions via Virtual Network (VNet) integrationand secure access through Private Link. Data protection is reinforced by integration with Microsoft Purview, which helps classify and label sensitive data, and Defender for Cosmos DBto detect threats and exfiltration attempts. Cosmos DB ensures all data is encrypted in transit using TLS 1.2+ (mandatory) and at rest using Microsoft-managed or customer-managed keys (CMKs). Azure Functions / Compute Secured with Entra ID authentication and managed identities, least-privilege RBAC, HTTPS-only access, private endpoints, VNet integration, and Key Vault for secrets. Hardened with Azure Policy, Defender for Cloud, and centralized logging. Microsoft Foundry Microsoft Foundry supports robust identity management using Azure Role-Based Access Control (RBAC) to assign roles within Microsoft Entra ID, and it supports Managed Identities for secure resource access. Conditional Access policies allow organizations to enforce access based on location, device, and risk level. For network security, Azure AI Foundry supports Private Link, Managed Network Isolation, and Network Security Groups (NSGs) to restrict resource access. Data is encrypted in transit and at rest using Microsoft-managed keys or optional Customer-Managed Keys (CMKs). Azure Policy enables auditing and enforcing configurations for all resources deployed in the environment. Additionally, Microsoft Entra Agent ID, which extends identity management and access capabilities to AI agents. AI agents created within Microsoft Foundry are automatically assigned identities in a Microsoft Entra directory centralizing agent and user management in one solution. AI Security Posture Management can be used to assess the security posture of AI workloads. Defender for AI Services provides threat protection and insights for you AI resources. Purview APIs enable Azure AI Foundry and developers to integrate data security and compliance controls into custom AI apps and agents. This includes enforcing policies based on how users interact with sensitive information in AI applications. Purview Sensitive Information Types can be used to detect sensitive data in user prompts and responses when interacting with AI applications. DevOps Security Security is further “shifted left” by integrating automated controls directly into CI/CD pipelines. GitHub Advanced Security for Azure DevOps, which provides dependency scanning, CodeQL-based static application security testing (SAST), and secret scanning to identify vulnerabilities and exposed credentials in code and third-party libraries. Infrastructure-as-code templates can be validated with Azure Policy and Microsoft Defender for Cloud, while pipeline protections such as protected branches and approvals reduce the risk of unauthorized changes. DevOps environments can be hardened using Azure Key Vault for secrets management, Managed Identities and Microsoft Entra ID for least-privilege access, and monitoring through Azure Monitor . Microsoft Defender for Cloud DevOps Security provides centralized code‑to‑cloud visibility across Azure DevOps, GitHub, and GitLab, identifying risks in code, secrets, dependencies, and IaC and helping teams prioritize fixes early in CI/CD pipelines Related and Future Scenarios Although document extraction serves as the initial use case, this architecture establishes a scalable pattern for many applications: Insurance Claims Processing: Swap schema to claim fields; update CU analyzer for claim forms Legal Contract Analysis: Schema for clauses, parties, dates; add NER in normalization Healthcare Medical Records: HIPAA-compliant Cosmos; schema for diagnoses, medications, vitals Financial Document Processing: Schema for transactions, accounts; add currency normalization Engineering/Construction Plans: Schema for dimensions, materials, specifications Digital Twin Integration: Feed extracted data into asset models for real-time facility visualization Predictive Analytics: Track extracted values over time for trend detection and forecasting Conclusion Modernizing document extraction is not simply about applying AI—it requires aligning technology, operational discipline, and data quality. Early exploration using Generative AI enabled rapid learning and feasibility validation. However, a production-grade solution must be built on structured layout understanding models supported by standardized schema definitions and operational controls. By combining primary structured extraction with Generative AI reasoning for bounded gap-fill verification, organizations can achieve scalable, repeatable, and auditable extraction processes. This hybrid approach enables reduced manual effort, lower error rates, and the transition from batch manual processing to intelligent, automated workflows. The result is not just an automated extraction tool, but a scalable AI architecture for modern document intelligence—adaptable to any industry, any document type, and any structured data need. Contributors: This article is maintained by Microsoft. It was originally written by the following contributors. Gaurav Bhardwaj | Senior Cloud Solution Architect – US Customer Success Manasa Ramalinga | Senior Principal Cloud Solution Architect – US Customer Success Abed Sau | Principal Cloud Solution Architect – US Customer Success255Views0likes0CommentsEmpowering the AI Generation: Microsoft's Open-Source Initiative
In a world increasingly driven by open collaboration and community-driven innovation, Microsoft has undergone a remarkable transformation. The tech giant is on a mission to provide students, startups, AI developers, and entrepreneurs with the tools and resources they need to build groundbreaking solutions. Embracing open source is at the heart of this journey.7KViews3likes1CommentNow in Foundry: Command A+ (W4A4), Chandra OCR 2, and GLM-OCR
We are seeing two distinct trends this week. The first is around how low-bit quantization has developed to the point where large reasoning models can fit on a single accelerator with less quality loss. Second, a new wave of OCR-specialized vision-language models are redefining the accuracy-throughput frontier for document understanding. This week we are highlighting three Hugging Face models in Microsoft Foundry: Cohere Labs' Command A+ (W4A4), a 218B-parameter Sparse Mixture-of-Experts (MoE) reasoning model optimized for agentic, multilingual, and reasoning-heavy tasks; Datalab's Chandra OCR 2, a 5.3B vision-language model that converts images and PDFs to markdown, HTML, and JSON while preserving layout, with state-of-the-art results on the olmOCR benchmark and 90+ language coverage; and Z.ai's GLM-OCR, a 0.9B compact OCR model—roughly 6× smaller than Chandra OCR 2—built on the GLM-V encoder–decoder architecture that ranks first on OmniDocBench V1.5 while serving at high concurrency. Models of the week Cohere Labs: Command A+ (W4A4) Model Specs Parameters / size: 218B total, 25B active per token Context length: 128K input, 64K output Primary task: Text generation with vision input, reasoning, and tool use Why it's interesting Efficient, low compute deployment: Command A+ is designed to run on relatively minimal hardware for its size while maintaining high performance. It achieves this through advanced quantization and optimization techniques that reduce compute, latency, and cost. However, reasoning models are especially sensitive to quantization, as errors can accumulate over long decoding sequences. To mitigate this, the quantized student model is post-trained against the full-precision teacher’s output distribution, using fake quantization in the forward pass and straight-through estimators during backpropagation. CohereLabs recommends the W4A4 quantization for its strong balance of speed and latency. Multilingual, multimodal, and reasoning focused performance gains: Command A+ extends to 48 different languages (previously 23) and is built for complex reasoning and multimodal tasks with measureable improvements across document understanding, math reasoning, and enterprise QA workflows. Try it Test this prompt in the CohereLabs Hugging Face Space before deploying the model in Foundry: Sample prompt: You are Command, a legal AI for multinational contract review with access to CONTRACT_VAULT_QUERY and POLICY_TEMPLATE_RETRIEVAL tools. Analyze the input clause by first detecting language and classifying obligation type, then use CONTRACT_VAULT to find comparable {jurisdiction} clauses and retrieve the relevant policy template. Output structured JSON with obligation classification, comparative findings, risk assessment, and English recommendations with exact document citations. Include confidence scores, similarity metrics, and a reasoning trace showing each analysis step. Handle Polish/Japanese legal terminology accurately, preserve legal precision, and ensure all citations reference actual source documents. Use chain-of-thought reasoning, stay within 128K tokens, and never hallucinate references—state limitations explicitly when tools fail. Datalab: Chandra OCR 2 Model Specs Parameters / size: 5.3B Output formats: Markdown, HTML, and JSON Primary task: Document OCR (image-text-to-text) Why it's interesting State-of-the-art on the olmOCR benchmark: Chandra OCR 2 recieved 85.9% bench score on the olmOCR Benchmark and a 77.8% multilingual bench score (12% improvement over Chandra 1). Support for 90 world languages: Indic script, European languages, and languages that read right to left say substantial improvemtns based on Datalab’s internal benchmarking. View the full list of languages and the benchmark results here: Chandra 2 Language List Better complex layout understanding: Handles multi-level tables, nested structures, forms, math, and mixed handwriting with structured outputs (HTML/JSON/Markdown + bounding boxes), removing the need for post-OCR layout reconstruction. Take a look here: Try it Build an automated compliance intake pipeline using Chandra OCR 2 for structured extraction across complex, handwritten and form-based documents. In this scenario, you’re supporting a state election commission processing large volumes of candidate filings submitted as scanned forms or mobile-captured images. These documents often include mixed handwriting quality, checkbox selections, signatures, and structured fields that must be validated for compliance. Chandra OCR 2 can extract both printed and handwritten fields, identify form structure, and capture key elements such as candidate information, filing details, checkbox states, and signed declarations in a consistent JSON format. This structured output can then be passed into a compliance workflow to validate completeness, detect inconsistencies, and flag filings that require manual review. This approach helps streamline high-volume intake while improving accuracy and reducing manual processing across complex document types. Sample prompt: Extract all fields from this filing and return a structured JSON output including form type, candidate name, office sought, district, committee name, treasurer, filing date, checkbox states, and a transcription of the signed declaration. Include bounding boxes for each extracted field. Z.ai: GLM-OCR Model Specs Parameters / size: 0.9B Languages: Chinese, English, French, Spanish, Russian, German, Japanese, Korean Primary task: Document OCR (image-text-to-text) Why it's interesting High accuracy at a compact scale: GLM-OCR achieves a score of 94.62 on OmniDocBench V1.5, showing strong performance on tasks such as formula recognition, table extraction, and document parsing—even at sub-1B scale Designed for structured document understanding: The model performs well across complex document layouts, enabling extraction of tables, forms, and mixed text-image content Optimized training for consistency across tasks: Uses Multi-Token Prediction (MTP) and full-task reinforcement learning to improve stability and accuracy across diverse document types Efficient for real-world deployment: Its smaller footprint makes it well suited for scalable OCR pipelines where cost, latency, and throughput matter Try it Build a high-throughput document ingestion pipeline using GLM-OCR for structured extraction across diverse document types. Imagine you are operating a customer onboarding platform that processes identity documents, invoices, and proof-of-income statements across multiple languages. GLM-OCR can be used to extract key fields—such as names, ID numbers, dates, and addresses—and output them in a consistent structured format for downstream systems. The model’s compact footprint makes it well suited for scaling high-volume OCR workflows, enabling you to process large batches of documents efficiently while maintaining accuracy across layouts like tables, forms, and mixed text-image content. Sample prompt: Extract the following fields from this document and return a structured JSON output: full name, ID number, date of birth, address, document type, and expiration date. Ensure all fields match the document exactly, including formatting. Getting started Whether you are coming straight from the Hugging Face hub or are already in Microsoft Foundry, deploying new open models is getting simpler. You can deploy models on Foundry by browsing the Hugging Face collection in the model catalog or you can choose "Deploy on Microsoft Foundry" on the Hugging Face website, which brings you straight into Foundry with secure, scalable inference already configured. Read the documentation to learn more: Read Hugging Face on Azure docs Learn about one-click deployments from the Hugging Face Hub on Microsoft Foundry Explore models in Microsoft Foundry436Views0likes0CommentsFoundry IQ: Improve recall by up to 54% with knowledge bases
Foundry IQ: Improve recall by up to 54% with knowledge bases. Foundry IQ (Azure AI Search) has improved its agentic retrieval engine resulting in better answer quality and improved token cost savings. We compared standalone retrieval tools to knowledge bases using the challenging BrowseComp-Plus benchmark and found: Replacing single-shot RAG with a knowledge base improves evidence recall by up to 46%. Combining a smaller agent model with agentic retrieval improves evidence recall by up to 54% while controlling costs and increasing agent responsiveness. In both cases, the amount of retrieval tool calls your agent makes is reduced, resulting in 34% token cost savings.1.5KViews3likes0CommentsResource Guide: Making Physical AI Practical for Real‑World Industrial Operations
Microsoft’s adaptive cloud approach enables organizations to turn operational technology (OT) data into intelligent actions, autonomously, without requiring everything to live in the cloud by unifying cloud-to-edge management plane, data plane, and intelligence platform. At the center of this approach are key foundational technologies: Key Purpose Offering Direct-to-cloud device management + telemetry ingestion Azure IoT Hub Industrial connectivity + edge data plane Azure IoT Operations Unified analytics + real-time intelligence Microsoft Fabric On-device AI inferencing runtime Foundry Local Microsoft Azure IoT Gartner winner: Microsoft named a Leader in the 2025 Gartner® Magic Quadrant™ for Global Industrial IoT Platforms See it all come together Before diving into each component, watch this end-to-end demo showing how Azure IoT Operations, Azure IoT Hub, Microsoft Fabric, and Foundry Local work as one stack across the edge-to-cloud lifecycle - Making industrial AI practical for real-world operations with adaptive cloud. How these components work together Azure IoT Operations and Azure IoT Hub collect real-time data from operational assets and send semantically-ready, modeled data to Microsoft Fabric, where it's contextualized with enterprise data for downstream analytics. Microsoft Foundry extends to the edge through Foundry Local, so the same tooling used to deploy and manage AI models in the cloud applies to edge use cases. All of it integrates into Azure Resource Manager, bringing OT devices, assets, and edge AI models into the same management and security paradigm as every other Azure-managed resource. This blog walks through where to get started with each product capability: 1. Manage Cloud-Connected Devices and Telemetry with Azure IoT Hub Azure IoT Hub is a fully managed cloud service that enables secure bidirectional communication, device-to-cloud telemetry ingestion, cloud-to-device command execution, per-device authentication, remote management and more. Telemetry from IoT Hub can also be routed downstream into analytics platforms like Microsoft Fabric for visualization or AI modeling. Recommended Usage: Devices that utilize IoT Hub are distributed, stand-alone devices with fixed-functions. These devices typically do not require cloud-managed containerized workloads or cloud-managed proximal industrial protocol connectivity. Examples of appropriate device-to-cloud IoT Hub endpoint devices include water monitoring stations, vehicle telematics, distributed fluid level sensors, etc. Resources Current in-market services overview: IoT Hub: What is Azure IoT Hub? - Azure IoT Hub DPS: Overview of Azure IoT Hub Device Provisioning Service - Azure IoT Hub Device Provisioning Service ADU: Introduction to Device Update for Azure IoT Hub Building scalable solutions with Azure IoT platform: Best practices for large-scale IoT deployments - Azure IoT Hub Device Provisioning Service Scale Out an Azure IoT Hub-based Solution to Support Millions of Devices - Azure Architecture Center Azure IoT Hub scaling Try out our preview of new IoT Hub capabilities (integration with Azure Device Registry and Certificate Management) Learn more about these capabilities on our blog post: Azure IoT Hub + Azure Device Registry (Preview Refresh): Device Trust and Management at Fleet Scale… Integration with Azure Device Registry (preview): Integration with Azure Device Registry (preview) - Azure IoT Hub Microsoft-backed X.509 certificate management (preview): What is Microsoft-backed X.509 Certificate Management (Preview)? - Azure IoT Hub How to start with the preview: Deploy IoT Hub with ADR integration and certificate management (Preview) - Azure IoT Hub 2. Connect Industrial Assets with Azure IoT Operations Azure IoT Operations provides a unified data plane for the edge that runs on Azure Arc–enabled Kubernetes clusters and supports open industrial standards. It allows organizations to connect and capture equipment telemetry, normalize OT data locally, route hot-path signals to real-time analytics, securely manage layered industrial networks, and more. Edge‑processed data can then be sent upstream to Microsoft Fabric for AI‑driven analysis. Recommended Usage: Azure IoT Operations is intended to be the data plane for an adaptive cloud deployment extending the management, data, and AI capabilities of the Microsoft cloud to an on-prem device. This device binds to these cloud planes providing a platform for local data processing and intermittent connectivity. The target for these devices range from a small-gateway-style PC to a full data center. Azure IoT Operations endpoints enable cloud-managed containerized workloads and cloud-managed proximal industrial protocol connectivity. Examples of appropriate adaptive cloud and Azure IoT Operations endpoints include, on-robot computers, industrial machine controllers, retail store sensor/vision processing, and top-of-factory site infrastructure for line of business applications. Resources Azure IoT Operations Overview Azure IoT Operations Documentation Hub Quickstart: explore-iot-operations/quickstart at main · Azure-Samples/explore-iot-operations Open-source framework for scaling robotics from simulation to production on Azure + NVIDIA: microsoft/physical-ai-toolchain Demo video showcasing this in action: Making industrial AI practical for real-world operations with adaptive cloud How we built the demo: explore-iot-operations/quickstart at main · Azure-Samples/explore-iot-operations Edge-AI: microsoft/edge-ai: Production-ready Infrastructure as Code, applications, pluggable components, and… Latest Announcements & Blogs Making Physical AI Practical for Real-World Industrial Operations: Part 1 | Microsoft Community Hub Making Physical AI Practical for Real-World Industrial Operations: Part 2 | Microsoft Community Hub Unlock Industrial Intelligence | Microsoft Hannover Messe 2026 From pilots to production: How Microsoft and partners are accelerating intelligent operations 3. Advanced Analytics with Microsoft Fabric Microsoft Fabric delivers a unified, end‑to‑end analytics platform that transforms streaming OT telemetry into real‑time insights and live dashboards. Fabric Operations Agents monitor industrial signals to recommend targeted actions, while Fabric IQ provides a shared semantic foundation that enables AI agents to reason over enterprise data with business context. Together, Fabric turns live industrial data into AI‑powered operational intelligence. Resources Get Started with Microsoft Fabric Learning Path Fabric Real-Time Intelligence documentation - Microsoft Fabric | Microsoft Learn Create and Configure Operations Agents - Microsoft Fabric | Microsoft Learn Fabric IQ documentation - Microsoft Fabric | Microsoft Learn 4.Run AI Models On‑Device with Foundry Local Foundry Local extends on‑device AI to Arc‑enabled Kubernetes edge clusters, providing a Microsoft‑validated inferencing layer for running AI models in industrial, disconnected or sovereign environments. Resources Foundry Local on Azure Local Documentation Participate in Foundry Local on Azure Local preview form Foundry Local on Azure Local: HELM deployment Demo Customer Stories Chevron: Chevron plans facilities of the future with Azure IoT Operations Husqvarna: Husqvarna Group Boosts Operational Efficiency with Azure Adaptive Cloud Ecopetrol: Azure IoT Operations and Azure IoT for energy help Ecopetrol optimize energy distribution while lowering operational costs P&G: Procter & Gamble cuts model deployment time up to 90% with Azure IoT Operations Toyota: Toyota Industries innovates its paint shop processes with Azure industrial AI and Azure IoT Hub801Views1like0CommentsFoundry IQ: New governance and enterprise AI security capabilities
Enterprise AI isn’t just about better retrieval—it’s about secure access to business‑critical content. Discover how Foundry IQ (Azure AI Search) enables governance, compliance, and private connectivity across agentic retrieval workflows. We are introducing the following features: - Incremental SharePoint permissions sync for indexed document content, SharePoint Lists and ASPX pages. - Purview sensitivity labels in Foundry IQ knowledge bases - Purview auditing for elevated admin queries - Private connectivity support between for Foundry IQ and Foundry resources via NSP604Views1like0CommentsResponsible Synthetic Data Creation for Fine-Tuning with RAFT Distillation
This blog will explore the process of crafting responsible synthetic data, evaluating it, and using it for fine-tuning models. We’ll also dive into Azure AI’s RAFT distillation recipe, a novel approach to generating synthetic datasets using Meta’s Llama 3.1 model and UC Berkeley’s Gorilla project.2.4KViews2likes0CommentsCloud Native Platforms: Evolve
Audience: Engineering leaders, platform architects, senior developers exploring how to operationalise AI in their teams Reading time: 8 minutes Series: Cloud Native Platforms. Build, Run, Evolve. This is Part 3 of 3. Cloud helped us scale infrastructure. AI is starting to do the same thing for the work around the code: the planning, the testing, the release communication, the incident triage, the writing that surrounds writing software. The conversation about AI in software has narrowed too quickly to "Copilot in the editor". The bigger story is happening across the lifecycle. Planning, design, development, testing, release, and operations are all being augmented at once. The platforms that adopt AI well are not the ones with the most usage. They are the ones with the clearest discipline around how it is used. This post is about that discipline. AI is changing how we engineer, not how we type AI is not changing how we write code. It is changing how we engineer software. Code generation is the surface. Underneath it, AI is reshaping the unit of leverage. The question is no longer how fast a developer can type. It is how well a workflow can be expressed as a reusable engineering asset. Six disciplines determine whether AI moves the needle on outcomes or just adds another tool to the stack. Figure 1. AI across the SDLC. Each phase has clear AI assist points and clear human-owned validations. The boundary is not negotiable. It is the design. 1. From assistance to augmentation Early AI tools focused on assisting individual developers. Code suggestions. Autocomplete. Quick refactors. The value was real but bounded by the editor. The shift now is into structured workflows that span the lifecycle. The unit of leverage is no longer a single suggestion. It is a sequence of actions executed reliably across phases. ("Agentic" later in this post means a system that makes its own next-step decisions inside guardrails. A workflow follows a fixed sequence; an agent chooses the path.) Code generation has become baseline, not differentiator Workflow generation is where the largest gains live Multi-step assistance with explicit human checkpoints Context that travels across tools, not just within one In practice The pattern that works: start with the single highest-volume writing task on the team (commit messages, code review comments, release notes, postmortem first drafts) and turn the AI assist for that task into a shared workflow rather than each individual's private trick. The cost is one engineer's afternoon documenting the workflow and the eval set. The return is that every engineer on the team inherits the work, and the task that used to consume an engineer's morning every two weeks becomes a background step in the release process. Workflow generation, not faster typing, is where the gains compound across a team. Code suggestions help one developer. Reusable workflows help the next ten. 2. AI across the SDLC, with guardrails AI now has a useful role at every phase of delivery. The role is different at each phase, and the guardrails are different too. Phase What AI helps with What humans must validate Plan Breaking down requirements, drafting acceptance criteria Domain context, business priorities, customer impact Build Code generation, refactoring, scaffolding Architectural fit, security boundaries, performance Test Test case generation, edge case discovery Coverage of business-critical paths, regulatory cases Release Release notes, changelog summaries, communication drafts Accuracy, tone, customer-facing claims Operate Log triage, incident summaries, runbook drafts Root cause attribution, action item ownership The guardrails are not optional decoration. They are the design. In practice The pattern that works: stage AI assists for release communication (changelog drafting, customer-facing release notes, internal release announcements) and require a human review before anything goes out. The draft arrives consistently, faster than a human could produce, and easier to compare across releases. The reviewer is not eliminated; the reviewer is moved from author to editor, which is where their judgment actually matters. Teams that adopt this pattern stop missing release-note deadlines and stop publishing inconsistent communication across products. 3. From prompts to reusable assets Many teams begin with prompt experimentation. Individuals find techniques that work for their tasks. The result is a patchwork of personal practices that do not survive a team change. The compounding value comes when prompts mature into reusable engineering assets. Figure 2. The maturity model from prompts to agents. The value compounds at the workflow stage and accelerates at the agent stage. The disciplines that make agents safe are the same ones that made workflows reliable. The maturity stages, in order of leverage: Prompts: ad-hoc, individual, hard to share Templates: parameterised prompts versioned with the project Workflows: multi-step sequences with clear inputs, outputs, checkpoints Agents: autonomous task chains operating within explicit guardrails The diagram is a maturity ladder, not a graduation. In practice teams operate at all four stages simultaneously for different tasks. A senior engineer may use a one-off prompt to explore a refactor, run a versioned template for commit messages, hand off to a workflow for release notes, and trigger an agent for routine PR triage, all in the same hour. The point of the ladder is not to leave earlier stages behind. It is to know which stage a given task belongs to and to invest accordingly. In practice The pattern that works: pick the three prompts your team uses every week, codify them as parameterised templates in the same repository as the application code, and treat them as engineering artefacts (reviewed, versioned, owned). New engineers inherit the team's accumulated practice instead of building their own from scratch. Quality becomes consistent because the variance between individuals shrinks. Investment pays back in weeks, not quarters, and the maturity ladder keeps producing returns as the team moves from templates to workflows to agents. 4. Agentic delivery, with guardrails that survive a security review The next stage is agentic. AI executes sequences of tasks within a defined scope. The risk is not that the agent will fail. It is that the system around the agent will not catch the failure, and that the failure modes are different in kind from traditional automation. Agents are non-deterministic, they can be manipulated through their inputs, and their actions can have side effects in systems the team does not own. Five guardrails make agentic delivery safe. The first four are necessary. The fifth is what carries the agent through a security review at a regulated enterprise. Identity and scope: the agent runs as a managed identity (or scoped service principal) with the smallest set of permissions that lets it do its job. Permissions are expressed as allowlists, not denylists. Tools fetched at runtime are subject to the same identity boundary as the agent itself. Input quarantine: anything the agent reads from a user-controlled source (work item bodies, PR descriptions, customer tickets) is treated as untrusted text. The agent does not execute instructions found in fetched content, and tool calls are validated against an output schema before execution. This is the prompt-injection mitigation, and it is the most common gap in agentic systems shipped today. Cost and blast-radius caps: every run has a maximum token budget, a maximum number of tool calls, and a maximum spend. Exceeding any cap aborts the run cleanly. Without caps, scoped credentials are not enough to bound the damage. Evaluations and traceability: agents are evaluated against a fixed test set before deployment, and on every prompt or model change. Every action is logged with inputs, outputs, the model and prompt versions used, and the reasoning trace where the model exposes one. Logs are redacted for secrets and personally identifiable information at write time. Reversibility taxonomy: actions are categorised by reversibility, not asserted to be reversible in general. A draft write to a private store is reversible. A post to a customer-facing channel is not reversible (deletion does not unsend). A database update may be reversible by a compensating transaction or not at all. Irreversible actions require human approval at the boundary, before they happen, not after. The agent is allowed to draft and stage. The human is the only one who is allowed to make the move that cannot be undone. In practice The pattern that works: start with one low-risk agent (release-notes drafter, PR triage assistant) running on read-only inputs, write-only-to-drafts permissions, and a hard cost cap per run. Require explicit human approval at the irreversible step. Wire up an evaluation set on day one, and rerun it on every prompt or model change. Treat regressions as failures, not warnings. The first agent the team ships is rarely the most valuable; it is the rehearsal that establishes the controls every later agent inherits. Teams that skip this rehearsal end up with an agent in production that no one feels safe extending. Implementation note An agent without a reversibility taxonomy and a regression eval set is a liability. The discipline is the same one that made workflows reliable: scoped identity, idempotency, traceability, and a clear boundary between machine action and human decision. The YAML below is illustrative, not a runtime contract; it is meant to show the shape of the controls a real agent definition would carry, not the syntax of any specific platform. # Agent run definition (illustrative; not a specific platform's syntax) name: release-notes-drafter trigger: pre-release identity: type: managed-identity scope: tenant=<tenant-id> resource=release-tools/<app-id> permissions: allow: - read: work-items in milestone (filter: state=Done) - read: pull-requests in milestone (filter: merged) - write: drafts/release-notes/${run-id} # Production channels are NOT in the allowlist. The agent cannot post. limits: max_tokens_per_run: 80000 max_tool_calls_per_run: 20 max_runtime_seconds: 300 max_cost_usd: 0.40 on_exceeded: abort_with_partial_artifact input_handling: treat_fetched_content_as: untrusted # Indirect prompt injection is mitigated by the layered discipline below, # not by a single feature flag. Each item is a separate control. enforce_instruction_hierarchy: true validate_tool_args_against_schema: true validate_outputs_against_schema: true steps: - fetch: completed work items in milestone - draft: release notes from items - validate: required fields present - request-review: from: release-manager idempotency_key: ${milestone-id}-${draft-hash} - on-approval: action: post-to-internal-channel reversibility: not-reversible requires: explicit-human-click # the agent does NOT click this audit: log_inputs: true log_outputs: true redact: - secrets # Pattern-based: handles structured PII like emails, phones, IDs. - pii_patterns: [email, phone, national-id, payment-card, ip-address] # Entity-based: required for unstructured PII like names. Pattern alone # cannot redact a customer name without an entity-recognition step. - pii_entities: ner-based # names, locations, organisations retain: 365_days # tune to your audit policy, not to the demo evaluation: test_set: tests/release-notes/eval-v3.jsonl on_prompt_change: rerun on_model_change: rerun fail_threshold: 5_percent_regression 5. Where AI still needs human judgment AI has clear boundaries. The boundaries are not embarrassing. They are the design. What must stay human-owned: Architectural trade-offs and design decisions Security validation and threat modelling Correctness for business-critical and regulatory paths Domain context that has not been written down Accountability for outcomes, not just outputs The goal is collaboration, not replacement. The teams that get the most value from AI are not the ones with the most automation. They are the ones with the clearest sense of where automation ends and judgment begins. In practice The pattern that works: name the human-owned items explicitly in the team's working agreement (architecture, security, regulatory correctness, accountability) and audit every AI workflow against that list. When a workflow asks the AI to make a decision in any of those categories, redesign it so the AI prepares the analysis and a human makes the call. Most teams over-trust AI for one of these areas in their first six months and learn the hard way. Naming the boundary up front prevents the lesson from being paid in production. The clarity is the value; the model behind the workflow is interchangeable. 6. Responsible AI is engineering work The first five disciplines decide whether AI moves the needle. The sixth decides whether the platform can defend the choices it makes with AI. Responsible AI is the engineering practice of building systems whose AI behaviour is fair, transparent, accountable, and safe by design, not by audit after the fact. Treating it as a compliance checkbox at the end of the project is how teams end up shipping AI workflows that fail security review, embarrass the company, or harm users. Six controls turn responsible AI from a policy into engineering work. These map directly onto the practices Microsoft and the broader industry have converged on, but the names matter less than the practice they enable. Fairness in inputs and outputs. The training data, eval set, and prompts are reviewed for systematic bias against any group the system serves. The eval set covers under-represented cases by design, not by accident, and regressions on those cases fail the build. Transparency to end users. When a user sees AI-generated content, they are told. When a decision is AI-assisted, the path from input to output is explainable in plain language, not just in a model card buried in documentation. Content safety filters. Inputs and outputs pass through safety classifiers (prompt injection, prohibited content, jailbreak patterns) before reaching the model and before reaching the user. Filtering decisions are logged and reviewable. Accountability ownership. Every AI workflow has a named owner who is accountable for its outcomes, not just its uptime. The owner has the authority to pause or roll back the workflow when harm is detected. Data minimisation and residency. The AI sees only the data it needs to do the task. Personally identifiable information and customer data are scoped, redacted, and kept inside the boundary the customer agreed to. Cross-tenant leakage is treated as a P1 incident, not a feature request. Harm evaluation alongside quality evaluation. The eval set measures harm potential (toxicity, hallucination on factual queries, leakage of confidential context) with the same rigour as it measures correctness. Both must pass for a release to ship. Figure 3. Responsible AI as a set of engineering controls around the AI workflow. The six controls fall into four categories: data discipline (fairness, data minimisation), model discipline (content safety, harm evaluation), deployment discipline (transparency to users), and governance (accountability ownership). All six are necessary; none is sufficient on its own. In practice The pattern that works: write the responsible AI plan before the first agent ships, not after the first incident. Pick one workflow that touches user data or generates customer-facing content, and use it as the reference implementation: fairness review on the eval set, content safety filters wrapping the model call, transparency annotation in the UI, redaction of identifying details in logs, harm evals running alongside quality evals on every change, and a named owner with explicit pause authority. The first such workflow takes longer to ship than the unconstrained version. Every workflow after it inherits the controls and ships faster than it would have without them. Teams that defer responsible AI to a future quarter end up retrofitting it under pressure, which is the most expensive way to do it. A scenario that ties it together Picture a platform team several months into using Copilot. Adoption is high. Productivity dashboards show gains. But defect rates are not improving and lead time is flat. Leadership asks the obvious question: is AI actually helping, or just feeling like help? The answer is not to stop using AI. It is to change how AI is measured. Move adoption metrics to the background. Move outcome metrics to the front: defect escape rate, lead time for change, change failure rate, mean time to recovery. In parallel, promote the individual prompts that have proved themselves to shared templates, and the templates to versioned workflows. Retrofit responsible AI controls onto the workflows that shipped first: content safety filters, harm evaluations alongside quality evaluations, transparency annotations on customer-facing output, and a named owner for each workflow. Six months later, the picture is different. Defect rate improves on the parts of the codebase where reusable workflows were introduced. Onboarding for new engineers is visibly faster. Release notes are consistent across teams. The shift is from celebrating use to tracking outcomes, and once the team measures what matters, the tooling decisions start making themselves. What teams get wrong The common pattern is measuring AI by usage, not by outcome. Adoption metrics tell you who tried Copilot. They do not tell you whether defects dropped, lead time improved, or release notes got better. The fix is not less AI. It is better measurement. The four metrics named in the scenario above (defect escape rate, lead time for change, change failure rate, mean time to recovery) come from the DORA research on software delivery performance and have become a useful default. Two warnings travel with them. First, attribution is hard: an AI workflow rolled out alongside a test refactor and a CI pipeline change cannot claim credit cleanly. Second, baselines matter more than headlines: a single quarter's improvement is not a trend, and a single team's gain is not the platform's gain. Outcome measurement done well needs a baseline window, an attribution discipline, and a kill criterion for workflows that are not paying back. Done poorly, it is just adoption metrics with better names. There is also the question of cost. AI usage carries a per-run token bill, an evaluation bill on every change, and (for agents) a cost cap that limits damage when something goes wrong. None of these are large compared to the engineering time saved when the workflow works. All of them are visible enough that a finance-aware reader will ask. Track them. Where to start The most concrete starter from this post: promote one personal prompt to a shared template. Pick the prompt that gets used most often (commit messages, code reviews, release notes, debugging assist), move it from someone's notes into the repository where the team versions everything else, and watch what changes when the next person on the team runs it. That is the smallest unit of the workflow shift this post argues for, and it is the step where prompts stop being individual practice and start becoming engineering assets. The shift The shift is from building systems to building smarter systems: AI does not replace engineers. It changes what an engineer's leverage looks like. The unit of value is the workflow, not the suggestion. The discipline that made platforms operable is the same discipline that makes AI useful. Responsible AI is not a compliance step. It is the sixth engineering discipline that lets the other five compound safely. The series ends here, but the arc is consistent across all three posts. The disciplines that make platforms scale are the same disciplines that make AI useful. Build with discipline. Run with discipline. Evolve with discipline. The tools change. The disciplines do not. Want to discuss? Where has AI moved the needle most in your delivery, and where has it disappointed you? Drop a comment with patterns you have seen in your environment. Every reply gets read. Previously in this series: Building Cloud Native Platforms That Scale: Patterns That Actually Work. Part 1 covered the design choices that make scale possible. Running Cloud Native Platforms: Why Day 2 Decides Everything. Part 2 covered the operational disciplines that decide production outcomes. This is the third and final post in the series.OneDrive Photos Restyle with AI-now rolling out on mobile and web
Photos capture real moments. With AI Restyle in OneDrive, you can reimagine them in fresh new styles-right where your photos already live. Meet AI Restyle Photos capture real moments. With AI Restyle in OneDrive, you can reimagine those moments in expressive new styles-right where your photos already live. With just a tap, transform everyday photos into cinematic posters, hand‑painted artwork, pencil sketches, anime‑inspired scenes, and more. Choose a style, watch a new version appear in seconds, and keep exploring until it feels just right. Through it all, the people, places, and memories you care about stay unmistakably yours-just seen in a fresh new light. Your photos stay private When you use AI Restyle in OneDrive, your photos remain under your control and are processed only to generate the style you choose. For more information on how AI Restyle works, its intended uses, and limitations, see Transparency note for AI Restyle in OneDrive - Microsoft Support. What you can do with AI Restyle Create something beautiful instantly. Choose from a rotating set of one‑tap styles designed to match the content of your photo-so it’s easy to get a great result right away. New styles are added regularly, giving you fresh ways to reimagine your photos. Add a personal touch when you want. Include an optional prompt to guide the look-no design skills required. Explore until it feels right. Try multiple restyles, undo or redo changes, and keep experimenting until you find the look you love. Share in just a few taps. Go from viewing to restyling to sharing with your favourite apps-without ever leaving OneDrive Photos. Availability AI Restyle is rolling out on OneDrive for iOS, Android, and web for customers with a Microsoft 365 Premium subscription. Availability may vary by region as rollout continues. What’s next We’re continuing to expand AI-powered photo experiences in OneDrive-bringing AI Restyle to additional platforms and investing in new editing capabilities that help you create with confidence while keeping your photos authentic. Try it today Open OneDrive on iOS, Android, or web, sign in with a Microsoft 365 Premium account, open a photo, and tap on ‘AI Restyle’ to start exploring new styles. Have fun creating something new today! Try it on the OneDrive mobile app. iOS: Download Microsoft OneDrive from the App Store Android: Download Microsoft OneDrive from Google Play We’d love your feedback-use 👍👎 to help us improve AI Restyle. #Microsoft #OneDrive #Photos #iOS #Android #Web #AI * This blog was updated on April 7, 2026 to inform how AI Restyle in OneDrive protects users’ privacy and ensures their photos remain secure and under their control.1.8KViews1like1CommentFrom Prompt to Production: Building Azure Architecture Diagrams with AI
Author: Arturo Quiroga, Senior Partner Solutions Architect — Microsoft Cloud architects spend significant time translating ideas into architecture diagrams. They toggle between Visio, draw.io, pricing calculators, and documentation. According to the 2024 Stack Overflow Developer Survey, 61% of developers spend more than 30 minutes a day searching for answers or solutions, time lost to context-switching rather than design. What if you could describe your architecture in plain English and get a diagram, cost estimate, and deployment guide in minutes? The Challenge: Fragmented Architecture Workflows Designing Azure architectures today typically involves multiple disconnected steps: Sketch the architecture in a diagramming tool Look up official Azure icons and drag them into place Research pricing across regions using the Azure Pricing Calculator Validate the design against the Well-Architected Framework (WAF) Write deployment documentation and Infrastructure as Code templates Compare alternative designs manually Each step lives in a different tool, and keeping them in sync as designs evolve is costly. The Azure Architecture Diagram Builder brings these workflows together in a single browser-based experience. How It Works Describe your architecture in natural language, for example "A HIPAA-compliant healthcare platform with FHIR APIs, event-driven processing, and multi-region disaster recovery", and the AI generates a diagram with grouped services, data flow connections, and logical organization. Figure 1. Enter a natural-language prompt describing your architecture. Curated example prompts help you get started, and you can optionally upload an existing diagram for the AI to analyze. The tool uses Azure OpenAI to power generation across multiple models, enabling you to choose the model that best fits your scenario — from fast iterations to deeper reasoning. Key Features AI-Powered Architecture Generation Describe what you need in plain English, and the AI creates an architecture diagram with: 714 official Azure service icons across 29 categories Smart grouping: services are logically organized (Frontend, Backend, Data, Security) Data flow connections: labeled edges showing how data moves through the system 13 curated example prompts: from simple web apps to complex enterprise scenarios like Zero Trust networks, Industrial IoT with 5,000+ sensors, and global multiplayer gaming backends Figure 2. A generated industrial IoT architecture. Top: the clean diagram view as initially produced. Bottom: the same diagram with per-service monthly cost overlays toggled on, plus a running subscription total in the toolbar. Architecture Image Import Already have an architecture on a whiteboard or in a screenshot? Upload the image and let the AI analyze it, mapping services to official Azure icons and recreating the architecture as an editable, interactive diagram. Figure 3. Upload a photo of a whiteboard sketch (top-right reference panel) and the AI recreates it as an editable diagram with official Azure service icons and labeled data flow connections. ARM Template Import Import existing ARM templates to visualize your current infrastructure. The AI parses resource definitions and dependencies, groups related resources into logical layers, and produces a meaningful diagram of what you actually have deployed — a fast way to document an inherited environment or sanity-check a template before deployment. Figure 4. ARM template import in action. Top: the parser status banner while resources and dependencies are being analyzed. Bottom: the resulting diagram, with resources auto-grouped into logical layers (Web Tier, Data Layer, Container Platform, Observability & Logging) and a Generated from: ARM Template badge linking the diagram back to its source file. Well-Architected Framework Validation Validate your architecture against all five WAF pillars — Security, Reliability, Performance Efficiency, Cost Optimization, and Operational Excellence. The validator provides: An overall WAF score with pillar-level breakdowns Specific findings with severity levels Actionable recommendations you can select and apply Select the recommendations you agree with, and the AI regenerates an improved architecture incorporating those changes. Figure 5. WAF validation results showing the overall score, per-pillar breakdowns, and individual findings with severity badges. Tick the recommendations you want and the AI rebuilds the diagram with those changes applied. Multi-Model Comparison Run the same architecture prompt through multiple AI models side-by-side and compare: Architecture Comparison: service counts, connection counts, groups, token usage, and latency Validation Comparison: WAF scores across models, severity breakdowns, and finding counts Apply Winner: pick the best result and apply it to the canvas with one click Present Critique: a talking avatar narrates the AI-generated ranking with live closed captions Figure 6. Multi-model comparison. Top: select the models and reasoning effort, then enter the prompt. Bottom: side-by-side results across all selected models with service counts, latency, token usage, and Fastest / Cheapest / Most Thorough badges. Multi-Region Cost Estimation Get cost estimates from the Azure Retail Prices API across 8 Azure regions: East US 2, Australia East, Canada Central, Brazil South, Mexico Central, West Europe, Sweden Central, and Southeast Asia. Features include: Color-coded cost legend (green / yellow / red thresholds) SKU and tier information for each service Export options: CSV, JSON, plain-text summary, and an analysis report with top cost drivers, Reserved Instance flags, and a ranked multi-region comparison table Figure 7. The cost legend overlay shows per-service pricing with color-coded thresholds. The region selector in the toolbar lets you re-price the entire architecture in any of eight Azure regions. Deployment Guide Generation with Bicep Generate step-by-step deployment documentation including: Prerequisites and Azure resource requirements Step-by-step deployment instructions Bicep templates for each service (Infrastructure as Code) Post-deployment verification steps Security configuration recommendations Figure 8. Each generated Deployment Guide opens with the architecture name, an estimated deployment time, and a prerequisites checklist covering subscription roles, CLI versions, Microsoft Entra ID permissions, and region requirements, followed by numbered, copy-ready deployment steps. Figure 9. The Infrastructure as Code section produces a main.bicep orchestrator plus a per-service module (Log Analytics, Key Vault, Cosmos DB, SQL Database, Event Hubs, Azure Functions, and more). The Download All Templates button packages everything into a ready-to-deploy folder. Workflow Animation & Avatar Presenter Visualize how data flows through your architecture with step-by-step animations that highlight services on the canvas as each step plays. When the Azure Speech Service is configured, a photorealistic talking avatar can narrate the workflow or present model comparison results, with live word-by-word closed captions in a draggable, resizable panel. Figure 10. A workflow step is highlighted on the canvas as the Avatar Presenter narrates that step. Live word-by-word closed captions appear in a draggable, resizable panel, useful for accessibility and stakeholder demos. Export Options Figure 11. A single-slide PowerPoint export, available in dark or light theme, ready to drop straight into a stakeholder deck. Format Use Case PNG Documentation, presentations SVG Scalable vector graphics PPTX Single PowerPoint slide (dark or light theme) Draw.io Edit in diagrams.net JSON Backup, version control CSV / ZIP Cost analysis with multi-region comparison Highlights The Azure Architecture Diagram Builder unifies the architecture design lifecycle in a single tool: End-to-end workflow: from natural-language description to deployable Bicep templates without tool switching Official Azure icons: 714 icons across 29 categories, mapped directly from the Azure service catalog Live pricing: queries the Azure Retail Prices API at design time rather than relying on static estimates WAF-integrated validation: architectural best practices built into the design loop rather than applied after the fact Multi-model flexibility: choose the AI model that best suits each task, with fast models for iteration and reasoning models for complex designs Open source: the source code is available for customization and contribution One-Command Deploy with Azure Developer CLI The fastest way to get your own instance running is with azd : # Install azd (once) brew tap azure/azd && brew install azd # macOS winget install microsoft.azd # Windows # Clone, configure, and deploy git clone https://github.com/Arturo-Quiroga-MSFT/azure-architecture-diagram-builder cd azure-architecture-diagram-builder azd auth login azd env set AZURE_OPENAI_ENDPOINT "https://your-resource.openai.azure.com/" azd env set AZURE_OPENAI_API_KEY "your-key" azd up # Provisions infrastructure + builds + deploys (~8 min) azd up provisions the following via Bicep: Resource Purpose Azure Container Registry Stores the Docker image Azure Container Apps Runs the app (nginx + token server) Log Analytics + Application Insights Monitoring and telemetry Azure Speech (S0) Avatar Presenter (optional, keyless auth via managed identity) Try It Today The Azure Architecture Diagram Builder is available now: Live demo: https://aka.ms/diagram-builder Source code: GitHub repository Documentation: See the Getting Started Guide for detailed setup instructions We welcome feedback and contributions. Use the GitHub Issues page to report bugs, suggest features, or share your experience. Tags: artificial intelligence · application · apps & devops · well architected · infrastructure1.4KViews1like0Comments