well architected

122 Topics

Deploy PostgreSQL on Azure VMs with Azure NetApp Files: Production-Ready Infrastructure as Code
PostgreSQL is a popular open‑source cloud database for modern web applications and AI/ML workloads, and deploying it on Azure VMs with high‑performance storage should be simple. In practice, however, using Azure NetApp Files requires many coordinated steps—from provisioning networking and storage to configuring NFS, installing and initializing PostgreSQL, and maintaining consistent, secure, and high‑performance environments across development, test, and production. To address this complexity, we’ve built production‑ready Infrastructure as Code templates that fully automate the deployment, from infrastructure setup to database initialization, ensuring PostgreSQL runs on high‑performance Azure NetApp Files storage from day one.
GeertVanTeylingen
Jan 15, 2026 Place Azure Architecture Blog
157Views
1like
0Comments
Unlocking Advanced Data Analytics & AI with Azure NetApp Files object REST API
Azure NetApp Files object REST API enables object access to enterprise file data stored on Azure NetApp Files, without copying, moving, or restructuring that data. This capability allows analytics and AI platforms that expect object storage to work directly against existing NFS based datasets, while preserving Azure NetApp Files’ performance, security, and governance characteristics.
GeertVanTeylingen
Jan 15, 2026 Place Azure Architecture Blog
137Views
0likes
0Comments
What's New with Azure NetApp Files VS Code Extension
The latest update to the Azure NetApp Files (ANF) VS Code Extension introduces powerful enhancements designed to simplify cloud storage management for developers. From multi-tenant support to intuitive right-click mounting and AI-powered commands, this release focuses on improving productivity and streamlining workflows within Visual Studio Code. Explore the new features, learn how they accelerate development, and see why this extension is becoming an essential tool for cloud-native applications.
GeertVanTeylingen
Jan 15, 2026 Place Azure Architecture Blog
128Views
0likes
0Comments
Azure Course Blueprints
Each Blueprint serves as a 1:1 visual representation of the official Microsoft instructor‑led course (ILT), ensuring full alignment with the learning path. This helps learners: see exactly how topics fit into the broader Azure landscape, map concepts interactively as they progress, and understand the “why” behind each module, not just the “what.” Formats Available: PDF · Visio · Excel · Video Every icon is clickable and links directly to the related Learn module. Layers and Cross‑Course Comparisons For expert‑level certifications like SC‑100 and AZ‑305, the Visio Template+ includes additional layers for each associate-level course. This allows trainers and students to compare certification paths at a glance: 🔐 Security Path SC‑100 side‑by‑side with SC‑200, SC‑300, AZ‑500 🏗️ Infrastructure & Dev Path AZ‑305 alongside AZ‑104, AZ‑204, AZ‑700, AZ‑140 This helps learners clearly identify: prerequisites, skill gaps, overlapping modules, progression paths toward expert roles. Because associate certifications (e.g., SC‑300 → SC‑100 or AZ‑104 → AZ‑305) are often prerequisites or recommended foundations, this comparison layer makes it easy to understand what additional knowledge is required as learners advance. Azure Course Blueprints + Demo Deploy Demos are essential for achieving end‑to‑end understanding of Azure. To reduce preparation overhead, we collaborated with Peter De Tender to align each Blueprint with the official Trainer Demo Deploy scenarios. With a single click, trainers can deploy the full environment and guide learners through practical, aligned demonstrations. https://aka.ms/DemoDeployPDF Benefits for Students 🎯 Defined Goals Learners clearly see the skills and services they are expected to master. 🔍 Focused Learning By spotlighting what truly matters, the Blueprint keeps learners oriented toward core learning objectives. 📈 Progress Tracking Students can easily identify what they’ve already mastered and where more study is needed. 📊 Slide Deck Topic Lists (Excel) A downloadable .xlsx file provides: a topic list for every module, links to Microsoft Learn, prerequisite dependencies. This file helps students build their own study plan while keeping all links organized. Download links Associate Level PDF - Demo Visio Contents AZ-104 Azure Administrator Associate R: 12/14/2023 U: 12/17/2025 Blueprint Demo Video Visio Excel AZ-204 Azure Developer Associate R: 11/05/2024 U: 12/17/2025 Blueprint Demo Visio Excel AZ-500 Azure Security Engineer Associate R: 01/09/2024 U: 10/10/2024 Blueprint Demo Visio+ Excel AZ-700 Azure Network Engineer Associate R: 01/25/2024 U: 12/17/2025 Blueprint Demo Visio Excel SC-200 Security Operations Analyst Associate R: 04/03/2025 U:04/09/2025 Blueprint Demo Visio Excel SC-300 Identity and Access Administrator Associate R: 10/10/2024 Blueprint Demo Excel Specialty PDF Visio AZ-140 Azure Virtual Desktop Specialty R: 01/03/2024 U: 12/17/2025 Blueprint Demo Visio Excel Expert level PDF Visio AZ-305 Designing Microsoft Azure Infrastructure Solutions R: 05/07/2024 U: 12/17/2025 Blueprint Demo Visio+ AZ-104 AZ-204 AZ-700 AZ-140 Excel SC-100 Microsoft Cybersecurity Architect R: 10/10/2024 U: 04/09/2025 Blueprint Demo Visio+ AZ-500 SC-300 SC-200 Excel Skill based Credentialing PDF AZ-1002 Configure secure access to your workloads using Azure virtual networking R: 05/27/2024 Blueprint Visio Excel AZ-1003 Secure storage for Azure Files and Azure Blob Storage R: 02/07/2024 U: 02/05/2024 Blueprint Excel Subscribe if you want to get notified of any update like new releases or updates. Author: Ilan Nyska, Microsoft Technical Trainer My email ilan.nyska@microsoft.com LinkedIn https://www.linkedin.com/in/ilan-nyska/ I’ve received so many kind messages, thank-you notes, and reshares — and I’m truly grateful. But here’s the reality: 💬 The only thing I can use internally to justify continuing this project is your engagement — through this survey https://lnkd.in/gnZ8v4i8 ___ Benefits for Trainers: Trainers can follow this plan to design a tailored diagram for their course, filled with notes. They can construct this comprehensive diagram during class on a whiteboard and continuously add to it in each session. This evolving visual aid can be shared with students to enhance their grasp of the subject matter. Explore Azure Course Blueprints! | Microsoft Community Hub Visio stencils Azure icons - Azure Architecture Center | Microsoft Learn ___ Are you curious how grounding Copilot in Azure Course Blueprints transforms your study journey into smarter, more visual experience: 🧭 Clickable guides that transform modules into intuitive roadmaps 🌐 Dynamic visual maps revealing how Azure services connect ⚖️ Side-by-side comparisons that clarify roles, services, and security models Whether you're a trainer, a student, or just certification-curious, Copilot becomes your shortcut to clarity, confidence, and mastery. Navigating Azure Certifications with Copilot and Azure Course Blueprints | Microsoft Community Hub
Ilan_Nyska
Jan 14, 2026 Place Azure Architecture Blog
33KViews
14likes
17Comments
Cross-Region Zero Trust: Connecting Power Platform to Azure PaaS across different regions
In the modern enterprise cloud landscape, data rarely sits in one place. You might face a scenario where your Power Platform environment (Dynamics 365, Power Apps, or Power Automate) is hosted in Region A for centralized management, while your sensitive SQL Databases or Storage Accounts must reside in Region B due to data sovereignty, latency requirements, or legacy infrastructure. Connecting these two worlds usually involves traversing the public internet - a major "red flag" for security teams. The Missing Link in Cloud Security When we talk about enterprise security, "Public Access: Disabled" is the holy grail. But for Power Platform architects, this setting is often followed by a headache. The challenge is simple but daunting: How can a Power Platform Environment (e.g., in Region A) communicate with an Azure PaaS service (e.g., Storage or SQL in Region B) when that resource is completely locked down behind a Private Endpoint? Existing documentation usually covers single-region setups with no firewalls. This post details a "Zero Trust" architecture that bridges this gap. This is a walk through for setting up a Cross-Region Private Link that routes traffic from the Power Platform in Region A, through a secure Azure Hub, and down the Azure Global Backbone to a Private Endpoint in Region B, without a single packet ever touching the public internet. 1. Understanding the Foundation: VNet Support Before we build, we must understand what moves: Power Platform VNet integration is an "Outbound" technology. It allows the platform to connect to data sources secured within an Azure Virtual Network and "inject" its traffic into your Virtual Network, without needing to install or manage an on-premises data gateway. According to Microsoft's official documentation, this integration supports a wide range of services: Dataverse: Plugins and Virtual Tables. Power Automate: Cloud Flows using standard connectors. Power Apps: Canvas Apps calling private APIs. This means once the "tunnel" is built, your entire Power Platform ecosystem can reach your private Azure universe. Virtual Network support overview – Power Platform | Microsoft Learn 2. The Architecture: A Cross-Region Global Bridge Based on the Hub-and-Spoke topology, this architecture relies on four key components working in unison: Source (Region A): The Power Platform environment utilizes VNet Injection. This injects the platform's outbound traffic into a dedicated, delegated subnet within your Region A Spoke VNet. The Hub: A central VNet containing an Azure Firewall. This acts as the regional traffic cop and DNS Proxy, inspecting traffic and resolving private names before allowing packets to traverse the global backbone. The Bridge (Global Backbone): We utilize Global VNet Peering to connect Region A to the Region B Spoke. This keeps traffic on Microsoft's private fiber backbone. Destination (Region B): The Azure PaaS service (e.g. Storage Account) is locked down with Public Access Disabled. It is only accessible via a Private Endpoint. The Architecture: Visualizing the Flow As illustrated in the diagram below, this solution separates the responsibilities into two distinct layers: the Network Admin (Azure Infrastructure) and the Power Platform Admin (Enterprise Policy). 3. The High Availability Constraint: Regional Pairs A common pitfall of these deployments is configuring only a single region. Power Platform environments are inherently redundant. In a geography like Europe, your environment is actually hosted across a Regional Pair (e.g., West Europe and North Europe). Why? If one Azure region in the pair experiences an outage, your Power Platform environment will failover to the second region. If your VNet Policy isn't already there, your private connectivity will break. To maintain High Availability (HA) for your private tunnel, your Azure footprint must mirror this: Two VNets: You must create a Virtual Network in each region of the pair. Two Delegated Subnets: Each VNet requires a subnet delegated specifically to Microsoft.PowerPlatform/enterprisePolicies. Two Network Policies: You must create an Enterprise Policy in each region and link both to your environment to ensure traffic flows even during a regional failover. Ensure your Azure subscription is registered for the Microsoft.PowerPlatform resource provider by running the SetupSubscriptionForPowerPlatform.ps1 script. 4. Solving the DNS Riddle with Azure Firewall In a Hub-and-Spoke model, peering the VNets is only half the battle. If your Power Platform environment in Region A asks for mystorage.blob.core.windows.net, it will receive a public IP by default, and your connection will be blocked. To fix this, we utilize the Azure Firewall as a DNS Proxy: Link the Private DNS Zone: Ensure your Private DNS Zones (e.g., privatelink.blob.core.windows.net) are linked to the Hub VNet. Enable DNS Proxy: Turn on the DNS Proxy feature on your Azure Firewall. Configure Custom DNS: Set the DNS servers of your Spoke VNets (Region A) to the Firewall’s Internal IP. Now, the DNS query flows through the Firewall, which "sees" the Private DNS Zone and returns the Private IP to the Power Platform. 5. Secretless Security with User-Assigned Managed Identity Private networking secures the path, but identity secures the access. Instead of managing fragile Client Secrets, we use User-Assigned Managed Identity (UAMI). Phase A: The Azure Setup Create the Identity: Generate a User-Assigned Managed Identity in your Azure subscription. Assign RBAC Roles: Grant this identity specific permissions on your destination resource. For example, assign the Storage Blob Data Contributor role to allow the identity to manage files in your private storage account. Phase B: The Power Platform Integration To make the environment recognize this identity, you must register it as an Application User: Navigate to the Power Platform Admin Center. Go to Environments > [Your Environment] > Settings > Users + permissions > Application users. Add a new app and select the Managed Identity you created in Azure. 6. Creating Enterprise Policy using PowerShell Scripts One of the most important things to realize is that Enterprise Policies cannot be created manually in the Azure Portal UI. They must be deployed via PowerShell or CLI. While Microsoft provides a comprehensive official GitHub repository with all the necessary templates, it is designed to be highly modular and granular. This means that to achieve a High Availability (HA) setup, an admin usually needs to execute deployments for each region separately and then perform the linking step. To simplify this workflow, I have developed a Simplified Scripts Repository on my GitHub. These scripts use the official Microsoft templates as their foundation but add an orchestration layer specifically for the Regional Pair requirement: Regional Pair Automation: Instead of running separate deployments, my script handles the dual-VNet injection in a single flow. It automates the creation of policies in both regions and links them to your environment in one execution. Focused Scenarios: I’ve distilled the most essential scripts for Network Injection and Encryption (CMK), making it easier for admins to get up and running without navigating the entire modular library. The Goal: To provide a "Fast-Track" experience that follows Microsoft's best practices while reducing the manual steps required to achieve a resilient, multi-region architecture. Owning the Keys with Encryption Policies (CMK) While Microsoft encrypts Dataverse data by default, many enterprise compliance standards require Customer-Managed Keys (CMK). This ensures that you, not Microsoft, control the encryption keys for your environments. - Manage your customer-managed encryption key - Power Platform | Microsoft Learn Key Requirements: Key Vault Configuration: Your Key Vault must have Purge Protection and Soft Delete enabled to prevent accidental data loss. The Identity Bridge: The Encryption Policy uses the User-Assigned Managed Identity (created in Step 5) to authenticate against the Key Vault. Permissions: You must grant the Managed Identity the Key Vault Crypto Service Encryption User role so it can wrap and unwrap the encryption keys. 7. The Final Handshake: Linking Policies to Your Environment Creating the Enterprise Policy in Azure is only the first half of the process. You must now "inform" your Power Platform environment that it should use these policies for its outbound traffic and identity. Linking the Policies to Your Environment: For VNet Injection: In the Admin Center, go to Security > Data and privacy > Azure Virtual Network Policies. Select your environment and link it to the Network Injection policies you created. For Encryption (CMK): Go to Security > Data and privacy > Customer-managed encryption Key. Add the Select the Encryption Enterprise Policy -Edit Policy - Add Environment. Crucial Step: You must first grant the Power Platform service "Get", "List", "Wrap" and "Unwrap" permissions on your specific key within Azure Key Vault before the environment can successfully validate the policy. Verification: The "Smoking Gun" in Log Analytics After successfully reaching a Resource from one of the power platform services you can check if the connection was private. How do you prove its private? Use KQL in Azure Log Analytics to verify the Network Security Perimeter (NSP) ID. The Proof: When you see a GUID in the NetworkPerimeter field, it is cryptographic evidence that the resource accepted the request only because it arrived via your authorized private bridge. In Azure Portal - Navigate to your Resource for example KeyVault - Logs - Use the following KQL: AzureDiagnostics | where ResourceProvider == "MICROSOFT.KEYVAULT" | where OperationName == "KeyGet" or OperationName == "KeyUnwrap" | where ResultType == "Success" | project TimeGenerated, OperationName, VaultName = Resource, ResultType, CallerIP = CallerIPAddress, EnterprisePolicy = identity_claim_xms_mirid_s, NetworkPerimeter = identity_claim_xms_az_nwperimid_s | sort by TimeGenerated desc Result: By implementing the Network, and Encryption Enterprise policy you transition the Power Platform from a public SaaS tool into a fully governed, private extension of your Azure infrastructure. You no longer have to choose between the agility of low-code and the security of a private cloud. To summarize the transformation from public endpoints to a complete Zero Trust architecture across regions, here is the end-to-end workflow: PHASE 1: Azure Infrastructure Foundation Create Network Fabric (HA): Deploy VNets and Delegated Subnets in both regional pairs. Deploy the Hub: Set up the Central Hub VNet with Azure Firewall. Connect Globally: Establish Global VNet Peering between all Spokes and the Hub. Solve DNS: Enable DNS Proxy on the Firewall and link Private DNS Zones to the Hub VNet. ↓ PHASE 2: Identity & Security Prep Create Identity: Generate a User-Assigned Managed Identity (UAMI). Grant Access (RBAC): Give the UAMI permissions on the target PaaS resource (e.g., Storage Contributor). Prepare CMK: Configure Key Vault access policies for the UAMI (Wrap/Unwrap permissions). ↓ PHASE 3: Deploy Enterprise Policies (PowerShell/IaC) Deploy Network Policies: Create "Network Injection" policies in Azure for both regions. Deploy Encryption Policy: Create the "CMK" policy linking to your Key Vault and Identity. ↓ PHASE 4: Power Platform Final Link (Admin Center) Link Network: Associate the Environment with the two Network Policies. Link Encryption: Activate the Customer-Managed Key on the environment. Register User: Add the Managed Identity as an "Application User" in the environment. ↓ PHASE 5: Verification Run Workload: Trigger a Flow or Plugin. Audit Logs: Use KQL in Log Analytics to confirm the presence of the NetworkPerimeter ID.
Idit_Bnaya
Jan 13, 2026 Place Azure Architecture Blog
401Views
2likes
0Comments
Provisioning Multiple Egress IP Addresses in AKS
Need certain apps or services in your Kubernetes cluster to use different IP addresses? Check out these quick solutions!
pjlewis
Jan 12, 2026 Place Azure Architecture Blog
8KViews
4likes
9Comments
From Large Semi-Structured Docs to Actionable Data: In-Depth Evaluation Approaches Guidance
Introduction Extracting structured data from large, semi-structured documents (the detailed solution implementation overview and architecture is provided in this tech community blog: From Large Semi-Structured Docs to Actionable Data: Reusable Pipelines with ADI, AI Search & OpenAI) demands a rigorous evaluation framework. The goal is to ensure our pipeline is accurate, reliable, and scalable before we trust it with mission-critical data. This framework breaks evaluation into clear phases, from how we prepare the document, to how we find relevant parts, to how we validate the final output. It provides metrics, examples, and best practices at each step, forming a generic pattern that can be applied to various domains. Framework Overview A very structured and stepwise approach for evaluation is given below: Establish Ground Truth & Sampling: Define a robust ground truth set and sampling method to fairly evaluate all parts of the document. Preprocessing Evaluation: Verify that OCR, chunking, and any structural augmentation (like adding headers) preserve all content and context. Labelling Evaluation: Check classification of sections/chunks by content based on topic/entity and ensure irrelevant data is filtered out without losing any important context. Retrieval Evaluation: Ensure the system can retrieve the right pieces of information (using search) with high precision@k and recall@k. Extraction Accuracy Evaluation: Measure how well the final structured data matches the expected values (field accuracy, record accuracy, overall precision/recall). Continuous Improvement Loop with SME: Use findings to retrain, tweak, and improve, enabling the framework to be reused for new documents and iterations. SMEs play a huge role in such scenarios. Detailed Guidance on Evaluation Below is a step-by-step, in-depth guide to evaluating this kind of IDP (Indelligent Document Processing) pipeline, covering both the overall system and its individual components: Establish Ground Truth & Sampling Why: Any evaluation is only as good as the ground truth it’s compared against. Start by assembling a reliable “source of truth” dataset for your documents. This often means manual labelling of some documents by domain experts (e.g., a legal team annotating key clauses in a contract, or accountants verifying invoice fields). Because manual curation is expensive, be strategic in what and how we sample. Ground Truth Preparation: Identify the critical fields and sections we need to extract, and create an annotated set of documents with those values marked correct. For example, if processing financial statements, we might mark the ground truth values for Total Assets, Net Income, Key Ratios, etc. This ground truth should be the baseline to measure accuracy against. Although creating it is labour-intensive, it yields a precise benchmark for model performance. Stratified Sampling: Documents like contracts or policies have diverse sections. To evaluate thoroughly, use stratified sampling – ensure your test set covers all major content types and difficulty levels. For instance, if 15% of pages in a set of contracts are annexes or addendums, then ~15% of your evaluation pages should come from annexes, not just the main body. This prevents the evaluation from overlooking challenging or rare sections. In practice, we might partition a document by section type (clauses, tables, schedules, footnotes) and sample a proportion from each. This way, metrics reflect performance on each type of content, not just the easiest portions. Multi-Voter Agreement (Consensus): It’s often helpful to have multiple automated voters on the outputs before involving humans. For example, suppose we extracted an invoice amount; we can have: A regex/format checker/fuzzy matching voter A cross-field logic checker/embedding based matching voter An ML model confidence score/LLM as a judge vote If all signals are strong, we label that extraction as Low Risk; if they conflict, mark it High Risk for human review. By tallying such “votes”, we create tiers of confidence. Why? Because in many cases, a large portion of outputs will be obviously correct (e.g., over 80% might have unanimous high confidence), and we can safely assume those are right, focusing manual review on the remainder. This strategy effectively reduces the human workload while maintaining quality. Preprocessing Evaluation Before extracting meaning, make sure the raw text and structure are captured correctly. Any loss here breaks the whole pipeline. Key evaluation checks: OCR / Text Extraction Accuracy Character/Error Rate: Sample pages to see how many words are recognized correctly (use per-word confidence to spot issues). Layout Preservation: Ensure reading order isn’t scrambled, especially in multi-column pages or footnotes. Content Coverage: Verify every sentence from a sample page appears in the extracted text. Missing footers or sidebars count as gaps. Chunking Completeness: Combined chunks should reconstruct the full document. Word counts should match. Segment Integrity: Chunks should align to natural boundaries (paragraphs, tables). Track a metric like “95% clean boundaries.” Context Preservation: If a table or section spans chunks, mark relationships so downstream logic sees them as connected. Multi-page Table Handling Header Insertion Accuracy: Validate that continued pages get the correct header (aim for high 90% to maintain context across documents). No False Headers: Ensure new tables aren’t mistakenly treated as continuations. Track a False Continuation Rate and push it to near zero. Practical Check: Sample multi-page tables across docs to confirm consistent extraction and no missed rows. Structural Links / References Link Accuracy: Confirm references (like footnotes or section anchors) map to the right targets (e.g., 98%+ correct). Ontology / Category Coverage: If content is pre-grouped, check precision (no mis-grouping) and recall (nothing left uncategorized). Implication The goal is to ensure the pre-processed chunks are a faithful, complete, and structurally coherent representation of the original document. Metrics like content coverage, boundary cleanliness, and header accuracy help catch issues early. Fixing them here saves significant downstream debugging. Labelling Evaluation – “Did we isolate the right pieces?” Once we chunk the document, we label those chunks (with ML or rules) to map them to the right entities and throw out the noise. Think of this step as sorting useful clauses from filler. Section/Entity Labelling Accuracy Treat labelling as a multi-class or multi-label classification problem. Precision (Label Accuracy): Of the chunks we labelled as X, how many were actually X? Example: Model tags 40 chunks as “Financial Data.” If 5 are wrong, precision is 87.5. High precision avoids polluting a category (topic/entity) with junk. Recall (Coverage): Of the chunks that truly belong to category X, how many did we catch? Example: Ground truth has 50 Financial Data chunks, model finds 45. Recall is 90%. High recall prevents missing important sections. Example: A model labels paper sections as Introduction, Methods, Results, etc. It marks 100 sections as Results and 95 are correct (95% precision). It misses 5 actual Results (slightly lower recall). That’s acceptable if downstream steps can still recover some items. But low precision means the labelling logic needs tightening. Implication Low precision means wrong info contaminates the category. Low recall means missing crucial bits. Use these metrics to refine definitions or adjust the labelling logic. Don’t just report one accuracy number; precision and recall per label tell the real story. Retrieval Evaluation – “Can we find the right info when we ask?” Many document pipelines use retrieval to narrow a huge file down to the few chunks most likely to contain the answer corresponding to a topic/entity. If we need a “termination date,” we first fetch chunks about dates or termination, then extract from those. Retrieval must be sharp, or everything downstream suffers. Precision@K How many of the top K retrieved chunks are actually relevant? If we grab 5 chunks for “Key Clauses” and 4 are correct, Precision@5 is 80%. We usually set K to whatever the next stage consumes (3 or 5). High precision keeps extraction clean. Average it across queries or fields. Critical fields may demand very high Precision@K. Recall@K Did we retrieve enough of the relevant chunks? If there are 2 relevant chunks in the doc but the top 5 results include only 1, recall is 50%. Good recall means we aren’t missing mentions in other sections or appendices. Increasing K improves recall but can dilute precision. Tune both together. Ranking Quality (MRR, NDCG) If order matters, use rank-aware metrics. MRR: Measures how early the first relevant result appears. Perfect if it’s always at rank 1. NDCG@K: Rewards having the most relevant chunks at the top. Useful when relevance isn’t binary. Most pipelines can get away with Precision@K and maybe MRR. Implication Test 50 QA pairs from policy documents, retrieving 3 passages per query. Average Precision@3: 85%. Average Recall@3: 92%. MRR: 0.8. Suppose, we notice “data retention” answers appear in appendices that sometimes rank low. We increase K to 5 for that query type. Precision@3 rises to 90%, and Recall@5 hits roughly 99%. Retrieval evaluation is a sanity check. If retrieval fails, extraction recall will tank no matter how good the extractor is. Measure both so we know where the leak is. Also keep an eye on latency and cost if fancy re-rankers slow things down. Extraction Accuracy Evaluation – “Did we get the right answers?” Look at each field and measure how often we got the right value. Precision: Of the values we extracted, what percent are correct? Use exact match or a lenient version if small format shifts don’t matter. Report both when useful. Recall: Out of all ground truth values, how many did we actually extract? Per-field breakdown: Some fields will be easy (invoice numbers, dates), others messy (vendor names, free text). A simple table makes this obvious and shows where to focus improvements. Error Analysis Numbers don’t tell the whole story. Look at patterns: OCR mix-ups Bad date or amount formats Wrong chunk retrieved upstream Misread tables Find the recurring mistakes. That’s where the fixes live. Holistic Metrics If needed, compute overall precision/recall across all extracted fields. But per-field and record-level are usually what matter to stakeholders. Implication Precision protects against wrong entries. Recall protects against missing data. Choose your balance based on risk: If false positives hurt more (wrong financial numbers), favour precision. If missing items hurts more (missing red-flag clauses), favour recall. Continuous Improvement Loop with SME Continuous improvement means treating evaluation as an ongoing feedback loop rather than a one-time check. Each phase’s errors point to concrete fixes, and every fix is re-measured to ensure accuracy moves in the right direction without breaking other components. The same framework also supports A/B testing alternative methods and monitoring real production data to detect drift or new document patterns. Because the evaluation stages are modular, they generalize well across domains such as contracts, financial documents, healthcare forms, or academic papers with only domain-specific tweaks. Over time, this creates a stable, scalable and measurable path toward higher accuracy, better robustness, and easier adaptation to new document types. Conclusion Building an end-to-end evaluation framework isn’t just about measuring accuracy, it’s about creating trust in the entire pipeline. By breaking the process into clear phases, defining robust ground truth, and applying precision/recall-driven metrics at every stage, we ensure that document processing systems are reliable, scalable, and adaptable. This structured approach not only highlights where improvements are needed but also enables continuous refinement through SME feedback and iterative testing. Ultimately, such a framework transforms evaluation from a one-time exercise into a sustainable practice, paving the way for higher-quality outputs across diverse domains.
anishganguli
Dec 29, 2025 Place Azure Architecture Blog
296Views
2likes
1Comment
Azure OpenAI Landing Zone reference architecture
In this article, delve into the synergy of Azure Landing Zones and Azure OpenAI Service, building a secure and scalable AI environment. Unpack the Azure OpenAI Landing Zone architecture, which integrates numerous Azure services for optimal AI workloads. Explore robust security measures and the significance of monitoring for operational success. This journey of deploying Azure OpenAI evolves alongside Azure's continual innovation.
FreddyAyala
Dec 22, 2025 Place Azure Architecture Blog
211KViews
44likes
21Comments
From Large Semi-Structured Docs to Actionable Data: Reusable Pipelines with ADI, AI Search & OpenAI
Problem Space Large semi-structured documents such as contracts, invoices, hospital tariff/rate cards multi-page reports, and compliance records often carry essential information that is difficult to extract reliably with traditional approaches. Their layout can span several pages, the structure is rarely consistent, and related fields may appear far apart even though they must be interpreted together. This makes it hard not only to detect the right pieces of information but also to understand how those pieces relate across the document. LLM can help, but when documents are long and contain complex cross-references, they may still miss subtle dependencies or generate hallucinated information. That becomes risky in environments where small errors can cascade into incorrect decisions. At the same time, these documents don’t change frequently, while the extracted data is used repeatedly by multiple downstream systems at scale. Because of this usage pattern, a RAG-style pipeline is often not ideal in terms of cost, latency, or consistency. Instead, organizations need a way to extract data once, represent it consistently, and serve it efficiently in a structured form to a wide range of applications, many of which are not conversational AI systems. At this point, data stewardship becomes critical, because once information is extracted, it must remain accurate, governed, traceable, and consistent throughout its lifecycle. When the extracted information feed compliance checks, financial workflows, risk models, or end-user experiences, the organization must ensure that the data is not just captured correctly but also maintained with proper oversight as it moves across systems. Any extraction pipeline that cannot guarantee quality, reliability, and provenance introduces long-term operational risk. The core problem, therefore, is finding a method that handles the structural and relational complexity of large semi-structured documents, minimizes LLM hallucination risk, produces deterministic results, and supports ongoing data stewardship so that the resulting structured output stays trustworthy and usable across the enterprise. Target Use Cases The potential applications for an Intelligent Document Processing (IDP) pipeline differ across industries. Several industry-specific use cases are provided as examples to guide the audience in conceptualising and implementing solutions tailored to their unique requirements. Hospital Tariff Digitization for Tariff-Invoice Reconciliation in Health Insurance Document types: Hospital tariff/rate cards, annexures/guidelines, pre-authorization guidelines etc. Technical challenge: Charges for the same service might appear under different sections or for different hospital room types across different versions of tariff/rate cards. Table + free text mix, abbreviations, and cross-page references. Downstream usage: Reimbursement orchestration, claims adjudication Commercial Loan Underwriting in Banking Document types: Balance sheets, cash-flow statements, auditor reports, collateral documents. Technical Challenge: Ratios and covenants must be computed from fields located across pages. Contextual dependencies: “Net revenue excluding exceptional items” or footnotes that override values. Downstream usage: Loan decisioning models, covenant monitoring, credit scoring. Procurement Contract Intelligence in Manufacturing Document types: Vendor agreements, SLAs, pricing annexures. Technical Challenge: Pricing rules defined across clauses that reference each other. Penalty and escalation conditions hidden inside nested sections. Downstream usage: Automated PO creation, compliance checks. Regulatory Compliance Extraction Document types: GDPR/HIPAA compliance docs, audit reports. Technical Challenge: Requirements and exceptions buried across many sections. Extraction must be deterministic since compliance logic is strict. Downstream usage: Rule engines, audit workflows, compliance checklist. Solution Approaches Problem Statement Across industries from finance and healthcare to legal and compliance, large semi-structured documents serve as the single source of truth for critical workflows. These documents often span hundreds of pages, mixing free text, tables, and nested references. Before any automation can validate transactions, enforce compliance, or perform analytics, this information must be transformed into a structured, machine-readable format. The challenge isn’t just size; it’s complexity. Rules and exceptions are scattered, relationships span multiple sections, and formatting inconsistencies make naive parsing unreliable. Errors at this stage ripple downstream, impacting reconciliation, risk models, and decision-making. In short, the fidelity of this digitization step determines the integrity of every subsequent process. Solving this problem requires a pipeline that can handle structural diversity, preserve context, and deliver deterministic outputs at scale. Challenges There are many challenges which can arise while solving for such large complex documents. The documents can have ~200-250 pages. The documents structures and layouts can be extremely complex in nature. A document or a page may contain a mix of various layouts like tables, text blocks, figures etc. Sometimes a single table can stretch across multiple pages, but only the first page contains the table header, leaving the remaining pages without column labels. A topic on one page may be referenced from a different page, so there can be complex inter-relationship amongst different topics in the same documents which needs to be structured in a machine-readable format. The document can be semi-structured as well (some parts are structured; some parts are unstructured or free text) The downstream applications might not always be AI-assisted (it can be core analytics dashboard or existing enterprise legacy system), so the structural storage of the digitized items from the documents need to be very well thought out before moving ahead with the solution. Motivation Behind High Level Approach A larger document (number of pages ~200) needs to be divided into smaller chunks so that it becomes readable and digestible (within context length) for the LLM. To make the content/input of the LLM truly context-aware, the references must be maintained across pages (for example, table headers of long and continuous tables need to be injected to those chunks which would have the tables without the headers). If a pre-defined set of topics/entities are being covered in the documents in consideration, then topic/entity-wise information needs to be extracted for making the system truly context-aware. Different chunks can cover similar topic/entity which becomes a search problem The retrieval needs to happen for every topic/entity so that all information related to one topic/entity are in a single place and as a result the downstream applications become efficient, scalable and reliable over time. Sample Architecture and Implementation Let’s take a possible approach to demonstrate the feasibility of the following architecture, building on the motivation outlined above. The solution divides a large, semi-structured document into manageable chunks, making it easier to maintain context and references across pages. First, the document is split into logical sections. Then, OCR and layout extraction capture both text and structure, followed by structure analysis to preserve semantic relationships. Annotated chunks are labeled and grouped by entity, enabling precise extraction of items such as key-value pairs or table data. As a result, the system efficiently transforms complex documents into structured, context-rich outputs ready for downstream analytics and automation. Architecture Components The key elements of the architecture diagram include components 1-6, which are code modules. Components 7 and 8 represent databases that store data chunks and extracted items, while component 9 refers to potential downstream systems that will use the structured data obtained from extraction. Chunking: Break documents into smaller, logical sections such as pages or content blocks. Enables parallel processing and improves context handling for large files. Technology: Python-based chunking logic using pdf2image and PIL for image handling. OCR & Layout Extraction: Convert scanned images into machine-readable text while capturing layout details like bounding boxes, tables, and reading order for structural integrity. Technology: Azure Document Intelligence or Microsoft Foundry Content Understanding Prebuilt Layout model combining OCR with deep learning for text, tables, and structure extraction. Context Aware Structural Analysis: Analyse the extracted layout to identify document components such as headers, paragraphs, and tables. Preserves semantic relationships for accurate interpretation. Technology: Custom Python logic leveraging OCR output to inject missing headers, summarize layout (row/column counts, sections per page). Labelling: Assign entity-based labels to chunks according to predefined schema or SME input. Helps filter irrelevant content and focus on meaningful sections. Technology: Azure OpenAI GPT-4.1-mini with NLI-style prompts for multi-class classification. Entity-Wise Grouping: Organize chunks by entity type (e.g., invoice number, total amount) for targeted extraction. Reduces noise and improves precision in downstream tasks. Technology: Azure AI Search with Hybrid Search and Semantic Reranking for grouping relevant chunks. Item Extraction: Extract specific values such as key-value pairs, line items, or table data from grouped chunks. Converts semi-structured content into structured fields. Technology: Azure OpenAI GPT-4.1-mini with Set of Marking style prompts using layout clues (row × column, headers, OCR text). Interim Chunk Storage: Store chunk-level data including OCR text, layout metadata, labels, and embeddings. Supports traceability, semantic search, and audit requirements. Technology: Azure AI Search for chunk indexing and Azure OpenAI Embedding models for semantic retrieval. Document Store: Maintain final extracted items with metadata and bounding boxes. Enables quick retrieval, validation, and integration with enterprise systems. Technology: Azure Cosmos DB, Azure SQL DB, Azure AI Search, or Microsoft Fabric depending on downstream needs (analytics, APIs, LLM apps). Downstream Integration: Deliver structured outputs (JSON, CSV, or database records) to business applications or APIs. Facilitates automation and analytics across workflows. Technology: REST APIs, Azure Functions, or Data Pipelines integrated with enterprise systems. Algorithms Consider these key algorithms when implementing the components above: Structural Analysis – Inject headers: Detect tables page by page; compare the last row of a table on page i with the first row of a table on page i+1, if column counts match and ≥4/5 style features (Font Weight, Background Colour, Font Style, Foreground Colour, Similar Font Family) match, mark it as a continuous table (header missing) and inject the previous page’s header into the next page’s table, repeating across pages. Labelling – Prompting Guide: Run NLI checks per SOC chunk image (ground on OCR text) across N curated entity labels, return {decision ∈ {ENTAILED, CONTRADICTED, NEUTRAL}, confidence ∈ [0,1]}, and output only labels where decision = ENTAILED and confidence > 0.7. Entity-Wise Grouping – Querying Chunks per Entity & Top‑50 Handling: Construct the query from the entity text and apply hybrid search with label filters for Azure AI Search, starting with chunks where the target label is sole, then expanding to observed co‑occurrence combinations under a cap to prevent explosion. If label frequency >50, run staged queries (sole‑label → capped co‑label combos); otherwise use a single hybrid search with semantic reranking, merge results and deduplicate before scoring. Entity-Wise Grouping – Chunk to Entity relevance scoring: For each retrieved chunk, split text into spans; compute cosine similarities to the entity text and take the mean s. Boost with a gated nonlinearity b=σ(k(s-m))⋅s. where σ is sigmoid function and k,m are tunables to emphasize mid-range relevance while suppressing very low s. Min–max normalize the re-ranker score r → r_norm; compute the final score F=α*b+(1-α)*r_norm, and keep the chunk iff F≥τ. Item Extraction – Prompting Guide: Provide the chunk image as an input and ground on visual structure (tables, headers, gridlines, alignment, typography) and document structural metadata to segment and align units; reconcile ambiguities via OCR extracted text, then enumerate associations by positional mapping (header ↔ column, row ↔ cell proximity) and emit normalized objects while filtering narrative/policy text by layout and pattern cues. Deployment at Scale There are several ways to implement a document extraction pipeline, each with its own pros and cons. The best deployment model depends on scenario requirements. Below are some common approaches with their advantages and disadvantages. Host as REST API Pros: Enables straightforward creation, customization, and deployment across scalable compute services such as Azure Kubernetes Service. Cons: Processing time and memory usage scale with document size and complexity, potentially requiring multiple iterations to optimize performance. Deploy as Azure Machine Learning (ML) Pipeline Pros: Facilitates efficient time and memory management, as Azure ML supports processing large datasets at scale. Cons: The pipeline may be more challenging to develop, customize, and maintain. Deploy as Azure Databricks Job Pros: Offers robust time and memory management similar to Azure ML, with advanced features such as Data Autoloader for detecting data changes and triggering pipeline execution. Cons: The solution is highly tailored to Azure Databricks and may have limited customization options. Deploy as Microsoft Fabric Pipeline Pros: Provides capabilities comparable to Azure ML and Databricks, and features like Fabric Activator replicate Databricks Autoloader functionality. Cons: Presents similar limitations found in Azure ML and Azure Databricks approaches. Each method should be carefully evaluated to ensure alignment with technical and operational requirements. Evaluation Objective: The aim is to evaluate how accurately a document extraction pipeline extracts information by comparing its output with manually verified data. Approach: Documents are split into sections, labelled, and linked to relevant entities; then, AI tools extract key items through the outlined pipeline mentioned above. The extracted data is checked against expert-curated records using both exact and approximate matching techniques. Key Metrics: Individual Item Attribute Match: Assesses the system’s ability to identify specific item attributes using strict and flexible comparison methods. Combined Item Attribute Match: Evaluates how well multiple attributes are identified together, considering both exact and fuzzy matches. Precision Calculation: Precision for each metric reflects the proportion of correctly matched entries compared to all reference entries. Findings for a real-world scenario: Fuzzy matching of item key attributes yields high precision (over 90%), but accuracy drops for key attribute combinations (between 43% and 48%). These results come from analysis across several datasets to ensure reliability. How This Addresses the Problem Statement The sample architecture described integrates sectioning, entity linking, and attribute extraction as foundational steps. Each extracted item is then evaluated against expert-curated datasets using both strict (exact) and flexible (fuzzy) matching algorithms. This approach directly addresses the problem statement by providing measurable metrics, such as individual and combined attribute match rates and precision calculations, that quantify the system’s reliability and highlight areas for improvement. Ultimately, this methodology ensures that the pipeline’s output is systematically validated, and its strengths and limitations are clearly understood in real-world contexts. Plausible Alternative Approaches No single approach fits every use case; the best method depends on factors like document complexity, structure, sensitivity, and length as well as the downstream application types, Consider these alternative approaches for different scenarios. Using Azure OpenAI alone Article: Best Practices for Structured Extraction from Documents Using Azure OpenAI Using Azure OpenAI + Azure Document Intelligence + Azure AI Search: RAG like solution Article 1: Document Field Extraction with Generative AI Article 2: Complex Data Extraction using Document Intelligence and RAG Article 3: Design and develop a RAG solution Using Azure OpenAI + Azure Document Intelligence + Azure AI Search: Non-RAG like solution Article: Using Azure AI Document Intelligence and Azure OpenAI to extract structured data from documents GitHub Repository: Content processing solution accelerator Conclusion Intelligent Document Processing for large semi-structured documents isn’t just about extracting data, it’s about building trust in that data. By combining Azure Document Intelligence for layout-aware OCR with OpenAI models for contextual understanding, we create a well thought out in-depth pipeline that is accurate, scalable, and resilient against complexity. Chunking strategies ensure context fits within model limits, while header injection and structural analysis preserve relationships across pages to make it context-aware. Entity-based grouping and semantic retrieval transform scattered content into organized, query-ready data. Finally, rigorous evaluation with scalable ground truth strategy roadmap, using precision, recall, and fuzzy matching, closes the loop, ensuring reliability for downstream systems. This pattern delivers more than automation; it establishes a foundation for compliance, analytics, and AI-driven workflows at enterprise scale. In short, it’s a blueprint for turning chaotic document into structured intelligence, efficient, governed, and future-ready for any kind of downstream applications. End-to-End Evaluation Approaches Guidance Given the complexity of this system, it should undergo a thorough end-to-end evaluation to ensure correctness, robustness, and performance across the pipeline. Continuous monitoring and observability of these metrics will enable iterative improvements and help the system scale reliably as requirements evolve. If you would like to read more about the end-to-end evaluation approaches guidance, please refer to our tech community blog: From Large Semi-Structured Docs to Actionable Data: In-Depth Evaluation Approaches Guidance. References Azure Content Understanding in Foundry Tools Azure Document Intelligence in Foundry Tools Azure OpenAI in Microsoft Foundry models Azure AI Search Azure Machine Learning (ML) Pipelines Azure Databricks Job Microsoft Fabric Pipeline
anishganguli
Dec 18, 2025 Place Azure Architecture Blog
536Views
0likes
0Comments
Accelerating HPC and EDA with Powerful Azure NetApp Files Enhancements
High-Performance Computing (HPC) and Electronic Design Automation (EDA) workloads demand uncompromising performance, scalability, and resilience. Whether you're managing petabyte-scale datasets or running compute intensive simulations, Azure NetApp Files delivers the agility and reliability needed to innovate without limits.
GeertVanTeylingen
Dec 15, 2025 Place Azure Architecture Blog
622Views
1like
0Comments