azure ai document intelligence
72 TopicsBuild Intelligent RAG For Multimodality and Complex Document Structure
Struggling with implementing RAG for Document with multiples Tables, Figures , Plots with complex structured scan documents, Looking for a solution? In this blog you will learn implementing end to end solution using Azure AI services including Document Intelligence, AI Search, Azure Open AI and our favorite LangChain 🙂 I have taken the advantage of Multimodal capability of Azure OpenAI GPT-4-Vision model for Figures and Plots.17KViews9likes5CommentsAnalyze complex documents with Azure Document Intelligence Markdown Output and Azure OpenAI
In today’s digital era, where data is the new gold, efficiently extracting and processing information from complex documents, including those with dynamic tables, is crucial for businesses. Microsoft’s Azure AI services offer robust solutions for tackling these challenges, especially through the Document Intelligence Layout model. In this blog post, we will explore how you can use markdown output to enhance the capabilities of Azure Document Intelligence Layout model and subsequently feed this refined data into Azure AI for comprehensive information extraction.23KViews8likes1CommentDocument Generative AI: the Power of Azure AI Document Intelligence & Azure OpenAI Service Combined
Embrace the future of document processing and take your enterprise data to new heights with Document generative AI, a cutting-edge solution that will revolutionize the way you work with documents.70KViews8likes2CommentsAzure Form Recognizer is now Azure AI Document Intelligence with new and updated capabilities
Azure Form Recognizer is now Azure AI Document Intelligence! Check out the latest updates coming to the service that includes new capabilities for classification and updates to all capabilities including structure extraction from documents, prebuilt models and custom models. Document Intelligence provides you with the tools to build your document centric solutions.49KViews7likes1CommentHow Copilot helps developers generate code for a Form Recognizer application
Learn how OpenAI powered GitHub Copilot automatically generates code from natural language sentences to import libraries, make API connections to Azure Form Recognizer, recognize contents on an input receipt image, and print out fields and values from the receipt image.8.9KViews7likes1CommentAzure AI Document Intelligence now previewing field extraction with Generative AI and more
The latest preview release from Azure AI Document Intelligence adds Generative AI based field extraction, new prebuilt models for bank statements, pay stubs, mortgage forms, a unified tax prebuilt model, searchable PDF response and updates to the service to simplify document processing at scale with a batch API and unified classification and extraction API.13KViews6likes1CommentSeamlessly Integrating Azure Document Intelligence with Azure API Management (APIM)
In today’s data-driven world, organizations are increasingly turning to AI for document understanding. Whether it's extracting invoices, contracts, ID cards, or complex forms, Azure Document Intelligence (formerly known as Form Recognizer) provides a robust, AI-powered solution for automated document processing. But what happens when you want to scale, secure, and load balance your document intelligence backend for high availability and enterprise-grade integration? Enter Azure API Management (APIM) — your gateway to efficient, scalable API orchestration. In this blog, we’ll explore how to integrate Azure Document Intelligence with APIM using a load-balanced architecture that works seamlessly with the Document Intelligence SDK — without rewriting your application logic. Azure Doc Intelligence SDKs simplify working with long-running document analysis operations — particularly asynchronous calls — by handling the polling and response parsing under the hood. Why Use API Management with Document Intelligence? While the SDK is great for client-side development, APIM adds essential capabilities for enterprise-scale deployments: 🔐 Security & authentication at the gateway level ⚖️ Load balancing across multiple backend instances 🔁 Circuit breakers, caching, and retries 📊 Monitoring and analytics 🔄 Response rewriting and dynamic routing By routing all SDK and API calls through APIM, you get full control over traffic flow, visibility into usage patterns, and the ability to scale horizontally with multiple Document Intelligence backends. SDK Behavior with Document Intelligence When using the Document Intelligence SDK (e.g., begin_analyze_document), it follows this two-step pattern: POST request to initiate document analysis Polling (GET) request to the operation-location URL until results are ready This is an asynchronous pattern where the SDK expects a polling URL in the response of the POST. If you’re not careful, this polling can bypass APIM — which defeats the purpose of using APIM in the first place. So what do we do? The Smart Rewrite Strategy We use APIM to intercept and rewrite the response from the POST call. POST Flow SDK sends a POST to: https://apim-host/analyze APIM routes the request to one of the backend services: https://doc-intel-backend-1/analyze Backend responds with: operation-location: https://doc-intel-backend-1/operations/123 APIM rewrites this header before returning to the client: operation-location: https://apim-host/operations/poller?backend=doc-intel-backend-1 Now, the SDK will automatically poll APIM, not the backend directly. GET (Polling) Flow Path to be set as /operations/123 in GET operation of APIM SDK polls: https://apim-host/operations/123?backend=doc-intel-backend-1 APIM extracts the query parameter backend=doc-intel-backend-1 APIM dynamically sets the backend URL for this request to: https://doc-intel-backend-1 It forwards the request to: https://doc-intel-backend-1/operations/123 Backend sends the status/result back to APIM → which APIM returns to the SDK. All of this happens transparently to the SDK. Sample policies //Outbound policies for POST - /documentintelligence/documentModels/prebuilt-read:analyze //--------------------------------------------------------------------------------------------------- <!-- - Policies are applied in the order they appear. - Position <base/> inside a section to inherit policies from the outer scope. - Comments within policies are not preserved. --> <!-- Add policies as children to the <inbound>, <outbound>, <backend>, and <on-error> elements --> <policies> <!-- Throttle, authorize, validate, cache, or transform the requests --> <inbound> <base /> </inbound> <!-- Control if and how the requests are forwarded to services --> <backend> <base /> </backend> <!-- Customize the responses --> <outbound> <base /> <set-header name="operation-location" exists-action="override"> <value>@{ // Original operation-location from backend var originalOpLoc = context.Response.Headers.GetValueOrDefault("operation-location", ""); // Encode original URL to pass as query parameter var encoded = System.Net.WebUtility.UrlEncode(originalOpLoc); // Construct APIM URL pointing to poller endpoint with backendUrl var apimUrl = $"https://tstmdapim.azure-api.net/document-intelligent/poller?backendUrl={encoded}"; return apimUrl; }</value> </set-header> </outbound> <!-- Handle exceptions and customize error responses --> <on-error> <base /> </on-error> </policies> //Inbound policies for Get (Note: path for get should be modified - /document-intelligent/poller //---------------------------------------------------------------------------------------------- <!-- - Policies are applied in the order they appear. - Position <base/> inside a section to inherit policies from the outer scope. - Comments within policies are not preserved. --> <!-- Add policies as children to the <inbound>, <outbound>, <backend>, and <on-error> elements --> <policies> <!-- Throttle, authorize, validate, cache, or transform the requests --> <inbound> <base /> <choose> <when condition="@(context.Request.Url.Query.ContainsKey("backendUrl"))"> <set-variable name="decodedUrl" value="@{ var backendUrlEncoded = context.Request.Url.Query.GetValueOrDefault("backendUrl", ""); // Make sure to decode the URL properly, potentially multiple times if needed var decoded = System.Net.WebUtility.UrlDecode(backendUrlEncoded); // Check if it's still encoded and decode again if necessary while (decoded.Contains("%")) { decoded = System.Net.WebUtility.UrlDecode(decoded); } return decoded; }" /> <!-- Log the decoded URL for debugging remove if not needed--> <trace source="Decoded URL">@((string)context.Variables["decodedUrl"])</trace> <send-request mode="new" response-variable-name="backendResponse" timeout="30" ignore-error="false"> <set-url>@((string)context.Variables["decodedUrl"])</set-url> <set-method>GET</set-method> <authentication-managed-identity resource="https://cognitiveservices.azure.com/" /> </send-request> <return-response response-variable-name="backendResponse" /> </when> <otherwise> <return-response> <set-status code="400" reason="Missing backendUrl query parameter" /> <set-body>{"error": "Missing backendUrl query parameter."}</set-body> </return-response> </otherwise> </choose> </inbound> <!-- Control if and how the requests are forwarded to services --> <backend> <base /> </backend> <!-- Customize the responses --> <outbound> <base /> </outbound> <!-- Handle exceptions and customize error responses --> <on-error> <base /> </on-error> </policies> Load Balancing in APIM You can configure multiple backend services in APIM and use built-in load-balancing policies to: Distribute POST requests across multiple Document Intelligence instances Use custom headers or variables to control backend selection Handle failure scenarios with circuit-breakers and retries Reference: Azure API Management backends – Microsoft Learn Sample: Using APIM Circuit Breaker & Load Balancing – Microsoft Community Hub Conclusion By integrating Azure Document Intelligence with Azure API Management native capabilities like Load balancing, rewrite header, authentication, rate limiting policies, organizations can transform their document processing workflows into scalable, secure, and efficient systems.1.3KViews5likes17Comments