azure foundry
4 TopicsModel Mondays S2E10: Automating Document Processing with AI
1. Weekly Highlights We kicked off with the top news and updates in the Azure AI ecosystem: Agent Factory Blog Series: A new 6-part blog series on designing reliable, agentic AI—exploring multi-step, collaborative agents that reflect, plan, and adapt using tool integrations and design patterns. Text PII Preview in Azure AI Language: Now redacts PII (like date of birth, license plates) in major European languages, with better accuracy for UK bank entities. Claude Opus 4.1 in Copilot Pro & Enterprise: Public preview brings smarter summaries, tool assistant thinking, and "Ask Mode" in VS Code.Now leverages stronger computer vision algorithms for table parsing—achieving 94-97% accuracy across Latin, Chinese, Japanese, and Korean—with sub-10ms latency. Mistral Document AI in Azure Foundry: Instantly turn PDFs, contracts, and scanned docs into structured JSON with tables, headings, and LaTeX support. Serverless, multilingual, secure, and perfect for regulated industries. 2. Spotlight On: Document Intelligence with Azure & Mistral This week’s spotlight was a hands-on exploration of document processing, featuring both Microsoft and Mistral AI experts. Why Document Processing? Unstructured data—receipts, forms, handwritten notes—are everywhere. Modern document AI can extract, structure, and even annotate this data, fueling everything from search to RAG pipelines. Azure Document Intelligence: State-of-the-art OCR and table extraction with super-high accuracy and speed. Handles multi-language, complex layouts, and returns structured outputs ready for programmatic use. Mistral Document AI: Transforms PDFs and scanned docs into JSON, retaining complex formatting, tables, images, and even LaTeX. Supports custom schema extraction, image/document annotations, and returns everything in one API call. Integrates seamlessly with Azure AI Foundry and developer workflows. Demo Highlights: Extracting Receipts: OCR accurately pulls out store, date, and transaction details from photos. Handwriting Recognition: Even historical documents (like Thomas Jefferson’s letters) are parsed with surprising accuracy. Tables & Structured Data: Financial statements and reports converted into structured markdown and JSON—ready for downstream apps. Advanced Annotations: Define your own schema (via JSON Schema or Pydantic), extract custom fields, classify images, summarize documents, and even translate summaries—all in a single call. 3. Customer Story: Oracle Health Oracle Health shared how agentic AI and fine-tuned models are revolutionizing clinical workflows: Problem: Clinicians spend hours on documentation, searching records, and manual data entry—reducing time for patient care. Solution: Oracle’s clinical AI agents automate chart reviews, data extraction, and even conversational Q&A—while keeping humans in the loop for safety. Technical Highlights: Multi-agent architecture understands provider specialty and context. Orchestrator model "routes" requests to the right agent or plugin, extracting needed arguments from context. Fine-tuning was key: For low latency, Oracle used lightweight models (like GPT-4 Mini) and fine-tuned on their data—achieving sub-800ms responses, with accuracy matching larger models. Fine-tuning also allowed for nuanced tool selection, argument extraction, and rule-based orchestration—better than prompt engineering alone. Used LoRA for efficient, targeted fine-tuning without erasing base model knowledge. Live Demo: Agent summarizes patient history, retrieves lab results, filters for abnormals, and answers follow-up questions—all conversationally. Fine-tuned orchestrator chooses the right tool and context for each doctor’s workflow. Result: 1-2 hours saved per day, more time for patients, and happier doctors! 4. Key Takeaways Here are the key learnings from this episode: Document AI is Production-Ready: Azure Document Intelligence and Mistral Document AI offer fast, accurate, and customizable document parsing for real enterprise needs. Schema-Driven Extraction & Annotation: Define your own schemas and extract exactly what you want—no more one-size-fits-all. Fine-Tuning Unlocks Performance: For low latency and high accuracy, fine-tuning lightweight models beats prompt engineering in complex, rule-based agent workflows. Agentic Workflows in Action: Multi-agent systems can automate complex tasks, route requests, and keep humans in control, especially in regulated domains like healthcare. Community & Support: Join the Discord and Forum to ask questions, share use cases, and connect with the team. Sharda's Tips: How I Wrote This Blog Writing this recap is all about sharing what I learned and making it practical for the community! I start by organizing the key highlights, then walk through customer stories and demos, using simple language and real-world examples. Copilot helps me structure and clarify my notes, especially when summarizing technical sections. Here’s the prompt I used for Copilot this week: "Generate a technical blog post for Model Mondays S2E10 based on the transcript and episode details. Focus on document processing with Azure AI and Mistral, include customer demos, and highlight practical workflows and fine-tuning. Make it clear and approachable for developers and students." Every episode inspires me to try these tools myself, and I hope this blog makes it easy for you to start, too. If you have questions or want to share your own experience, I’d love to hear from you! Coming Up Next Week Next week: Text & Speech AI Playgrounds! Learn how to build and test language and speech models, with live demos and expert guests. | Register For The Livestream – Aug 25, 2025 | Register For The AMA – Aug 29, 2025 | Ask Questions & View Recaps – Discussion Forum About Model Mondays Model Mondays is a weekly series to build your Azure AI IQ with: 5-Minute Highlights: News & updates on Mondays 15-Minute Spotlight: Deep dives into new features, models, and protocols 30-Minute AMA Fridays: Live Q&A with product teams and experts Get started: Register For Livestreams Watch Past Replays Register For AMA Recap Past AMAs Join The Community Don’t build alone! Join the Azure AI Developer Community for real-time chats, events, support, and more: Join the Discord Explore the Forum About Me I'm Sharda, a Gold Microsoft Learn Student Ambassador focused on cloud and AI. Find me on GitHub, Dev.to, Tech Community, and LinkedIn. In this blog series, I share takeaways from each week’s Model Mondays livestream.197Views0likes0CommentsSmart Auditing: Leveraging Azure AI Agents to Transform Financial Oversight
In today's data-driven business environment, audit teams often spend weeks poring over logs and databases to verify spending and billing information. This time-consuming process is ripe for automation. But is there a way to implement AI solutions without getting lost in complex technical frameworks? While tools like LangChain, Semantic Kernel, and AutoGen offer powerful AI agent capabilities, sometimes you need a straightforward solution that just works. So, what's the answer for teams seeking simplicity without sacrificing effectiveness? This tutorial will show you how to use Azure AI Agent Service to build an AI agent that can directly access your Postgres database to streamline audit workflows. No complex chains or graphs required, just a practical solution to get your audit process automated quickly. The Auditing Challenge: It's the month end, and your audit team is drowning in spreadsheets. As auditors reviewing financial data across multiple SaaS tenants, you're tasked with verifying billing accuracy by tracking usage metrics like API calls, storage consumption, and user sessions in Postgres databases. Each tenant generates thousands of transactions daily, and traditionally, this verification process consumes weeks of your team's valuable time. Typically, teams spend weeks: Manually extracting data from multiple database tables. Cross-referencing usage with invoices. Investigating anomalies through tedious log analysis. Compiling findings into comprehensive reports. With an AI-powered audit agent, you can automate these tasks and transform the process. Your AI assistant can: Pull relevant usage data directly from your database Identify billing anomalies like unexpected usage spikes Generate natural language explanations of findings Create audit reports that highlight key concerns For example, when reviewing a tenant's invoice, your audit agent can query the database for relevant usage patterns, summarize anomalies, and offer explanations: "Tenant_456 experienced a 145% increase in API usage on April 30th, which explains the billing increase. This spike falls outside normal usage patterns and warrants further investigation." Let’s build an AI agent that connects to your Postgres database and transforms your audit process from manual effort to automated intelligence. Prerequisites: Before we start building our audit agent, you'll need: An Azure subscription (Create one for free). The Azure AI Developer RBAC role assigned to your account. Python 3.11.x installed on your development machine. OR You can also use GitHub Codespaces, which will automatically install all dependencies for you. You’ll need to create a GitHub account first if you don’t already have one. Setting Up Your Database: For this tutorial, we'll use Neon Serverless Postgres as our database. It's a fully managed, cloud-native Postgres solution that's free to start, scales automatically, and works excellently for AI agents that need to query data on demand. Creating a Neon Database on Azure: Open the Neon Resource page on the Azure portal Fill out the form with the required fields and deploy your database After creation, navigate to the Neon Serverless Postgres Organization service Click on the Portal URL to access the Neon Console Click "New Project" Choose an Azure region Name your project (e.g., "Audit Agent Database") Click "Create Project" Once your project is successfully created, copy the Neon connection string from the Connection Details widget on the Neon Dashboard. It will look like this: postgresql://[user]:[password]@[neon_hostname]/[dbname]?sslmode=require Note: Keep this connection string saved; we'll need it shortly. Creating an AI Foundry Project on Azure: Next, we'll set up the AI infrastructure to power our audit agent: Create a new hub and project in the Azure AI Foundry portal by following the guide. Deploy a model like GPT-4o to use with your agent. Make note of your Project connection string and Model Deployment name. You can find your connection string in the overview section of your project in the Azure AI Foundry portal, under Project details > Project connection string. Once you have all three values on hand: Neon connection string, Project connection string, and Model Deployment Name, you are ready to set up the Python project to create an Agent. All the code and sample data are available in this GitHub repository. You can clone or download the project. Project Environment Setup: Create a .env file with your credentials: PROJECT_CONNECTION_STRING="<Your AI Foundry connection string> "AZURE_OPENAI_DEPLOYMENT_NAME="gpt4o" NEON_DB_CONNECTION_STRING="<Your Neon connection string>" Create and activate a virtual environment: python -m venv .venv source .venv/bin/activate # on macOS/Linux .venv\Scripts\activate # on Windows Install required Python libraries: pip install -r requirements.txt Example requirements.txt: Pandas python-dotenv sqlalchemy psycopg2-binary azure-ai-projects ==1.0.0b7 azure-identity Load Sample Billing Usage Data: We will use a mock dataset for tenant usage, including computed percent change in API calls and storage usage in GB: tenant_id date api_calls storage_gb tenant_456 2025-04-01 1000 25.0 tenant_456 2025-03-31 950 24.8 tenant_456 2025-03-30 2200 26.0 Run python load_usage_data.py Python script to create and populate the usage_data table in your Neon Serverless Postgres instance: # load_usage_data.py file import os from dotenv import load_dotenv from sqlalchemy import ( create_engine, MetaData, Table, Column, String, Date, Integer, Numeric, ) # Load environment variables from .env load_dotenv() # Load connection string from environment variable NEON_DB_URL = os.getenv("NEON_DB_CONNECTION_STRING") engine = create_engine(NEON_DB_URL) # Define metadata and table schema metadata = MetaData() usage_data = Table( "usage_data", metadata, Column("tenant_id", String, primary_key=True), Column("date", Date, primary_key=True), Column("api_calls", Integer), Column("storage_gb", Numeric), ) # Create table with engine.begin() as conn: metadata.create_all(conn) # Insert mock data conn.execute( usage_data.insert(), [ { "tenant_id": "tenant_456", "date": "2025-03-27", "api_calls": 870, "storage_gb": 23.9, }, { "tenant_id": "tenant_456", "date": "2025-03-28", "api_calls": 880, "storage_gb": 24.0, }, { "tenant_id": "tenant_456", "date": "2025-03-29", "api_calls": 900, "storage_gb": 24.5, }, { "tenant_id": "tenant_456", "date": "2025-03-30", "api_calls": 2200, "storage_gb": 26.0, }, { "tenant_id": "tenant_456", "date": "2025-03-31", "api_calls": 950, "storage_gb": 24.8, }, { "tenant_id": "tenant_456", "date": "2025-04-01", "api_calls": 1000, "storage_gb": 25.0, }, ], ) print("✅ usage_data table created and mock data inserted.") Create a Postgres Tool for the Agent: Next, we configure an AI agent tool to retrieve data from Postgres. The Python script billing_agent_tools.py contains: The function billing_anomaly_summary() that: Pulls usage data from Neon. Computes % change in api_calls. Flags anomalies with a threshold of > 1.5x change. Exports user_functions list for the Azure AI Agent to use. You do not need to run it separately. # billing_agent_tools.py file import os import json import pandas as pd from sqlalchemy import create_engine from dotenv import load_dotenv # Load environment variables load_dotenv() # Set up the database engine NEON_DB_URL = os.getenv("NEON_DB_CONNECTION_STRING") db_engine = create_engine(NEON_DB_URL) # Define the billing anomaly detection function def billing_anomaly_summary( tenant_id: str, start_date: str = "2025-03-27", end_date: str = "2025-04-01", limit: int = 10, ) -> str: """ Fetches recent usage data for a SaaS tenant and detects potential billing anomalies. :param tenant_id: The tenant ID to analyze. :type tenant_id: str :param start_date: Start date for the usage window. :type start_date: str :param end_date: End date for the usage window. :type end_date: str :param limit: Maximum number of records to return. :type limit: int :return: A JSON string with usage records and anomaly flags. :rtype: str """ query = """ SELECT date, api_calls, storage_gb FROM usage_data WHERE tenant_id = %s AND date BETWEEN %s AND %s ORDER BY date DESC LIMIT %s; """ df = pd.read_sql(query, db_engine, params=(tenant_id, start_date, end_date, limit)) if df.empty: return json.dumps( {"message": "No usage data found for this tenant in the specified range."} ) df.sort_values("date", inplace=True) df["pct_change_api"] = df["api_calls"].pct_change() df["anomaly"] = df["pct_change_api"].abs() > 1.5 return df.to_json(orient="records") # Register this in a list to be used by FunctionTool user_functions = [billing_anomaly_summary] Create and Configure the AI Agent: Now we'll set up the AI agent and integrate it with our Neon Postgres tool using the Azure AI Agent Service SDK. The Python script does the following: Creates the agent Instantiates an AI agent using the selected model (gpt-4o, for example), adds tool access, and sets instructions that tell the agent how to behave (e.g., “You are a helpful SaaS assistant…”). Creates a conversation thread A thread is started to hold a conversation between the user and the agent. Posts a user message Sends a question like “Why did my billing spike for tenant_456 this week?” to the agent. Processes the request The agent reads the message, determines that it should use the custom tool to retrieve usage data, and processes the query. Displays the response Prints the response from the agent with a natural language explanation based on the tool’s output. # billing_anomaly_agent.py import os from datetime import datetime from azure.ai.projects import AIProjectClient from azure.identity import DefaultAzureCredential from azure.ai.projects.models import FunctionTool, ToolSet from dotenv import load_dotenv from pprint import pprint from billing_agent_tools import user_functions # Custom tool function module # Load environment variables from .env file load_dotenv() # Create an Azure AI Project Client project_client = AIProjectClient.from_connection_string( credential=DefaultAzureCredential(), conn_str=os.environ["PROJECT_CONNECTION_STRING"], ) # Initialize toolset with our user-defined functions functions = FunctionTool(user_functions) toolset = ToolSet() toolset.add(functions) # Create the agent agent = project_client.agents.create_agent( model=os.environ["AZURE_OPENAI_DEPLOYMENT_NAME"], name=f"billing-anomaly-agent-{datetime.now().strftime('%Y%m%d%H%M')}", description="Billing Anomaly Detection Agent", instructions=f""" You are a helpful SaaS financial assistant that retrieves and explains billing anomalies using usage data. The current date is {datetime.now().strftime("%Y-%m-%d")}. """, toolset=toolset, ) print(f"Created agent, ID: {agent.id}") # Create a communication thread thread = project_client.agents.create_thread() print(f"Created thread, ID: {thread.id}") # Post a message to the agent thread message = project_client.agents.create_message( thread_id=thread.id, role="user", content="Why did my billing spike for tenant_456 this week?", ) print(f"Created message, ID: {message.id}") # Run the agent and process the query run = project_client.agents.create_and_process_run( thread_id=thread.id, agent_id=agent.id ) print(f"Run finished with status: {run.status}") if run.status == "failed": print(f"Run failed: {run.last_error}") # Fetch and display the messages messages = project_client.agents.list_messages(thread_id=thread.id) print("Messages:") pprint(messages["data"][0]["content"][0]["text"]["value"]) # Optional cleanup: # project_client.agents.delete_agent(agent.id) # print("Deleted agent") Run the agent: To run the agent, run the following command python billing_anomaly_agent.py Snippet of output from agent: Using the Azure AI Foundry Agent Playground: After running your agent using the Azure AI Agent SDK, it is saved within your Azure AI Foundry project. You can now experiment with it using the Agent Playground. To try it out: Go to the Agents section in your Azure AI Foundry workspace. Find your billing anomaly agent in the list and click to open it. Use the playground interface to test different financial or billing-related questions, such as: “Did tenant_456 exceed their API usage quota this month?” “Explain recent storage usage changes for tenant_456.” This is a great way to validate your agent's behavior without writing more code. Summary: You’ve now created a working AI agent that talks to your Postgres database, all using: A simple Python function Azure AI Agent Service A Neon Serverless Postgres backend This approach is beginner-friendly, lightweight, and practical for real-world use. Want to go further? You can: Add more tools to the agent Integrate with vector search (e.g., detect anomaly reasons from logs using embeddings) Resources: Introduction to Azure AI Agent Service Develop an AI agent with Azure AI Agent Service Getting Started with Azure AI Agent Service Neon on Azure Build AI Agents with Azure AI Agent Service and Neon Multi-Agent AI Solution with Neon, Langchain, AutoGen and Azure OpenAI Azure AI Foundry GitHub Discussions That's it, folks! But the best part? You can become part of a thriving community of learners and builders by joining the Microsoft Learn Student Ambassadors Community. Connect with like-minded individuals, explore hands-on projects, and stay updated with the latest in cloud and AI. 💬 Join the community on Discord here and explore more benefits on the Microsoft Learn Student Hub.605Views5likes1CommentAI Agents: Mastering Agentic RAG - Part 5
This blog post, Part 5 of a series on AI agents, explores Agentic RAG (Retrieval-Augmented Generation), a paradigm shift in how LLMs interact with external data. Unlike traditional RAG, Agentic RAG allows LLMs to autonomously plan their information retrieval process through an iterative loop of actions and evaluations. The post highlights the importance of the LLM "owning" the reasoning process, dynamically selecting tools and refining queries. It covers key implementation details, including iterative loops, tool integration, memory management, and handling failure modes. Practical use cases, governance considerations, and code examples demonstrating Agentic RAG with AutoGen, Semantic Kernel, and Azure AI Agent Service are provided. The post concludes by emphasizing the transformative potential of Agentic RAG and encourages further exploration through linked resources and previous blog posts in the series.2.6KViews1like0CommentsCreate your own QA RAG Chatbot with LangChain.js + Azure OpenAI Service
Demo: Mpesa for Business Setup QA RAG Application In this tutorial we are going to build a Question-Answering RAG Chat Web App. We utilize Node.js and HTML, CSS, JS. We also incorporate Langchain.js + Azure OpenAI + MongoDB Vector Store (MongoDB Search Index). Get a quick look below. Note: Documents and illustrations shared here are for demo purposes only and Microsoft or its products are not part of Mpesa. The content demonstrated here should be used for educational purposes only. Additionally, all views shared here are solely mine. What you will need: An active Azure subscription, get Azure for Student for free or get started with Azure for 12 months free. VS Code Basic knowledge in JavaScript (not a must) Access to Azure OpenAI, click here if you don't have access. Create a MongoDB account (You can also use Azure Cosmos DB vector store) Setting Up the Project In order to build this project, you will have to fork this repository and clone it. GitHub Repository link: https://github.com/tiprock-network/azure-qa-rag-mpesa . Follow the steps highlighted in the README.md to setup the project under Setting Up the Node.js Application. Create Resources that you Need In order to do this, you will need to have Azure CLI or Azure Developer CLI installed in your computer. Go ahead and follow the steps indicated in the README.md to create Azure resources under Azure Resources Set Up with Azure CLI. You might want to use Azure CLI to login in differently use a code. Here's how you can do this. Instead of using az login. You can do az login --use-code-device OR you would prefer using Azure Developer CLI and execute this command instead azd auth login --use-device-code Remember to update the .env file with the values you have used to name Azure OpenAI instance, Azure models and even the API Keys you have obtained while creating your resources. Setting Up MongoDB After accessing you MongoDB account get the URI link to your database and add it to the .env file along with your database name and vector store collection name you specified while creating your indexes for a vector search. Running the Project In order to run this Node.js project you will need to start the project using the following command. npm run dev The Vector Store The vector store used in this project is MongoDB store where the word embeddings were stored in MongoDB. From the embeddings model instance we created on Azure AI Foundry we are able to create embeddings that can be stored in a vector store. The following code below shows our embeddings model instance. //create new embedding model instance const azOpenEmbedding = new AzureOpenAIEmbeddings({ azureADTokenProvider, azureOpenAIApiInstanceName: process.env.AZURE_OPENAI_API_INSTANCE_NAME, azureOpenAIApiEmbeddingsDeploymentName: process.env.AZURE_OPENAI_API_DEPLOYMENT_EMBEDDING_NAME, azureOpenAIApiVersion: process.env.AZURE_OPENAI_API_VERSION, azureOpenAIBasePath: "https://eastus2.api.cognitive.microsoft.com/openai/deployments" }); The code in uploadDoc.js offers a simple way to do embeddings and store them to MongoDB. In this approach the text from the documents is loaded using the PDFLoader from Langchain community. The following code demonstrates how the embeddings are stored in the vector store. // Call the function and handle the result with await const storeToCosmosVectorStore = async () => { try { const documents = await returnSplittedContent() //create store instance const store = await MongoDBAtlasVectorSearch.fromDocuments( documents, azOpenEmbedding, { collection: vectorCollection, indexName: "myrag_index", textKey: "text", embeddingKey: "embedding", } ) if(!store){ console.log('Something wrong happened while creating store or getting store!') return false } console.log('Done creating/getting and uploading to store.') return true } catch (e) { console.log(`This error occurred: ${e}`) return false } } In this setup, Question Answering (QA) is achieved by integrating Azure OpenAI’s GPT-4o with MongoDB Vector Search through LangChain.js. The system processes user queries via an LLM (Large Language Model), which retrieves relevant information from a vectorized database, ensuring contextual and accurate responses. Azure OpenAI Embeddings convert text into dense vector representations, enabling semantic search within MongoDB. The LangChain RunnableSequence structures the retrieval and response generation workflow, while the StringOutputParser ensures proper text formatting. The most relevant code snippets to include are: AzureChatOpenAI instantiation, MongoDB connection setup, and the API endpoint handling QA queries using vector search and embeddings. There are some code snippets below to explain major parts of the code. Azure AI Chat Completion Model This is the model used in this implementation of RAG, where we use it as the model for chat completion. Below is a code snippet for it. const llm = new AzureChatOpenAI({ azTokenProvider, azureOpenAIApiInstanceName: process.env.AZURE_OPENAI_API_INSTANCE_NAME, azureOpenAIApiDeploymentName: process.env.AZURE_OPENAI_API_DEPLOYMENT_NAME, azureOpenAIApiVersion: process.env.AZURE_OPENAI_API_VERSION }) Using a Runnable Sequence to give out Chat Output This shows how a runnable sequence can be used to give out a response given the particular output format/ output parser added on to the chain. //Stream response app.post(`${process.env.BASE_URL}/az-openai/runnable-sequence/stream/chat`, async (req,res) => { //check for human message const { chatMsg } = req.body if(!chatMsg) return res.status(201).json({ message:'Hey, you didn\'t send anything.' }) //put the code in an error-handler try{ //create a prompt template format template const prompt = ChatPromptTemplate.fromMessages( [ ["system", `You are a French-to-English translator that detects if a message isn't in French. If it's not, you respond, "This is not French." Otherwise, you translate it to English.`], ["human", `${chatMsg}`] ] ) //runnable chain const chain = RunnableSequence.from([prompt, llm, outPutParser]) //chain result let result_stream = await chain.stream() //set response headers res.setHeader('Content-Type','application/json') res.setHeader('Transfer-Encoding','chunked') //create readable stream const readable = Readable.from(result_stream) res.status(201).write(`{"message": "Successful translation.", "response": "`); readable.on('data', (chunk) => { // Convert chunk to string and write it res.write(`${chunk}`); }); readable.on('end', () => { // Close the JSON response properly res.write('" }'); res.end(); }); readable.on('error', (err) => { console.error("Stream error:", err); res.status(500).json({ message: "Translation failed.", error: err.message }); }); }catch(e){ //deliver a 500 error response return res.status(500).json( { message:'Failed to send request.', error:e } ) } }) To run the front end of the code, go to your BASE_URL with the port given. This enables you to run the chatbot above and achieve similar results. The chatbot is basically HTML+CSS+JS. Where JavaScript is mainly used with fetch API to get a response. Thanks for reading. I hope you play around with the code and learn some new things. Additional Reads Introduction to LangChain.js Create an FAQ Bot on Azure Build a basic chat app in Python using Azure AI Foundry SDK569Views0likes0Comments