AzureAI

9 Topics

Deploy Open Web UI on Azure VM via Docker: A Step-by-Step Guide with Custom Domain Setup.
Introductions Open Web UI (often referred to as "Ollama Web UI" in the context of LLM frameworks like Ollama) is an open-source, self-hostable interface designed to simplify interactions with large language models (LLMs) such as GPT-4, Llama 3, Mistral, and others. It provides a user-friendly, browser-based environment for deploying, managing, and experimenting with AI models, making advanced language model capabilities accessible to developers, researchers, and enthusiasts without requiring deep technical expertise. This article will delve into the step-by-step configurations on hosting OpenWeb UI on Azure. Requirements: Azure Portal Account - For students you can claim $USD100 Azure Cloud credits from this URL. Azure Virtual Machine - with a Linux of any distributions installed. Domain Name and Domain Host Caddy Open WebUI Image Step One: Deploy a Linux – Ubuntu VM from Azure Portal Search and Click on “Virtual Machine” on the Azure portal search bar and create a new VM by clicking on the “+ Create” button > “Azure Virtual Machine”. Fill out the form and select any Linux Distribution image – In this demo, we will deploy Open WebUI on Ubuntu Pro 24.04. Click “Review + Create” > “Create” to create the Virtual Machine. Tips: If you plan to locally download and host open source AI models via Open on your VM, you could save time by increasing the size of the OS disk / attach a large disk to the VM. You may also need a higher performance VM specification since large resources are needed to run the Large Language Model (LLM) locally. Once the VM has been successfully created, click on the “Go to resource” button. You will be redirected to the VM’s overview page. Jot down the public IP Address and access the VM using the ssh credentials you have setup just now. Step Two: Deploy the Open WebUI on the VM via Docker Once you are logged into the VM via SSH, run the Docker Command below: docker run -d --name open-webui --network=host --add-host=host.docker.internal:host-gateway -e PORT=8080 -v open-webui:/app/backend/data --restart always ghcr.io/open-webui/open-webui:dev This Docker command will download the Open WebUI Image into the VM and will listen for Open Web UI traffic on port 8080. Wait for a few minutes and the Web UI should be up and running. If you had setup an inbound Network Security Group on Azure to allow port 8080 on your VM from the public Internet, you can access them by typing into the browser: [PUBLIC_IP_ADDRESS]:8080 Step Three: Setup custom domain using Caddy Now, we can setup a reverse proxy to map a custom domain to [PUBLIC_IP_ADDRESS]:8080 using Caddy. The reason why Caddy is useful here is because they provide automated HTTPS solutions – you don’t have to worry about expiring SSL certificate anymore, and it’s free! You must download all Caddy’s dependencies and set up the requirements to install it using this command: sudo apt install -y debian-keyring debian-archive-keyring apt-transport-https curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/gpg.key' | sudo gpg --dearmor -o /usr/share/keyrings/caddy-stable-archive-keyring.gpg curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/debian.deb.txt' | sudo tee /etc/apt/sources.list.d/caddy-stable.list sudo apt update && sudo apt install caddy Once Caddy is installed, edit Caddy’s configuration file at: /etc/caddy/Caddyfile , delete everything else in the file and add the following lines: yourdomainname.com { reverse_proxy localhost:8080 } Restart Caddy using this command: sudo systemctl restart caddy Next, create an A record on your DNS Host and point them to the public IP of the server. Step Four: Update the Network Security Group (NSG) To allow public access into the VM via HTTPS, you need to ensure the NSG/Firewall of the VM allow for port 80 and 443. Let’s add these rules into Azure by heading to the VM resources page you created for Open WebUI. Under the “Networking” Section > “Network Settings” > “+ Create port rule” > “Inbound port rule” On the “Destination port ranges” field, type in 443 and Click “Add”. Repeat these steps with port 80. Additionally, to enhance security, you should avoid external users from directly interacting with Open Web UI’s port - port 8080. You should add an inbound deny rule to that port. With that, you should be able to access the Open Web UI from the domain name you setup earlier. Conclusion And just like that, you’ve turned a blank Azure VM into a sleek, secure home for your Open Web UI, no magic required! By combining Docker’s simplicity with Caddy’s “set it and forget it” HTTPS magic, you’ve not only made your app accessible via a custom domain but also locked down security by closing off risky ports and keeping traffic encrypted. Azure’s cloud muscle handles the heavy lifting, while you get to enjoy the perks of a pro setup without the headache. If you are interested in using AI models deployed on Azure AI Foundry on OpenWeb UI via API, kindly read my other article: Step-by-step: Integrate Ollama Web UI to use Azure Open AI API with LiteLLM Proxy
suzarilshah
Jun 27, 2025 Place Educator Developer Blog
2.1KViews
1like
1Comment
Configure Embedding Models on Azure AI Foundry with Open Web UI
Introduction Let’s take a closer look at an exciting development in the AI space. Embedding models are the key to transforming complex data into usable insights, driving innovations like smarter chatbots and tailored recommendations. With Azure AI Foundry, Microsoft’s powerful platform, you’ve got the tools to build and scale these models effortlessly. Add in Open Web UI, a intuitive interface for engaging with AI systems, and you’ve got a winning combo that’s hard to beat. In this article, we’ll explore how embedding models on Azure AI Foundry, paired with Open Web UI, are paving the way for accessible and impactful AI solutions for developers and businesses. Let’s dive in! To proceed with configuring the embedding model from Azure AI Foundry on Open Web UI, please firstly configure the requirements below. Requirements: Setup Azure AI Foundry Hub/Projects Deploy Open Web UI – refer to my previous article on how you can deploy Open Web UI on Azure VM. Optional: Deploy LiteLLM with Azure AI Foundry models to work on Open Web UI - refer to my previous article on how you can do this as well. Deploying Embedding Models on Azure AI Foundry Navigate to the Azure AI Foundry site and deploy an embedding model from the “Model + Endpoint” section. For the purpose of this demonstration, we will deploy the “text-embedding-3-large” model by OpenAI. You should be receiving a URL endpoint and API Key to the embedding model deployed just now. Take note of that credential because we will be using it in Open Web UI. Configuring the embedding models on Open Web UI Now head to the Open Web UI Admin Setting Page > Documents and Select Azure Open AI as the Embedding Model Engine. Copy and Paste the Base URL, API Key, the Embedding Model deployed on Azure AI Foundry and the API version (not the model version) into the fields below: Click “Save” to reflect the changes. Expected Output Now let us look into the scenario for when the embedding model configured on Open Web UI and when it is not. Without Embedding Models configured. With Azure Open AI Embedding models configured. Conclusion And there you have it! Embedding models on Azure AI Foundry, combined with the seamless interaction offered by Open Web UI, are truly revolutionizing how we approach AI solutions. This powerful duo not only simplifies the process of building and deploying intelligent systems but also makes cutting-edge technology more accessible to developers and businesses of all sizes. As we move forward, it’s clear that such integrations will continue to drive innovation, breaking down barriers and unlocking new possibilities in the AI landscape. So, whether you’re a seasoned developer or just stepping into this exciting field, now’s the time to explore what Azure AI Foundry and Open Web UI can do for you. Let’s keep pushing the boundaries of what’s possible!
suzarilshah
Jun 11, 2025 Place Educator Developer Blog
601Views
0likes
0Comments
Building a Basic Chatbot with Azure OpenAI
Overview In this turorial, we'll build a simple chatbot that uses Azure OpenAI to generate responses to user queries. To create a basic chatbot, we need to set up a language model resource that enables conversation capabilities. In this tutorial, we will: Set up the Azure OpenAI resource using the Azure AI Foundry portal. Retrieve the API key needed to connect the resource to your chatbot application. Once the API key is configured in your code, you will be able to integrate the language model into your chatbot and enable it to generate responses. By the end of this tutorial, you'll have a working chatbot that can generate responses using the Azure OpenAI model. Signing In and Setting Up Your Azure AI Foundry Workspace Signing In to Azure AI Foundry Open the Azure AI Foundry page in your web browser. Login to your Azure account. If you don't have an account, you can sign up. Setting Up Your Azure AI Foundry Workspace Select + Create project to create a new project. Perform the following tasks: Enter Project name. It must be a unique value. Select Hub you'd like to use (create a new one if needed). Select Create. Setting Up the Azure OpenAI Resource in Azure AI Foundry In this step, you'll learn how to set up the Azure OpenAI resource in Azure AI Foundry. Azure OpenAI is a pre-trained language model that can generate responses to user queries. We'll be using it in our chatbot. Select Models + endpoints from the left side menu. On this page, you can deploy language models and set up Azure AI resources. In this step, we will deploy the Azure OpenAI GPT-4 language model. Select + Deploy model. Select Deploy base model. In this tutorial, we will deploy the GPT-4o model. Select GPT-4o. Select Confirm. Select Deploy. The model will be deployed. Once the deployment is complete, you will see the model listed on the Models + endpoints page. Now that the model is deployed, you can retrieve the API key needed to connect the model to your chatbot application. Select the model you deployed on the Models + endpoints page. ` On the model details page, you can view information about the model, including the API key. We will come back this page later to add the required information into the environment variables. Setting Up the Project and Install the Libraries Now, you will create a folder to work in and set up a virtual environment to develop a program. Creating a Folder to Work Inside It Open a terminal window and type the following command to create a folder named basic-chatbot in the default path. mkdir basic-chatbot Type the following command inside your terminal to navigate to the basic-chatbot folder you created. cd basic-chatbot Creating a Virtual Environment Type the following command inside your terminal to create a virtual environment named .venv. python -m venv .venv Type the following command inside your terminal to activate the virtual environment. .venv\Scripts\activate.bat NOTE If it worked, you should see (.venv) before the command prompt. Installing the Required Packages Type the following commands inside your terminal to install the required packages. openai: A Python library that provides integration with the Azure OpenAI API. python-dotenv: A Python library for managing environment variables stored in an .env file. pip install openai python-dotenv Setting up the Project in Visual Studio Code To create a basic chatbot program, you will need two files: example.py: This file will contain the code to interact with Azure resources. .env: This file will store the Azure credentials and configuration details. NOTE Purpose of the .env File The .env file is essential for storing the Azure information required to connect and use the resources you created. By keeping the Azure credentials in the .env file, you can ensure a secure and organized way to manage sensitive information. Setting Up example.py File Open Visual Studio Code. Select File from the menu bar. Select Open Folder. Select the basic-chatbot folder that you created, which is located at C:\Users\yourUserName\basic-chatbot. In the left pane of Visual Studio Code, right-click and select New File to create a new file named example.py. Add the following code to the example.py file to import the required libraries. from openai import AzureOpenAI from dotenv import load_dotenv import os # Load environment variables from the .env file load_dotenv() # Retrieve environment variables AZURE_OPENAI_ENDPOINT = os.getenv("AZURE_OPENAI_ENDPOINT") AZURE_OPENAI_API_KEY = os.getenv("AZURE_OPENAI_API_KEY") AZURE_OPENAI_MODEL_NAME = os.getenv("AZURE_OPENAI_MODEL_NAME") AZURE_OPENAI_CHAT_DEPLOYMENT_NAME = os.getenv("AZURE_OPENAI_CHAT_DEPLOYMENT_NAME") AZURE_OPENAI_API_VERSION = os.getenv("AZURE_OPENAI_API_VERSION") # Initialize Azure OpenAI client client = AzureOpenAI( api_key=AZURE_OPENAI_API_KEY, api_version=AZURE_OPENAI_API_VERSION, base_url=f"{AZURE_OPENAI_ENDPOINT}/openai/deployments/{AZURE_OPENAI_CHAT_DEPLOYMENT_NAME}" ) print("Chatbot: Hello! How can I assist you today? Type 'exit' to end the conversation.") while True: user_input = input("You: ") if user_input.lower() == "exit": print("Chatbot: Ending the conversation. Have a great day!") break response = client.chat.completions.create( model=AZURE_OPENAI_MODEL_NAME, messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": user_input} ], max_tokens=200 ) print("Chatbot:", response.choices[0].message.content.strip()) Setting Up .env File To set up your development environment, we will create a .env file and store the necessary credentials directly. NOTE Complete folder structure: └── YourUserName . └── basic-chatbot . ├── example.py . └── .env In the left pane of Visual Studio Code, right-click and select New File to create a new file named .env. Add the following code to the .env file to include your Azure information. AZURE_OPENAI_API_KEY=your_azure_openai_api_key AZURE_OPENAI_ENDPOINT=https://your_azure_openai_endpoint AZURE_OPENAI_MODEL_NAME=your_model_name AZURE_OPENAI_CHAT_DEPLOYMENT_NAME=your_deployment_name AZURE_OPENAI_API_VERSION=your_api_version Retrieving Environment Variables from Azure AI Foundry Now, you will retrieve the required information from Azure AI Foundry and update the .env file. Go to the Models + endpoints page and select your deployed model. On the Model Details page, copy the following information in to the .env file.: AZURE_OPENAI_API_KEY AZURE_OPENAI_ENDPOINT AZURE_OPENAI_MODEL_NAME AZURE_OPENAI_CHAT_DEPLOYMENT_NAME Paste this information into the .env file in the respective placeholders. Running the Chatbot Program Type the following command inside your terminal to run the program and see if it can answer questions. python example.py Interact with the chatbot by typing your questions or messages. The chatbot will generate responses based on the Azure OpenAI model you deployed. NOTE You can find the full example of this chatbot, including the code and .env template, in my GitHub repository: GitHub Repository
Minseok_Song
Jun 08, 2025 Place Educator Developer Blog
1.7KViews
2likes
1Comment
How to build Tool-calling Agents with Azure OpenAI and Lang Graph
Introducing MyTreat Our demo is a fictional website that shows customers their total bill in dollars, but they have the option of getting the total bill in their local currencies. The button sends a request to the Node.js service and a response is simply returned from our Agent given the tool it chooses. Let’s dive in and understand how this works from a broader perspective. Prerequisites An active Azure subscription. You can sign up for a free trial here or get $100 worth of credits on Azure every year if you are a student. A GitHub account (not necessarily) Node.js LTS 18 + VS Code installed (or your favorite IDE) Basic knowledge of HTML, CSS, JS Creating an Azure OpenAI Resource Go over to your browser and key in portal.azure.com to access the Microsoft Azure Portal. Over there navigate to the search bar and type Azure OpenAI. Go ahead and click on + Create. Fill in the input boxes with appropriate, for example, as shown below then press on next until you reach review and submit then finally click on Create. After the deployment is done, go to the deployment and access Azure AI Foundry portal using the button as show below. You can also use the link as demonstrated below. In the Azure AI Foundry portal, we have to create our model instance so we have to go over to Model Catalog on the left panel beneath Get Started. Select a desired model, in this case I used gpt-35-turbo for chat completion (in your case use gpt-4o). Below is a way of doing this. Choose a model (gpt-4o) Click on deploy Give the deployment a new name e.g. myTreatmodel, then click deploy and wait for it to finish On the left panel go over to deployments and you will see the model you have created. Access your Azure OpenAI Resource Key Go back to Azure portal and specifically to the deployment instance that we have and select on the left panel, Resource Management. Click on Keys and Endpoints. Copy any of the keys as shown below and keep it very safe as we will use it in our .env file. Configuring your project Create a new project folder on your local machine and add these variables to the .env file in the root folder. AZURE_OPENAI_API_INSTANCE_NAME= AZURE_OPENAI_API_DEPLOYMENT_NAME= AZURE_OPENAI_API_KEY= AZURE_OPENAI_API_VERSION="2024-08-01-preview" LANGCHAIN_TRACING_V2="false" LANGCHAIN_CALLBACKS_BACKGROUND = "false" PORT=4556 Starting a new project Go over to https://github.com/tiprock-network/mytreat.git and follow the instructions to setup the new project, if you do not have git installed, go over to the Code button and press Download ZIP. This will enable you get the project folder and follow the same procedure for setting up. Creating a custom tool In the utils folder the math tool was created, this code show below uses tool from Langchain to build a tool and the schema of the tool is created using zod.js, a library that helps in validating an object’s property value. The price function takes in an array of prices and the exchange rate, adds the prices up and converts them using the exchange rate as shown below. import { tool } from '@langchain/core/tools' import { z } from 'zod' const priceConv = tool((input) =>{ //get the prices and add them up after turning each into let sum = 0 input.prices.forEach((price) => { let price_check = parseFloat(price) sum += price_check }) //now change the price using exchange rate let final_price = parseFloat(input.exchange_rate) * sum //return return final_price },{ name: 'add_prices_and_convert', description: 'Add prices and convert based on exchange rate.', schema: z.object({ prices: z.number({ required_error: 'Price should not be empty.', invalid_type_error: 'Price must be a number.' }).array().nonempty().describe('Prices of items listed.'), exchange_rate: z.string().describe('Current currency exchange rate.') }) }) export { priceConv } Utilizing the tool In the controller’s folder we then bring the tool in by importing it. After that we pass it in to our array of tools. Notice that we have the Tavily Search Tool, you can learn how to implement in the Additional Reads Section or just remove it. Agent Model and the Call Process This code defines an AI agent using LangGraph and LangChain.js, powered by GPT-4o from Azure OpenAI. It initializes a ToolNode to manage tools like priceConv and binds them to the agent model. The StateGraph handles decision-making, determining whether the agent should call a tool or return a direct response. If a tool is needed, the workflow routes the request accordingly; otherwise, the agent responds to the user. The callModel function invokes the agent, processing messages and ensuring seamless tool integration. The searchAgentController is a GET endpoint that accepts user queries (text_message). It processes input through the compiled LangGraph workflow, invoking the agent to generate a response. If a tool is required, the agent calls it before finalizing the output. The response is then sent back to the user, ensuring dynamic and efficient tool-assisted reasoning. //create tools the agent will use //const agentTools = [new TavilySearchResults({maxResults:5}), priceConv] const agentTools = [ priceConv] const toolNode = new ToolNode(agentTools) const agentModel = new AzureChatOpenAI({ model:'gpt-4o', temperature:0, azureOpenAIApiKey: AZURE_OPENAI_API_KEY, azureOpenAIApiInstanceName:AZURE_OPENAI_API_INSTANCE_NAME, azureOpenAIApiDeploymentName:AZURE_OPENAI_API_DEPLOYMENT_NAME, azureOpenAIApiVersion:AZURE_OPENAI_API_VERSION }).bindTools(agentTools) //make a decision to continue or not const shouldContinue = ( state ) => { const { messages } = state const lastMessage = messages[messages.length -1] //upon tool call we go to tools if("tool_calls" in lastMessage && Array.isArray(lastMessage.tool_calls) && lastMessage.tool_calls?.length) return "tools"; //if no tool call is made we stop and return back to the user return END } const callModel = async (state) => { const response = await agentModel.invoke(state.messages) return { messages: [response] } } //define a new graph const workflow = new StateGraph(MessagesAnnotation) .addNode("agent", callModel) .addNode("tools", toolNode) .addEdge(START, "agent") .addConditionalEdges("agent", shouldContinue, ["tools", END]) .addEdge("tools", "agent") const appAgent = workflow.compile() The above is implemented with the following code: Frontend The frontend is a simple HTML+CSS+JS stack that demonstrated how you can use an API to integrate this AI Agent to your website. It sends a GET request and uses the response to get back the right answer. Below is an illustration of how fetch API has been used. const searchAgentController = async ( req, res ) => { //get human text const { text_message } = req.query if(!text_message) return res.status(400).json({ message:'No text sent.' }) //invoke the agent const agentFinalState = await appAgent.invoke( { messages: [new HumanMessage(text_message)] }, {streamMode: 'values'} ) //const agentFinalState_b = await agentModel.invoke(text_message) /*return res.status(200).json({ answer:agentFinalState.messages[agentFinalState.messages.length - 1].content })*/ //console.log(agentFinalState_b.tool_calls) res.status(200).json({ text: agentFinalState.messages[agentFinalState.messages.length - 1].content }) } There you go! We have created a basic tool-calling agent using Azure and Langchain successfully, go ahead and expand the code base to your liking. If you have questions you can comment below or reach out on my socials. Additional Reads Azure Open AI Service Models Generative AI for Beginners AI Agents for Beginners Course Lang Graph Tutorial Develop Generative AI Apps in Azure AI Foundry Portal
theophilusO
Jun 02, 2025 Place Educator Developer Blog
2.9KViews
1like
2Comments
Power Up Your Open WebUI with Azure AI Speech: Quick STT & TTS Integration
Introduction Ever found yourself wishing your web interface could really talk and listen back to you? With a few clicks (and a bit of code), you can turn your plain Open WebUI into a full-on voice assistant. In this post, you’ll see how to spin up an Azure Speech resource, hook it into your frontend, and watch as user speech transforms into text and your app’s responses leap off the screen in a human-like voice. By the end of this guide, you’ll have a voice-enabled web UI that actually converses with users, opening the door to hands-free controls, better accessibility, and a genuinely richer user experience. Ready to make your web app speak? Let’s dive in. Why Azure AI Speech? We use Azure AI Speech service in Open Web UI to enable voice interactions directly within web applications. This allows users to: Speak commands or input instead of typing, making the interface more accessible and user-friendly. Hear responses or information read aloud, which improves usability for people with visual impairments or those who prefer audio. Provide a more natural and hands-free experience especially on devices like smartphones or tablets. In short, integrating Azure AI Speech service into Open Web UI helps make web apps smarter, more interactive, and easier to use by adding speech recognition and voice output features. If you haven’t hosted Open WebUI already, follow my other step-by-step guide to host Ollama WebUI on Azure. Proceed to the next step if you have Open WebUI deployed already. Learn More about OpenWeb UI here. Deploy Azure AI Speech service in Azure. Navigate to the Azure Portal and search for Azure AI Speech on the Azure portal search bar. Create a new Speech Service by filling up the fields in the resource creation page. Click on “Create” to finalize the setup. After the resource has been deployed, click on “View resource” button and you should be redirected to the Azure AI Speech service page. The page should display the API Keys and Endpoints for Azure AI Speech services, which you can use in Open Web UI. Settings things up in Open Web UI Speech to Text settings (STT) Head to the Open Web UI Admin page > Settings > Audio. Paste the API Key obtained from the Azure AI Speech service page into the API key field below. Unless you use different Azure Region, or want to change the default configurations for the STT settings, leave all settings to blank. Text to Speech settings (TTS) Now, let's proceed with configuring the TTS Settings on OpenWeb UI by toggling the TTS Engine to Azure AI Speech option. Again, paste the API Key obtained from Azure AI Speech service page and leave all settings to blank. You can change the TTS Voice from the dropdown selection in the TTS settings as depicted in the image below: Click Save to reflect the change. Expected Result Now, let’s test if everything works well. Open a new chat / temporary chat on Open Web UI and click on the Call / Record button. The STT Engine (Azure AI Speech) should identify your voice and provide a response based on the voice input. To test the TTS feature, click on the Read Aloud (Speaker Icon) under any response from Open Web UI. The TTS Engine should reflect Azure AI Speech service! Conclusion And that’s a wrap! You’ve just given your Open WebUI the gift of capturing user speech, turning it into text, and then talking right back with Azure’s neural voices. Along the way you saw how easy it is to spin up a Speech resource in the Azure portal, wire up real-time transcription in the browser, and pipe responses through the TTS engine. From here, it’s all about experimentation. Try swapping in different neural voices or dialing in new languages. Tweak how you start and stop listening, play with silence detection, or add custom pronunciation tweaks for those tricky product names. Before you know it, your interface will feel less like a web page and more like a conversation partner.
suzarilshah
May 16, 2025 Place Educator Developer Blog
509Views
1like
0Comments
Smart Auditing: Leveraging Azure AI Agents to Transform Financial Oversight
In today's data-driven business environment, audit teams often spend weeks poring over logs and databases to verify spending and billing information. This time-consuming process is ripe for automation. But is there a way to implement AI solutions without getting lost in complex technical frameworks? While tools like LangChain, Semantic Kernel, and AutoGen offer powerful AI agent capabilities, sometimes you need a straightforward solution that just works. So, what's the answer for teams seeking simplicity without sacrificing effectiveness? This tutorial will show you how to use Azure AI Agent Service to build an AI agent that can directly access your Postgres database to streamline audit workflows. No complex chains or graphs required, just a practical solution to get your audit process automated quickly. The Auditing Challenge: It's the month end, and your audit team is drowning in spreadsheets. As auditors reviewing financial data across multiple SaaS tenants, you're tasked with verifying billing accuracy by tracking usage metrics like API calls, storage consumption, and user sessions in Postgres databases. Each tenant generates thousands of transactions daily, and traditionally, this verification process consumes weeks of your team's valuable time. Typically, teams spend weeks: Manually extracting data from multiple database tables. Cross-referencing usage with invoices. Investigating anomalies through tedious log analysis. Compiling findings into comprehensive reports. With an AI-powered audit agent, you can automate these tasks and transform the process. Your AI assistant can: Pull relevant usage data directly from your database Identify billing anomalies like unexpected usage spikes Generate natural language explanations of findings Create audit reports that highlight key concerns For example, when reviewing a tenant's invoice, your audit agent can query the database for relevant usage patterns, summarize anomalies, and offer explanations: "Tenant_456 experienced a 145% increase in API usage on April 30th, which explains the billing increase. This spike falls outside normal usage patterns and warrants further investigation." Let’s build an AI agent that connects to your Postgres database and transforms your audit process from manual effort to automated intelligence. Prerequisites: Before we start building our audit agent, you'll need: An Azure subscription (Create one for free). The Azure AI Developer RBAC role assigned to your account. Python 3.11.x installed on your development machine. OR You can also use GitHub Codespaces, which will automatically install all dependencies for you. You’ll need to create a GitHub account first if you don’t already have one. Setting Up Your Database: For this tutorial, we'll use Neon Serverless Postgres as our database. It's a fully managed, cloud-native Postgres solution that's free to start, scales automatically, and works excellently for AI agents that need to query data on demand. Creating a Neon Database on Azure: Open the Neon Resource page on the Azure portal Fill out the form with the required fields and deploy your database After creation, navigate to the Neon Serverless Postgres Organization service Click on the Portal URL to access the Neon Console Click "New Project" Choose an Azure region Name your project (e.g., "Audit Agent Database") Click "Create Project" Once your project is successfully created, copy the Neon connection string from the Connection Details widget on the Neon Dashboard. It will look like this: postgresql://[user]:[password]@[neon_hostname]/[dbname]?sslmode=require Note: Keep this connection string saved; we'll need it shortly. Creating an AI Foundry Project on Azure: Next, we'll set up the AI infrastructure to power our audit agent: Create a new hub and project in the Azure AI Foundry portal by following the guide. Deploy a model like GPT-4o to use with your agent. Make note of your Project connection string and Model Deployment name. You can find your connection string in the overview section of your project in the Azure AI Foundry portal, under Project details > Project connection string. Once you have all three values on hand: Neon connection string, Project connection string, and Model Deployment Name, you are ready to set up the Python project to create an Agent. All the code and sample data are available in this GitHub repository. You can clone or download the project. Project Environment Setup: Create a .env file with your credentials: PROJECT_CONNECTION_STRING="<Your AI Foundry connection string> "AZURE_OPENAI_DEPLOYMENT_NAME="gpt4o" NEON_DB_CONNECTION_STRING="<Your Neon connection string>" Create and activate a virtual environment: python -m venv .venv source .venv/bin/activate # on macOS/Linux .venv\Scripts\activate # on Windows Install required Python libraries: pip install -r requirements.txt Example requirements.txt: Pandas python-dotenv sqlalchemy psycopg2-binary azure-ai-projects ==1.0.0b7 azure-identity Load Sample Billing Usage Data: We will use a mock dataset for tenant usage, including computed percent change in API calls and storage usage in GB: tenant_id date api_calls storage_gb tenant_456 2025-04-01 1000 25.0 tenant_456 2025-03-31 950 24.8 tenant_456 2025-03-30 2200 26.0 Run python load_usage_data.py Python script to create and populate the usage_data table in your Neon Serverless Postgres instance: # load_usage_data.py file import os from dotenv import load_dotenv from sqlalchemy import ( create_engine, MetaData, Table, Column, String, Date, Integer, Numeric, ) # Load environment variables from .env load_dotenv() # Load connection string from environment variable NEON_DB_URL = os.getenv("NEON_DB_CONNECTION_STRING") engine = create_engine(NEON_DB_URL) # Define metadata and table schema metadata = MetaData() usage_data = Table( "usage_data", metadata, Column("tenant_id", String, primary_key=True), Column("date", Date, primary_key=True), Column("api_calls", Integer), Column("storage_gb", Numeric), ) # Create table with engine.begin() as conn: metadata.create_all(conn) # Insert mock data conn.execute( usage_data.insert(), [ { "tenant_id": "tenant_456", "date": "2025-03-27", "api_calls": 870, "storage_gb": 23.9, }, { "tenant_id": "tenant_456", "date": "2025-03-28", "api_calls": 880, "storage_gb": 24.0, }, { "tenant_id": "tenant_456", "date": "2025-03-29", "api_calls": 900, "storage_gb": 24.5, }, { "tenant_id": "tenant_456", "date": "2025-03-30", "api_calls": 2200, "storage_gb": 26.0, }, { "tenant_id": "tenant_456", "date": "2025-03-31", "api_calls": 950, "storage_gb": 24.8, }, { "tenant_id": "tenant_456", "date": "2025-04-01", "api_calls": 1000, "storage_gb": 25.0, }, ], ) print("✅ usage_data table created and mock data inserted.") Create a Postgres Tool for the Agent: Next, we configure an AI agent tool to retrieve data from Postgres. The Python script billing_agent_tools.py contains: The function billing_anomaly_summary() that: Pulls usage data from Neon. Computes % change in api_calls. Flags anomalies with a threshold of > 1.5x change. Exports user_functions list for the Azure AI Agent to use. You do not need to run it separately. # billing_agent_tools.py file import os import json import pandas as pd from sqlalchemy import create_engine from dotenv import load_dotenv # Load environment variables load_dotenv() # Set up the database engine NEON_DB_URL = os.getenv("NEON_DB_CONNECTION_STRING") db_engine = create_engine(NEON_DB_URL) # Define the billing anomaly detection function def billing_anomaly_summary( tenant_id: str, start_date: str = "2025-03-27", end_date: str = "2025-04-01", limit: int = 10, ) -> str: """ Fetches recent usage data for a SaaS tenant and detects potential billing anomalies. :param tenant_id: The tenant ID to analyze. :type tenant_id: str :param start_date: Start date for the usage window. :type start_date: str :param end_date: End date for the usage window. :type end_date: str :param limit: Maximum number of records to return. :type limit: int :return: A JSON string with usage records and anomaly flags. :rtype: str """ query = """ SELECT date, api_calls, storage_gb FROM usage_data WHERE tenant_id = %s AND date BETWEEN %s AND %s ORDER BY date DESC LIMIT %s; """ df = pd.read_sql(query, db_engine, params=(tenant_id, start_date, end_date, limit)) if df.empty: return json.dumps( {"message": "No usage data found for this tenant in the specified range."} ) df.sort_values("date", inplace=True) df["pct_change_api"] = df["api_calls"].pct_change() df["anomaly"] = df["pct_change_api"].abs() > 1.5 return df.to_json(orient="records") # Register this in a list to be used by FunctionTool user_functions = [billing_anomaly_summary] Create and Configure the AI Agent: Now we'll set up the AI agent and integrate it with our Neon Postgres tool using the Azure AI Agent Service SDK. The Python script does the following: Creates the agent Instantiates an AI agent using the selected model (gpt-4o, for example), adds tool access, and sets instructions that tell the agent how to behave (e.g., “You are a helpful SaaS assistant…”). Creates a conversation thread A thread is started to hold a conversation between the user and the agent. Posts a user message Sends a question like “Why did my billing spike for tenant_456 this week?” to the agent. Processes the request The agent reads the message, determines that it should use the custom tool to retrieve usage data, and processes the query. Displays the response Prints the response from the agent with a natural language explanation based on the tool’s output. # billing_anomaly_agent.py import os from datetime import datetime from azure.ai.projects import AIProjectClient from azure.identity import DefaultAzureCredential from azure.ai.projects.models import FunctionTool, ToolSet from dotenv import load_dotenv from pprint import pprint from billing_agent_tools import user_functions # Custom tool function module # Load environment variables from .env file load_dotenv() # Create an Azure AI Project Client project_client = AIProjectClient.from_connection_string( credential=DefaultAzureCredential(), conn_str=os.environ["PROJECT_CONNECTION_STRING"], ) # Initialize toolset with our user-defined functions functions = FunctionTool(user_functions) toolset = ToolSet() toolset.add(functions) # Create the agent agent = project_client.agents.create_agent( model=os.environ["AZURE_OPENAI_DEPLOYMENT_NAME"], name=f"billing-anomaly-agent-{datetime.now().strftime('%Y%m%d%H%M')}", description="Billing Anomaly Detection Agent", instructions=f""" You are a helpful SaaS financial assistant that retrieves and explains billing anomalies using usage data. The current date is {datetime.now().strftime("%Y-%m-%d")}. """, toolset=toolset, ) print(f"Created agent, ID: {agent.id}") # Create a communication thread thread = project_client.agents.create_thread() print(f"Created thread, ID: {thread.id}") # Post a message to the agent thread message = project_client.agents.create_message( thread_id=thread.id, role="user", content="Why did my billing spike for tenant_456 this week?", ) print(f"Created message, ID: {message.id}") # Run the agent and process the query run = project_client.agents.create_and_process_run( thread_id=thread.id, agent_id=agent.id ) print(f"Run finished with status: {run.status}") if run.status == "failed": print(f"Run failed: {run.last_error}") # Fetch and display the messages messages = project_client.agents.list_messages(thread_id=thread.id) print("Messages:") pprint(messages["data"][0]["content"][0]["text"]["value"]) # Optional cleanup: # project_client.agents.delete_agent(agent.id) # print("Deleted agent") Run the agent: To run the agent, run the following command python billing_anomaly_agent.py Snippet of output from agent: Using the Azure AI Foundry Agent Playground: After running your agent using the Azure AI Agent SDK, it is saved within your Azure AI Foundry project. You can now experiment with it using the Agent Playground. To try it out: Go to the Agents section in your Azure AI Foundry workspace. Find your billing anomaly agent in the list and click to open it. Use the playground interface to test different financial or billing-related questions, such as: “Did tenant_456 exceed their API usage quota this month?” “Explain recent storage usage changes for tenant_456.” This is a great way to validate your agent's behavior without writing more code. Summary: You’ve now created a working AI agent that talks to your Postgres database, all using: A simple Python function Azure AI Agent Service A Neon Serverless Postgres backend This approach is beginner-friendly, lightweight, and practical for real-world use. Want to go further? You can: Add more tools to the agent Integrate with vector search  (e.g., detect anomaly reasons from logs using embeddings) Resources: Introduction to Azure AI Agent Service Develop an AI agent with Azure AI Agent Service Getting Started with Azure AI Agent Service Neon on Azure Build AI Agents with Azure AI Agent Service and Neon Multi-Agent AI Solution with Neon, Langchain, AutoGen and Azure OpenAI Azure AI Foundry GitHub Discussions That's it, folks! But the best part? You can become part of a thriving community of learners and builders by joining the Microsoft Learn Student Ambassadors Community. Connect with like-minded individuals, explore hands-on projects, and stay updated with the latest in cloud and AI. 💬 Join the community on Discord here and explore more benefits on the Microsoft Learn Student Hub.
MuhammadSamiullah
May 06, 2025 Place Educator Developer Blog
422Views
5likes
1Comment
Create your own QA RAG Chatbot with LangChain.js + Azure OpenAI Service
Demo: Mpesa for Business Setup QA RAG Application In this tutorial we are going to build a Question-Answering RAG Chat Web App. We utilize Node.js and HTML, CSS, JS. We also incorporate Langchain.js + Azure OpenAI + MongoDB Vector Store (MongoDB Search Index). Get a quick look below. Note: Documents and illustrations shared here are for demo purposes only and Microsoft or its products are not part of Mpesa. The content demonstrated here should be used for educational purposes only. Additionally, all views shared here are solely mine. What you will need: An active Azure subscription, get Azure for Student for free or get started with Azure for 12 months free. VS Code Basic knowledge in JavaScript (not a must) Access to Azure OpenAI, click here if you don't have access. Create a MongoDB account (You can also use Azure Cosmos DB vector store) Setting Up the Project In order to build this project, you will have to fork this repository and clone it. GitHub Repository link: https://github.com/tiprock-network/azure-qa-rag-mpesa . Follow the steps highlighted in the README.md to setup the project under Setting Up the Node.js Application. Create Resources that you Need In order to do this, you will need to have Azure CLI or Azure Developer CLI installed in your computer. Go ahead and follow the steps indicated in the README.md to create Azure resources under Azure Resources Set Up with Azure CLI. You might want to use Azure CLI to login in differently use a code. Here's how you can do this. Instead of using az login. You can do az login --use-code-device OR you would prefer using Azure Developer CLI and execute this command instead azd auth login --use-device-code Remember to update the .env file with the values you have used to name Azure OpenAI instance, Azure models and even the API Keys you have obtained while creating your resources. Setting Up MongoDB After accessing you MongoDB account get the URI link to your database and add it to the .env file along with your database name and vector store collection name you specified while creating your indexes for a vector search. Running the Project In order to run this Node.js project you will need to start the project using the following command. npm run dev The Vector Store The vector store used in this project is MongoDB store where the word embeddings were stored in MongoDB. From the embeddings model instance we created on Azure AI Foundry we are able to create embeddings that can be stored in a vector store. The following code below shows our embeddings model instance. //create new embedding model instance const azOpenEmbedding = new AzureOpenAIEmbeddings({ azureADTokenProvider, azureOpenAIApiInstanceName: process.env.AZURE_OPENAI_API_INSTANCE_NAME, azureOpenAIApiEmbeddingsDeploymentName: process.env.AZURE_OPENAI_API_DEPLOYMENT_EMBEDDING_NAME, azureOpenAIApiVersion: process.env.AZURE_OPENAI_API_VERSION, azureOpenAIBasePath: "https://eastus2.api.cognitive.microsoft.com/openai/deployments" }); The code in uploadDoc.js offers a simple way to do embeddings and store them to MongoDB. In this approach the text from the documents is loaded using the PDFLoader from Langchain community. The following code demonstrates how the embeddings are stored in the vector store. // Call the function and handle the result with await const storeToCosmosVectorStore = async () => { try { const documents = await returnSplittedContent() //create store instance const store = await MongoDBAtlasVectorSearch.fromDocuments( documents, azOpenEmbedding, { collection: vectorCollection, indexName: "myrag_index", textKey: "text", embeddingKey: "embedding", } ) if(!store){ console.log('Something wrong happened while creating store or getting store!') return false } console.log('Done creating/getting and uploading to store.') return true } catch (e) { console.log(`This error occurred: ${e}`) return false } } In this setup, Question Answering (QA) is achieved by integrating Azure OpenAI’s GPT-4o with MongoDB Vector Search through LangChain.js. The system processes user queries via an LLM (Large Language Model), which retrieves relevant information from a vectorized database, ensuring contextual and accurate responses. Azure OpenAI Embeddings convert text into dense vector representations, enabling semantic search within MongoDB. The LangChain RunnableSequence structures the retrieval and response generation workflow, while the StringOutputParser ensures proper text formatting. The most relevant code snippets to include are: AzureChatOpenAI instantiation, MongoDB connection setup, and the API endpoint handling QA queries using vector search and embeddings. There are some code snippets below to explain major parts of the code. Azure AI Chat Completion Model This is the model used in this implementation of RAG, where we use it as the model for chat completion. Below is a code snippet for it. const llm = new AzureChatOpenAI({ azTokenProvider, azureOpenAIApiInstanceName: process.env.AZURE_OPENAI_API_INSTANCE_NAME, azureOpenAIApiDeploymentName: process.env.AZURE_OPENAI_API_DEPLOYMENT_NAME, azureOpenAIApiVersion: process.env.AZURE_OPENAI_API_VERSION }) Using a Runnable Sequence to give out Chat Output This shows how a runnable sequence can be used to give out a response given the particular output format/ output parser added on to the chain. //Stream response app.post(`${process.env.BASE_URL}/az-openai/runnable-sequence/stream/chat`, async (req,res) => { //check for human message const { chatMsg } = req.body if(!chatMsg) return res.status(201).json({ message:'Hey, you didn\'t send anything.' }) //put the code in an error-handler try{ //create a prompt template format template const prompt = ChatPromptTemplate.fromMessages( [ ["system", `You are a French-to-English translator that detects if a message isn't in French. If it's not, you respond, "This is not French." Otherwise, you translate it to English.`], ["human", `${chatMsg}`] ] ) //runnable chain const chain = RunnableSequence.from([prompt, llm, outPutParser]) //chain result let result_stream = await chain.stream() //set response headers res.setHeader('Content-Type','application/json') res.setHeader('Transfer-Encoding','chunked') //create readable stream const readable = Readable.from(result_stream) res.status(201).write(`{"message": "Successful translation.", "response": "`); readable.on('data', (chunk) => { // Convert chunk to string and write it res.write(`${chunk}`); }); readable.on('end', () => { // Close the JSON response properly res.write('" }'); res.end(); }); readable.on('error', (err) => { console.error("Stream error:", err); res.status(500).json({ message: "Translation failed.", error: err.message }); }); }catch(e){ //deliver a 500 error response return res.status(500).json( { message:'Failed to send request.', error:e } ) } }) To run the front end of the code, go to your BASE_URL with the port given. This enables you to run the chatbot above and achieve similar results. The chatbot is basically HTML+CSS+JS. Where JavaScript is mainly used with fetch API to get a response. Thanks for reading. I hope you play around with the code and learn some new things. Additional Reads Introduction to LangChain.js Create an FAQ Bot on Azure Build a basic chat app in Python using Azure AI Foundry SDK
theophilusO
Mar 12, 2025 Place Educator Developer Blog
458Views
0likes
0Comments
IA y NET LATAM - Episodio 6
Buenas, Es un placer para nosotros, Bruno y Pablito Piova compartir con ustedes nuestras impresiones sobre el episodio 6 de la serie AI + .NET LATAM que tuvimos el honor de presentar el 6 de Diciembre junto con Jose Luis Latorre y Luis Beltran En el episodio número 6 de nuestra serie en Microsoft Reactor, exploramos cómo la inteligencia artificial (IA) está transformando el panorama tecnológico a través de herramientas innovadoras como Agentes autónomos, Semantic Kernel y otras tecnologías avanzadas. Además, discutimos las tendencias clave de IA que marcarán el 2025 y pudimos revisar algunas noticias frescas posteriores al gran evento Microsoft Ignite 2024. A continuación, destacamos algunos de los puntos más interesantes que se mencionaron en la charla y compartimos los enlaces de referencia: 6 AI trends you’ll see more of in 2025 Un repaso a las tendencias que marcan la hoja de ruta de la IA para el futuro próximo, desde modelos más potentes y accesibles, hasta el auge de los agentes inteligentes. Microsoft Ignite 2024 Book of News Un resumen completo de todos los anuncios más relevantes presentados en Ignite, incluyendo nuevos servicios, herramientas y mejoras para desarrolladores y profesionales de TI. Introducing Microsoft Copilot actions, new agents, and tools to empower IT| Microsoft 365 Blog Copilot va más allá del simple chat; ahora incluye agentes y acciones que automatizan tareas y mejoran la productividad empresarial. Ignite 2024: Announcing the Azure AI Foundry SDK Un nuevo SDK que unifica y facilita el despliegue y la orquestación de soluciones de IA en Azure, acelerando los ciclos de desarrollo. Introducing Azure AI Agent Service Nuevas funcionalidades que facilitan la creación y administración de agentes de IA capaces de interactuar con otras herramientas y servicios. New Copilot Prompt Gallery helps you discover, save, and share your favorite prompts | Microsoft Community Hub Una galería para descubrir, guardar y compartir prompts, facilitando el trabajo con modelos generativos. Ideal para estandarizar y reutilizar buenas prácticas. Unlocking the Power of Memory: Announcing General Availability of Semantic Kernel’s Memory Packages Una galería para descubrir, guardar y compartir prompts, facilitando el trabajo con modelos generativos. Ideal para estandarizar y reutilizar buenas prácticas. eShopLite-SemanticSearch | eShopLite-SemanticSearch-AzureAISearch Ejemplos prácticos sobre cómo incorporar búsqueda semántica e IA en aplicaciones, utilizando .NET y Azure. Azure AI Content Understanding Servicio en vista previa para procesar y comprender contenidos complejos (texto, imágenes, audio, video) y extraer información relevante. Estamos muy entusiasmados con la creciente participación e interés de la comunidad. Seguiremos comprometidos en ofrecer contenido de alta calidad que promueva el conocimiento y la innovación. Los invitamos a dejar sus comentarios, compartir sus opiniones y contarnos qué más les gustaría ver en futuros episodios. Agradecemos su apoyo y esperamos verlos en el próximo episodio, el 10 de enero de 2025. Registro: https://aka.ms/IAyNET-LATAM Redes de LinkedIn de Microsoft-Reactor: https://www.linkedin.com/showcase/microsoft-reactor/ Un saludo, Bruno y Pablito
pablitopiova
Dec 12, 2024 Place Educator Developer Blog
132Views
0likes
0Comments
Mastering Azure AI: #30DaysOfAzureAI Week Two Recap
Week two of #30DaysOfAzureAI focused on Intelligent App Developers is complete! Here is a quick recap.
glovebox
Apr 14, 2023 Place Educator Developer Blog
2.7KViews
2likes
0Comments