azure open ai
21 TopicsConfigure Embedding Models on Azure AI Foundry with Open Web UI
Introduction Let’s take a closer look at an exciting development in the AI space. Embedding models are the key to transforming complex data into usable insights, driving innovations like smarter chatbots and tailored recommendations. With Azure AI Foundry, Microsoft’s powerful platform, you’ve got the tools to build and scale these models effortlessly. Add in Open Web UI, a intuitive interface for engaging with AI systems, and you’ve got a winning combo that’s hard to beat. In this article, we’ll explore how embedding models on Azure AI Foundry, paired with Open Web UI, are paving the way for accessible and impactful AI solutions for developers and businesses. Let’s dive in! To proceed with configuring the embedding model from Azure AI Foundry on Open Web UI, please firstly configure the requirements below. Requirements: Setup Azure AI Foundry Hub/Projects Deploy Open Web UI – refer to my previous article on how you can deploy Open Web UI on Azure VM. Optional: Deploy LiteLLM with Azure AI Foundry models to work on Open Web UI - refer to my previous article on how you can do this as well. Deploying Embedding Models on Azure AI Foundry Navigate to the Azure AI Foundry site and deploy an embedding model from the “Model + Endpoint” section. For the purpose of this demonstration, we will deploy the “text-embedding-3-large” model by OpenAI. You should be receiving a URL endpoint and API Key to the embedding model deployed just now. Take note of that credential because we will be using it in Open Web UI. Configuring the embedding models on Open Web UI Now head to the Open Web UI Admin Setting Page > Documents and Select Azure Open AI as the Embedding Model Engine. Copy and Paste the Base URL, API Key, the Embedding Model deployed on Azure AI Foundry and the API version (not the model version) into the fields below: Click “Save” to reflect the changes. Expected Output Now let us look into the scenario for when the embedding model configured on Open Web UI and when it is not. Without Embedding Models configured. With Azure Open AI Embedding models configured. Conclusion And there you have it! Embedding models on Azure AI Foundry, combined with the seamless interaction offered by Open Web UI, are truly revolutionizing how we approach AI solutions. This powerful duo not only simplifies the process of building and deploying intelligent systems but also makes cutting-edge technology more accessible to developers and businesses of all sizes. As we move forward, it’s clear that such integrations will continue to drive innovation, breaking down barriers and unlocking new possibilities in the AI landscape. So, whether you’re a seasoned developer or just stepping into this exciting field, now’s the time to explore what Azure AI Foundry and Open Web UI can do for you. Let’s keep pushing the boundaries of what’s possible!1.1KViews0likes0CommentsAzure AI Search - Tag Scoring profile on azureopenai extra_body
I created an index on Azure AI Search and connected it to Azure OpenAI using the extra_body. It works perfectly. However, I created a default scoring profile for my index, which boosts documents containing the string "zinc" in the VITAMINS field by a factor of 10. Since doing this, I can no longer run the query that worked previously without issues. Now, the query is asking for a scoringParameter, and when I attempt to pass it, I receive an error. Here is the code that works fine when I remove the scoring function. client.chat.completions.create( model=os.getenv('DEPLOYMENT'), messages=messages, temperature=0.5, extra_body={ "data_sources": [{ "type": "azure_search", "parameters": { "endpoint": os.getenv('ENDPOINT'), "index_name": os.getenv('INDEX'), "semantic_configuration": os.getenv('RANK'), "query_type": "hybrid", "in_scope": True, "role_information": None, "strictness": 1, "top_n_documents": 3, "authentication": { "type": "api_key", "key": os.getenv('KEY') }, "embedding_dependency": { "type": "deployment_name", "deployment_name": os.getenv('ADA_VIT') } } }] } ) However, if I activate the default scoring profile, I get the following error: > An error occurred: Error code: 400 - {'error': 'message': 'An error occurred when calling Azure Cognitive Search: Azure Search Error: 400, message=\'Server responded with status 400. Error message: {"error":{"code":"MissingRequiredParameter","message":"Expected 1 parameter(s) but 0 were supplied.\\\\r\\\\nParameter name: scoringParameter","details":[{"code":"MissingScoringParameter","message":"Expected 1 parameter(s) but 0 were supplied."}]}}\', api-version=2024-03-01-preview\'\nCall to Azure Search instance failed.\nAPI Users: Please ensure you are using the right instance, index_name, and provide admin_key as the api_key.\n'} **If I try to pass the scoringParameter anywhere in the extra_body**, I receive this error: > An error occurred: Error code: 400 - {'error': {'requestid': '', 'code': 400, 'message': 'Validation error at #/data_sources/0/azure_search/parameters/scoringParameter: Extra inputs are not permitted'}} This error is even more confusing. I’ve been looking through various resources, but none of them seem to provide a clear example of how to properly pass the scoring profile or scoring parameters in the extra_body. Here’s how I define my scoring profile using tags: scoring_profiles = [ ScoringProfile( name="my-scoring-profile", functions=[ TagScoringFunction( field_name="VITAMINS", boost=10.0, parameters=TagScoringParameters( tags_parameter="tags", ), ) ] ) ] How to pass the scoring parameters correctly in the `extra_body` on the `client.chat.completions.create`? PS: The only way I can get my code to work is if I delete the scoring profile or do not make it the scoring profile by default by I do want to use it.376Views0likes5CommentsBuilding an AI-Powered ESG Consultant Using Azure AI Services: A Case Study
In today's corporate landscape, Environmental, Social, and Governance (ESG) compliance has become increasingly important for stakeholders. To address the challenges of analyzing vast amounts of ESG data efficiently, a comprehensive AI-powered solution called ESGai has been developed. This blog explores how Azure AI services were leveraged to create a sophisticated ESG consultant for publicly listed companies. https://youtu.be/5-oBdge6Q78?si=Vb9aHx79xk3VGYAh The Challenge: Making Sense of Complex ESG Data Organizations face significant challenges when analyzing ESG compliance data. Manual analysis is time-consuming, prone to errors, and difficult to scale. ESGai was designed to address these pain points by creating an AI-powered virtual consultant that provides detailed insights based on publicly available ESG data. Solution Architecture: The Three-Agent System ESGai implements a sophisticated three-agent architecture, all powered by Azure's AI capabilities: Manager Agent: Breaks down complex user queries into manageable sub-questions containing specific keywords that facilitate vector search retrieval. The system prompt includes generalized document headers from the vector database for context. Worker Agent: Processes the sub-questions generated by the Manager, connects to the vector database to retrieve relevant text chunks, and provides answers to the sub-questions. Results are stored in Cosmos DB for later use. Director Agent: Consolidates the answers from the Worker agent into a comprehensive final response tailored specifically to the user's original query. It's important to note that while conceptually there are three agents, the Worker is actually a single agent that gets called multiple times - once for each sub-question generated by the Manager. Current Implementation State The current MVP implementation has several limitations that are planned for expansion: Limited Company Coverage: The vector database currently stores data for only 2 companies, with 3 documents per company (Sustainability Report, XBRL, and BRSR). Single Model Deployment: Only one GPT-4o model is currently deployed to handle all agent functions. Basic Storage Structure: The Blob container has a simple structure with a single directory. While Azure Blob storage doesn't natively support hierarchical folders, the team plans to implement virtual folders in the future. Free Tier Limitations: Due to funding constraints, the AI Search service is using the free tier, which limits vector data storage to 50MB. Simplified Vector Database: The current index stores all 6 files (3 documents × 2 companies) in a single vector database without filtering capabilities or schema definition. Azure Services Powering ESGai The implementation of ESGai leverages multiple Azure services for a robust and scalable architecture: Azure AI Services: Provides pre-built APIs, SDKs, and services that incorporate AI capabilities without requiring extensive machine learning expertise. This includes access to 62 pre-trained models for chat completions through the AI Foundry portal. Azure OpenAI: Hosts the GPT-4o model for generating responses and the Ada embedding model for vectorization. The service combines OpenAI's advanced language models with Azure's security and enterprise features. Azure AI Foundry: Serves as an integrated platform for developing, deploying, and governing generative AI applications. It offers a centralized management centre that consolidates subscription information, connected resources, access privileges, and usage quotas. Azure AI Search (formerly Cognitive Search): Provides both full-text and vector search capabilities using the OpenAI ada-002 embedding model for vectorization. It's configured with hybrid search algorithms (BM25 RRF) for optimal chunk ranking. Azure Storage Services: Utilizes Blob Storage for storing PDFs, Business Responsibility Sustainability Reports (BRSRs), and other essential documents. It integrates seamlessly with AI Search using indexers to track database changes. Cosmos DB: Employs MongoDB APIs within Cosmos DB as a NoSQL database for storing chat history between agents and users. Azure App Services: Hosts the web application using a B3-tier plan optimized for cost efficiency, with GitHub Actions integrated for continuous deployment. Project Evolution: From Concept to Deployment The development of ESGai followed a structured approach through several phases: Phase 1: Data Cleaning Extracted specific KPIs from XML/XBRL datasets and BRSR reports containing ESG data for 1,000 listed companies Cleaned and standardized data to ensure consistency and accuracy Phase 2: RAG Framework Development Implemented Retrieval-Augmented Generation (RAG) to enhance responses by dynamically fetching relevant information Created a workflow that includes query processing, data retrieval, and response generation Phase 3: Initial Deployment Deployed models locally using Docker and n8n automation tools for testing Identified the need for more scalable web services Phase 4: Transition to Azure Services Migrated automation workflows from n8n to Azure AI Foundry services Leveraged Azure's comprehensive suite of AI services, storage solutions, and app hosting capabilities Technical Implementation Details Model Configurations: The GPT model is configured with: Model version: 2024-11-20 Temperature: 0.7 Max Response Token: 800 Past Messages: 10 Top-p: 0.95 Frequency/Presence Penalties: 0 The embedding model uses OpenAI-text-embedding-Ada-002 with 1536 dimensions and hybrid semantic search (BM25 RRF) algorithms. Cost Analysis and Efficiency A detailed cost breakdown per user query reveals: App Server: $390-400 AI Search: $5 per query RAG Query Processing: $4.76 per query Agent-specific costs: Manager: $0.05 (30 input tokens, 210 output tokens) Worker: $3.71 (1500 input tokens, 1500 output tokens) Director: $1.00 (600 input tokens, 600 output tokens) Challenges and Solutions The team faced several challenges during implementation: Quota Limitations: Initial deployments encountered token quota restrictions, which were resolved through Azure support requests (typically granted within 24 hours). Cost Optimization: High costs associated with vectorization required careful monitoring. The team addressed this by shutting down unused services and deploying on services with free tiers. Integration Issues: GitHub Actions raised errors during deployment, which were resolved using GitHub's App Service Build Service. Azure UI Complexity: The team noted that Azure AI service naming conventions were sometimes confusing, as the same name is used for both parent and child resources. Free Tier Constraints: The AI Search service's free tier limitation of 50MB for vector data storage restricts the amount of company information that can be included in the current implementation. Future Roadmap The current implementation is an MVP with several areas for expansion: Expand the database to include more publicly available sustainability reports beyond the current two companies Optimize token usage by refining query handling processes Research alternative embedding models to reduce costs while maintaining accuracy Implement a more structured storage system with virtual folders in Blob storage Upgrade from the free tier of AI Search to support larger data volumes Develop a proper schema for the vector database to enable filtering and more targeted searches Scale to multiple GPT model deployments for improved performance and redundancy Conclusion ESGai demonstrates how advanced AI techniques like Retrieval-Augmented Generation can transform data-intensive domains such as ESG consulting. By leveraging Azure's comprehensive suite of AI services alongside a robust agent-based architecture, this solution provides users with actionable insights while maintaining scalability and cost efficiency. https://youtu.be/5-oBdge6Q78?si=Vb9aHx79xk3VGYAh204Views0likes0CommentsCreate your own QA RAG Chatbot with LangChain.js + Azure OpenAI Service
Demo: Mpesa for Business Setup QA RAG Application In this tutorial we are going to build a Question-Answering RAG Chat Web App. We utilize Node.js and HTML, CSS, JS. We also incorporate Langchain.js + Azure OpenAI + MongoDB Vector Store (MongoDB Search Index). Get a quick look below. Note: Documents and illustrations shared here are for demo purposes only and Microsoft or its products are not part of Mpesa. The content demonstrated here should be used for educational purposes only. Additionally, all views shared here are solely mine. What you will need: An active Azure subscription, get Azure for Student for free or get started with Azure for 12 months free. VS Code Basic knowledge in JavaScript (not a must) Access to Azure OpenAI, click here if you don't have access. Create a MongoDB account (You can also use Azure Cosmos DB vector store) Setting Up the Project In order to build this project, you will have to fork this repository and clone it. GitHub Repository link: https://github.com/tiprock-network/azure-qa-rag-mpesa . Follow the steps highlighted in the README.md to setup the project under Setting Up the Node.js Application. Create Resources that you Need In order to do this, you will need to have Azure CLI or Azure Developer CLI installed in your computer. Go ahead and follow the steps indicated in the README.md to create Azure resources under Azure Resources Set Up with Azure CLI. You might want to use Azure CLI to login in differently use a code. Here's how you can do this. Instead of using az login. You can do az login --use-code-device OR you would prefer using Azure Developer CLI and execute this command instead azd auth login --use-device-code Remember to update the .env file with the values you have used to name Azure OpenAI instance, Azure models and even the API Keys you have obtained while creating your resources. Setting Up MongoDB After accessing you MongoDB account get the URI link to your database and add it to the .env file along with your database name and vector store collection name you specified while creating your indexes for a vector search. Running the Project In order to run this Node.js project you will need to start the project using the following command. npm run dev The Vector Store The vector store used in this project is MongoDB store where the word embeddings were stored in MongoDB. From the embeddings model instance we created on Azure AI Foundry we are able to create embeddings that can be stored in a vector store. The following code below shows our embeddings model instance. //create new embedding model instance const azOpenEmbedding = new AzureOpenAIEmbeddings({ azureADTokenProvider, azureOpenAIApiInstanceName: process.env.AZURE_OPENAI_API_INSTANCE_NAME, azureOpenAIApiEmbeddingsDeploymentName: process.env.AZURE_OPENAI_API_DEPLOYMENT_EMBEDDING_NAME, azureOpenAIApiVersion: process.env.AZURE_OPENAI_API_VERSION, azureOpenAIBasePath: "https://eastus2.api.cognitive.microsoft.com/openai/deployments" }); The code in uploadDoc.js offers a simple way to do embeddings and store them to MongoDB. In this approach the text from the documents is loaded using the PDFLoader from Langchain community. The following code demonstrates how the embeddings are stored in the vector store. // Call the function and handle the result with await const storeToCosmosVectorStore = async () => { try { const documents = await returnSplittedContent() //create store instance const store = await MongoDBAtlasVectorSearch.fromDocuments( documents, azOpenEmbedding, { collection: vectorCollection, indexName: "myrag_index", textKey: "text", embeddingKey: "embedding", } ) if(!store){ console.log('Something wrong happened while creating store or getting store!') return false } console.log('Done creating/getting and uploading to store.') return true } catch (e) { console.log(`This error occurred: ${e}`) return false } } In this setup, Question Answering (QA) is achieved by integrating Azure OpenAI’s GPT-4o with MongoDB Vector Search through LangChain.js. The system processes user queries via an LLM (Large Language Model), which retrieves relevant information from a vectorized database, ensuring contextual and accurate responses. Azure OpenAI Embeddings convert text into dense vector representations, enabling semantic search within MongoDB. The LangChain RunnableSequence structures the retrieval and response generation workflow, while the StringOutputParser ensures proper text formatting. The most relevant code snippets to include are: AzureChatOpenAI instantiation, MongoDB connection setup, and the API endpoint handling QA queries using vector search and embeddings. There are some code snippets below to explain major parts of the code. Azure AI Chat Completion Model This is the model used in this implementation of RAG, where we use it as the model for chat completion. Below is a code snippet for it. const llm = new AzureChatOpenAI({ azTokenProvider, azureOpenAIApiInstanceName: process.env.AZURE_OPENAI_API_INSTANCE_NAME, azureOpenAIApiDeploymentName: process.env.AZURE_OPENAI_API_DEPLOYMENT_NAME, azureOpenAIApiVersion: process.env.AZURE_OPENAI_API_VERSION }) Using a Runnable Sequence to give out Chat Output This shows how a runnable sequence can be used to give out a response given the particular output format/ output parser added on to the chain. //Stream response app.post(`${process.env.BASE_URL}/az-openai/runnable-sequence/stream/chat`, async (req,res) => { //check for human message const { chatMsg } = req.body if(!chatMsg) return res.status(201).json({ message:'Hey, you didn\'t send anything.' }) //put the code in an error-handler try{ //create a prompt template format template const prompt = ChatPromptTemplate.fromMessages( [ ["system", `You are a French-to-English translator that detects if a message isn't in French. If it's not, you respond, "This is not French." Otherwise, you translate it to English.`], ["human", `${chatMsg}`] ] ) //runnable chain const chain = RunnableSequence.from([prompt, llm, outPutParser]) //chain result let result_stream = await chain.stream() //set response headers res.setHeader('Content-Type','application/json') res.setHeader('Transfer-Encoding','chunked') //create readable stream const readable = Readable.from(result_stream) res.status(201).write(`{"message": "Successful translation.", "response": "`); readable.on('data', (chunk) => { // Convert chunk to string and write it res.write(`${chunk}`); }); readable.on('end', () => { // Close the JSON response properly res.write('" }'); res.end(); }); readable.on('error', (err) => { console.error("Stream error:", err); res.status(500).json({ message: "Translation failed.", error: err.message }); }); }catch(e){ //deliver a 500 error response return res.status(500).json( { message:'Failed to send request.', error:e } ) } }) To run the front end of the code, go to your BASE_URL with the port given. This enables you to run the chatbot above and achieve similar results. The chatbot is basically HTML+CSS+JS. Where JavaScript is mainly used with fetch API to get a response. Thanks for reading. I hope you play around with the code and learn some new things. Additional Reads Introduction to LangChain.js Create an FAQ Bot on Azure Build a basic chat app in Python using Azure AI Foundry SDK537Views0likes0CommentsHow to Use SemanticKernel with OpenAI and Azure OpenAI in C#
Discover the future of AI with Semantic Kernel for C# — your gateway to integrating cutting-edge language models. Jumpstart your projects with our easy-to-follow guides and examples. Get ready to elevate your applications to new heights!6.1KViews2likes1Comment