langchain
15 TopicsLangChain integration with Azure Database for PostgreSQL (Part 1)
Use LangChain to split documents into smaller chunks, generate embeddings for each chunk using Azure OpenAI, and store them in a PostgreSQL database via the pgvector extension. Then, we’ll perform a vector similarity search on the embedded documents.5.8KViews2likes1CommentDemocratizing Data: Unleashing Power of AI Analytics with LangChain and Azure OpenAI Services
Explore how LangChain and Azure OpenAI Services are revolutionizing data analytics. Discover the transformative potential of Generative AI and Large Language Models in making data analytics accessible to everyone, irrespective of their coding expertise or data science background. Dive into the behind-the-scenes magic of the LangChain agent and learn how it simplifies the user experience by dynamically generating Python code for data analysis.5.1KViews0likes0CommentsHow to build Tool-calling Agents with Azure OpenAI and Lang Graph
Introducing MyTreat Our demo is a fictional website that shows customers their total bill in dollars, but they have the option of getting the total bill in their local currencies. The button sends a request to the Node.js service and a response is simply returned from our Agent given the tool it chooses. Let’s dive in and understand how this works from a broader perspective. Prerequisites An active Azure subscription. You can sign up for a free trial here or get $100 worth of credits on Azure every year if you are a student. A GitHub account (not necessarily) Node.js LTS 18 + VS Code installed (or your favorite IDE) Basic knowledge of HTML, CSS, JS Creating an Azure OpenAI Resource Go over to your browser and key in portal.azure.com to access the Microsoft Azure Portal. Over there navigate to the search bar and type Azure OpenAI. Go ahead and click on + Create. Fill in the input boxes with appropriate, for example, as shown below then press on next until you reach review and submit then finally click on Create. After the deployment is done, go to the deployment and access Azure AI Foundry portal using the button as show below. You can also use the link as demonstrated below. In the Azure AI Foundry portal, we have to create our model instance so we have to go over to Model Catalog on the left panel beneath Get Started. Select a desired model, in this case I used gpt-35-turbo for chat completion (in your case use gpt-4o). Below is a way of doing this. Choose a model (gpt-4o) Click on deploy Give the deployment a new name e.g. myTreatmodel, then click deploy and wait for it to finish On the left panel go over to deployments and you will see the model you have created. Access your Azure OpenAI Resource Key Go back to Azure portal and specifically to the deployment instance that we have and select on the left panel, Resource Management. Click on Keys and Endpoints. Copy any of the keys as shown below and keep it very safe as we will use it in our .env file. Configuring your project Create a new project folder on your local machine and add these variables to the .env file in the root folder. AZURE_OPENAI_API_INSTANCE_NAME= AZURE_OPENAI_API_DEPLOYMENT_NAME= AZURE_OPENAI_API_KEY= AZURE_OPENAI_API_VERSION="2024-08-01-preview" LANGCHAIN_TRACING_V2="false" LANGCHAIN_CALLBACKS_BACKGROUND = "false" PORT=4556 Starting a new project Go over to https://github.com/tiprock-network/mytreat.git and follow the instructions to setup the new project, if you do not have git installed, go over to the Code button and press Download ZIP. This will enable you get the project folder and follow the same procedure for setting up. Creating a custom tool In the utils folder the math tool was created, this code show below uses tool from Langchain to build a tool and the schema of the tool is created using zod.js, a library that helps in validating an object’s property value. The price function takes in an array of prices and the exchange rate, adds the prices up and converts them using the exchange rate as shown below. import { tool } from '@langchain/core/tools' import { z } from 'zod' const priceConv = tool((input) =>{ //get the prices and add them up after turning each into let sum = 0 input.prices.forEach((price) => { let price_check = parseFloat(price) sum += price_check }) //now change the price using exchange rate let final_price = parseFloat(input.exchange_rate) * sum //return return final_price },{ name: 'add_prices_and_convert', description: 'Add prices and convert based on exchange rate.', schema: z.object({ prices: z.number({ required_error: 'Price should not be empty.', invalid_type_error: 'Price must be a number.' }).array().nonempty().describe('Prices of items listed.'), exchange_rate: z.string().describe('Current currency exchange rate.') }) }) export { priceConv } Utilizing the tool In the controller’s folder we then bring the tool in by importing it. After that we pass it in to our array of tools. Notice that we have the Tavily Search Tool, you can learn how to implement in the Additional Reads Section or just remove it. Agent Model and the Call Process This code defines an AI agent using LangGraph and LangChain.js, powered by GPT-4o from Azure OpenAI. It initializes a ToolNode to manage tools like priceConv and binds them to the agent model. The StateGraph handles decision-making, determining whether the agent should call a tool or return a direct response. If a tool is needed, the workflow routes the request accordingly; otherwise, the agent responds to the user. The callModel function invokes the agent, processing messages and ensuring seamless tool integration. The searchAgentController is a GET endpoint that accepts user queries (text_message). It processes input through the compiled LangGraph workflow, invoking the agent to generate a response. If a tool is required, the agent calls it before finalizing the output. The response is then sent back to the user, ensuring dynamic and efficient tool-assisted reasoning. //create tools the agent will use //const agentTools = [new TavilySearchResults({maxResults:5}), priceConv] const agentTools = [ priceConv] const toolNode = new ToolNode(agentTools) const agentModel = new AzureChatOpenAI({ model:'gpt-4o', temperature:0, azureOpenAIApiKey: AZURE_OPENAI_API_KEY, azureOpenAIApiInstanceName:AZURE_OPENAI_API_INSTANCE_NAME, azureOpenAIApiDeploymentName:AZURE_OPENAI_API_DEPLOYMENT_NAME, azureOpenAIApiVersion:AZURE_OPENAI_API_VERSION }).bindTools(agentTools) //make a decision to continue or not const shouldContinue = ( state ) => { const { messages } = state const lastMessage = messages[messages.length -1] //upon tool call we go to tools if("tool_calls" in lastMessage && Array.isArray(lastMessage.tool_calls) && lastMessage.tool_calls?.length) return "tools"; //if no tool call is made we stop and return back to the user return END } const callModel = async (state) => { const response = await agentModel.invoke(state.messages) return { messages: [response] } } //define a new graph const workflow = new StateGraph(MessagesAnnotation) .addNode("agent", callModel) .addNode("tools", toolNode) .addEdge(START, "agent") .addConditionalEdges("agent", shouldContinue, ["tools", END]) .addEdge("tools", "agent") const appAgent = workflow.compile() The above is implemented with the following code: Frontend The frontend is a simple HTML+CSS+JS stack that demonstrated how you can use an API to integrate this AI Agent to your website. It sends a GET request and uses the response to get back the right answer. Below is an illustration of how fetch API has been used. const searchAgentController = async ( req, res ) => { //get human text const { text_message } = req.query if(!text_message) return res.status(400).json({ message:'No text sent.' }) //invoke the agent const agentFinalState = await appAgent.invoke( { messages: [new HumanMessage(text_message)] }, {streamMode: 'values'} ) //const agentFinalState_b = await agentModel.invoke(text_message) /*return res.status(200).json({ answer:agentFinalState.messages[agentFinalState.messages.length - 1].content })*/ //console.log(agentFinalState_b.tool_calls) res.status(200).json({ text: agentFinalState.messages[agentFinalState.messages.length - 1].content }) } There you go! We have created a basic tool-calling agent using Azure and Langchain successfully, go ahead and expand the code base to your liking. If you have questions you can comment below or reach out on my socials. Additional Reads Azure Open AI Service Models Generative AI for Beginners AI Agents for Beginners Course Lang Graph Tutorial Develop Generative AI Apps in Azure AI Foundry Portal4.5KViews1like2CommentsBuild Enterprise-Ready AI Agents with the New Azure Postgres LangChain + LangGraph Connector
AI agents are only as powerful as the data layer behind them. That’s why we’re excited to announce native LangChain + LangGraph connector for Azure Database for PostgreSQL. With this release, Postgres becomes your single source of truth for AI agents, handling knowledge retrieval, chat history, and long-term memory all in one place. This new connector is packed with everything you need to build secure, scalable and enterprise-ready AI agents on Azure without the complexity. With EntraID authentication, DiskANN acceleration, vector store, and a dedicated agent store, you can go from prototype to production on Azure faster than ever. You can quickly get started with the LangChain + LangGraph connector today pip install langchain-azure-postgresql In this post, we’ll cover: How Azure Postgres connector for LangGraph can serve as the single persistence + retrieval layer for an AI agent New first-class connector for LangChain +LangGraph A practical example to help you get started Azure PostgreSQL as the single persistence + retrieval layer for an AI agent When building AI agents today, developers face a fragmented stack: Vector storage and search require a library, service or separate database. Chat history & short-term memory need yet another data source. Long-term memory often means bolting on yet another system. This sprawl leads to complex integrations, higher costs, and weaker security, making it hard to scale AI agents reliably. The Solution The new Azure Postgres connector for LangChain + LangGraph transforms your Azure Postgres database to the single persistence + retrieval layer for AI agents. Instead of working on a fragmented stack, developers can now: Run embeddings + semantic search with built-in DiskANN acceleration in the same database that powers their application logic. Persist chat history and short-term memory and keep agent conversations grounded via seamless context retrieval from data stored in Postgres. Capture, retrieve, and evolve knowledge over time with a built-in long-term memory without bolting on external systems. All in one database, simplified, secure, and enterprise ready. Postgres becomes the persistent and retrieval data layer for your AI agent. Built for Enterprise Readiness: LangChain + LangGraph Connector This release unlocks several new capabilities that make it easy to build robust, production-ready agents: Auth with EntraID: Enterprise-grade identity to securely connect LangChain + LangGraph workflows to Azure Database for PostgreSQL within a centrally managed security perimeter based on identity. DiskANN & Extensions: First-class support for faster vector search using pgvector combined with DiskANN indexing, enabling support for high-dimensional vectors and cost-efficient search. Additionally, helper functions ensure your favorite extensions are installed. Native Vector Store: Store and query embeddings, enabling semantic search and Retrieval-Augmented Generation (RAG) scenarios. Dedicated Agent Store: Persist agent state, memory, and chat history with structured access patterns, perfect for multi-turn conversations and long-term context. Together, these features give developers a turnkey persistence solution for building reliable AI agents without stitching together multiple storage systems. Using LangGraph on Azure Database for PostgreSQL Using LangGraph with Azure Database for PostgreSQL is easy. Enable the vector & pg_diskann Extension: Allowlist the vector and pg_diskann extension within your server configuration. Import LangChain + LangGraph connector pip install langchain-azure-postgresql pip install -qU langchain-openai pip install -qU azure-identity Login to Azure, to your Entra ID Run az login in your terminal, where you will also run the LangGraph code. az login To get started, you need to set up a production-ready vector store for your agent in a few lines of code. # 1. Auth: Securely connect to Azure Postgres connection_pool = AzurePGConnectionPool(azure_conn_info=ConnectionInfo(host=os.environ["PGHOST"])) #2. Create embeddings embeddings = AzureOpenAIEmbeddings(model="text-embedding-3-small") # 3. Initialize a vector store in Postgres with DiskANN vector_store = AzurePGVectorStore(connection=connection, embedding=embeddings) Use LangGraph to build a sample agent. Here’s a practical example that combines vector search and checkpointer inside Postgres: #4 Define the tool for data retrieval. def get_data_from_vector_store(query: str) -> str: """Get data from the vector store.""" results = vector_store.similarity_search(query) return results #5 Define the agent, checkpointer and memory store. with connection_pool.getconn() as conn: agent = create_react_agent( model=model, tools=[get_data_from_vector_store], checkpointer=PostgresSaver(conn) ) #6 Run the agent and print results config = {"configurable": {"thread_id": "1", "user_id": "1"}} response = agent.invoke( {"messages": [{"role": "user", "content": "What does my database say about cats? Make sure you address me with my name"}]}, config ) for msg in response["messages"][-2:]: msg.pretty_print() With just a few lines of code, you can: Uses the vector store backed by Postgres Enable DiskANN for semantic search Use checkpointers for short-term conversation history Learn More This is just the beginning. With native LangChain + LangGraph support in Azure PostgreSQL, developers can now rely on a single, secure, high-performance data layer for building the next generation of AI agents. 👉 Ready to start? All the code are available in the Azure Postgres Agents Demo GitHub repository. See how easy it is to bring your AI agent to life on Azure. 👉 Check out the docs for more details on the LangChain + LangGraph connector.3.9KViews3likes0CommentsHow to use any Python AI agent framework with free GitHub Models
I ❤️ when companies offer free tiers for developer services, since it gives everyone a way to learn new technologies without breaking the bank. Free tiers are especially important for students and people between jobs, when the desire to learn is high but the available cash is low. That's why I'm such a fan of GitHub Models: free, high-quality generative AI models available to anyone with a GitHub account. The available models include the latest OpenAI LLMs (like o3-mini), LLMs from the research community (like Phi and Llama), LLMs from other popular providers (like Mistral and Jamba), multimodal models (like gpt-4o and llama-vision-instruct) and even a few embedding models (from OpenAI and Cohere). With access to such a range of models, you can prototype complex multi-model workflows to improve your productivity or heck, just make something fun for yourself. 🤗 To use GitHub Models, you can start off in no-code mode: open the playground for a model, send a few requests, tweak the parameters, and check out the answers. When you're ready to write code, select "Use this model". A screen will pop up where you can select a programming language (Python/JavaScript/C#/Java/REST) and select an SDK (which varies depending on model). Then you'll get instructions and code for that model, language, and SDK. But here's what's really cool about GitHub Models: you can use them with all the popular Python AI frameworks, even if the framework has no specific integration with GitHub Models. How is that possible? The vast majority of Python AI frameworks support the OpenAI Chat Completions API, since that API became a defacto standard supported by many LLM API providers besides OpenAI itself. GitHub Models also provide OpenAI-compatible endpoints for chat completion models. Therefore, any Python AI framework that supports OpenAI-like models can be used with GitHub Models as well. 🎉 To prove it, I've made a new repository with examples from eight different Python AI agent packages, all working with GitHub Models: python-ai-agent-frameworks-demos. There are examples for AutoGen, LangGraph, Llamaindex, OpenAI Agents SDK, OpenAI standard SDK, PydanticAI, Semantic Kernel, and SmolAgents. You can open that repository in GitHub Codespaces, install the packages, and get the examples running immediately. Now let's walk through the API connection code for GitHub Models for each framework. Even if I missed your favorite framework, I hope my tips here will help you connect any framework to GitHub Models. OpenAI I'll start with openai , the package that started it all! import openai client = openai.OpenAI( api_key=os.environ["GITHUB_TOKEN"], base_url="https://models.inference.ai.azure.com") The code above demonstrates the two key parameters we'll need to configure for all frameworks: api_key : When using OpenAI.com, you pass your OpenAI API key here. When using GitHub Models, you pass in a Personal Access Token (PAT). If you open the repository (or any repository) in GitHub Codespaces, a PAT is already stored in the GITHUB_TOKEN environment variable. However, if you're working locally with GitHub Models, you'll need to generate a PAT yourself and store it. PATs expire after a while, so you need to generate new PATs every so often. base_url : This parameter tells the OpenAI client to send all requests to "https://models.inference.ai.azure.com" instead of the OpenAI.com API servers. That's the domain that hosts the OpenAI-compatible endpoint for GitHub Models, so you'll always pass that domain as the base URL. If we're working with the new openai-agents SDK, we use very similar code, but we must use the AsyncOpenAI client from openai instead. Lately, Python AI packages are defaulting to async, because it's so much better for performance. import agents import openai client = openai.AsyncOpenAI( base_url="https://models.inference.ai.azure.com", api_key=os.environ["GITHUB_TOKEN"]) model = agents.OpenAIChatCompletionsModel( model="gpt-4o", openai_client=client) spanish_agent = agents.Agent( name="Spanish agent", instructions="You only speak Spanish.", model=model) PydanticAI Now let's look at all of the packages that make it really easy for us, by allowing us to directly bring in an instance of either OpenAI or AsyncOpenAI . For PydanticAI, we configure an AsyncOpenAI client, then construct an OpenAIModel object from PydanticAI, and pass that model to the agent: import openai import pydantic_ai import pydantic_ai.models.openai client = openai.AsyncOpenAI( api_key=os.environ["GITHUB_TOKEN"], base_url="https://models.inference.ai.azure.com") model = pydantic_ai.models.openai.OpenAIModel( "gpt-4o", provider=OpenAIProvider(openai_client=client)) spanish_agent = pydantic_ai.Agent( model, system_prompt="You only speak Spanish.") Semantic Kernel For Semantic Kernel, the code is very similar. We configure an AsyncOpenAI client, then construct an OpenAIChatCompletion object from Semantic Kernel, and add that object to the kernel. import openai import semantic_kernel.connectors.ai.open_ai import semantic_kernel.agents chat_client = openai.AsyncOpenAI( api_key=os.environ["GITHUB_TOKEN"], base_url="https://models.inference.ai.azure.com") chat = semantic_kernel.connectors.ai.open_ai.OpenAIChatCompletion( ai_model_id="gpt-4o", async_client=chat_client) kernel.add_service(chat) spanish_agent = semantic_kernel.agents.ChatCompletionAgent( kernel=kernel, name="Spanish agent" instructions="You only speak Spanish") AutoGen Next, we'll check out a few frameworks that have their own wrapper of the OpenAI clients, so we won't be using any classes from openai directly. For AutoGen, we configure both the OpenAI parameters and the model name in the same object, then pass that to each agent: import autogen_ext.models.openai import autogen_agentchat.agents client = autogen_ext.models.openai.OpenAIChatCompletionClient( model="gpt-4o", api_key=os.environ["GITHUB_TOKEN"], base_url="https://models.inference.ai.azure.com") spanish_agent = autogen_agentchat.agents.AssistantAgent( "spanish_agent", model_client=client, system_message="You only speak Spanish") LangGraph For LangGraph, we configure a very similar object, which even has the same parameter names: import langchain_openai import langgraph.graph model = langchain_openai.ChatOpenAI( model="gpt-4o", api_key=os.environ["GITHUB_TOKEN"], base_url="https://models.inference.ai.azure.com", ) def call_model(state): messages = state["messages"] response = model.invoke(messages) return {"messages": [response]} workflow = langgraph.graph.StateGraph(MessagesState) workflow.add_node("agent", call_model) SmolAgents Once again, for SmolAgents, we configure a similar object, though with slightly different parameter names: import smolagents model = smolagents.OpenAIServerModel( model_id="gpt-4o", api_key=os.environ["GITHUB_TOKEN"], api_base="https://models.inference.ai.azure.com") agent = smolagents.CodeAgent(model=model) Llamaindex I saved Llamaindex for last, as it is the most different. The llama-index package has a different constructor for OpenAI.com versus OpenAI-like servers, so I opted to use that OpenAILike constructor instead. However, I also needed an embeddings model for my example, and the package doesn't have an OpenAIEmbeddingsLike constructor, so I used the standard OpenAIEmbedding constructor. import llama_index.embeddings.openai import llama_index.llms.openai_like import llama_index.core.agent.workflow Settings.llm = llama_index.llms.openai_like.OpenAILike( model="gpt-4o", api_key=os.environ["GITHUB_TOKEN"], api_base="https://models.inference.ai.azure.com", is_chat_model=True) Settings.embed_model = llama_index.embeddings.openai.OpenAIEmbedding( model="text-embedding-3-small", api_key=os.environ["GITHUB_TOKEN"], api_base="https://models.inference.ai.azure.com") agent = llama_index.core.agent.workflow.ReActAgent( tools=query_engine_tools, llm=Settings.llm) Choose your models wisely! In all of the examples above, I specified the gpt-4o model. The gpt-4o model is a great choice for agents because it supports function calling, and many agent frameworks only work (or work best) with models that natively support function calling. Fortunately, GitHub Models includes multiple models that support function calling, at least in my basic experiments: gpt-4o gpt-4o-mini o3-mini AI21-Jamba-1.5-Large AI21-Jamba-1.5-Mini Codestral-2501 Cohere-command-r Ministral-3B Mistral-Large-2411 Mistral-Nemo Mistral-small You might find that some models work better than others, especially if you're using agents with multiple tools. With GitHub Models, it's very easy to experiment and see for yourself, by simply changing the model name and re-running the code. Join the AI Agents Hackathon We are currently running a free virtual hackathon from April 8th - 30th, to challenge developers to create agentic applications using Microsoft technologies. You could build an agent entirely using GitHub Models and submit it to the hackathon for a chance to win amazing prizes! You can also join our 30+ streams about building AI agents, including a stream all about prototyping with GitHub Models. Learn more and register at https://aka.ms/agentshack2.6KViews3likes0CommentsDon't miss the 2024 Azure Developers JavaScript Day!
Do you want to discover the latest services and features in Azure designed specifically for JavaScript developers? Are you looking for cutting-edge cloud development techniques that can save you time and money, while providing your customers with the best experience possible?2.4KViews1like2CommentsOllama on HTTPS for SQL Server
Here is a quick procedure to deploy an Ubuntu container with Ollama and expose its API over HTTPS. The goal is to allow a fast deployment, even for those unfamiliar with Docker or Language Models, making it easy to set up an offline platform for generating embeddings and using Small Language Models This is particularly useful when testing SQL Server 2025 for fully on-premises environment use cases, since SQL Server only allows access to HTTPS endpoints. However, HTTP remains open for testing purposes. Please note that this example is CPU-based, as deploying with (integrated) GPU support involves additional, less straightforward steps. This example is provided solely to illustrate the concept, is not intended for production use, and comes without any guarantee of performance or security. Prerequisites To continue, you need to have Docker Desktop, WSL and SQL Server 2025 (currently Release Candidate 1) Docker Desktop Install WSL | Microsoft Learn SQL Server 2025 Preview | Microsoft Evaluation Center Create a Dockerfile First, create a working directory. In this example, C:\Docker\Ollama will be used. Simply create a file named Dockerfile (without an extension) and paste the following content into it. FROM ubuntu:25.10 RUN apt update && apt install -y curl gnupg2 ca-certificates lsb-release apt-transport-https software-properties-common unzip nano openssl net-tools RUN curl -fsSL https://ollama.com/install.sh | bash RUN curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/gpg.key' | gpg --dearmor -o /usr/share/keyrings/caddy-stable-archive-keyring.gpg RUN curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/debian.deb.txt' | tee /etc/apt/sources.list.d/caddy-stable.list RUN apt update && apt install -y caddy RUN mkdir -p /etc/caddy/certs RUN cat > /etc/caddy/certs/san.cnf <<EOF [req] default_bits = 2048 prompt = no default_md = sha256 req_extensions = req_ext distinguished_name = dn [dn] CN = 127.0.0.1 [req_ext] subjectAltName = @alt_names [alt_names] IP.1 = 127.0.0.1 DNS.1 = localhost EOF RUN openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout /etc/caddy/certs/localhost.key -out /etc/caddy/certs/localhost.crt -config /etc/caddy/certs/san.cnf -extensions req_ext RUN echo "https://:443 {\n tls /etc/caddy/certs/localhost.crt /etc/caddy/certs/localhost.key\n reverse_proxy localhost:11434\n}" >> /etc/caddy/Caddyfile RUN echo "#!/bin/bash" > /usr/local/bin/entrypoint.sh && \ echo "set -e" >> /usr/local/bin/entrypoint.sh && \ echo "OLLAMA_HOST=0.0.0.0 ollama serve >> /var/log/ollama.log 2>&1 &" >> /usr/local/bin/entrypoint.sh && \ echo "caddy run --config /etc/caddy/Caddyfile --adapter caddyfile >> /var/log/caddy.log 2>&1 &" >> /usr/local/bin/entrypoint.sh && \ echo "tail -f /var/log/ollama.log /var/log/caddy.log" >> /usr/local/bin/entrypoint.sh && \ chmod 755 /usr/local/bin/entrypoint.sh ENTRYPOINT ["/usr/local/bin/entrypoint.sh"] For your information, this file allows the creation of an image based on Ubuntu 25.10 and includes: Ollama, for running the models Caddy, for the reverse proxy Creation of a certificate for the HTTPS endpoint on localhost Create the container After opening a Powershell terminal, execute the following commands: cd C:\Docker\Ollama #Build the image from the Dockerfile. docker build -t ollama-https . #Create a container based on the image ollama-https docker run --name ollama-https -d -it -p 443:443 -p 11434:11434 ollama-https #Copy the certificate created into the current Windows directory docker cp ollama-https:/etc/caddy/certs/localhost.crt . # Install the certificate in Trusted Root Certification Authorities Import-Certificate -FilePath "localhost.crt" -CertStoreLocation "Cert:\LocalMachine\Root" #Check Https (wget https://localhost).Content #Check Http (wget http://localhost:11434).Content Ollama is now running With a browser, connect to https://localhost Retrieve Models No model is retrieved when the image is created, as this depends on each use case, and for some models, the size can be substantial. Here’s a quick example for pulling an embedding model, Nomic, and a small language model, Phi3. Ollama Search docker exec ollama-https ollama pull nomic-embed-text docker exec ollama-https ollama pull phi3:mini A quick example with SQL Server 2025 A quick demonstration using the WideWorldImporters database (Wide World Importers sample database) use [master] GO ALTER DATABASE WideWorldImporters SET COMPATIBILITY_LEVEL = 170 WITH ROLLBACK IMMEDIATE GO DBCC TRACEON(466, 474, 13981, -1) GO Note: With RC1, you can use the PREVIEW_FEATURES database-scoped configuration T-SQL Declare an external model for embeddings. use [WideWorldImporters] GO CREATE EXTERNAL MODEL NomicLocal AUTHORIZATION dbo WITH ( LOCATION = 'https://localhost/api/embed', API_FORMAT = 'ollama', MODEL_TYPE = EMBEDDINGS, MODEL = 'nomic-embed-text' ) to enable semantic search capabilities on StockItems, we will create a dedicated table to store embeddings (no chunking in this example) along with a vector index optimized for cosine similarity use [WideWorldImporters] GO CREATE TABLE [Warehouse].[StockItemsEmbedding](StockItemEmbeddingID int identity (1,1) PRIMARY KEY, StockItemId int, SearchDetails nvarchar(max), Embedding vector(768)) GO INSERT INTO [Warehouse].[StockItemsEmbedding] SELECT si.StockItemID, si.SearchDetails, AI_GENERATE_EMBEDDINGS(si.SearchDetails USE MODEL NomicLocal) /* Generate embeddings from declared external model */ FROM [Warehouse].[StockItems] si GO /* Check */ SELECT * FROM [Warehouse].[StockItemsEmbedding] GO CREATE VECTOR INDEX IXV_1 ON [Warehouse].[StockItemsEmbedding] (Embedding) WITH (METRIC = 'cosine', TYPE = 'DiskANN') GO /* User Input */ DECLARE @UserInput varchar(max) = 'Which product is best suited for shipping small items?' /* and Generate embeddings for user input */ DECLARE @UserInputV vector(768) = AI_GENERATE_EMBEDDINGS(@UserInput USE MODEL NomicLocal) DECLARE @ModelInput nvarchar(max) DECLARE Payload nvarchar(max) DECLARE Response nvarchar(max) /* Similarity Search on StockItems and Model Input creation*/ SELECT @ModelInput = STRING_AGG('ProductDetails: ' + sie.SearchDetails + 'UnitPrice: ' + CAST(si.UnitPrice AS nvarchar(max)), ' \n\n') FROM VECTOR_SEARCH( TABLE = [Warehouse].[StockItemsEmbedding] as sie, COLUMN = Embedding, SIMILAR_TO = @UserInputV, METRIC = 'cosine', TOP_N = 10 ) JOIN [Warehouse].[StockItems] si ON si.StockItemId = sie.StockItemId /* Generate payload for response generation */ SELECT = '{"model": "phi3:mini", "stream": false, "prompt":"You are acting as a customer advisor responsible for recommending the most suitable products based on customer needs, providing clear and personalized suggestions. Question : ' + @UserInput + '\n\nList of Items : ' + @ModelInput + '"}'; EXECUTE sp_invoke_external_rest_endpoint @url = 'https://localhost/api/generate', @method = 'POST', = , @timeout = 230, = OUTPUT; PRINT JSON_VALUE(@response, '$.result.response') LangChain You can also have a try with LangChain. Same demo with a small difference, there is no vector index created on the vector store table. The table has been modified, but only for demonstration purposes. Reference: SQLServer | 🦜️🔗 LangChain # PREREQ #sudo apt-get update && sudo apt-get install -y unixodbc # sudo apt-get update # sudo apt-get install -y curl gnupg2 # curl https://packages.microsoft.com/keys/microsoft.asc | sudo apt-key add - # curl https://packages.microsoft.com/config/debian/11/prod.list | sudo tee /etc/apt/sources.list.d/mssql-release.list # sudo apt-get update # sudo ACCEPT_EULA=Y apt-get install -y msodbcsql18 # pip3 install langchain langchain-sqlserver langchain-ollama langchain-community import pyodbc from langchain_sqlserver import SQLServer_VectorStore from langchain_ollama import OllamaEmbeddings from langchain_ollama import ChatOllama from langchain.schema import Document from langchain_community.vectorstores.utils import DistanceStrategy #Prompt for testing _USER_INPUT = 'Which product is best suited for shipping small items?' ############### Params ########################################## print("\033[93mSetting up variables...\033[0m") _SQL_DRIVER = "ODBC Driver 18 for SQL Server" _SQL_SERVER = "localhost\\SQL2K25" _SQL_DATABASE = "WideWorldImporters" _SQL_USERNAME = "lc" _SQL_PASSWORD = "lc" _SQL_TRUST_CERT = "yes" _SQL_VECTOR_STORE_TABLE = "StockItem_VectorStore" # Table name for vector storage _MODIFY_TABLE_TO_USE_SQL_VECTOR_INDEX = True #As vector index not considered currently in langchain and structure does not match vector index requirements _CONNECTION_STRING = f"Driver={{{_SQL_DRIVER}}};Server={_SQL_SERVER};Database={_SQL_DATABASE};UID={_SQL_USERNAME};PWD={_SQL_PASSWORD};TrustServerCertificate={_SQL_TRUST_CERT}" _OLLAMA_API_URL = "https://localhost" _OLLAMA_EMBEDDING_MODEL = "nomic-embed-text:latest" _OLLAMA_EMBEDDING_VECTOR_SIZE = 768 _OLLAMA_SLM_MODEL = "phi3:mini" # Model for SLM queries ################################################################### #Define Ollama embeddings embeddings = OllamaEmbeddings( model=_OLLAMA_EMBEDDING_MODEL, base_url=_OLLAMA_API_URL ) conn = pyodbc.connect(_CONNECTION_STRING) cursor = conn.cursor() #Drop embeddings table if it exists print("\033[93mDropping existing vector store table if it exists...\033[0m") cursor.execute(f"DROP TABLE IF EXISTS Warehouse.{_SQL_VECTOR_STORE_TABLE};") print("\033[93mConnecting to SQL Server and fetching data...\033[0m") cursor.execute("SELECT StockItemId, SearchDetails, UnitPrice FROM Warehouse.StockItems;") rows = cursor.fetchall() print(f"\033[93mFound {len(rows)} records to process\033[0m") # Create documents from the fetched data documents = [ Document( page_content=row.SearchDetails, metadata={ "StockItemId": row.StockItemId, "UnitPrice": float(row.UnitPrice) # Convert Decimal to float } ) for row in rows ] conn.commit() #Creating vector store print("\033[93mCreating vector store...\033[0m") vector_store = SQLServer_VectorStore( connection_string=_CONNECTION_STRING, distance_strategy=DistanceStrategy.COSINE, # If not provided, defaults to COSINE embedding_function=embeddings, embedding_length=_OLLAMA_EMBEDDING_VECTOR_SIZE, db_schema = "Warehouse", table_name=_SQL_VECTOR_STORE_TABLE ) print("\033[93mAdding to vector store...\033[0m") try: vector_store.add_documents(documents) print("\033[93mSuccessfully added to vector store!\033[0m") except Exception as e: print(f"\033[91mError adding documents: {e}\033[0m") #Vector index not yet integrated in SQL Server VectorStore (drop auto-created nonclustered PK and generating int clustered PK if (_MODIFY_TABLE_TO_USE_SQL_VECTOR_INDEX): print("\033[93mModifying structure to create vector index...\033[0m") cursor.execute("DECLARE @AutoCreatedPK sysname, @SQL nvarchar(max);" f"SELECT @AutoCreatedPK = name FROM sys.key_constraints WHERE type = 'PK' AND parent_object_id = object_id('Warehouse.{_SQL_VECTOR_STORE_TABLE}');" f"SELECT @SQL = 'ALTER TABLE Warehouse.{_SQL_VECTOR_STORE_TABLE} DROP CONSTRAINT ' + @AutoCreatedPK + ';'" "EXEC sp_executesql @SQL;" f"ALTER TABLE Warehouse.{_SQL_VECTOR_STORE_TABLE} ADD Alt_Id int identity(1,1);" f"ALTER TABLE Warehouse.{_SQL_VECTOR_STORE_TABLE} ADD CONSTRAINT PK_{_SQL_VECTOR_STORE_TABLE} PRIMARY KEY (Alt_Id);") conn.commit() print("\033[93mCreating vector index...\033[0m") cursor.execute(f"CREATE VECTOR INDEX IV_{_SQL_VECTOR_STORE_TABLE} ON [Warehouse].[{_SQL_VECTOR_STORE_TABLE}] (embeddings) WITH (METRIC = 'cosine', TYPE = 'DiskANN');") conn.commit() #Generate prompt then answer print(f"\033[92mUser Input: {_USER_INPUT}\033[0m") context = [ { "Item": doc.page_content, "UnitPrice": doc.metadata.get("UnitPrice", None) } for doc in vector_store.similarity_search(_USER_INPUT, k=3) ] llm = ChatOllama(model=_OLLAMA_SLM_MODEL,base_url=_OLLAMA_API_URL) prompt = ( f"You are acting as a customer advisor responsible for recommending the most suitable products based on customer needs, providing clear and personalized suggestions" f"Context: {context}\n\nQuestion: {_USER_INPUT}\n\n") response = llm.invoke(prompt) print(f"\033[36m{response.content}\033[0m") Note : If using devcontainer with VSCode add "runArgs": [ "--network=host" ] to devcontainer.json to allow connections to “localhost”. Import and install the previously created certificat docker cp C:\Docker\Ollama\localhost.crt <devcontainer name>:/usr/local/share/ca-certificates/localhost.crt docker exec <devcontainer name> "update-ca-certificates" Disclaimer The sample scripts are not supported under any Microsoft standard support program or service. The sample scripts are provided AS IS without warranty of any kind. Microsoft further disclaims all implied warranties including, without limitation, any implied warranties of merchantability or of fitness for a particular purpose. The entire risk arising out of the use or performance of the sample scripts and documentation remains with you. In no event shall Microsoft, its authors, or anyone else involved in the creation, production, or delivery of the scripts be liable for any damages whatsoever (including, without limitation, damages for loss of business profits, business interruption, loss of business information, or other pecuniary loss) arising out of the use of or inability to use the sample scripts or documentation, even if Microsoft has been advised of the possibility of such damages.
