Here is a quick procedure to deploy an Ubuntu container with Ollama and expose its API over HTTPS.
The goal is to allow a fast deployment, even for those unfamiliar with Docker or Language Models, making it easy to set up an offline platform for generating embeddings and using Small Language Models
This is particularly useful when testing SQL Server 2025 for fully on-premises environment use cases, since SQL Server only allows access to HTTPS endpoints. However, HTTP remains open for testing purposes.
Please note that this example is CPU-based, as deploying with (integrated) GPU support involves additional, less straightforward steps. This example is provided solely to illustrate the concept, is not intended for production use, and comes without any guarantee of performance or security.
Prerequisites
To continue, you need to have
Docker Desktop, WSL and SQL Server 2025 (currently Release Candidate 1)
SQL Server 2025 Preview | Microsoft Evaluation Center
Create a Dockerfile
First, create a working directory. In this example, C:\Docker\Ollama will be used.
Simply create a file named Dockerfile (without an extension) and paste the following content into it.
FROM ubuntu:25.10
RUN apt update && apt install -y curl gnupg2 ca-certificates lsb-release apt-transport-https software-properties-common unzip nano openssl net-tools
RUN curl -fsSL https://ollama.com/install.sh | bash
RUN curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/gpg.key' | gpg --dearmor -o /usr/share/keyrings/caddy-stable-archive-keyring.gpg
RUN curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/debian.deb.txt' | tee /etc/apt/sources.list.d/caddy-stable.list
RUN apt update && apt install -y caddy
RUN mkdir -p /etc/caddy/certs
RUN cat > /etc/caddy/certs/san.cnf <<EOF
[req]
default_bits = 2048
prompt = no
default_md = sha256
req_extensions = req_ext
distinguished_name = dn
[dn]
CN = 127.0.0.1
[req_ext]
subjectAltName = @alt_names
[alt_names]
IP.1 = 127.0.0.1
DNS.1 = localhost
EOF
RUN openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout /etc/caddy/certs/localhost.key -out /etc/caddy/certs/localhost.crt -config /etc/caddy/certs/san.cnf -extensions req_ext
RUN echo "https://:443 {\n tls /etc/caddy/certs/localhost.crt /etc/caddy/certs/localhost.key\n reverse_proxy localhost:11434\n}" >> /etc/caddy/Caddyfile
RUN echo "#!/bin/bash" > /usr/local/bin/entrypoint.sh && \
echo "set -e" >> /usr/local/bin/entrypoint.sh && \
echo "OLLAMA_HOST=0.0.0.0 ollama serve >> /var/log/ollama.log 2>&1 &" >> /usr/local/bin/entrypoint.sh && \
echo "caddy run --config /etc/caddy/Caddyfile --adapter caddyfile >> /var/log/caddy.log 2>&1 &" >> /usr/local/bin/entrypoint.sh && \
echo "tail -f /var/log/ollama.log /var/log/caddy.log" >> /usr/local/bin/entrypoint.sh && \
chmod 755 /usr/local/bin/entrypoint.sh
ENTRYPOINT ["/usr/local/bin/entrypoint.sh"]
For your information, this file allows the creation of an image based on Ubuntu 25.10 and includes:
- Ollama, for running the models
- Caddy, for the reverse proxy
- Creation of a certificate for the HTTPS endpoint on localhost
Create the container
After opening a Powershell terminal, execute the following commands:
cd C:\Docker\Ollama
#Build the image from the Dockerfile.
docker build -t ollama-https .
#Create a container based on the image ollama-https
docker run --name ollama-https -d -it -p 443:443 -p 11434:11434 ollama-https
#Copy the certificate created into the current Windows directory
docker cp ollama-https:/etc/caddy/certs/localhost.crt .
# Install the certificate in Trusted Root Certification Authorities
Import-Certificate -FilePath "localhost.crt" -CertStoreLocation "Cert:\LocalMachine\Root"
#Check Https
(wget https://localhost).Content
#Check Http
(wget http://localhost:11434).Content
Ollama is now running
With a browser, connect to https://localhost
Retrieve Models
No model is retrieved when the image is created, as this depends on each use case, and for some models, the size can be substantial.
Here’s a quick example for pulling an embedding model, Nomic, and a small language model, Phi3. Ollama Search
docker exec ollama-https ollama pull nomic-embed-text
docker exec ollama-https ollama pull phi3:mini
A quick example with SQL Server 2025
A quick demonstration using the WideWorldImporters database (Wide World Importers sample database)
use [master]
GO
ALTER DATABASE WideWorldImporters SET COMPATIBILITY_LEVEL = 170 WITH ROLLBACK IMMEDIATE
GO
DBCC TRACEON(466, 474, 13981, -1)
GO
Note: With RC1, you can use the PREVIEW_FEATURES database-scoped configuration
- T-SQL
Declare an external model for embeddings.
use [WideWorldImporters]
GO
CREATE EXTERNAL MODEL NomicLocal
AUTHORIZATION dbo
WITH (
LOCATION = 'https://localhost/api/embed',
API_FORMAT = 'ollama',
MODEL_TYPE = EMBEDDINGS,
MODEL = 'nomic-embed-text'
)
to enable semantic search capabilities on StockItems, we will create a dedicated table to store embeddings (no chunking in this example) along with a vector index optimized for cosine similarity
use [WideWorldImporters]
GO
CREATE TABLE [Warehouse].[StockItemsEmbedding](StockItemEmbeddingID int identity (1,1) PRIMARY KEY, StockItemId int, SearchDetails nvarchar(max), Embedding vector(768))
GO
INSERT INTO [Warehouse].[StockItemsEmbedding]
SELECT
si.StockItemID, si.SearchDetails,
AI_GENERATE_EMBEDDINGS(si.SearchDetails USE MODEL NomicLocal) /* Generate embeddings from declared external model */
FROM
[Warehouse].[StockItems] si
GO
/* Check */
SELECT * FROM [Warehouse].[StockItemsEmbedding]
GO
CREATE VECTOR INDEX IXV_1 ON [Warehouse].[StockItemsEmbedding] (Embedding) WITH (METRIC = 'cosine', TYPE = 'DiskANN')
GO
/* User Input */
DECLARE @UserInput varchar(max) = 'Which product is best suited for shipping small items?'
/* and Generate embeddings for user input */
DECLARE @UserInputV vector(768) = AI_GENERATE_EMBEDDINGS(@UserInput USE MODEL NomicLocal)
DECLARE @ModelInput nvarchar(max)
DECLARE AS NVARCHAR (MAX)
DECLARE nvarchar(max)
/* Similarity Search on StockItems and Model Input creation*/
SELECT
@ModelInput = STRING_AGG('ProductDetails: ' + sie.SearchDetails + 'UnitPrice: ' + CAST(si.UnitPrice AS nvarchar(max)), ' \n\n')
FROM
VECTOR_SEARCH(
TABLE = [Warehouse].[StockItemsEmbedding] as sie,
COLUMN = Embedding,
SIMILAR_TO = @UserInputV,
METRIC = 'cosine',
TOP_N = 10
)
JOIN [Warehouse].[StockItems] si ON si.StockItemId = sie.StockItemId
/* Generate payload for response generation */
SELECT =
'{"model": "phi3:mini", "stream": false, "prompt":"You are acting as a customer advisor responsible for recommending the most suitable products based on customer needs, providing clear and personalized suggestions. Question : '
+ @UserInput
+ '\n\nList of Items : '
+ @ModelInput + '"}';
EXECUTE sp_invoke_external_rest_endpoint
@url = 'https://localhost/api/generate',
@method = 'POST',
= ,
@timeout = 230,
= OUTPUT;
PRINT JSON_VALUE(@response, '$.result.response')
- LangChain
You can also have a try with LangChain.
Same demo with a small difference, there is no vector index created on the vector store table.
The table has been modified, but only for demonstration purposes.
Reference: SQLServer | 🦜️🔗 LangChain
# PREREQ
#sudo apt-get update && sudo apt-get install -y unixodbc
# sudo apt-get update
# sudo apt-get install -y curl gnupg2
# curl https://packages.microsoft.com/keys/microsoft.asc | sudo apt-key add -
# curl https://packages.microsoft.com/config/debian/11/prod.list | sudo tee /etc/apt/sources.list.d/mssql-release.list
# sudo apt-get update
# sudo ACCEPT_EULA=Y apt-get install -y msodbcsql18
# pip3 install langchain langchain-sqlserver langchain-ollama langchain-community
import pyodbc
from langchain_sqlserver import SQLServer_VectorStore
from langchain_ollama import OllamaEmbeddings
from langchain_ollama import ChatOllama
from langchain.schema import Document
from langchain_community.vectorstores.utils import DistanceStrategy
#Prompt for testing
_USER_INPUT = 'Which product is best suited for shipping small items?'
############### Params ##########################################
print("\033[93mSetting up variables...\033[0m")
_SQL_DRIVER = "ODBC Driver 18 for SQL Server"
_SQL_SERVER = "localhost\\SQL2K25"
_SQL_DATABASE = "WideWorldImporters"
_SQL_USERNAME = "lc"
_SQL_PASSWORD = "lc"
_SQL_TRUST_CERT = "yes"
_SQL_VECTOR_STORE_TABLE = "StockItem_VectorStore" # Table name for vector storage
_MODIFY_TABLE_TO_USE_SQL_VECTOR_INDEX = True #As vector index not considered currently in langchain and structure does not match vector index requirements
_CONNECTION_STRING = f"Driver={{{_SQL_DRIVER}}};Server={_SQL_SERVER};Database={_SQL_DATABASE};UID={_SQL_USERNAME};PWD={_SQL_PASSWORD};TrustServerCertificate={_SQL_TRUST_CERT}"
_OLLAMA_API_URL = "https://localhost"
_OLLAMA_EMBEDDING_MODEL = "nomic-embed-text:latest"
_OLLAMA_EMBEDDING_VECTOR_SIZE = 768
_OLLAMA_SLM_MODEL = "phi3:mini" # Model for SLM queries
###################################################################
#Define Ollama embeddings
embeddings = OllamaEmbeddings(
model=_OLLAMA_EMBEDDING_MODEL,
base_url=_OLLAMA_API_URL
)
conn = pyodbc.connect(_CONNECTION_STRING)
cursor = conn.cursor()
#Drop embeddings table if it exists
print("\033[93mDropping existing vector store table if it exists...\033[0m")
cursor.execute(f"DROP TABLE IF EXISTS Warehouse.{_SQL_VECTOR_STORE_TABLE};")
print("\033[93mConnecting to SQL Server and fetching data...\033[0m")
cursor.execute("SELECT StockItemId, SearchDetails, UnitPrice FROM Warehouse.StockItems;")
rows = cursor.fetchall()
print(f"\033[93mFound {len(rows)} records to process\033[0m")
# Create documents from the fetched data
documents = [
Document(
page_content=row.SearchDetails,
metadata={
"StockItemId": row.StockItemId,
"UnitPrice": float(row.UnitPrice) # Convert Decimal to float
}
) for row in rows
]
conn.commit()
#Creating vector store
print("\033[93mCreating vector store...\033[0m")
vector_store = SQLServer_VectorStore(
connection_string=_CONNECTION_STRING,
distance_strategy=DistanceStrategy.COSINE, # If not provided, defaults to COSINE
embedding_function=embeddings,
embedding_length=_OLLAMA_EMBEDDING_VECTOR_SIZE,
db_schema = "Warehouse",
table_name=_SQL_VECTOR_STORE_TABLE
)
print("\033[93mAdding to vector store...\033[0m")
try:
vector_store.add_documents(documents)
print("\033[93mSuccessfully added to vector store!\033[0m")
except Exception as e:
print(f"\033[91mError adding documents: {e}\033[0m")
#Vector index not yet integrated in SQL Server VectorStore (drop auto-created nonclustered PK and generating int clustered PK
if (_MODIFY_TABLE_TO_USE_SQL_VECTOR_INDEX):
print("\033[93mModifying structure to create vector index...\033[0m")
cursor.execute("DECLARE @AutoCreatedPK sysname, @SQL nvarchar(max);"
f"SELECT @AutoCreatedPK = name FROM sys.key_constraints WHERE type = 'PK' AND parent_object_id = object_id('Warehouse.{_SQL_VECTOR_STORE_TABLE}');"
f"SELECT @SQL = 'ALTER TABLE Warehouse.{_SQL_VECTOR_STORE_TABLE} DROP CONSTRAINT ' + @AutoCreatedPK + ';'"
"EXEC sp_executesql @SQL;"
f"ALTER TABLE Warehouse.{_SQL_VECTOR_STORE_TABLE} ADD Alt_Id int identity(1,1);"
f"ALTER TABLE Warehouse.{_SQL_VECTOR_STORE_TABLE} ADD CONSTRAINT PK_{_SQL_VECTOR_STORE_TABLE} PRIMARY KEY (Alt_Id);")
conn.commit()
print("\033[93mCreating vector index...\033[0m")
cursor.execute(f"CREATE VECTOR INDEX IV_{_SQL_VECTOR_STORE_TABLE} ON [Warehouse].[{_SQL_VECTOR_STORE_TABLE}] (embeddings) WITH (METRIC = 'cosine', TYPE = 'DiskANN');")
conn.commit()
#Generate prompt then answer
print(f"\033[92mUser Input: {_USER_INPUT}\033[0m")
context = [
{
"Item": doc.page_content,
"UnitPrice": doc.metadata.get("UnitPrice", None)
} for doc in vector_store.similarity_search(_USER_INPUT, k=3)
]
llm = ChatOllama(model=_OLLAMA_SLM_MODEL,base_url=_OLLAMA_API_URL)
prompt = (
f"You are acting as a customer advisor responsible for recommending the most suitable products based on customer needs, providing clear and personalized suggestions"
f"Context: {context}\n\nQuestion: {_USER_INPUT}\n\n")
response = llm.invoke(prompt)
print(f"\033[36m{response.content}\033[0m")
Note : If using devcontainer with VSCode
-
- add "runArgs": [ "--network=host" ] to devcontainer.json to allow connections to “localhost”.
- Import and install the previously created certificat
docker cp C:\Docker\Ollama\localhost.crt <devcontainer name>:/usr/local/share/ca-certificates/localhost.crt
docker exec <devcontainer name> "update-ca-certificates"
Disclaimer
The sample scripts are not supported under any Microsoft standard support program or service. The sample scripts are provided AS IS without warranty of any kind. Microsoft further disclaims all implied warranties including, without limitation, any implied warranties of merchantability or of fitness for a particular purpose. The entire risk arising out of the use or performance of the sample scripts and documentation remains with you. In no event shall Microsoft, its authors, or anyone else involved in the creation, production, or delivery of the scripts be liable for any damages whatsoever (including, without limitation, damages for loss of business profits, business interruption, loss of business information, or other pecuniary loss) arising out of the use of or inability to use the sample scripts or documentation, even if Microsoft has been advised of the possibility of such damages.