Blog Post

SQL Server Blog
6 MIN READ

mssql-python 1.5: Apache Arrow, sql_variant, and Native UUIDs

DavidLevy's avatar
DavidLevy
Icon for Microsoft rankMicrosoft
Apr 10, 2026

We're excited to announce the release of mssql-python 1.5.0, the latest version of Microsoft's official Python driver for SQL Server, Azure SQL Database, and SQL databases in Fabric. This release delivers Apache Arrow fetch support for high-performance data workflows, first-class sql_variant and native UUID support, and a collection of important bug fixes.

 

pip install --upgrade mssql-python

Apache Arrow fetch support

If you're working with pandas, Polars, DuckDB, or any Arrow-native data framework, this release changes how you get data out of SQL Server. The new Arrow fetch API returns query results as native Apache Arrow structures, using the Arrow C Data Interface for zero-copy handoff directly from the C++ layer to Python.

This is a significant performance improvement over the traditional fetchall() path, which converts every value through Python objects. With Arrow, columnar data stays in columnar format end-to-end, and your data framework can consume it without any intermediate copies.

Three methods for different workflows

cursor.arrow() fetches the entire result set as a PyArrow Table:

import mssql_python

conn = mssql_python.connect(
    "SERVER=myserver.database.windows.net;"
    "DATABASE=AdventureWorks;"
    "UID=myuser;PWD=mypassword;"
    "Encrypt=yes;"
)
cursor = conn.cursor()
cursor.execute("SELECT * FROM Sales.SalesOrderDetail")

# Get the full result as a PyArrow Table
table = cursor.arrow()

# Convert directly to pandas - zero-copy where possible
df = table.to_pandas()

# Or to Polars - also zero-copy
import polars as pl
df = pl.from_arrow(table)

cursor.arrow_batch() fetches a single RecordBatch of a specified size, useful when you want fine-grained control over memory:

cursor.execute("SELECT * FROM Production.TransactionHistory")

# Process in controlled chunks
while True:
    batch = cursor.arrow_batch(batch_size=10000)
    if batch.num_rows == 0:
        break
    # Process each batch individually
    process(batch.to_pandas())

cursor.arrow_reader() returns a streaming RecordBatchReader, which integrates directly with frameworks that accept readers:

cursor.execute("SELECT * FROM Production.TransactionHistory")
reader = cursor.arrow_reader(batch_size=8192)

# Write directly to Parquet with streaming - no need to load everything into memory
import pyarrow.parquet as pq
pq.write_table(reader.read_all(), "output.parquet")

# Or iterate batches manually
for batch in reader:
    process(batch)

How it works under the hood

The Arrow integration is built directly into the C++ pybind11 layer. When you call any Arrow fetch method, the driver:

  1. Allocates columnar Arrow buffers based on the result set schema
  2. Fetches rows from SQL Server in batches using bound column buffers
  3. Converts and packs values directly into the Arrow columnar format
  4. Exports the result via the Arrow C Data Interface as PyCapsule objects
  5. PyArrow imports the capsules with zero copy

Every SQL Server type maps to the appropriate Arrow type: INT to int32, BIGINT to int64, DECIMAL(p,s) to decimal128(p,s), DATE to date32, TIME to time64[ns], DATETIME2 to timestamp[us], UNIQUEIDENTIFIER to large_string, VARBINARY to large_binary, and so on.

LOB columns (large VARCHAR(MAX), NVARCHAR(MAX), VARBINARY(MAX), XML, UDTs) are handled transparently by falling back to row-by-row GetData fetching while still assembling the result into Arrow format.

Community contribution

The Arrow fetch support was contributed by @ffelixg. This is a substantial contribution spanning the C++ pybind layer, the Python cursor API, and comprehensive tests. Thank you, Felix Graßl, for an outstanding contribution that brings high-performance data workflows to mssql-python.

sql_variant type support

SQL Server's sql_variant type stores values of various data types in a single column. It's commonly used in metadata tables, configuration stores, and EAV (Entity-Attribute-Value) patterns. Version 1.5 adds full support for reading sql_variant values with automatic type resolution.

The driver reads the inner type tag from the sql_variant wire format and returns the appropriate Python type:

cursor.execute("""
    CREATE TABLE #config (
        key NVARCHAR(50) PRIMARY KEY,
        value SQL_VARIANT
    )
""")

cursor.execute("INSERT INTO #config VALUES ('max_retries', CAST(5 AS INT))")
cursor.execute("INSERT INTO #config VALUES ('timeout', CAST(30.5 AS FLOAT))")
cursor.execute("INSERT INTO #config VALUES ('app_name', CAST('MyApp' AS NVARCHAR(50)))")
cursor.execute("INSERT INTO #config VALUES ('start_date', CAST('2026-01-15' AS DATE))")

cursor.execute("SELECT value FROM #config ORDER BY key")
rows = cursor.fetchall()

# Each value comes back as the correct Python type
assert rows[0][0] == "MyApp"        # str
assert rows[1][0] == 5              # int
assert rows[2][0] == date(2026, 1, 15)  # datetime.date
assert rows[3][0] == 30.5           # float

All 23+ base types are supported, including int, float, Decimal, bool, str, date, time, datetime, bytes, uuid.UUID, and None.

Native UUID support

Previously, UNIQUEIDENTIFIER columns were returned as strings, requiring manual conversion to uuid.UUID. Version 1.5 changes the default: UUID columns now return native uuid.UUID objects.

import uuid

cursor.execute("SELECT NEWID() AS id")
row = cursor.fetchone()

# Native uuid.UUID object - no manual conversion needed
assert isinstance(row[0], uuid.UUID)
print(row[0])  # e.g., UUID('550e8400-e29b-41d4-a716-446655440000')

UUID values also bind natively as input parameters:

my_id = uuid.uuid4()
cursor.execute("INSERT INTO Users (id, name) VALUES (?, ?)", my_id, "Alice")

Migration compatibility

If you're migrating from pyodbc and your code expects string UUIDs, you can opt out at three levels:

# Module level - affects all connections
mssql_python.native_uuid = False

# Connection level - affects all cursors on this connection
conn = mssql_python.connect(conn_str, native_uuid=False)

When native_uuid=False, UUID columns return strings as before.

Row class export

The Row class is now publicly exported from the top-level mssql_python module. This makes it easy to use in type annotations and isinstance checks:

from mssql_python import Row

cursor.execute("SELECT 1 AS id, 'Alice' AS name")
row = cursor.fetchone()

assert isinstance(row, Row)
print(row[0])       # 1 (index access)
print(row.name)     # "Alice" (attribute access)

Bug fixes

Qmark false positive fix

The parameter style detection logic previously misidentified? characters inside SQL comments, string literals, bracketed identifiers, and double-quoted identifiers as qmark parameter placeholders. A new context-aware scanner correctly skips over these SQL quoting contexts:

# These no longer trigger false qmark detection:
cursor.execute("SELECT [is this ok?] FROM t")
cursor.execute("SELECT 'what?' AS col")
cursor.execute("SELECT /* why? */ 1")

NULL VARBINARY parameter fix

Fixed NULL parameter type mapping for VARBINARY columns, which previously could fail when passing None as a binary parameter.

Bulkcopy auth fix

Fixed stale authentication fields being retained in the bulk copy context after token acquisition. This could cause Entra ID-authenticated bulk copy operations to fail on subsequent calls.

Explicit module exports

Added explicit __all__ exports from the main library module to prevent import resolution issues in tools like mypy and IDE autocompletion.

Credential cache fix

Fixed the credential instance cache to correctly reuse and invalidate cached credential objects, preventing unnecessary re-authentication.

datetime.time microseconds fix

Fixed datetime.time values incorrectly having their microseconds component set to zero when fetched from TIME columns.

The road to 1.5

ReleaseDateHighlights
1.0.0November 2025GA release - DDBC architecture, Entra ID auth, connection pooling, DB API 2.0 compliance
1.1.0December 2025Parameter dictionaries, Connection.closed property, Copilot prompts
1.2.0January 2026Param-as-dict, non-ASCII path handling, fetchmany fixes
1.3.0January 2026Initial BCP implementation (internal), SQLFreeHandle segfault fix
1.4.0February 2026BCP public API, spatial types, Rust core upgrade, encoding & stability fixes
1.5.0April 2026Apache Arrow fetch, sql_variant, native UUIDs, qmark & auth fixes

Get started today

pip install --upgrade mssql-python

We'd love your feedback. Try the new Arrow fetch API with your data workflows, let us know how it performs, and file issues for anything you run into. This driver is built for the Python data community, and your input directly shapes what comes next.

Published Apr 10, 2026
Version 1.0
No CommentsBe the first to comment