In this episode, we'll see how to calculate KMeans clusters for vector data which will then be used to perform an Approximate Similarity Search. We'll offload resource intensive processing to calculate KMeans using SciKit-Learn to a container and then do cell probing in pure T-SQL.
Resources:
Intelligent applications with Azure SQL Database: https://aka.ms/sqlai
Azure SQL Devs’ Corner: https://devblogs.microsoft.com/azure-sql/
Vector Search Optimization via KMeans, Voronoi Cells and Inverted File Index (aka “Cell-Probing”): https://devblogs.microsoft.com/azure-sql/vector-search-optimization-via-kmeans-voronoi-cells-and-inverted-file-index-aka-cell-probing/
View/share our latest episodes on Microsoft Learn and YouTube!
Published Apr 25, 2024
Version 1.0MarisaMathews
Microsoft
Joined May 14, 2020
Azure SQL Blog
Follow this blog board to get notified when there's new activity