Forum Discussion

SerratedSharp's avatar
SerratedSharp
Copper Contributor
Mar 21, 2024

Vector indexes for image similarity search in Azure AI Search?

I was going through the Azure AI Studio and trying to create an Ai Search against images, but it only accepts document file types.  Images such as *.png's are not permitted.

 

I can generate vectorizations of images based on this: https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/how-to/image-retrieval?tabs=csharp#call-the-vectorize-image-api

 

If I use that API to retrieve image vectors is it possible to populate a AI Search vector database with these results?  I.e. bypass it's own vector embedding and use my vectors?

 

I'm still learning how to translate my rough knowledge of AI into Azure services, so appreciate helping me get oriented.  My goal is to process a collection of images, generate vectors, store the vectors in a vector index(what I understand AI Search to be), and then on demand at a later time individual process a user supplied image and perform a vector search on that image against the AI Search index to find similar images.  Note I am not asking about performing OCR.  I understand images embedded in documents can have some analysis performed on them to extract features, but AFAIK it doesn't create a vector embedding of the actual image that would be appropriate for a similarity search.

  • user_2429799's avatar
    user_2429799
    Copper Contributor
    [Copilot]
    Azure AI Search indeed supports vector search, which is an approach in information retrieval that supports indexing and query execution over numeric representations of content. This means that it can match across multiple content types, including images.

    To use Azure AI Search for image similarity, you would need to generate vector embeddings for your images. Azure AI Search doesn’t host vectorization models, so you would need to create these embeddings externally. You can use any embedding model, but Azure OpenAI embeddings models are commonly used. Once you have these embeddings, you can store them in Azure AI Search, which can act as a vector database.

    You can then use these stored vectors to perform similarity searches. For example, you can encode a user-supplied image into a vector using the same process, and then perform a vector search on that image against the AI Search index to find similar images.

    As for OCR, Azure AI Search does offer OCR capabilities through AI enrichment, which can extract text from images. However, as you noted, this wouldn’t create a vector embedding of the actual image for a similarity search.

    In summary, your understanding is correct. You can use Azure AI Search to create a vector index of images, and then perform similarity searches on user-supplied images against this index. This process would involve generating vector embeddings of your images externally and then storing these embeddings in Azure AI Search.

Resources