Approximate Nearest Neighbor (ANN) Benchmarks
ANN Benchmarks – Glove 100 Angular
ANN Benchmarks – Sift 128 Euclidean
In this era of generative AI, the ability to process and analyze large datasets with precision and speed is not just advantageous—it’s essential. Vector databases, such as Weaviate, play a pivotal role in the infrastructure that powers generative AI applications, from natural language processing to image generation. These databases efficiently handle the similarity search operations at the core of generative models, enabling them to parse vast datasets and identify patterns that drive the creation of new, synthetic content.
By leveraging Azure Kubernetes Service (AKS) and using high-performance Azure NetApp Files (ANF) as the back-end storage, deploying Weaviate creates a scalable foundation that effectively meets the demanding requirements of generative AI models. This blog post guides you through setting up Weaviate on AKS, backed by the robust storage solution of Azure NetApp Files. We then benchmark our setup with ANN-Benchmarks—the established framework for testing approximate nearest neighbor search algorithms with vector databases—to quantitatively measure Weaviate's performance in a controlled environment.
Follow along as we streamline the deployment process and benchmarking steps, providing a clear view of Weaviate's performance in a cloud environment. By the end of our journey, you'll have a comprehensive understanding of how to deploy a scalable vector search solution and what to expect from its performance on Azure's robust infrastructure.
Co-authors: Michael Haigh, Senior Technical Marketing Engineer, Kyle Radder, Technical Marketing Engineer (NetApp)
If you’ll be following along step by step, be sure to have the following resources at your disposal:
We use the Kubernetes Helm chart to install Weaviate on the AKS cluster. First, SSH to the Linux VM that’s deployed in the same virtual network as your AKS cluster, then add the Weaviate repository:
helm repo add weaviate https://weaviate.github.io/weaviate-helm
helm repo update
To view the possible configuration values for the Weaviate Helm chart, run the following command:
helm show values weaviate/weaviate
Depending on your generative AI application, you may want to configure additional Weaviate replica pods or enable local machine learning (ML) models. For our performance benchmarking, we leave all the defaults except for the following settings:
cat <<EOF > values.yaml
storage:
size: 30Ti
service:
annotations:
service.beta.kubernetes.io/azure-load-balancer-internal: "true"
grpcService:
annotations:
service.beta.kubernetes.io/azure-load-balancer-internal: "true"
EOF
As mentioned in the prerequisites, a 30TiB Azure NetApp Files Ultra volume provides 30Gbps of throughput, which is roughly equivalent to the 30,000Mbps of bandwidth provided by the Standard_D64_v4 AKS node. If you’re using a smaller AKS node, you can reduce your volume size to result in an equivalent throughput (each TiB of an Ultra volume provides 1Gbps of throughput).
The other two Helm settings are to use internal IP addresses for the HTTP and GRPC Weaviate services, so network traffic stays confined to our internal virtual network.
To deploy Weaviate with these values, run the following command:
helm install weaviate -n weaviate --create-namespace weaviate/weaviate -f values.yaml
To check on the status of the deployment, run the following command:
kubectl -n weaviate get all,pvc
It takes less than a minute to get the external IPs populated, and about 5 to 10 minutes for the volume to go into a Bound state:
$ kubectl -n weaviate get all,pvc
NAME READY STATUS RESTARTS AGE
pod/weaviate-0 1/1 Running 0 8m21s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/weaviate LoadBalancer 172.16.213.188 10.20.0.8 80:31961/TCP 8m21s
service/weaviate-grpc LoadBalancer 172.16.23.238 10.20.0.9 50051:30943/TCP 8m21s
service/weaviate-headless ClusterIP None <none> 80/TCP 8m21s
NAME READY AGE
statefulset.apps/weaviate 1/1 8m21s
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/weaviate-data-weaviate-0 Bound pvc-4c354c0d-29fe-4af8-a611-35e993c9ecab 30Ti RWO azure-netapp-files-ultra 8m21s
Depending on your virtual network settings, your external IPs will probably be different, but verify that they’re RFC 1918 internal IP addresses to ensure that network traffic stays on the internal virtual network. Take note of these IPs for use in the next section. Once the volume is bound and the weaviate-0 pod is in a Running state, we’re ready to start performance testing.
ANN benchmarks is a benchmarking environment for approximate nearest neighbor (ANN) algorithms. ANN algorithms are used to find the nearest neighbors to a point in a dataset, where approximate means that the algorithm is allowed to return points that are close to the nearest neighbors, rather than the exact ones. This trade-off enables significantly faster processing times, which is especially useful when dealing with very large datasets.
ANN algorithms are an effective tool for testing vector databases due to their efficiency with these large datasets, which are typical in real-world applications such as recommendation systems and natural language processing. By simulating practical use cases, ANN benchmarks allow the evaluation of a vector database's ability to balance accuracy and speed, a critical aspect of user experience. These tests also offer insights into the scalability and resource efficiency of the databases, revealing how performance evolves with growing data volumes and complexity. ANN testing can also inform about the impact of the underlying infrastructure on the database's performance, which is vital for optimizing deployments.
The Weaviate test module in ANN-Benchmarks uses the v3 Weaviate client and an embedded Weaviate instance. Because the v3 client is deprecated and is no longer recommended, and we’re using an external (running on AKS) Weaviate instance, the test module must be modified. A fork of the ANN-Benchmarks repository has been created with these modifications. If you’re curious about the specific changes, see this diff.
From your workstation VM, run the following commands to clone the forked ANN-Benchmarks repository and change into the created directory:
git clone https://github.com/MichaelHaigh/ann-benchmarks.git
cd ann-benchmarks
We now install Python 3.10, which is the validated Python version for ANN-Benchmarks:
sudo apt install -y software-properties-common
sudo add-apt-repository -y ppa:deadsnakes/ppa
sudo apt update
sudo apt install -y python3.10 python3.10-distutils python3.10-venv
(This example is for Ubuntu; if you’re running a different flavor of Linux, the commands will be different.)
Next, we create our Python virtual environment and install the necessary packages:
python3.10 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
pip install weaviate-client
Finally, open the weaviate module.py file with your favorite text editor:
vim ann_benchmarks/algorithms/weaviate/module.py
Take note of lines 14-21, and especially lines 15 and 18:
14 self.client = weaviate.connect_to_custom(
15 http_host="10.20.0.8",
16 http_port="80",
17 http_secure=False,
18 grpc_host="10.20.0.9",
19 grpc_port="50051",
20 grpc_secure=False,
21 )
Lines 15 and 18 must be updated with the external IPs of the weaviate and weaviate-grpc services, respectively, from the previous section. When complete, save the file and exit the text editor.
We're now ready to start our performance benchmarking with the following command:
python run.py --algorithm weaviate --local
(i) Note
This task will take 1 to 2 days to complete, depending on your setup; it took 30 hours with the configuration just described.
|
The --algorithm argument instructs ANN-Benchmarks to run the Weaviate tests, and the --local argument instructs to run the tests “locally” rather than the default Docker method. Because we’ve modified the test module to connect to our external Weaviate instance running on AKS, it’s not truly a “local” test.
The first action of the benchmark is to download the GloVe 100 Angular dataset. (This can be modified by the --dataset argument, as shown in the next section.) We then print out a large list of the order of the tests that will be run:
$ python run.py --algorithm weaviate --local
downloading https://ann-benchmarks.com/glove-100-angular.hdf5 -> data/glove-100-angular.hdf5...
2024-08-19 19:52:11,777 - annb - INFO - running only weaviate
2024-08-19 19:52:12,622 - annb - INFO - Order: [Definition(algorithm='weaviate', constructor='Weaviate', module='ann_benchmarks.algorithms.weaviate', docker_tag='ann-benchmarks-weaviate', arguments=['angular', 64, 128], query_argument_groups=[[16], [32], [48], [64], [96], [128], [256], [512], [768]], disabled=False), Definition(algorithm='weaviate', constructor='Weaviate',
...
In this example (order varies because the tests are randomized), the first test is:
These tests are controlled by the config.yml file located in the algorithm directory, so feel free to modify that file to reduce the number of tests, if desired.
After the tests have been running for a few hours, you can view the PVC overview page of the Azure portal to view the volume’s metrics. Make sure that the “throughput limit reached” chart stays at 0; otherwise your volume has been sized too small in relation to the bandwidth of your selected node.
After 1 to 2 days, the GloVe 100 Angular benchmarking will be complete, and we can move on to our next dataset.
Because the GloVe 100 Angular dataset is geared toward word vectors, we’ll use the Sift 128 Euclidean dataset, which is geared toward image vectors:
This time when we execute the benchmark, we’ll use the --dataset argument to specify this dataset:
python run.py --algorithm weaviate --local --dataset sift-128-euclidean
Again, this command can take 1 to 2 days to complete; in our testing it took roughly 24 hours. When complete, you can continue to run additional benchmarks with more datasets, if desired. However, we’ll now move on to analysis.
There are a handful of ways to analyze the results of our benchmark testing:
We’ll go with option 2 here, because a single command yields many interesting images. However, feel free to play around with options 1 and 3 in your environment. In your workstation, run the following command:
python create_website.py
If your workstation has a desktop environment, open the weaviate.html file that was generated. Otherwise, run the following command to copy the file to your physical machine:
scp <user>@<ip>:/home/<user>/ann-benchmarks/weaviate.html weaviate.html
Once opened, scroll through the page to view the results. The entire page of results is included in the results section of this blog, but let’s dig into just two of the images.
The axes of the above chart represent:
Values to the up and right are better, meaning that Weaviate performed better with the Sift 128 Euclidean dataset than with the GloVe 100 Angular dataset. This indicates that Weaviate is a more capable vector database for computer vision tasks rather than natural language processing. However we recommend testing against additional datasets and vector databases to find the best match for your specific application.
Let’s investigate one more chart:
While the previous chart focused purely on the query phase, this chart focuses on the trade-off between the quality of the search results and the memory footprint of the index. The X axis (Recall) is the same; however the Y axis represents the amount of memory used by the vector database to store the data structure that facilitates the neighbor search. As we can see, for certain levels of recall, Weaviate has a lower memory footprint for the GloVe 100 Angular dataset, but for other levels of recall the Sift 128 Euclidean dataset’s memory footprint is lower.
(i) Note
For more information about the tooltip, see this page of the Weaviate documentation.
|
Depending on your generative AI application, you may value memory footprint over query speed—for example, a computer vision application in embedded systems. Other applications, like a chatbot or coding assistant, may value query speed over memory footprint. Performing benchmarks against potential vector databases with relevant datasets can help determine the ideal configuration for your generative AI applications.
The remaining results of the ANN-Benchmarks testing are shown here.
The deployment and benchmarking of Weaviate on Azure Kubernetes Service with Azure NetApp Files demonstrates the platform's robust capabilities in handling generative AI workloads. The detailed walk-through in this blog simplifies the setup process, and it also equips users with the necessary insights to make informed decisions about their vector database deployments.
The results from the ANN-Benchmarks reveal valuable performance metrics that are essential for optimizing AI applications. Weaviate's impressive handling of the Sift 128 Euclidean dataset suggests a strong suit in computer vision tasks, and its performance with the GloVe 100 Angular dataset opens avenues for natural language processing applications. However, users must consider the specific requirements of their applications, because trade-offs can significantly impact the user experience and operational costs.
By leveraging Azure's scalable infrastructure and Weaviate's vector search capabilities, developers and organizations can confidently scale their AI solutions, knowing that they have a reliable and efficient system in place. The benchmarks are a testament to the potential of Weaviate on AKS and Azure NetApp Files, providing a solid foundation for future generative AI endeavors. Whether your focus is on maximizing recall, query throughput, or maintaining a minimal memory footprint, this setup means that you can achieve your goals with efficiency and precision.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.