Blog Post
Vector Drift in Azure AI Search: Three Hidden Reasons Your RAG Accuracy Degrades After Deployment
Hi Akanksha, I really enjoyed this article! You hit the nail on the head regarding why RAG systems can start to feel a bit off after a few months in production. Cause 2 was a major lightbulb moment for me because it’s so easy to forget that semantic meaning just isn't static.
I've found that using Hybrid Search can be a great safety net while the vector space is shifting. Also, for larger datasets where a full re-index is too expensive, a rolling re-index strategy focusing on the top 10 or 20 percent of high-impact docs usually clears up the most visible drift issues pretty fast.
On the monitoring side, tracking the average similarity score of top results over time has been a real lifesaver for us. It acts like a canary in the coal mine to catch alignment slips before users even notice the accuracy drop.
One thing I’d love to add to your point on chunking is the metadata lineage aspect. If the strategy changes, those pointers back to the original source doc can get misaligned. It's almost like the chunks become orphans, which makes citations a nightmare for users even if the answer is technically right.
Thanks for sharing these insights! It's definitely going to be a go-to resource for the team.