Blog Post

Azure Architecture Blog
12 MIN READ

Optimize Azure OpenAI Applications with Semantic Caching

sudarsan's avatar
sudarsan
Icon for Microsoft rankMicrosoft
Apr 10, 2024
Introduction One of the ways to optimize cost and performance of Large Language Models (LLMs) is to cache the responses from LLMs, this is sometimes referred to as “semantic caching”. In this blog,...
Updated Apr 10, 2024
Version 1.0