Blog Post

Azure Architecture Blog

12 MIN READ

Optimize Azure OpenAI Applications with Semantic Caching

Microsoft

Apr 10, 2024

Introduction One of the ways to optimize cost and performance of Large Language Models (LLMs) is to cache the responses from LLMs, this is sometimes referred to as “semantic caching”. In this blog,...

Updated Apr 10, 2024

Version 1.0

application

artificial intelligence

infrastructure

sudarsan

Microsoft

Joined July 05, 2022

View Profile

Azure Architecture Blog

Follow this blog board to get notified when there's new activity

MasayaNishimaki

Copper Contributor

Apr 27, 2024

Thank you for the valuable information. Using a cash strategy effectively seems to allow for a good design for both users and service providers.