Introduction
One of the ways to optimize cost and performance of Large Language Models (LLMs) is to cache the responses from LLMs, this is sometimes referred to as “semantic caching”. In this blog,...
Updated Apr 10, 2024
Version 1.0sudarsan
Microsoft
Joined July 05, 2022
Azure Architecture Blog
Follow this blog board to get notified when there's new activity