Introduction
One of the ways to optimize cost and performance of Large Language Models (LLMs) is to cache the responses from LLMs, this is sometimes referred to as “semantic caching”. In this blog,...
Updated Apr 09, 2024
Version 1.0sudarsan
Microsoft
Joined July 04, 2022
Azure Architecture Blog
Follow this blog board to get notified when there's new activity