In May 2024, we introduced GenAI Gateway capabilities – a set of features designed specifically for GenAI use cases. Today, we are happy to announce that we are adding new policies to support a wider range of large language models through Azure AI Model Inference API. These new policies work in a similar way to the previously announced capabilities, but now can be used with a wider range of LLMs.
Azure AI Model Inference API enables you to consume the capabilities of models, available in Azure AI model catalog, in a uniform and consistent way. It allows you to talk with different models in Azure AI Studio without changing the underlying code.
Working with large language models presents unique challenges, particularly around managing token resources. Token consumption impacts cost and performance of intelligent apps calling the same model, making it crucial to have robust mechanisms for monitoring and controlling token usage. The new policies aim to address challenges by providing detailed insights and control over token resources, ensuring efficient and cost-effective use of models deployed in Azure AI Studio.
LLM Token Limit policy (preview) provides the flexibility to define and enforce token limits when interacting with large language models available through the Azure AI Model Inference API.
Key Features
Learn more about this policy here.
LLM Emit Token Metric policy (preview) provides detailed metrics on token usage, enabling better cost management and insights into model usage across your application portfolio.
Key Features
Learn more about this policy here.
LLM Semantic Caching policy (preview) is designed to reduce latency and reduce token consumption by caching responses based on the semantic content of prompts.
Key Features
Learn more about this policy here.
We are committed to continuously improving our platform and providing the tools you need to leverage the full potential of large language models. Stay tuned as we roll out these new policies across all regions and watch for further updates and enhancements as we continue to expand our capabilities. Get started today and bring your intelligent application development to the next level with Azure API Management.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.