azure cosmos db
33 TopicsAzure APIM Cost Rate Limiting with Cosmos & Flex Functions
Azure API Management (APIM) provides built-in rate limiting policies, but implementing sophisticated Dollar cost quota management for Azure OpenAI services requires a more tailored approach. This solution combines Azure Functions, Cosmos DB, and stored procedures to implement cost-based quota management with automatic renewal periods. Architecture Client → APIM (with RateLimitConfig) → Azure Function Proxy → Azure OpenAI ↓ Cosmos DB (quota tracking) Technical Implementation 1. Rate Limit Configuration in APIM The rate limiting configuration is injected into the request body by APIM using a policy fragment. Here's an example for a basic $5 quota: <set-variable name="rateLimitConfig" value="@{ var productId = context.Product.Id; var config = new JObject(); config["counterKey"] = productId; config["quota"] = 5; return config.ToString(); }" /> <include-fragment fragment-id="RateLimitConfig" /> For more advanced scenarios, you can customize token costs. Here's an example for a $10 quota with custom token pricing: <set-variable name="rateLimitConfig" value="@{ var productId = context.Product.Id; var config = new JObject(); config["counterKey"] = productId; config["startDate"] = "2025-03-02T00:00:00Z"; config["renewal_period"] = 86400; config["explicitEndDate"] = null; config["quota"] = 10; config["input_cost_per_token"] = 0.00003; config["output_cost_per_token"] = 0.00006; return config.ToString(); }" /> <include-fragment fragment-id="RateLimitConfig" /> Flexible Counter Keys The counterKey parameter is highly flexible and can be set to any unique identifier that makes sense for your rate limiting strategy: Product ID: Limit all users of a specific APIM product (e.g., "starter", "professional") User ID: Apply individual limits per user Subscription ID: Track usage at the subscription level Custom combinations: Combine identifiers for granular control (e.g., "product_starter_user_12345") Rate Limit Configuration Parameters Parameter Description Example Value Required counterKey Unique identifier for tracking quota usage "starter10" or "user_12345" Yes quota Maximum cost allowed in the renewal period 10 Yes startDate When the quota period begins. If not provided, the system uses the time when the policy is first applied "2025-03-02T00:00:00Z" No renewal_period Seconds until quota resets (86400 = daily). If not provided, no automatic reset occurs 86400 No endDate Optional end date for the quota period null or "2025-12-31T23:59:59Z" No input_cost_per_token Custom cost per input token 0.00003 No output_cost_per_token Custom cost per output token 0.00006 No Scheduling and Time Windows The time-based parameters work together to create flexible quota schedules: If the current date falls outside the range defined by startDate and endDate , requests will be rejected with an error The renewal window begins either on the specified startDate or when the policy is first applied The renewal_period determines how frequently the accumulated cost resets to zero Without a renewal_period , the quota accumulates indefinitely until the endDate is reached 2. Quota Checking and Cost Tracking The Azure Function performs two key operations: Pre-request quota check: Before processing each request, it verifies if the user has exceeded their quota Post-request cost tracking: After a successful request, it calculates the cost and updates the accumulated usage Cost Calculation For cost calculation, the system uses: Custom pricing: If input_cost_per_token and output_cost_per_token are provided in the rate limit config LiteLLM pricing: If custom pricing is not specified, the system falls back to LiteLLM's model prices for accurate cost estimation based on the model being used The function returns appropriate HTTP status codes and headers: HTTP 429 (Too Many Requests) when quota is exceeded Response headers with usage information: x-counter-key: starter5 x-accumulated-cost: 5.000915 x-quota: 5 3. Cosmos DB for State Management Cosmos DB maintains the quota state with documents that track: { "id": "starter5", "counterKey": "starter5", "accumulatedCost": 5.000915, "startDate": "2025-03-02T00:00:00.000Z", "renewalPeriod": 86400, "renewalStart": 1741132800000, "endDate": null, "quota": 5 } A stored procedure handles atomic updates to ensure accurate tracking, including: Adding costs to the accumulated total Automatically resetting costs when the renewal period is reached Updating quota values when configuration changes Benefits Fine-grained Cost Control: Track actual API usage costs rather than just request counts Flexible Quotas: Set daily, weekly, or monthly quotas with automatic renewal Transparent Usage: Response headers provide real-time quota usage information Product Differentiation: Different APIM products can have different quota levels Custom Pricing: Override default token costs for special pricing tiers Flexible Tracking: Use any identifier as the counter key for versatile quota management Time-based Scheduling: Define active periods and automatic reset windows for quota management Getting Started Deploy the Azure Function with Cosmos DB integration Configure APIM policies to include rate limit configuration Set up different product policies for various quota levels For a detailed implementation, visit our GitHub repository. Demo Video: https://www.youtube.com/watch?v=vMX86_XpSAo Tags: #AzureOpenAI #APIM #CosmosDB #RateLimiting #Serverless278Views0likes2CommentsBuilt-in Enterprise Readiness with Azure AI Agent Service
Ensure enterprise-grade security and compliance with Private Network Isolation (BYO VNet) in Azure AI Agent Service. This feature allows AI agents to operate within a private, isolated network, giving organizations full control over data and networking configurations. Learn how Private Network Isolation enhances security, scalability, and compliance for mission-critical AI workloads.1.7KViews2likes0CommentsIntroducing Azure AI Agent Service
Introducing Azure AI Agent Service at Microsoft Ignite 2024 Discover how Azure AI Agent Service is revolutionizing the development and deployment of AI agents. This service empowers developers to build, deploy, and scale high-quality AI agents tailored to business needs within hours. With features like rapid development, extensive data connections, flexible model selection, and enterprise-grade security, Azure AI Agent Service sets a new standard in AI automation64KViews10likes8CommentsAdopting Hybrid Search with Azure Cosmos DB
In an era where data accessibility and retrieval are crucial, Azure Cosmos DB introduces Hybrid Search, a cutting-edge feature that merges the capabilities of Vector Search and Full-Text Search. This integration enhances search relevance by combining semantic understanding with traditional keyword-based methods, making it ideal for diverse applications such as e-commerce, content management, and AI-driven chatbots. The blog provides a comprehensive guide on enabling and configuring Hybrid Search within Azure Cosmos DB, detailing the processes for setting up Vector Search and Full-Text Search. It also explores the underlying mechanics of Hybrid Search, which utilizes Reciprocal Rank Fusion (RRF) to combine multiple scoring functions for more accurate search results. Additionally, practical use cases and a step-by-step project example demonstrate how to implement an enterprise knowledge management system using Nest.js integrated with Azure Cosmos DB's Hybrid Search capabilities. This powerful combination offers developers and businesses the tools needed to create sophisticated, efficient, and intelligent search experiences within their applications.347Views2likes1Comment[pt2] Choosing the right Data Storage Source (Under Preview) for Azure AI Search
This blog introduces new preview data sources for Azure AI Search, including Fabric OneLake Files, Azure Cosmos DB for Gremlin, Azure Cosmos DB for MongoDB, SharePoint, and Azure Files. Each data source supports incremental indexing, metadata extraction, and AI enrichment, making Azure AI Search more powerful for enterprise search applications.161Views1like0Comments[pt1] Choosing the right Data Storage Source (Generally available) for Azure AI Search
When integrating Azure AI Search into your solutions, choosing the right data storage and data sources is crucial for efficient and scalable indexing. This blog dives into three primary data source connectors for Azure AI Search: Azure Blob Storage, Azure Cosmos DB for NoSQL, and Azure SQL Database. Each data source type offers distinct advantages and use cases depending on the structure of your data and the desired search functionality.246Views0likes0CommentsFull-Text Search in Azure Cosmos DB
Full-Text Search in Azure Cosmos DB: A Powerful Way to Enhance Search Capabilities Searching through vast amounts of unstructured data can be challenging, but Full-Text Search simplifies this by allowing advanced querying beyond simple keyword matching. Now available in Azure Cosmos DB for NoSQL (Preview), this feature enables faster, more accurate searches using techniques like tokenization, stemming, stop-word removal, and indexing. How It Works Full-text search operates in two stages: Indexing – Text is analyzed, broken into tokens, and indexed for efficient retrieval. Searching – Queries are run against the index using functions like FullTextContains, FullTextContainsAll, and FullTextScore, allowing ranked and relevant results.391Views1like0CommentsSix reasons why startups and at-scale cloud native companies build their GenAI Apps with Azure
Azure has evolved as a platform of choice for many startups including Perplexity and Moveworks, as well as at-scale companies today. Here are six reasons why we see companies of all sizes building their GenAI apps on Azure OpenAI Service.3.1KViews2likes0CommentsEnhancing E-Commerce Product Search with Vector Similarity in Azure Cosmos DB
Learn how to implement vector similarity search in your e-commerce API using Azure Cosmos DB and TypeScript. Boost search accuracy and user experience with advanced embedding techniques and scalable NoSQL solutions.939Views0likes0CommentsDevelop a Library Web API: Integrating Azure Cosmos DB for MongoDB with ASP.NET Core
As a software developer, you’re always seeking ways to build scalable, high-performance applications. Azure Cosmos DB for MongoDB offers the flexibility of MongoDB with the reliability and global reach of Azure. In this blog, we’ll explore how to integrate Azure Cosmos DB for MongoDB with your ASP.NET Core application, walking through the key steps for setting up a simple API to perform CRUD operations. By leveraging this powerful combination, you can streamline your development process and unlock new possibilities for your data-driven projects. In our previous blog, we delved into the capabilities of azure cosmos DB for MongoDB using Open MongoDB shell in Azure portal. I highly recommend checking it out to understand the fundamentals2.6KViews0likes0Comments