Cosmos DB Patterns for Performance Efficiency
Published Jan 11 2023 04:07 AM 5,542 Views


Performance is one of the key reasons to use Azure Cosmos DB. Traditional databases are often limited in their ability to scale up resources like CPU, RAM, disk and IOPS. By distributing workloads into multiple partitions, Cosmos DB databases can exceed the limits of vertical scaling and provide high performance no matter the size of the use case.
However, properly designing for distributed databases is essential to make sure the technology delivers what it promises.
In this article, we discuss patterns and features designed to improve your Cosmos DB usage from the perspective of Performance Efficiency. This is not a complete list, but it provides some of the most important design considerations for you to get started. Keep in mind, these recommendations are targeted at the Cosmos DB noSQL API, but may apply to other account types as well.

Table of contents



Model data for distributed storage


Modeling is how you fit your data into the Cosmos DB resource model: Accounts, Databases, Containers and Items. In a Cosmos DB noSQL database, data is not restricted to a tabular format with a designated schema - this, however, does not mean there are no rules for how you should organize your data. When it comes to data modeling, consider the following recommendations:


  • If starting from scratch, try to visualize data in an embedded format. That is, a nested JSON structure representing one-to-one and one-to-many relationships. For example, say your model has a Person entity, which contains many Addresses and ContactDetails. This is what your Person document could look like in Cosmos DB:
        "id": "1",
        "firstName": "Thomas",
        "lastName": "Andersen",
        "addresses": [
                "line1": "100 Some Street",
                "line2": "Unit 1",
                "city": "Seattle",
                "state": "WA",
                "zip": 98012
        "contactDetails": [
            {"email": ""},
            {"phone": "+1 555 555-5555", "extension": 5555}

    This format is efficient for reads - notice how you can read a Person document along with all related entities in a single read operation.

  • This will not be a good design for all data. Here are a couple of reasons this may not work: 
    • Modeling many-to-many relationships would cause high repetition across documents;
    • Unbounded one-to-many relationships may cause documents to be too large, increasing cost and hindering performance;
    • Updating related entities frequently will require you to update a large document often.
  • When the embedded format does not work, consider using normalization - or putting related entities into separate documents, referenced by a foreign key. Here's an example for modeling books and reviews in Cosmos DB:
    Book documents:
        "id": "b1",
        "name": "Azure Cosmos DB 101",
        "bookId": "b1",
        "type": "book"
    Review documents:
        "id": "r1",
        "content": "This book is awesome",
        "bookId": "b1",
        "type": "review"
        "id": "r2",
        "content": "Best book ever!",
        "bookId": "b1",
        "type": "review"
  • This other format is efficient for writes - notice above how a review may be inserted without updating a large document;
  • Related documents often benefit from being in the same container. This will allow you to place them in the same logical partition, thus enabling stored procedures and overall faster queries over multiple documents. Use different containers for different sections of your application so that they may scale independently from each other.
  • Be wary of large documents. Most operations will become slower and more expensive as documents grow in size. If documents grow over 100 kB, there is likely room for modeling optimizations.
  • Avoid duplicating information. Any data that is duplicated will require additional logic and resources to be kept in sync. When duplication is unavoidable, make an effort to keep duplicated data constrained to a single container/partition, and use triggers and stored procedures, or even change feeds to keep data in sync. Duplication is not always bad - keeping a short list of child entities or pre-calculated aggregates about them on the parent entity document is a common technique for improving performance;
  • Avoid using application logic for data integrity. Operations that require multiple queries to be completed should be carried out by stored procedures. Those are likely to be more efficient as well as keeping data valid at all times, rolling back changes in case of errors.
  • If using Analytical Store and Synapse Link, keep in mind you are simultaneously editing the transactional and analytical models.
  • Resources:


Choose optimal partition keys


Partition keys define how your data and workload gets distributed across logical and physical partitions. Good partition design is essential to making the most out of Cosmos DB. Also, keep in mind partition keys are not easily edited - once a container is created, the only way to change its partition key is to recreate it and migrate data over. Here are some tips for choosing your partition keys:


  •  Start by ruling out any partition key candidates that do not meet the requirements:
    • Partition keys must be string values;
    • A document's partition key value cannot be edited after it's inserted;
    • Logical partitions are limited to 20 GB (considering all documents with that partition key value)
  • Next, consider the following best practices:
    • Choose a property commonly used in filters. Queries spanning a single partition are much faster and cheaper;
    • Choose partition keys with high cardinality (number of distinct values). This tends to reduce partition size;
    • Make sure the heaviest workloads gets evenly distributed across partitions;
    • Avoid time and location-related fields. Those are often unbalanced and change over time;
    • Combine data modeling and partition design when planning stored procedures and triggers. Remember those can only span a single partition.
    • Primary keys are often good choices for partition keys in simple containers. More complex designs often have better candidates.
    • When no single candidate can be chosen, consider synthetic or hierarchical partition keys;
    • You should not need to worry about physical partitions in most cases. Cosmos DB will allocate logical partitions into physical partitions, and redistribute them when necessary. This is a fully managed component of Cosmos DB.
  • Resources:


Optimize indexing policies


Indexes create auxiliary data structures to improve read performance at the expense of write performance. All properties in Cosmos DB are indexed by default, but you can change this behavior by creating custom indexing policies at the container level.


  • Eliminate indexes on fields that are not used for queries. This will greatly improve performance for most write operations;
  • Choose the appropriate index type for each property:
    • Range indexes (the default type) are used to general equality, range, and string function queries, as well as ordering and joining;
    • Spatial indexes are more suited to spatial data. E.g. checking whether a point is within bounds;
    • Composite indexes increase efficiency for queries on multiple fields.
  • Temporarily disable indexes when loading large amounts of data to CosmosDB - especially data migrations.
  • Resources:

Choose the best pricing model


Cosmos DB comes in three flavors when it comes to pricing: Provisioned Throughput, Autoscale Throughput and Serverless. Each pricing model has its pros and cons, and your choice can directly impact the performance and cost of your deployment. Consider the following before choosing a pricing model:


  • With Provisioned Throughput, you will choose a fixed amount of RU/s, and if demand exceeds this value, requests will be throttled. Use this for stable or predictable workloads where you can easily schedule scaling to match demand.
  • With Autoscale Throughput, you will set a minimum and maximum amount of RU/s, and Cosmos DB will automatically scale within that range to match your application's demand. Use it for variable or unpredictable workloads, or to simplify the process of scaling up/down RU throughput;
  • With Serverless, you are charged a flat rate per million RUs consumed. Use it for bursty, unpredictable traffic, or for development;
  • While Autoscale Throughput may be the most versatile option, be aware that it is significantly more expensive per RU/s than Provisioned Throughput. If your average RU consumption is higher than 70% of the peak usage, Provisioned Throughput is likely to be a better choice.
  • You may freely switch between Provisioned Throughput and Autoscale Throughput, but Serverless can only be chosen on account creation;
  • Resources:


Define data expiration with TTL

TTL (Time-to-live) is a configuration that allows you to continuously remove data after it reaches a certain age. Archive data can be moved to a cheaper form of storage once it is no longer under active usage, allowing more resources for the data that matters the most.


  • Consider TTL for data that can safely expire without impacting your application;
  • Transactional and Analytical TTL can be set at different values without interfering with each other;
  • TTL can be set at the container or at the item level.
  • Resources:


Decouple analytical workloads with Synapse Link


Analytical queries - especially when spanning multiple partitions - are likely to be expensive or unsupported by Cosmos DB's resource model. Analytical store is a way of using CosmosDB data for analytics without impacting transactional RU consumption. This feature can be used to decouple analytical workloads from your transactional application's performance
  • Analytical Store makes data available through Synapse SQL and Spark pools;
  • Consider analytical store when you need to run queries that:
    • Return large amounts of data
    • Join data from multiple containers
    • Aggregate data over many documents
  • Analytical store is updated within 2 minutes of any transactional store updates, allowing for near-realtime analytics.
  • Resources:

Use Integrated Cache with Dedicated Gateways


You may find situations where your application is constantly reading data that rarely gets updated. In this scenario, you may want to consider using Integrated Cache, which creates a transparent cache layer that does not consume RUs on cache hits.


  • Integrated cache requires creating a Dedicated Gateway resource;
  • Consider integrated cache for read-heavy containers that are not updated frequently;
  • Limitations:
    • Dedicated gateways are only supported on API for NoSQL accounts
    • You can't provision a dedicated gateway in Azure Cosmos DB accounts with availability zones.
    • You can't use role-based access control (RBAC) to authenticate data plane requests routed through the dedicated gateway
  • Resources:

Choose the best connectivity mode


Applications may connect to Cosmos DB in two ways - gateway mode and direct mode. In simple terms, gateway mode uses an intermediary component - the gateway - to connect to backend nodes. This has implications to networking and performance which need to be taken in consideration when choosing a connectivity mode:


  • Gateway mode is more suitable to applications with strict firewall restrictions, as it communicates through specific network ports, and uses a single DNS endpoint. It's also the only connection mode supported by all SDKs, and the only way to take advantage of Integrated Cache;
  • Direct mode tends to have better performance, due to fewer network hops. However, connections happen through a wider range of network ports, and is currently only supported in Java and .NET SDKs.
  • Resources:




How many practices and features above are you already applying to your Cosmos DB deployment? Are there any other best practices you've successfully implemented? Leave a comment and let me know!

Version history
Last update:
‎Jan 11 2023 04:08 AM
Updated by: