Forum Discussion
CosmosDb multi-region writes and creating globally unique value
Your reasoning is incredibly solid, and you're tackling one of the trickiest problems in distributed systems: achieving global uniqueness under eventual consistency.
Multi-Region Writes and Conflicts You're absolutely right: when you enable multi-region writes, each region accepts writes independently. Even with conflict resolution in place (like Cosmos DB’s "last write wins" or custom policies), if two users register the same handle nearly simultaneously in different regions, both operations could succeed locally. Then, conflict resolution kicks in later and overwrites one—leading to the surprise you mentioned. That’s a real UX risk.
Why Strong Consistency Isn’t Enough Strong consistency can mitigate this, but only within one write region. Across multiple write regions, Cosmos DB (and other similar databases) don’t offer strong consistency, so you’re stuck with eventual consistency or session consistency at best. Even with "bounded staleness," there's a window for race conditions.
Your Proposed Solution: Single Write Region You're spot on: keeping the username registration logic centralized in a single write region is the safest strategy. The tentative query from a nearby read replica improves latency, but the final write must go through the authoritative region to enforce uniqueness reliably.
Here’s a refined breakdown of your architecture:
- Clients perform a pre-check (tentative query) via local read regions.
- Final write request is routed to the master region where you use CreateItem with a partition key on username. If it fails, you know the handle's taken—clean and atomic.
- Optionally, you could implement a retry mechanism that asks users for a new handle if there's a conflict.
Alternatives (with trade-offs):
- Centralized Registration Service: Use a dedicated service (maybe an Azure Function or API) in the master region that clients call for username registration.
- Leasing-based system: Use a temporary hold or reservation on handles before final commit, but this gets complex fast in distributed environments.
- Globally synchronized cache (like Redis with RedLock): Advanced but potentially overkill for this case.
So yes—your reasoning not only makes sense, it's the go-to approach used in many large-scale apps to handle unique user identifiers.