azure cache for redis

1 Topic

Deploying Azure Redis Enterprise with Geo-Replication Using Terraform
This post walks through a production‑proven pattern for running stateful services across Azure regions using Terraform. We’ll cover a primary–replica Redis architecture, regional isolation with Key Vault and networking, and a clean Terraform parameterization strategy that scales from development to production without duplication. Why Multi‑Region State Is Hard Running applications globally is easy when everything is stateless—if something fails, you redeploy. But stateful services tell a different story. Caches, message brokers, and data stores can’t be treated as disposable. They hold business‑critical data, and downtime or inconsistency quickly becomes customer‑visible. In real‑world systems, common requirements include: Low‑latency reads from multiple regions Automatic recovery when a region becomes unavailable Predictable data consistency Repeatable infrastructure from dev through production Manually configuring this per region doesn’t scale. Drift sets in. Failover is unclear. Backups get forgotten. That’s where Terraform + Azure Managed Redis geo‑replication shines. Github Link : https://github.com/vsakash5/Managed-redis.git High‑Level Architecture We use a primary–replica Redis Enterprise model: Primary Redis Single write endpoint Highly available inside its region Source of truth Replica Redis Read‑only Asynchronously synced from primary Can be promoted during disaster recovery Each region is fully isolated: Separate subnets Separate Key Vaults Private Endpoints only (no public exposure) This prevents shared failure domains and allows each region to operate independently if needed. The Terraform Design Principle Instead of maintaining separate Terraform stacks per region, the key idea is: One reusable module, one tfvars file per environment, multiple regions inside it. The module is written once. Regional differences are supplied via parameter suffixes like: _replica _secondary _tertiary This keeps logic centralized and environments consistent. Core Parameter Layers 1. Environment Identity (Shared) Terraform environment = "dev" # dev | staging | prod context_prefix = "app" Show more lines These values are reused everywhere—names, tags, and identifiers. 2. Primary Region Terraform location = "eastus2" resource_group_name = "rg-app-dev-primary" Show more lines 3. Replica Region Terraform location_replica = "uksouth" resource_group_name_replica = "rg-app-dev-replica" The symmetry is intentional. Terraform can now apply the same module twice without branching logic. Regional Isolation: Networking and Secrets Why isolation matters Geo‑replication copies data, not dependencies. If both Redis instances depend on: the same subnet the same Key Vault then a failure in one region can cascade into the other. Networking (One Subnet per Region) Benefits: Independent NSGs Independent routing Independent capacity planning Key Vault (One per Region) Why this matters: Redis credentials are not replicated Each region stores its own secrets A Key Vault outage doesn’t take both regions down Redis Configuration Primary Redis (Writes Enabled) The geo‑replication group name must match. That’s the logical binding Azure uses to link instances. Private Endpoint‑Only Access No Redis instance is exposed publicly. Each region uses: A private endpoint A workload subnet Internal DNS resolution This means: No public IPs No inbound attack surface Traffic stays on the Azure backbone Linking Primary and Replica Terraform explicitly defines the relationship: Terraform managed_redis_geo_replication_config = { primary_to_replica = { primary_redis_key = "primary" replica_keys = ["replica"] } } Terraform ensures: Primary is created first Replica is deployed second Geo‑replication is established last Environment Scaling: Dev → Staging → Prod The infrastructure pattern never changes. Only values do. Environment Group Name Dev dev-grp Staging stg-grp Prod prod-grp This is how you avoid “snowflake” environments. Disaster Recovery Strategy If the primary region fails: Applications fail over to the replica read endpoint Terraform configuration is updated to: Remove geo‑replication Promote replica config to primary Traffic is fully restored Once the original region recovers, roles can be re‑established cleanly. No click‑ops. No guesswork. Key Lessons Learned 1. Naming is Infrastructure Predictable names enable automation, discovery, and auditing. 2. Key Vault Isolation Beats Availability A shared Key Vault is a shared outage. 3. Parameterization Beats Copy‑Paste Fix once → benefit everywhere. 4. Geo‑Replication Is a Contract Matching replication group names is non‑negotiable. 5. The tfvars File Is the Source of Truth If it’s not in Terraform, it’s not real. Final Thoughts Running stateful services in multiple regions doesn’t require magic— it requires discipline: Isolate aggressively Parameterize consistently Automate everything Test failure often With this approach, adding a new region becomes configuration—not redesign. That’s how infrastructure scales.
vsakash
Apr 28, 2026 Place Azure Infrastructure Blog
95Views
1like
0Comments