well-architected
47 TopicsUsing Application Gateway to secure access to the Azure OpenAI Service: Customer success story
Introduction A large enterprise customer set out to build a generative AI application using Azure OpenAI. While the app would be hosted on-premises, the customer wanted to leverage the latest large language models (LLMs) available through Azure OpenAI. However, they faced a critical challenge: how to securely access Azure OpenAI from an on-prem environment without private network connectivity or a full Azure landing zone. This blog post walks through how customers overcame these limitations using Application Gateway as a reverse proxy in front of their Azure Open AI along with other Azure services, to meet their security and governance requirements. Customer landscape and challenges The customer’s environment lacked: Private network connectivity (no Site-to-Site VPN or ExpressRoute). This was due to using a new Azure Government environment and not having a cloud operations team set up yet Common network topology such as Virtual WAN and Hub-Spoke network design A full Enterprise Scale Landing Zone (ESLZ) of common infrastructure Security components like private DNS zones, DNS resolvers, API Management, and firewalls This meant they couldn’t use private endpoints or other standard security controls typically available in mature Azure environments. Security was non-negotiable. Public access to Azure OpenAI was unacceptable. Customer needs to: Restrict access to specific IP CIDR ranges from on-prem user machines and data centers Limit ports communicating with Azure OpenAI Implement a reverse proxy with SSL termination and Web Application Firewall (WAF) Use a customer-provided SSL certificate to secure traffic Proposed solution To address these challenges, the customer designed a secure architecture using the following Azure components: Key Azure services Application Gateway – Layer 7 reverse proxy, SSL termination & Web Application Firewall (WAF) Public IP – Allows communication over public internet between customer’s IP addresses & Azure IP addresses Virtual Network – Allows control of network traffic in Azure Network Security Group (NSG) – Layer 4 network controls such as port numbers, service tags using five-tuple information (source, source port, destination, destination port, protocol) Azure OpenAI – Large Language Model (LLM) NSG configuration Inbound Rules: Allow traffic only from specific IP CIDR ranges and HTTP(S) ports Outbound Rules: Target AzureCloud.<region> with HTTP(S) ports (no service tag for Azure OpenAI yet) Application Gateway setup SSL Certificate: Issued by the customer’s on-prem Certificate Authority HTTPS Listener: Uses the on-prem certificate to terminate SSL Traffic flow: Decrypt incoming traffic Scan with WAF Re-encrypt using a well-known Azure CA Override backend hostname Custom health probe: Configured to detect a 404 response from Azure OpenAI (since no health check endpoint exists) Azure OpenAI configuration IP firewall restrictions: Only allow traffic from the Application Gateway subnet Outcome By combining Application Gateway, NSGs, and custom SSL configurations, the customer successfully secured their Azure OpenAI deployment—without needing a full ESLZ or private connectivity. This approach enabled them to move forward with their generative AI app while maintaining enterprise-grade security and governance.21Views0likes0CommentsMicrosoft Azure scales Hollow Core Fiber (HCF) production through outsourced manufacturing
Introduction As cloud and AI workloads surge, the pressure on datacenter (DC), Metro and Wide Area Network (WAN) networks has never been greater. Microsoft is tackling the physical limits of traditional networking head-on. From pioneering research in microLED technologies to deploying Hollow Core Fiber (HCF) at global scale, Microsoft is reimagining connectivity to power the next era of cloud networking. Azure’s HCF journey has been one of relentless innovation, collaboration, and a vision to redefine the physical layer of the cloud. Microsoft’s HCF, based on the proprietary Double Nested Antiresonant Nodeless Fiber (DNANF) design, delivers up to 47% faster data transmission and approximately 33% lower latency compared to conventional Single Mode Fiber (SMF), bringing significant advantages to the network that powers Azure. Today, Microsoft is announcing a major milestone: the industrial scale-up of HCF production, powered by new strategic manufacturing collaborations with Corning Incorporated (Corning) and Heraeus Covantics (Heraeus). These collaborations will enable Azure to increase the global fiber production of HCF to meet the demands of the growing network infrastructure, advancing the performance and reliability customers expect for cloud and AI workloads. Real-world benefits for Azure customers Since 2023, Microsoft has deployed HCF across multiple Azure regions, with production links meeting performance and reliability targets. As manufacturing scales, Azure plans to expand deployment of the full end-to-end HCF network solution to help increase capacity, resiliency, and speed for customers, with the potential to set new benchmarks for latency and efficiency in fiber infrastructure. Why it matters Microsoft’s proprietary HCF design brings the following improvements for Azure customers: Increased data transmission speeds with up to 33% lower latency. Enhanced signal performance that improves data transmission quality for customers. Improved optical efficiency resulting in higher bandwidth rates compared to conventional fiber. How Microsoft is making it possible To operationalize HCF across Azure with production grade performance, Microsoft is: Deploying a standardized HCF solution with end-to-end systems and components for operational efficiency, streamlined network management, and reliable connectivity across Azure’s infrastructure. Ensuring interoperability with standard SMF environments, enabling seamless integration with existing optical infrastructure in the network for faster deployment and scalable growth. Creating a multinational production supply chain to scale next generation fiber production, ensuring the volumes and speed to market needed for widespread HCF deployment across the Azure network. Scaling up and out With Corning and Heraeus as Microsoft’s first HCF manufacturing collaborators, Azure plans to accelerate deployment to meet surging demand for high-performance connectivity. These collaborations underscore Microsoft’s commitment to enhancing its global infrastructure and delivering a reliable customer experience. They also reinforce Azure’s continued investment in deploying HCF, with a vision for this technology to potentially set the global benchmark for high-capacity fiber innovation. “This milestone marks a new chapter in reimagining the cloud’s physical layer. Our collaborations with Corning and Heraeus establish a resilient, global HCF supply chain so Azure can deliver a standardized, world-class customer experience with ultra-low latency and high reliability for modern AI and cloud workloads.” - Jamie Gaudette, Partner Cloud Network Engineering Manager at Microsoft To scale HCF production, Microsoft will utilize Corning’s established U.S. facilities, while Heraeus will produce out of its sites in both Europe and the U.S. "Corning is excited to expand our longtime collaboration with Microsoft, leveraging Corning’s fiber and cable manufacturing facilities in North Carolina to accelerate the production of Microsoft's Hollow Core Fiber. This collaboration not only strengthens our existing relationship but also underscores our commitment to advancing U.S. leadership in AI innovation and infrastructure. By working closely with Microsoft, we are poised to deliver solutions that meet the demands of AI workloads, setting new benchmarks for speed and efficiency in fiber infrastructure." - Mike O'Day, Senior Vice President and General Manager, Corning Optical Communications “We started our work on HCF a decade ago, teamed up with the Optoelectronics Research Centre (ORC) at the University of Southampton and then with Lumenisity prior to its acquisition. Now, we are excited to continue working with Microsoft on shaping the datacom industry. With leading solutions in glass, tube, preform, and fiber manufacturing, we are ready to scale this disruptive HCF technology to significant volumes. We’ll leverage our proven track record of taking glass and fiber innovations from the lab to widespread adoption, just as we did in the telecom industry, where approximately 2 billion kilometers of fiber are made using Heraeus products.” - Dr. Jan Vydra, Executive Vice President Fiber Optics, Heraeus Covantics Azure engineers are working alongside Corning and Heraeus to operationalize Microsoft manufacturing process intellectual property (IP), deliver targeted training programs, and drive the yield, metrology, and reliability improvements required for scaled production. The collaborations are foundational to a growing standardized, global ecosystem that supports: Glass preform/tubing supply Fiber production at scale Cable and connectivity for deployment into carrier‑grade environments Building on a foundation of innovation: Microsoft’s HCF program In 2022, Microsoft acquired Lumenisity, a spin‑out from the Optoelectronics Research Centre (ORC) at the University of Southampton, UK. That same year, Microsoft launched the world’s first state‑of‑the‑art HCF fabrication facility in the UK to expand production and drive innovation. This purpose-built site continues to support long‑term HCF research, prototyping, and testing, ensuring that Azure remains at the forefront of HCF technology. Working with industry leaders, Microsoft has developed a proven end‑to‑end ecosystem of components, equipment, and HCF‑specific hardware necessary and successfully proven in production deployments and operations. Pushing the boundaries: recent breakthrough research Today, the University of Southampton announced a landmark achievement in optical communications: in collaboration with Azure Fiber researchers, they have demonstrated the lowest signal loss ever recorded for optical fibers (<0.1 dB/km) using research-grade DNANF HCF technology (see figure 4). This breakthrough, detailed in a research paper published in Nature Photonics earlier this month, paves the way for a potential revolution in the field, enabling unprecedented data transmission capacities and longer unamplified spans. ecords at around 1550nm [1] 2002 Nagayama et al. 1 [2] 2025 Sato et al. 2 [3] 2025 research-grade DNANF HCF Petrovich et al. 3 This breakthrough highlights the potential for this technology to transform global internet infrastructure and DC connectivity. Expected benefits include: Faster: Approximately 47% faster, reducing latency, powering real-time AI inference, cloud gaming and other interactive workloads. More capacity: A wider optical spectrum window enabling exponentially greater bandwidth. Future-ready: Lays the groundwork for quantum-safe links, quantum computing infrastructure, advanced sensing, and remote laser delivery. Looking ahead: Unlocking the future of cloud networking The future of cloud networking is being built today! With record-breaking [3] fiber innovations, a rapidly expanding collaborative ecosystem, and the industrialized scale to deliver next-generation performance, Azure continues to evolve to meet the demands for speed, reliability, and connectivity. As we accelerate the deployment of HCF across our global network, we’re not just keeping pace with the demands of AI and cloud, we’re redefining what’s possible. References: [1] Nagayama, K., Kakui, M., Matsui, M., Saitoh, T. & Chigusa, Y. Ultra-low-loss (0.1484 dB/km) pure silica core fibre and extension of transmission distance. Electron. Lett. 38, 1168–1169 (2002). [2] Sato, S., Kawaguchi, Y., Sakuma, H., Haruna, T. & Hasegawa, T. Record low loss optical fiber with 0.1397 dB/km. In Proc. Optical Fiber Communication Conference (OFC) 2024 Tu2E.1 (Optica Publishing Group, 2024). [3] Petrovich, M., Numkam Fokoua, E., Chen, Y., Sakr, H., Isa Adamu, A., Hassan, R., Wu, D., Fatobene Ando, R., Papadimopoulos, A., Sandoghchi, S., Jasion, G., & Poletti, F. Broadband optical fibre with an attenuation lower than 0.1 decibel per kilometre. Nat. Photon. (2025). https://doi.org/10.1038/s41566-025-01747-5 Useful Links: The Deployment of Hollow Core Fiber (HCF) in Azure’s Network How hollow core fiber is accelerating AI | Microsoft Azure Blog Learn more about Microsoft global infrastructure4.4KViews5likes0CommentsAzure Networking Portfolio Consolidation
Overview Over the past decade, Azure Networking has expanded rapidly, bringing incredible tools and capabilities to help customers build, connect, and secure their cloud infrastructure. But we've also heard strong feedback: with over 40 different products, it hasn't always been easy to navigate and find the right solution. The complexity often led to confusion, slower onboarding, and missed capabilities. That's why we're excited to introduce a more focused, streamlined, and intuitive experience across Azure.com, the Azure portal, and our documentation pivoting around four core networking scenarios: Network foundations: Network foundations provide the core connectivity for your resources, using Virtual Network, Private Link, and DNS to build the foundation for your Azure network. Try it with this link: Network foundations Hybrid connectivity: Hybrid connectivity securely connects on-premises, private, and public cloud environments, enabling seamless integration, global availability, and end-to-end visibility, presenting major opportunities as organizations advance their cloud transformation. Try it with this link: Hybrid connectivity Load balancing and content delivery: Load balancing and content delivery helps you choose the right option to ensure your applications are fast, reliable, and tailored to your business needs. Try it with this link: Load balancing and content delivery Network security: Securing your environment is just as essential as building and connecting it. The Network Security hub brings together Azure Firewall, DDoS Protection, and Web Application Firewall (WAF) to provide a centralized, unified approach to cloud protection. With unified controls, it helps you manage security more efficiently and strengthen your security posture. Try it with this link: Network security This new structure makes it easier to discover the right networking services and get started with just a few clicks so you can focus more on building, and less on searching. What you’ll notice: Clearer starting points: Azure Networking is now organized around four core scenarios and twelve essential services, reflecting the most common customer needs. Additional services are presented within the context of these scenarios, helping you stay focused and find the right solution without feeling overwhelmed. Simplified choices: We’ve merged overlapping or closely related services to reduce redundancy. That means fewer, more meaningful options that are easier to evaluate and act on. Sunsetting outdated services: To reduce clutter and improve clarity, we’re sunsetting underused offerings such as white-label CDN services and China CDN. These capabilities have been rolled into newer, more robust services, so you can focus on what’s current and supported. What this means for you Faster decision-making: With clearer guidance and fewer overlapping products, it's easier to discover what you need and move forward confidently. More productive sales conversations: With this simplified approach, you’ll get more focused recommendations and less confusion among sellers. Better product experience: This update makes the Azure Networking portfolio more cohesive and consistent, helping you get started quickly, stay aligned with best practices, and unlock more value from day one. The portfolio consolidation initiative is a strategic effort to simplify and enhance the Azure Networking portfolio, ensuring better alignment with customer needs and industry best practices. By focusing on top-line services, combining related products, and retiring outdated offerings, Azure Networking aims to provide a more cohesive and efficient product experience. Azure.com Before: Our original Solution page on Azure.com was disorganized and static, displaying a small portion of services in no discernable order. After: The revised solution page is now dynamic, allowing customers to click deeper into each networking and network security category, displaying the top line services, simplifying the customer experience. Azure Portal Before: With over 40 networking services available, we know it can feel overwhelming to figure out what’s right for you and where to get started. After: To make it easier, we've introduced four streamlined networking hubs each built around a specific scenario to help you quickly identify the services that match your needs. Each offers an overview to set the stage, key services to help you get started, guidance to support decision-making, and a streamlined left-hand navigation for easy access to all services and features. Documentation For documentation, we looked at our current assets as well as created new assets that aligned with the changes in the portal experience. Like Azure.com, we found the old experiences were disorganized and not well aligned. We updated our assets to focus on our top-line networking services, and to call out the pillars. Our belief is these changes will allow our customers to more easily find the relevant and important information they need for their Azure infrastructure. Azure Network Hub Before the updates, we had a hub page organized around different categories and not well laid out. In the updated hub page, we provided relevant links for top-line services within all of the Azure networking scenarios, as well as a section linking to each scenario's hub page. Scenario Hub pages We added scenario hub pages for each of the scenarios. This provides our customers with a central hub for information about the top-line services for each scenario and how to get started. Also, we included common scenarios and use cases for each scenario, along with references for deeper learning across the Azure Architecture Center, Well Architected Framework, and Cloud Adoption Framework libraries. Scenario Overview articles We created new overview articles for each scenario. These articles were designed to provide customers with an introduction to the services included in each scenario, guidance on choosing the right solutions, and an introduction to the new portal experience. Here's the Load balancing and content delivery overview: Documentation links Azure Networking hub page: Azure networking documentation | Microsoft Learn Scenario Hub pages: Azure load balancing and content delivery | Microsoft Learn Azure network foundation documentation | Microsoft Learn Azure hybrid connectivity documentation | Microsoft Learn Azure network security documentation | Microsoft Learn Scenario Overview pages What is load balancing and content delivery? | Microsoft Learn Azure Network Foundation Services Overview | Microsoft Learn What is hybrid connectivity? | Microsoft Learn What is Azure network security? | Microsoft Lea Improving user experience is a journey and in coming months we plan to do more on this. Watch out for more blogs over the next few months for further improvements.1.2KViews1like0CommentsGA: Enhanced Audit in Azure Security Baseline for Linux
We’re thrilled to announce the General Availability (GA) of the Enhanced Azure Security Baseline for Linux—a major milestone in cloud-native security and compliance. This release brings powerful, audit-only capabilities to over 1.6 million Linux devices across all Azure regions, helping enterprise customers and IT administrators monitor and maintain secure configurations at scale. What Is the Azure Security Baseline for Linux? The Azure Security Baseline for Linux is a set of pre-configured security recommendations delivered through Azure Policy and Azure Machine Configuration. It enables organizations to continuously audit Linux virtual machines and Arc-enabled servers against industry-standard benchmarks—without enforcing changes or triggering auto-remediation. This GA release focuses on enhanced audit capabilities, giving teams deep visibility into configuration drift and compliance gaps across their Linux estate. For our remediation experience, there is a limited public preview available here: What is the Azure security baseline for Linux? | Microsoft Learn Why Enhanced Audit Matters In today’s hybrid environments, maintaining compliance across diverse Linux distributions is a challenge. The enhanced audit mode provides: Granular insights into each configuration check Industry aligned benchmark for standardized security posture Detailed rule-level reporting with evidence and context Scalable deployment across Azure and Arc-enabled machines Whether you're preparing for an audit, hardening your infrastructure, or simply tracking configuration drift, enhanced audit gives you the clarity and control you need—without enforcing changes. Key Features at GA ✅ Broad Linux Distribution Support 📘 Full distro list: Supported Client Types 🔍 Industry-Aligned Audit Checks The baseline audits over 200+ security controls per machine, aligned to industry benchmarks such as CIS. These checks cover: OS hardening Network and firewall configuration SSH and remote access settings Logging and auditing Kernel parameters and system services Each finding includes a description and the actual configuration state—making it easy to understand and act on. 🌐 Hybrid Cloud Coverage The baseline works across: Azure virtual machines Arc-enabled servers (on-premises or other clouds) This means you can apply a consistent compliance standard across your entire Linux estate—whether it’s in Azure, on-prem, or multi-cloud. 🧠 Powered by Azure OSConfig The audit engine is built on the open-source Azure OSConfig framework, which performs Linux-native checks with minimal performance impact. OSConfig is modular, transparent, and optimized for scale—giving you confidence in the accuracy of audit results. 📊 Enterprise-Scale Reporting Audit results are surfaced in: Azure Policy compliance dashboard Azure Resource Graph Explorer Microsoft Defender for Cloud (Recommendations view) You can query, export, and visualize compliance data across thousands of machines—making it easy to track progress and share insights with stakeholders. 💰 Cost There’s no premium SKU or license required to use the audit capabilities with charges only applying to the Azure Arc managed workloads hosted on-premises or other CSP environments—making it easy to adopt across your environment. How to Get Started Review the Quickstart Guide 📘 Quickstart: Audit Azure Security Baseline for Linux Assign the Built-In Policy Search for “Linux machines should meet requirements for the Azure compute security baseline” in Azure Policy and assign it to your desired scope. Monitor Compliance Use Azure Policy and Resource Graph to track audit results and identify non-compliant machines. Plan Remediation While this release does not include auto-remediation, the detailed audit findings make it easy to plan manual or scripted fixes. Final Thoughts This GA release marks a major step forward in securing Linux workloads at scale. With enhanced audit now available, enterprise teams can: Improve visibility into Linux security posture Align with industry benchmarks Streamline compliance reporting Reduce risk across cloud and hybrid environmentsDesigning for Certainty: How Azure Capacity Reservations Safeguard Mission‑Critical Workloads
Why capacity reservations matter now Cloud isn’t running out of metal, but demand is compounding and often spikes. Resource strain shows up in specific regions, zones, and VM SKUs, especially for popular CPU families, memory-optimized sizes, and anything involving GPUs. Seasonal events (retail peaks), regulatory cutovers, emergency response, and bursty AI pipelines can trigger sudden surges. Even with healthy regional capacity, a single zone or a specific SKU can be tight. Capacity reservations acknowledge this reality and make it designable instead of probabilistic. Root reality: Capacity is finite at the SKU-in-zone granularity, and demand arrives in waves. Risk profile: The risk is not “no capacity in the cloud,” but “no capacity for this exact size in this exact place at this exact moment.” Strategic move: Reserve what matters, where it matters, before you need it. What capacity means in practice Think of three dimensions: region, zone, and SKU. Your workload’s SLO ties to all three. Region: The biggest pool of resources. It gives you flexibility but doesn’t guarantee availability in a specific zone. Zone: This is where fault isolation happens and where you’ll often feel the pinch first when demand spikes. SKU: The specific type of machine you’re asking for. This is usually the tightest constraint, especially for popular sizes like Dv5, Ev5, or anything with GPUs. Azure Capacity Reservations let you lock capacity for a specific VM size at the regional or zonal scope and then place VMs/scale sets into that reservation. Pay‑as‑you‑go vs capacity reservations vs reserved instances Attribute Pay‑as‑you‑go Capacity Reservations Reserved Instances Primary purpose Flexibility, no commitment Guarantee availability for a VM size Reduce price for steady usage What it guarantees Nothing beyond current availability Capacity in region/zone for N of a SKU Discount on matching usage (1‑ or 3‑year term) Scope Region/zone at runtime, best‑effort Bound to region or specific zone Billing benefit across scope rules Commitment None Active while you keep it (on‑demand) Term commitment (1 or 3 years) Key clarifications Capacity reservations ≠ discount tool: They exist to secure availability. You pay while the reservation is active (even if idle) because Azure is holding that capacity for you. Reserved Instances ≠ capacity guarantee: They reduce the rate you pay when you run matching VMs, but they don’t hold hardware for you. Together: Use Capacity Reservations to ensure the VMs can run; use Reserved Instances to lower the cost of the runtime those VMs consume. This is universal, not just Azure Every major cloud faces the same physics: finite hardware, localized spikes, SKU-specific constraints, and growth in high-demand families (especially GPUs). AWS offers On‑Demand Capacity Reservations; Google Cloud offers zonal reservations. The names differ; the pattern and the need are the same. If your architecture depends on “must run here, as this size, and right now,” you either design for capacity or accept availability risk. When mission‑critical means “reserve it” If failure to acquire capacity breaks your SLO, treat capacity as a dependency to engineer, not a variable to assume. High-stakes cutovers and events: Examples: Black Friday, tax deadlines, trading close, clinical batch windows. Action: Pre‑reserve the exact SKU in the exact zones for the surge window. HA across zones: Goal: Survive a zone failure by scaling in active zones. Action: Consider keeping extra capacity in each zone based on your failover plan, whether that’s N+1 or matching peak load, depending on active/active vs. active/passive. Change windows that deallocate/recreate: Risk: If a VM is deallocated during maintenance, it might not get the same placement when restarted. Action: Associate VMs/VMSS with a capacity reservation group before deallocation. Fixed‑SKU dependencies: Signal: Performance needs, licensing rules, or hardware accelerators that lock you into a specific VM family. Action: Reserve by SKU. If possible, define fallback SKUs and split reservations across them. Regulated or latency‑sensitive workloads: Constraint: Must run in a specific zone or region due to compliance or latency. Action: Prefer zonal reservations to control both locality and availability. How reserved instances complement capacity reservations Two-layer strategy: Layer 1: Availability: Capacity reservations ensure your compute can be placed when needed. Layer 2: Economics: Reserved Instances (or Savings Plans) apply a pricing benefit to the steady‑state hours you actually run. Practical pairing: Steady base load: Cover with 1/3‑year Reserved Instances for maximum savings. Critical surge headroom: Hold with Capacity Reservations; if the surge is predictable, you can still layer partial RI coverage aligned to expected utilization. Dynamic burst: Leave as pay‑as‑you‑go or use short‑lived reservations during known windows. FinOps hygiene: Coverage ratios: Track RI coverage and capacity reservation utilization separately. Rightsizing: Align reservations to the SKU mix you truly run; shift or cancel idle capacity reservations quickly. Chargeback: Attribute the cost of “insurance” (capacity) to the workloads that require the SLO, separate from the cost of “fuel” (compute hours). Conclusion In today’s cloud landscape, resilience isn’t just redundancy; it’s about assured access to the exact resources your workload demands. Capacity Reservations remove uncertainty by guaranteeing placement, while Reserved Instances drive cost efficiency for predictable use. Together, they form a strategic duo that keeps mission‑critical services running smoothly under any demand surge. Build with both in mind, and you turn capacity from a risk into a controlled asset.Network Redundancy Between AVS, On-Premises, and Virtual Networks in a Multi-Region Design
By Mays_Algebary shruthi_nair Establishing redundant network connectivity is vital to ensuring the availability, reliability, and performance of workloads operating in hybrid and cloud environments. Proper planning and implementation of network redundancy are key to achieving high availability and sustaining operational continuity. This article focuses on network redundancy in multi-region architecture. For details on single-region design, refer to this blog. The diagram below illustrates a common network design pattern for multi-region deployments, using either a Hub-and-Spoke or Azure Virtual WAN (vWAN) topology, and serves as the baseline for establishing redundant connectivity throughout this article. In each region, the Hub or Virtual Hub (VHub) extends Azure connectivity to Azure VMware Solution (AVS) via an ExpressRoute circuit. The regional Hub/VHub is connected to on-premises environments by cross-connecting (bowtie) both local and remote ExpressRoute circuits, ensuring redundancy. The concept of weight, used to influence traffic routing preferences, will be discussed in the next section. The diagram below illustrates the traffic flow when both circuits are up and running. Design Considerations If a region loses its local ExpressRoute connection, AVS in that region will lose connectivity to the on-premises environment. However, VNets will still retain connectivity to on-premises via the remote region’s ExpressRoute circuit. The solutions discussed in this article aim to ensure redundancy for both AVS and VNets. Looking at the diagram above, you might wonder: why do we need to set weights at all, and why do the AVS-ER connections (1b/2b) use the same weight as the primary on-premises connections (1a/2a)? Weight is used to influence routing decisions and ensure optimal traffic flow. In this scenario, both ExpressRoute circuits, ER1-EastUS and ER2-WestUS, advertise the same prefixes to the Azure ExpressRoute gateway. As a result, traffic from the VNet to on-premises would be ECMPed across both circuits. To avoid suboptimal routing and ensure that traffic from the VNets prefers the local ExpressRoute circuit, a higher weight is assigned to the local path. It’s also critical that the ExpressRoute gateway connection to on-premises (1a/2a) and to AVS (1b/2b), is assigned the same weight. Otherwise, traffic from the VNet to AVS will follow a less efficient route as AVS routes are also learned over ER1-EastUS via Global Reach. For instance, VNets in EastUS will connect to AVS EUS through ER1-EastUS circuit via Global Reach (as shown by the blue dotted line), instead of using the direct local path (orange line). This suboptimal routing is illustrated in the below diagram. Now let us see what solutions we can have to achieve redundant connectivity. The following solutions will apply to both Hub-and-Spoke and vWAN topology unless noted otherwise. Note: The diagrams in the upcoming solutions will focus only on illustrating the failover traffic flow. Solution1: Network Redundancy via ExpressRoute in Different Peering Location In the solution, deploy an additional ExpressRoute circuit in a different peering location within the same metro area (e.g., ER2–PeeringLocation2), and enable Global Reach between this new circuit and the existing AVS ExpressRoute (e.g., AVS-ER1). If you intend to use this second circuit as a failover path, apply prepends to the on-premises prefixes advertised over it. Alternatively, if you want to use it as an active-active redundant path, do not prepend routes, in this case, both AVS and Azure VNets will ECMP to distribute traffic across both circuits (e.g., ER1–EastUS and ER–PeeringLocation2) when both are available. Note: Compared to the Standard Topology, this design removes both the ExpressRoute cross-connect (bowtie) and weight settings. When adding a second circuit in the same metro, there's no benefit in keeping them, otherwise traffic from the Azure VNet will prefer the local AVS circuit (AVS-ER1/AVS-ER2) to reach on-premises due to the higher weight, as on-premises routes are also learned over AVS circuit (AVS-ER1/AVS-ER2) via Global Reach. Also, when connecting the new circuit (e.g., ER–Peering Location2), remove all weight settings across the connections. Traffic will follow the optimal path based on BGP prepending on the new circuit, or load-balance (ECMP) if no prepend is applied. Note: Use public ASN to prepend the on-premises prefix as AVS circuit (e.g., AVS-ER) will strip the private ASN toward AVS. Solution Insights Ideal for mission-critical applications, providing predictable throughput and bandwidth for backup. It could be cost prohibitive depending on the bandwidth of the second circuit. Solution2: Network Redundancy via ExpressRoute Direct In this solution, ExpressRoute Direct is used to provision multiple circuits from a single port pair in each region, for example, ER2-WestUS and ER4-WestUS are created from the same port pair. This allows you to dedicate one circuit for local traffic and another for failover to a remote region. To ensure optimal routing, prepend the on-premises prefixes using public ASN on the newly created circuit (e.g., ER3-EastUS and ER4-WestUS). Remove all weight settings across the connections; traffic will follow the optimal path based on BGP prepending on the new circuit. For instance, if ER1-EastUS becomes unavailable, traffic from AVS and VNets in the EastUS region will automatically route through ER4-WestUS circuit, ensuring continuity. Note: Compared to the Standard Topology, this design connects the newly created ExpressRoute circuits (e.g., ER3-EastUS/ER4-WestUS) to the remote region of ExpressRoute gateway (black dotted lines) instead of having the bowtie to the primary circuits (e.g., ER1-EastUS/ER2-WestUS). Solution Insights Easy to implement if you have ExpressRoute Direct. ExpressRoute Direct supports over- provisioning where you can create logical ExpressRoute circuits on top of your existing ExpressRoute Direct resource of 10-Gbps or 100-Gbps up to the subscribed Bandwidth of 20 Gbps or 200 Gbps. For example, you can create two 10-Gbps ExpressRoute circuits within a single 10-Gbps ExpressRoute Direct resource (port pair). Ideal for mission-critical applications, providing predictable throughput and bandwidth for backup. Solution3: Network Redundancy via ExpressRoute Metro Metro ExpressRoute is a new configuration that enables dual-homed connectivity to two different peering locations within the same city. This setup enhances resiliency by allowing traffic to continue flowing even if one peering location goes down, using the same circuit. Solution Insights Higher Resiliency: Provides increased reliability with a single circuit. Limited regional availability: Currently available in select regions, with more being added over time. Cost-effective: Offers redundancy without significantly increasing costs. Solution4: Deploy VPN as a Backup to ExpressRoute This solution mirrors solution 1 for a single region but extends it to multiple regions. In this approach, a VPN serves as the backup path for each region in the event of an ExpressRoute failure. In a Hub-and-Spoke topology, a backup path to and from AVS can be established by deploying Azure Route Server (ARS) in the hub VNet. ARS enables seamless transit routing between ExpressRoute and the VPN gateway. In vWAN topology, ARS is not required; the vHub's built-in routing service automatically provides transitive routing between the VPN gateway and ExpressRoute. In this design, you should not cross-connect ExpressRoute circuits (e.g., ER1-EastUS and ER2-WestUS) to the ExpressRoute gateways in the Hub VNets (e.g., Hub-EUS or Hub-WUS). Doing so will lead to routing issues, where the Hub VNet only programs the on-premises routes learned via ExpressRoute. For instance, in the EastUS region, if the primary circuit (ER1-EastUS) goes down, Hub-EUS will receive on-premises routes from both the VPN tunnel and the remote ER2-WestUS circuit. However, it will prefer and program only the ExpressRoute-learned routes from ER2-WestUS circuit. Since ExpressRoute gateways do not support route transitivity between circuits, AVS connected via AVS-ER will not receive the on-premises prefixes, resulting in routing failures. Note: In vWAN topology, to ensure optimal route convergence when failing back to ExpressRoute, you should prepend the prefixes advertised from on-premises over the VPN. Without route prepending, VNets may continue to use the VPN as the primary path to on-premises. If prepend is not an option, you can trigger the failover manually by bouncing the VPN tunnel. Solution Insights Cost-effective and straightforward to deploy. Increased Latency: The VPN tunnel over the internet adds latency due to encryption overhead. Bandwidth Considerations: Multiple VPN tunnels might be needed to achieve bandwidth comparable to a high-capacity ExpressRoute circuit (e.g., over 1G). For details on VPN gateway SKU and tunnel throughput, refer to this link. As you can't cross connect ExpressRoute circuits, VNets will utilize the VPN for failover instead of leveraging remote region ExpressRoute circuit. Solution5: Network Redundancy-Multiple On-Premises (split-prefix) In many scenarios, customers advertise the same prefix from multiple on-premises locations to Azure. However, if the customer can split prefixes across different on-premises sites, it simplifies the implementation of failover strategy using existing ExpressRoute circuits. In this design, each on-premises advertises region-specific prefixes (e.g., 10.10.0.0/16 for EastUS and 10.70.0.0/16 for WestUS), along with a common supernet (e.g., 10.0.0.0/8). Under normal conditions, AVS and VNets in each region use longest prefix match to route traffic efficiently to the appropriate on-premises location. For instance, if ER1-EastUS becomes unavailable, AVS and VNets in EastUS will automatically fail over to ER2-WestUS, routing traffic via the supernet prefix to maintain connectivity. Solution Insights Cost-effective: no additional deployment, using existing ExpressRoute circuits. Advertising specific prefixes over each region might need additional planning. Ideal for mission-critical applications, providing predictable throughput and bandwidth for backup. Solution6: Prioritize Network Redundancy for One Region Over Another If you're operating under budget constraints and can prioritize one region (such as hosting critical workloads in a single location) and want to continue using your existing ExpressRoute setup, this solution could be an ideal fit. In this design, assume AVS in EastUS (AVS-EUS) hosts the critical workloads. To ensure high availability, AVS-ER1 is configured with Global Reach connections to both the local ExpressRoute circuit (ER1-EastUS) and the remote circuit (ER2-WestUS). Make sure to prepend the on-premises prefixes advertised to ER2-WestUS using public ASN to ensure optimal routing (no ECMP) from AVS-EUS over both circuits (ER1-EastUS and ER2-WestUS). On the other hand, AVS in WestUS (AVS-WUS) is connected via Global Reach only to its local region ExpressRoute circuit (ER2-WestUS). If that circuit becomes unavailable, you can establish an on-demand Global Reach connection to ER1-EastUS, either manually or through automation (e.g., a triggered script). This approach introduces temporary downtime until the Global Reach link is established. You might be thinking, why not set up Global Reach between the AVS-WUS circuit and remote region circuits (like connecting AVS-ER2 to ER1-EastUS), just like we did for AVS-EUS? Because it would lead to suboptimal routing. Due to AS path prepending on ER2-WestUS, if both ER1-EastUS and ER2-WestUS are linked to AVS-ER2, traffic would favor the remote ER1-EastUS circuit since it presents a shorter AS path. As a result, traffic would bypass the local ER2-WestUS circuit, causing inefficient routing. That is why for AVS-WUS, it's better to use on-demand Global Reach to ER1-EastUS as a backup path, enabled manually or via automation, only when ER2-WestUS becomes unavailable. Note: VNets will failover via local AVS circuit. E.g., HUB-EUS will route to on-prem through AVS-ER1 and ER2-WestUS via Global Reach Secondary (purple line). Solution Insights Cost-effective Workloads hosted in AVS within the non-critical region will experience downtime if the local region ExpressRoute circuit becomes unavailable, until the on-demand Global Reach connection is established. Conclusion Each solution has its own advantages and considerations, such as cost-effectiveness, ease of implementation, and increased resiliency. By carefully planning and implementing these solutions, organizations can ensure operational continuity and optimal traffic routing in multi-region deployments.2.5KViews6likes0Comments🚨 Azure Service Health Built-In Policy (Preview) – Now Available!
Resiliency is a key focus for Microsoft in making sure our customers experience minimal impact due to planned or unexpected outages that may occur. Up until now there has been no native scalable solution to provide consistent notifications across Azure subscriptions for Service Health events. Building on the success of Azure Monitor Baseline Alerts (AMBA) where this functionality is currently available, the AMBA team has combined with the Service Health Product team to include this capability into the Azure native experience. We’re excited to announce the release of Azure Service Health Built-In Policy (Preview), a new built-in Azure Policy designed to simplify and scale the deployment of Service Health alerts across your Azure environment. This policy enables customers to automatically deploy Service Health alerts across subscriptions, ensuring consistent visibility into platform-level issues that may impact workloads. Existing subscriptions can be remediated in bulk and new Azure subscriptions, created once the Policy has been assigned, will automatically be configured for receiving Service Health alerts. 🔍 What's the purpose of this announcement? It addresses situations where customers only permit the use of built-in policies. It automates the setup of Service Health alerts across all subscriptions when deployed at the management group level. It ensures consistent alert coverage for platform events. It helps reduce manual setup and ongoing maintenance. 🛠️ What options are available with the Policy? All the learnings from AMBA have been taken into consideration in designing and creating this policy. There are now a wide range of options available to provide flexibility based on your needs. These options are surfaced as parameters within the policy: It audits the existing environment for compliance. It ensures the ability to provide custom alert rules that align with the naming standards. It gives the ability to choose the types of Service Health events to monitor. It supports Bring-your-own Action Group, or the ability to create a new Action Group as part of the Policy assignment. For ARM role notification, it ensures the ability to choose from a pre-set list of built-in roles for notifications. It provides the ability to choose from email, Logic App, Event Hubs, webhook, and Azure Functions within the Action Group. It enables naming Resource groups, and location flexibility. It gives the ability to add Resource tags. 🧩 What about Azure Monitor Baseline Alerts? The AMBA team have been working to incorporate the newly built-in policy into a future release. The team plans to roll this out in the next few weeks along with details for existing customers on replacing the existing AMBA custom policy. These changes will then be consumed into Azure Landing Zones. AMBA continues to offer a wide range of alerts for both platform and workload services in addition to Service Health alerts. This announcement does not serve as a replacement for AMBA but simply compliments the AMBA solution. 📣 What’s Next? Check out the guidance on leveraging this policy in your environment Deploy Service Health alert rules at scale using Azure Policy - Azure Service Health Should you require support for this policy please raise a support ticket via the portal as comments raised below may not be addressed in a timely mannerAzure ExpressRoute Direct: A Comprehensive Overview
What is Express Route Azure ExpressRoute allows you to extend your on-premises network into the Microsoft cloud over a private connection made possible through a connectivity provider. With ExpressRoute, you can establish connections to Microsoft cloud services, such as Microsoft Azure, and Microsoft 365. ExpressRoute allows you to create a connection between your on-premises network and the Microsoft cloud in four different ways, CloudExchange Colocation, Point-to-point Ethernet Connection, Any-to-any (IPVPN) Connection, and ExpressRoute Direct. ExpressRoute Direct gives you the ability to connect directly into the Microsoft global network at peering locations strategically distributed around the world. ExpressRoute Direct provides dual 100-Gbps or 10-Gbps connectivity that supports active-active connectivity at scale. Why ExpressRoute Direct Is Becoming the Preferred Choice for Customers ExpressRoute Direct with ExpressRoute Local – Free Egress: ExpressRoute Direct includes ExpressRoute Local, which allows private connectivity to Azure services within the same metro or peering location. This setup is particularly cost-effective because egress (outbound) data transfer is free, regardless of whether you're on a metered or unlimited data plan. By avoiding Microsoft's global backbone, ExpressRoute Local offers high-speed, low-latency connections for regionally co-located workloads without incurring additional data transfer charges. Dual Port Architecture Both ExpressRoute Direct and the service provider model feature a dual-port architecture, with two physical fiber pairs connected to separate Microsoft router ports and configured in an active/active BGP setup that distributes traffic across both links simultaneously for redundancy and improved throughput. What sets Microsoft apart is making this level of resiliency standard, not optional. Forward-thinking customers in regions like Sydney take it even further by deploying ExpressRoute Direct across multiple colocation facilities for example, placing one port pair in Equinix SY2 and another in NextDC S1 creating four connections across two geographically separate sites. This design protects against facility-level outages from power failures, natural disasters, or accidental infrastructure damage, ensuring business continuity for organizations where downtime is simply not an option. When Geography Limits Your Options: Not every region offers facility diversity, example New Zealand has only one ExpressRoute peering location, businesses needing geographic redundancy must connect to Sydney incurring Auckland to Sydney link costs but gaining critical diversity to mitigate outages. While ExpressRoute’s dual ports provide active/active redundancy, both are on the same Microsoft edge, so true disaster recovery requires using Sydney’s edge. ExpressRoute Direct scales from basic dual-port setups to multi-facility deployments and offers another advantage: free data transfer within the same geopolitical region. Once traffic enters Microsoft’s network, New Zealand customers can move data between Azure services across the trans-Tasman link without per-GB fees, with Microsoft absorbing those costs. Premium SKU: Global Reach: Azure ExpressRoute Direct with the Premium SKU enables Global Reach, allowing private connectivity between your on-premises networks across different geographic regions through Microsoft's global backbone. This means you can link ExpressRoute circuits in different countries or continents, facilitating secure and high-performance data exchange between global offices or data centers. The Premium SKU extends the capabilities of ExpressRoute Direct by supporting cross-region connectivity, increased route limits, and access to more Azure regions, making it ideal for multinational enterprises with distributed infrastructure. MACsec: Defense in Depth and Enterprise Security ExpressRoute Direct uniquely supports MACsec (IEEE 802.1AE) encryption at the data-link layer, allowing your router and Microsoft's router to establish encrypted communication even within the colocation facility. This optional feature provides additional security for compliance-sensitive workloads in banking or government environments. High-Performance Data Transfer for the Enterprise: Azure ExpressRoute Direct enables ultra-fast and secure data transfer between on-premises infrastructure and Azure by offering dedicated bandwidth of 10 to 100 Gbps. This high-speed connectivity is ideal for large-scale data movement scenarios such as AI workloads, backup, and disaster recovery. It ensures consistent performance, low latency, and enhanced reliability, making it well-suited for hybrid and multicloud environments that require frequent or time-sensitive data synchronization. FastPath Support: Azure ExpressRoute Direct now supports FastPath for Private Endpoints and Private Link, enabling low-latency, high-throughput connections by bypassing the virtual network gateway. This feature is available only with ExpressRoute Direct circuits (10 Gbps or 100 Gbps) and is in limited general availability. While a gateway is still needed for route exchange, traffic flows directly once FastPath is enabled. Supported gateway ExpressRoute Direct Setup Workflow Before provisioning ExpressRoute Direct resources, proper planning is essential. Key considerations for connectivity include understanding the two connectivity patterns available for ExpressRoute Direct from the customer edge to Microsoft Enterprise Edge (MSEE). Option 1: Colocation of Customer Equipment: This is a common pattern where the customer racks their network device (edge router) in the same third-party data center facility that houses Microsoft's networking gear (e.g., Equinix or NextDC). They install their router or firewall there and then order a short cross-connect from their cage to Microsoft's cage in that facility. The cross-connect is simply a fiber cable run through the facility's patch panel connecting the two parties. This direct colocation approach has the advantage of a single, highly efficient physical link (no intermediate hops) between the customer and Microsoft, completing the layer-1 connectivity in one step. Option 2: Using a Carrier/Exchange Provider: If the customer prefers not to move hardware into a new facility (due to cost or complexity), they can leverage a provider that already has presence in the relevant colocation. In this case, the customer connects from their data center to the provider's network, and the provider extends connectivity into the Microsoft peering location. For instance, the customer could contract with Megaport or a local telco to carry traffic from their on-premises location into Megaport's equipment, and Megaport in turn handles the cross-connection to Microsoft in the target facility. The conversation cited that the customer had already set up connections to Megaport in their data center. Using an exchange can simplify logistics since the provider arranges the cross-connect and often provides an LOA on the customer's behalf. It may also be more cost-effective where the customer's location is far from any Microsoft peering site. Many enterprises find that placing equipment in a well-connected colocation facility works best for their needs. Banks and large organizations have successfully taken this approach, such as placing routers in Equinix Sydney or NextDC Sydney to establish a direct fiber link to Azure. However, we understand that not every organization wants the capital expense or complexity of managing physical equipment in a new location. For those situations, using a cloud exchange like Megaport offers a practical alternative that still delivers the dedicated connectivity you're looking for, while letting someone else handle the infrastructure management. Once the decision on the connectivity pattern is made, the next step is to provision ExpressRoute Direct ports and establish the physical link: Step1: Provisioning Express Route Direct Ports Through the Azure portal (or CLI), the customer creates an ExpressRoute Direct resource. Customer must select an appropriate peering location, which corresponds to the colocation facility housing Azure's routers. For example, the customer would select the specific facility (such as "Vocus Auckland" or "Equinix Sydney SY2") where they intend to connect. Customer also choose the port bandwidth (either 10 Gbps or 100 Gbps) and the encapsulation type (Dot1Q or QinQ) during this setup. Azure then allocates two ports on two separate Microsoft devices in that location – essentially giving the customer a primary and secondary interface for redundancy, to remove a single point of failure affecting their connectivity. ****Critical considerations we need to keep in mind during this step**** Encapsulation: When configuring ExpressRoute Direct ports, the customer must choose an encapsulation method. Dot1Q (802.1Q) uses a single VLAN tag for the circuit, whereas Q-in-Q (802.1ad) uses stacked VLAN tags (an Outer S-Tag and Inner C-Tag). Q-in-Q allows multiple circuits on one physical port with overlapping customer VLAN IDs because Azure assigns a unique outer tag per circuit (making it ideal if the customer needs several ExpressRoute circuits on the same port). Dot1Q, by contrast, requires each VLAN ID to be unique across all circuits on the port, and is often used if the equipment doesn’t support Q-in-Q. (Most modern deployments prefer Q-in-Q for flexibility.) Capacity Planning: This offering allows customers to overprovision and utilize 20Gbps of capacity. Design for 10 Gbps with redundancy, not 20 Gbps total capacity. During Microsoft's monthly maintenance windows, one port may go offline, and your network must handle this seamlessly. Step 2: Generate Letter of Authorization After the ExpressRoute Direct resource is created, Microsoft generates a Letter of Authorization. The LOA is a document (often a PDF) that authorizes the data center operator to connect a specific Microsoft port to the designated port. It includes details like the facility name, patch panel identifier, and port numbers on Microsoft's side. If co-locating your own gear, you will also obtain a corresponding LOA from the facility for your port (or simply indicate your port details on the cross-connect order form). If a provider like Megaport is involved, that provider will generate an LOA for their port as well. Two LOAs are typically needed – one for Microsoft's ports and one for the other party's ports which are then submitted to the facility to execute the cross-connect. Step 3: Complete Cross Connect with data center provider Using the LOAs, the data center’s technicians will perform the cross-connection in the meet-me room. At this point, the physical fiber link is established between the Microsoft router and the customer (or provider) equipment. The link goes through a patch panel in the MMR – Meet me room rather than a direct cable between cages, for security and manageability. After patching, the circuit is in place but typically kept “administratively down” until ready. *****Critical considerations we need to keep in mind during this step. ***** When port allocation conflicts occur, engage Microsoft Support rather than recreating resources. They coordinate with colocation providers to resolve conflicts or issue new LOAs. Step 4: Change Admin Status of each link Once the cross-connect is physically completed, you can head into Azure's portal and flip the Admin State of each ExpressRoute Direct link to "Enabled." This action lights up the optical interface on Microsoft's side and starts your billing meter running, so you'll want to make sure everything is working properly first. The great thing is that Azure gives you visibility into the health of your fiber connection through optical power metrics. You can check the receive light levels right in the portal , a healthy connection should show power readings somewhere between -1 dBm and -9 dBm, which indicates a strong fiber signal. If you're seeing readings outside this range, or worse, no light at all, that's a red flag pointing to a potential issue like a mis-patch or faulty fiber connector. There was a real case where someone had a bad fiber connector that was caught because the light levels were too low, and the facility had to come back and re-patch the connection. So, this optical power check is really your first line of defence , once you see good light levels within the acceptable range, you know your physical layer is solid and you're ready to move on to the next steps. ****Critical considerations we need to keep in mind during this step. **** Proactive Monitoring: Set up alerts for BGP session failures and optical power thresholds. Link failures might not immediately impact users but require quick restoration to maintain full redundancy. At this stage, you've successfully navigated the physical infrastructure challenge, ExpressRoute Direct port pair is provisioned, fiber cross-connects are in place, and those critical optical power levels are showing healthy readings. Essentially, private physical highway directly connecting your network edge to Microsoft's backbone infrastructure has been built Step 5: Create Express Route Circuits ExpressRoute circuits represent the logical layer that transforms your physical ExpressRoute Direct ports into functional network connections. Through the Azure portal, organizations create circuit resources linked to their ExpressRoute Direct infrastructure, specifying bandwidth requirements and selecting the appropriate SKU (Local, Standard, or Premium) based on connectivity needs. A key advantage is the ability to provision multiple circuits on the same physical port pair, provided aggregate bandwidth stays within physical limits. For example, an organization with 10 Gbps ExpressRoute Direct might run a 1 Gbps non-production circuit alongside a 5 Gbps production circuit on the same infrastructure. Azure handles the technical complexity through automatic VLAN management: Step 6: Establish Peering Once your ExpressRoute circuit is created and VLAN connectivity is established, the next crucial step involves setting up BGP (Border Gateway Protocol) sessions between your network and Microsoft's infrastructure. ExpressRoute supports two primary BGP peering types: Private Peering for accessing Azure Virtual Networks and Microsoft Peering for reaching Microsoft SaaS services like Office 365 and Azure PaaS offerings. For most enterprise scenarios connecting data centers to Azure workloads, Private Peering becomes the focal point. Azure provides specific BGP IP addresses for your circuit configuration, defining /30 subnets for both primary and secondary link peering, which you'll configure on your edge router to exchange routing information. The typical flow involves your organization advertising on-premises network prefixes while Azure advertises VNet prefixes through these BGP sessions, creating dynamic route discovery between your environments. Importantly, both primary and secondary links maintain active BGP sessions, ensuring that if one connection fails, the secondary BGP session seamlessly maintains connectivity and keeps your network resilient against single points of failure. Step 7: Routing and Testing Once BGP sessions are established, your ExpressRoute circuit becomes fully operational, seamlessly extending your on-premises network into Azure virtual networks. Connectivity testing with ping, traceroute, and application traffic confirms that your on-premises servers can now communicate directly with Azure VMs through the private ExpressRoute path, bypassing the public internet entirely. The traffic remains completely isolated to your circuit via VLAN tags, ensuring no intermingling with other tenants while delivering the low latency and predictable performance that only dedicated connectivity can provide. At the end of this stage, the customer’s data center is linked to Azure at layer-3 via a private, resilient connection. They can access Azure resources as if they were on the same LAN extension, with low latency and high throughput. All that remains is to connect this circuit to relevant Azure virtual networks (via an ExpressRoute Gateway) and verify end-to-end application traffic. Step by step instructions are available as below Configure Azure ExpressRoute Direct using the Azure portal | Microsoft Learn Azure ExpressRoute: Configure ExpressRoute Direct | Microsoft Learn Azure ExpressRoute: Configure ExpressRoute Direct: CLI | Microsoft Learn1.5KViews3likes3Comments