azure networking

41 Topics

Extending Layer-2 (VXLAN) networks over Layer-3 IP network
Introduction Virtual Extensible LAN (VXLAN) is a network virtualization technology that encapsulates Layer-2 Ethernet frames inside Layer-3 UDP/IP packets. In essence, VXLAN creates a logical Layer-2 overlay network on top of an IP network, allowing Ethernet segments (VLANs) or underlay IP packet to be stretched across routed infrastructures. A key advantage is scale: VXLAN uses a 24-bit segment ID (VNI) instead of the 12-bit VLAN ID, supporting around 16 million isolated networks versus the 4,094 VLAN limit. This makes VXLAN ideal for large cloud data centers and multi-tenant environments that demand many distinct network segments. VXLAN’s Layer-2 overlays bring flexibility and mobility to modern architectures. Because VXLAN tunnels can span multiple Layer-3 domains, organizations can extend VLANs across different sites or subnets – for example, creating a tunnel that extends over two data centers over an IP WAN as long as underlying tunnel IP is reachable. This enables seamless workload mobility and disaster recovery: also helps virtual machines or applications can move between physical locations without changing IP addresses, since they remain in the same virtual L2 network. The overlay approach also decouples the logical network from the physical underlay, meaning you can run your familiar L2 segments over any IP routing infrastructure while leveraging features like equal-cost multi-path (ECMP) load balancing and avoiding large spanning-tree domains. In short, VXLAN combines the best of both worlds – the simplicity of Layer-2 adjacency with the scalability of Layer-3 routing – making it a foundational tool in cloud networking and software-defined data centers. Layer-2 VXLAN overlay on a Layer-3 IP network allows customers or edge networks to stretch Ethernet (VLAN) segments across geographically distributed sites using an IP backbone. This approach preserves VLAN tags end-to-end and enables flexible segmentation across locations without needing an extend or continuous Layer-2 network in the core. It also helps hide or avoid the underlying IP network complexities. However, it’s crucial to account for MTU overhead (VXLAN adds ~50 bytes of header) so that the overlay’s VLAN MTU is set smaller than the underlay IP MTU – otherwise fragmentation or packet loss can occur. Additionally, because VXLAN doesn’t inherently signal link status, implementing Bidirectional Forwarding Detection (BFD) on the VXLAN interfaces provides rapid detection of neighbor failures, ensuring quick rerouting or recovery when a tunnel endpoint goes down. VXLAN overlay use case and benefits VXLAN is a standard protocol (IETF RFC 7348) that can encapsulate Layer-2 Ethernet frames into Layer-3 UDP/IP packets. By doing so, VXLAN creates an L2 overlay network on top of an L3 underlay. The VXLAN tunnel endpoints (VTEPs), which can be routers, switches, or hosts, wrap the original Ethernet frame (including its VLAN tag) with an IP/UDP header plus a VXLAN header, then send it through the IP network. The default UDP port for VXLAN is 4789. This mechanism offers several key benefits: Preserves VLAN Tags and L2 Segmentation: The entire Ethernet frame is carried across, so the original VLAN ID (802.1Q tag) is maintained end-to-end through the tunnel. Even if an extra tag is added at the ingress for local tunneling, the customer’s inner VLAN tag remains intact across the overlay. This means a VLAN defined at one site will be recognized at the other site as the same VLAN, enabling seamless L2 adjacency. In practice, VXLAN can transport multiple VLANs transparently by mapping each VLAN or service to a VXLAN Network Identifier (VNI). Flexible network segmentation at scale: VXLAN uses a 24-bit VNI (VXLAN Network ID), supporting about 16 million distinct segments, far exceeding the 4094 VLAN limit of traditional 802.1Q networks. This gives architects freedom to create many isolated L2 overlay networks (for multi-tenant scenarios, application tiers, etc.) over a shared IP infrastructure. Geographically distributed sites can share the same VLANs and broadcast domain via VXLAN, without the WAN routers needing any VLAN configurations. The IP/MPLS core only sees routed VXLAN packets, not individual VLANs, simplifying the underlay configuration. No need for end-to-end VLANs in underlay: Traditional solutions to extend L2 might rely on methods like MPLS/VPLS or long ethernet trunk lines, which often require configuring VLANs across the WAN and can’t scale well. In a VXLAN overlay, the intermediate L3 network remains unaware of customer VLANs, and you don’t need to trunk VLANs across the WAN. Each site’s VTEP encapsulates and decapsulates traffic, so the core routers/switches just forward IP/UDP packets. This isolation improves scalability and stability—core devices don’t carry massive MAC address tables or STP domains from all sites. It also means the underlay can use robust IP routing (OSPF, BGP, etc.) with ECMP, rather than extending spanning-tree across sites. In short, VXLAN lets you treat the WAN like an IP cloud while still maintaining Layer-2 connectivity between specific endpoints. Multi-path and resilience: Since the overlay runs on IP, it naturally leverages IP routing features. ECMP in the underlay, for example, can load-balance VXLAN traffic across multiple links, something not possible with a single bridged VLAN spanning the WAN. The encapsulated traffic’s UDP header even provides entropy (via source port hashing) to help load-sharing on multiple paths. Furthermore, if one underlay path fails, routing protocols can reroute VXLAN packets via alternate paths without disrupting the logical L2 network. This increases reliability and bandwidth usage compared to a Layer-2 only approach. Diagram: VXLAN Overlay Across a Layer-3 WAN – Below is a simplified illustration of two sites using a VXLAN overlay. “Site A” and “Site B” each have a local VLAN (e.g. VLAN 100) that they want to bridge across an IP WAN. The VTEPs at each site encapsulate the Layer-2 frames into VXLAN/UDP packets and send them over the IP network. Inside the tunnel, the original VLAN tag is preserved. In this example, a BFD session (red dashed line) runs between the VTEPs to monitor the tunnel’s health, as explained later. Figure 1: Two sites (A and B) extend “VLAN 100” across an IP WAN using a VXLAN tunnel. The inner VLAN tag is preserved over the L3 network. A BFD keepalive (every 900ms) runs between the VXLAN endpoints to detect failures. The practical effect of this design is that devices in Site A and Site B can be in the same VLAN and IP subnet, broadcast to each other, etc., even though they are connected by a routed network. For example, if Site A has a machine in VLAN 100 with IP 10.1.100.5/24 and Site B has another in VLAN 100 with IP 10.1.100.10/24, they can communicate as if on one LAN – ARP, switches, and VLAN tagging function normally across the tunnel. MTU and overhead considerations One critical consideration for deploying VXLAN overlays is handling the increased packet size due to encapsulation. A VXLAN packet includes additional headers on top of the original Ethernet frame: an outer IP header, UDP header, and VXLAN header (plus an outer Ethernet header on the WAN interface). This encapsulation adds approximately 50 bytes of overhead to each packet (for IPv4; about 70 bytes for IPv6). In practical terms, if your original Ethernet frame was the typical 1500-byte payload (1518 bytes with Ethernet header and CRC, or 1522 with a VLAN tag), the VXLAN-encapsulated version will be ~1550 bytes. **The underlying IP network *must* accommodate these larger frames**, or you’ll get fragmentation or drops. Many network links by default only support 1500-byte MTUs, so without adjustments, a VXLAN carrying a full-sized VLAN packet would exceed that. Though modern networks runs jumbo frames (~9k), if the underlying encapsulated packet frames exceeds 8950 bytes it can create problems like control-plane failure (ex BGP session tear down) or fragmentation for data packet causing out of order packet. Solution: Either raise the MTU on the underlay network or enforce a lower MTU on the overlay. Network architects generally prefer to increase the IP MTU of the core so the overlay can carry standard 1500-byte Ethernet frames unfragmented. For example, one vendor’s guide recommends configuring at least a 1550-byte MTU on all network segments to account for VXLAN’s ~50B overhead. In enterprise environments, it’s common to use “baby jumbo” frames (e.g. 1600 bytes) or full jumbo (9000 bytes) in the datacenter/WAN to accommodate various tunneling overheads. If increasing the underlay MTU is not possible (say, over an ISP that only supports 1500), then the VLAN MTU on the overlay should be reduced – for instance, set the VLAN interface MTU to 1450 bytes, so that even with the 50B VXLAN overhead the outer packet remains 1500 bytes. This prevents any IP fragmentation. Why Fragmentation is Undesirable: VXLAN itself doesn’t include any fragmentation mechanism; it relies on the underlay IP to fragment if needed. But IP fragmentation can harm performance and some devices/drop policies might simply drop oversized VXLAN packets instead of fragmenting. In fact, certain implementations don’t support VXLAN fragmentation or Path MTU discovery on tunnels. The safe approach is to ensure no encapsulated packet ever exceeds the physical MTU. That means planning your MTUs end-to-end: make the core links slightly larger than the largest expected overlay packet. Diagram: VXLAN Encapsulation and MTU Layering – The figure below illustrates the components of a VXLAN-encapsulated frame and how they contribute to packet size. The original Ethernet frame (yellow) with a VLAN tag is wrapped with a new outer Ethernet, IP, UDP, and VXLAN header . The extra headers add ~50 bytes. If the inner (yellow) frame was, say, 1500 bytes of payload plus 18 bytes Ethernet overhead, the outer packet becomes ~1568 bytes (including new headers and FCS). In practice the old FCS is replaced by a new one, so the net growth is ~50 bytes. The key takeaway: the IP transport must handle the total size. Figure 2: Layered view of a VXLAN-encapsulated packet (not to scale). The original Ethernet frame with VLAN tag (yellow) is encapsulated by outer headers (blue/green/red/gray), resulting in ~50 bytes of overhead for IPv4. The outer packet must fit within the WAN MTU (e.g. 1518B if inner frame is 1468B) to avoid fragmentation. In summary, ensure the IP underlay’s MTU is configured to accommodate the VXLAN overhead. If using standard 1500-byte MTUs on the WAN, set your overlay interfaces (VLAN SVIs or bridge MTUs) to around 1450 bytes. In many cases if possible, raising the WAN MTU to 1600 or using jumbo frames throughout is the best practice to provide ample headroom. Always test your end-to-end path with ping sweeps (e.g. using the DF-bit and varying sizes) to verify that the encapsulated packets aren’t being dropped due to MTU limits. Neighbor failure detection with BFD One challenge with overlays like VXLAN is that the logical link lacks immediate visibility into physical link status. If one end of the VXLAN tunnel goes down or the path fails, the other end’s VXLAN interface may remain “up” (since its own underlay interface is still up), potentially blackholing traffic until higher-level protocols notice. VXLAN itself doesn’t send continuous “link alive” messages to check the remote VTEP’s reachability. To address this, network engineers deploy BFD on VXLAN endpoints. BFD is a lightweight protocol specifically designed for rapid failure detection independent of media or routing protocol. It works by two endpoints periodically sending very fast, small hello packets to each other (often every 50ms or less). If a few consecutive hellos are missed, BFD declares the peer down – often within <1 second, versus several seconds (or tens of seconds) with conventional detection. Applying BFD to VXLAN: Many router and switch vendors support running BFD over a VXLAN tunnel or on the VTEP’s loopback adjacencies. When enabled, the two VTEPs will continuously ping each other at the configured interval. If the VXLAN tunnel fails (e.g. one site loses connectivity), BFD on the surviving side will quickly detect the loss of response. This can then trigger corrective actions: for instance, the BFD can generate logs for the logical interface or notify the routing protocol to withdraw routes via that tunnel. In designs with redundant tunnels or redundant VTEPs, BFD helps achieve sub-second failover – traffic can switch to a backup VXLAN tunnel almost immediately upon a primary failure. Even in a single-tunnel scenario, BFD gives an early alert to the network operator or applications that the link is down, rather than quietly dropping packets. Example: If Site A and Site B have two VXLAN tunnels (primary and backup) connecting them, running BFD on each tunnel interface means that if the primary’s path goes down, BFD at Site A and B will detect it within milliseconds and inform the routing control-plane. The network can then shift traffic to the backup tunnel right away. Without BFD, the network might have to wait for a timeout (e.g. OSPF dead interval or even ARP timeouts) to realize the primary tunnel is dead, causing a noticeable outage. BFD is protocol-agnostic – it can integrate with any routing protocols. For VXLAN, it’s purely a monitoring mechanism: lightweight and with minimal overhead on the tunnel. Its messages are small UDP packets (often on port 3784/3785) that can be sourced from the VTEP’s IP. The frequency is configurable based on how fast you need detection vs. how much overhead you can afford; common timers are 300ms with 3x multiplier (detect in ~1s) for moderate speeds, or even 50ms with 3x (150ms detection) for high-speed failover requirements. Bottom line: Implementing BFD dramatically improves the reliability of a VXLAN-based L2 extension. Since VXLAN tunnels don’t automatically signal if a neighbor is unreachable, BFD acts as the heartbeat. Many platforms even allow BFD to directly influence interface state (for example, the VXLAN interface can be tied to go down when BFD fails) so that any higher-level protocols (like VRRP, dynamic routing, etc.) immediately react to the loss. This prevents lengthy outages and ensures the overlay network remains robust even over a complex WAN. Conclusion Deploying a Layer-2 VXLAN overlay across a Layer-3 WAN unlocks powerful capabilities: you can keep using familiar VLAN-based segmentation across sites while taking advantage of an IP network’s scalability and resilience. It’s a vendor-neutral solution widely supported in modern networking gear. By preserving VLAN tags over the tunnel, VXLAN makes it possible to stretch subnets and broadcast domains to remote locations for workloads that require Layer-2 adjacency. With the huge VNI address space, segmentation can scale for large enterprises or cloud providers well beyond traditional VLAN limits. However, to realize these benefits successfully, careful attention must be paid to MTU and link monitoring. Always accommodate the ~50-byte VXLAN overhead by configuring proper MTUs (or adjusting the overlay’s MTU) – this prevents fragmentation and packet loss that can be very hard to troubleshoot after deployment. And since a VXLAN tunnel’s health isn’t apparent to switches/hosts by default, use tools like BFD to add fast failure detection, thereby avoiding black holes and improving convergence times. In doing so, you ensure that your stretched network is not only functional but also resilient and performant. By following these guidelines – leveraging VXLAN for flexible L2 overlays, minding the MTU, and bolstering with BFD – network engineers can build a robust, wide-area Layer-2 extension that behaves nearly indistinguishably from a local LAN, yet rides on the efficiency and reliability of a Layer-3 IP backbone. Enjoy the best of both worlds: VLANs without borders, and an IP network without unnecessary constraints. References: VXLAN technical overview and best practices from vendor documentation and industry sources have been used to ensure accuracy in the above explanations. This ensures the blog is grounded in real-world proven knowledge while remaining vendor-neutral and applicable to a broad audience of cloud and network professionals.
SaravSubramanian
Nov 10, 2025 Place Azure Networking Blog
250Views
2likes
0Comments
Cut the Noise & Cost with Container Network Metrics Filtering in ACNS for AKS
We’re excited to introduce Container Network Metrics Filtering in Azure Container Networking Services (ACNS) for Azure Kubernetes Service (AKS) is now in Public Preview! This capability transforms how you manage network observability in Kubernetes clusters by giving you control over what metrics matter most. Why Excessive Metrics Are a Problem (And How We’re Fixing It) In today’s large-scale, microservices-driven environments, teams often face metrics bloat, Bollecting far more data than they need. The result? High Storage & Ingestion Costs: Paying for data you’ll never use. Cluttered Dashboards: Hunting for critical latency spikes in a sea of irrelevant pod restarts. Operational Overhead: Slower queries, higher maintenance, and fatigue. Our new filtering capability solves this by letting you define precise filters at the pod level using standard Kubernetes Custom Resources. You collect only what matters, before it ever reaches your monitoring stack. Key Benefits: Signal Over Noise Benefit Your Gain Fine-Grained Control Filter by namespace or pod label. Target critical services and ignore noise. Cost Optimization Reduce ingestion costs for Prometheus, Grafana, and other tools. Improved Observability Cleaner dashboards and faster troubleshooting with relevant metrics only. Dynamic & Zero-Downtime Apply or update filters without restarting Cilium agents or Prometheus. How It Works: Filtering at the Source Unlike traditional sampling or post-processing, filtering happens at the Cilium agent level—inside the kernel’s data plane. You define filters using the ContainerNetworkMetric Custom Resource to include or exclude metrics such as: DNS lookups TCP connection metrics Flow metrics Drop (error) metrics This reduces data volume before metrics leave the host, ensuring your observability tools receive only curated, high-value data. Example: Filtering Flow Metrics to Reduce Noise Here’s a sample ContainerNetworkMetric CRD that filters only dropped flows from the traffic/http namespace and excludes flows from traffic/fortio pods: apiVersion: acn.azure.com/v1alpha1 kind: ContainerNetworkMetric metadata: name: container-network-metric spec: filters: - metric: flow includeFilters: # Include only DROPPED flows from traffic namespace verdict: - "dropped" from: namespacedPod: - "traffic/http" excludeFilters: # Exclude traffic/fortio flows to reduce noise from: namespacedPod: - "traffic/fortio" Before Filtering After Applying Filters Getting Started Today Ready to simplify your network observability? Enable ACNS: Make sure ACNS is enabled on your AKS cluster. Define Your Filter: Apply the ContainerNetworkMetric CRD with your include/exclude rules. Validate: Check your settings via ConfigMap and Cilium agent logs. See the Impact: Watch ingestion costs drop and dashboards become clearer! 👉 Learn more in the Metrics Filtering Guide. Try the Public Preview today and take control of your container network metrics.
KhushbuP
Nov 08, 2025 Place Azure Networking Blog
157Views
0likes
0Comments
Layer 7 Network Policies for AKS: Now Generally Available for Production Security and Observability!
We are thrilled to announce that Layer 7 (L7) Network Policies for Azure Kubernetes Service (AKS), powered by Cilium and Advanced Container Networking Services (ACNS), has reached General Availability (GA)! The journey from public preview to GA signifies a critical step: L7 Network Policies are now fully supported, highly optimized, and ready for your most demanding, mission-critical production workloads. A Practical Example: Securing a Multi-Tier Retail Application Let's walk through a common production scenario. Imagine a standard retail application running on AKS with three core microservices: frontend-app: Handles user traffic and displays product information. inventory-api: A backend service that provides product stock levels. It should be read-only for the frontend. payment-gateway: A highly sensitive service that processes transactions. It should only accept POST requests from the frontend to a specific endpoint. The Security Challenge: A traditional L4 policy would allow the frontend-app to talk to the inventory-api on its port, but it couldn't prevent a compromised frontend pod from trying to exploit a potential vulnerability by sending a DELETE or POST request to modify inventory data. The L7 Policy Solution: With GA L7 policies, you can enforce the Principle of Least Privilege at the application layer. Here's how you would protect the inventory-api: apiVersion: cilium.io/v2 kind: CiliumNetworkPolicy metadata: name: protect-inventory-api spec: endpointSelector: matchLabels: app: inventory-api ingress: - fromEndpoints: - matchLabels: app: frontend-app toPorts: - ports: - port: "8080" # The application port protocol: TCP rules: http: - method: "GET" # ONLY allow the GET method path: "/api/inventory/.*" # For paths under /api/inventory/ The Outcome: Allowed: A legitimate request from the frontend-app (GET /api/inventory/item123) is seamlessly forwarded. Blocked: Assuming frontend-app is compromised, any malicious request (like DELETE /api/inventory/item123) originating from it is blocked at the network layer. This Zero Trust approach protects the inventory-api service from the threat, regardless of the security state of the source service. This same principle can be applied to protect the payment-gateway, ensuring it only accepts POST requests to the /process-payment endpoint, and nothing else. Beyond L7: Supporting Zero Trust with Enhanced Security In addition, toL7 application-level policies to ensure Zero Trust, we support Layer 3/4 network security and advanced egress controls like Fully Qualified Domain Name (FQDN) filtering. This comprehensive approach allows administrators to: Restrict Outbound Connections (L3/L4 & FQDN): Implement strict egress control by ensuring that workloads can only communicate with approved external services. FQDN filtering is crucial here, allowing pods to connect exclusively to trusted external domains (e.g., www.trusted-partner.com), significantly reducing the risk of data exfiltration and maintaining compliance. To learn more, visit the FQDN Filtering Overview. Enforce Uniform Policy Across the Cluster (CCNP): Extend protections beyond individual namespaces. By defining security measures as a Cilium Clusterwide Network Policy (CCNP), thanks to its General Availability (GA), administrators can ensure uniform policy enforcement across multiple namespaces or the entire Kubernetes cluster, simplifying management and strengthening the overall security posture of all workloads. To learn CCNP Example: L4 Egress Policy with FQDN Filtering This policy ensures that all pods across the cluster (CiliumClusterwideNetworkPolicy) are only allowed to establish outbound connections to the domain *.example.com on the standard web ports (80 and 443). apiVersion: cilium.io/v2 kind: CiliumClusterwideNetworkPolicy metadata: name: allow-egress-to-example-com spec: endpointSelector: {} # Applies to all pods in the cluster egress: - toFQDNs: - matchPattern: "*.example.com" # Allows access to any subdomain of example.com toPorts: - ports: - port: "443" protocol: TCP - port: "80" protocol: TCP Operational Excellence: Observability You Can Trust A secure system must be observable. With GA, the integrated visibility of your L7 traffic is production ready. In our example above, the blocked DELETE request isn't silent. It is immediately visible in your Azure Managed Grafana dashboards as a "Dropped" flow, attributed directly to the protect-inventory-api policy. This makes security incidents auditable and easy to diagnose, enabling operations teams to detect misconfigurations or threats in real time. Below is a sample dashboard layout screenshot. Next Steps: Upgrade and Secure Your Production! We encourage you to enable L7 Network Policies on your AKS clusters and level up your network security controls for containerized workloads. We value your feedback as we continue to develop and improve this feature. Please refer to the Layer 7 Policy Overview for more information and visit How to Apply L7 Policy for an example scenario.
KhushbuP
Nov 08, 2025 Place Azure Networking Blog
240Views
1like
0Comments
Introducing eBPF Host Routing: High performance AI networking with Azure CNI powered by Cilium
AI-driven applications demand low-latency workloads for optimal user experience. To meet this need, services are moving to containerized environments, with Kubernetes as the standard. Kubernetes networking relies on the Container Network Interface (CNI) for pod connectivity and routing. Traditional CNI implementations use iptables for packet processing, adding latency and reducing throughput. Azure CNI powered by Cilium natively integrates Azure Kubernetes service (AKS) data plane with Azure CNI networking modes for superior performance, hardware offload support, and enterprise-grade reliability. Azure CNI powered by Cilium delivers up to 30% higher throughput in both benchmark and real-world customer tests compared to a bring-your-own Cilium setup on AKS. The next leap forward: Now, AKS data plane performance can be optimized even further with eBPF host routing, which is an open-source Cilium CNI capability that accelerates packet forwarding by executing routing logic directly in eBPF. As shown in the figure, this architecture eliminates reliance on iptables and connection tracking (conntrack) within the host network namespace. As a result, significantly improving packet processing efficiency, reducing CPU overhead and optimized performance for modern workloads. Comparison of host routing using the Linux kernel stack vs eBPF Azure CNI powered by Cilium is battle-tested for mission-critical workloads, backed by Microsoft support, and enriched with Advanced Container Networking Services features for security, observability, and accelerated performance. eBPF host routing is now included as part of Advanced Container Networking Services suite, delivering network performance acceleration. In this blog, we highlight the performance benefits of eBPF host routing, explain how to enable it in an AKS cluster, and provide a deep dive into its implementation on Azure. We start by examining AKS cluster performance before and after enabling eBPF host routing. Performance comparison Our comparative benchmarks measure the difference in Azure CNI Powered by Cilium, by enabling eBPF host routing. To perform these measurements, we use AKS clusters on K8s version 1.33, with host nodes of 16 cores, running Ubuntu 24.04. We are interested in throughput and latency numbers for pod-to-pod traffic in these clusters. For throughput measurements, we deploy netperf client and server pods, and measure TCP_STREAM throughput at varying message sizes in tests running 20 seconds each. The wide range of message sizes are meant to capture the variety of workloads running on AKS clusters, ranging from AI training and inference to messaging systems and media streaming. For latency, we run TCP_RR tests, measuring latency at various percentiles, as well as transaction rates. The following figure compares pods on the same node; eBPF-based routing results in a dramatic improvement in throughput (~30%). This is because, on the same node, the throughput is not constrained by factors such as the VM NIC limits and is almost entirely determined by host routing performance. For pod-to-pod throughput across different nodes in the cluster. eBPF host routing results in better pod-to-pod throughput across nodes, and the difference is more pronounced with smaller message sizes (3x more). This is because, with smaller messages, the per-message overhead incurred in the host network stack has a bigger impact on performance. Next, we compare latency for pod-to-pod traffic. We limit this benchmark to intra-node traffic, because cross-node traffic latency is determined by factors other than the routing latency incurred in the hosts. eBPF host routing results in reduced latency compared to the non-accelerated configuration at all measured percentiles. We have also measured the transaction rate between client and server pods, with and without eBPF host routing. This benchmark is an alternative measurement of latency because a transaction is essentially a small TCP request/response pair. We observe that eBPF host routing improves transactions per second by around 27% as compared to legacy host routing. Transactions/second (same node) Azure CNI configuration Transactions/second eBPF host routing 20396.9 Traditional host routing 16003.7 Enabling eBPF routing through Advanced Container Networking Services eBPF host routing is disabled by default in Advanced Container Networking Services because bypassing iptables in the host network namespace can ignore custom user rules and host-level security policies. This may lead to visible failures such as dropped traffic or broken network policies, as well as silent issues like unintended access or missed audit logs. To mitigate these risks, eBPF host routing is offered as an opt-in feature, enabled through Advanced Container Networking Services on Azure CNI powered by Cilium. The Advanced Container Networking Services advantage: Built-in safeguards: Enabling eBPF Host Routing in ACNS enhances the open-source offering with strong built-in safeguards. Before activation, ACNS validates existing iptables rules in the host network namespace and blocks enablement if user-defined rules are detected. Once enabled, kernel-level protections prevent new iptables rules and generate Kubernetes events for visibility. These measures allow customers to benefit from eBPF’s performance gains while maintaining security and reliability. Thanks to the additional safeguards, eBPF host routing in Advanced Container Networking Services is a safer and more robust option for customers who wish to obtain the best possible networking performance on their Kubernetes infrastructure. How to enable eBPF Host Routing with ACNS Visit the documentation on how to enable eBPF Host Routing for new and existing Azure CNI Powered by Cilium clusters. Verify the network profile with the new performance `accelerationMode`field set to `BpfVeth`. "networkProfile": { "advancedNetworking": { "enabled": true, "performance": { "accelerationMode": "BpfVeth" }, … For more information on Advanced Container Networking Services and ACNS Performance, please visit https://aka.ms/acnsperformance. Resources For more info about Advanced Container Networking Services please visit (Container Network Security with Advanced Container Networking Services (ACNS) - Azure Kubernetes Service | Microsoft Learn). For more info about Azure CNI Powered by Cilium please visit (Configure Azure CNI Powered by Cilium in Azure Kubernetes Service (AKS) - Azure Kubernetes Service | Microsoft Learn).
Sam_Foo
Nov 08, 2025 Place Azure Networking Blog
197Views
1like
0Comments
Deploying Third-Party Firewalls in Azure Landing Zones: Design, Configuration, and Best Practices
As enterprises adopt Microsoft Azure for large-scale workloads, securing network traffic becomes a critical part of the platform foundation. Azure’s Well-Architected Framework provides the blueprint for enterprise-scale Landing Zone design and deployments, and while Azure Firewall is a built-in PaaS option, some organizations prefer third-party firewall appliances for familiarity, feature depth, and vendor alignment. This blog explains the basic design patterns, key configurations, and best practices when deploying third-party firewalls (Palo Alto, Fortinet, Check Point, etc.) as part of an Azure Landing Zone. 1. Landing Zone Architecture and Firewall Role The Azure Landing Zone is Microsoft’s recommended enterprise-scale architecture for adopting cloud at scale. It provides a standardized, modular design that organizations can use to deploy and govern workloads consistently across subscriptions and regions. At its core, the Landing Zone follows a hub-and-spoke topology: Hub (Connectivity Subscription): Central place for shared services like DNS, private endpoints, VPN/ExpressRoute gateways, Azure Firewall (or third-party firewall appliances), Bastion, and monitoring agents. Provides consistent security controls and connectivity for all workloads. Firewalls are deployed here to act as the traffic inspection and enforcement point. Spokes (Workload Subscriptions): Application workloads (e.g., SAP, web apps, data platforms) are placed in spoke VNets. Additional specialized spokes may exist for Identity, Shared Services, Security, or Management. These are isolated for governance and compliance, but all connectivity back to other workloads or on-premises routes through the hub. Traffic Flows Through Firewalls North-South Traffic: Inbound connections from the Internet (e.g., customer access to applications). Outbound connections from Azure workloads to Internet services. Hybrid connectivity to on-premises datacenters or other clouds. Routed through the external firewall set for inspection and policy enforcement. East-West Traffic: Lateral traffic between spokes (e.g., Application VNet to Database VNet). Communication across environments like Dev → Test → Prod (if allowed). Routed through an internal firewall set to apply segmentation, zero-trust principles, and prevent lateral movement of threats. Why Firewalls Matter in the Landing Zone While Azure provides NSGs (Network Security Groups) and Route Tables for basic packet filtering and routing, these are not sufficient for advanced security scenarios. Firewalls add: Deep packet inspection (DPI) and application-level filtering. Intrusion Detection/Prevention (IDS/IPS) capabilities. Centralized policy management across multiple spokes. Segmentation of workloads to reduce blast radius of potential attacks. Consistent enforcement of enterprise security baselines across hybrid and multi-cloud. Organizations May Choose Depending on security needs, cost tolerance, and operational complexity, organizations typically adopt one of two models for third party firewalls: Two sets of firewalls One set dedicated for north-south traffic (external to Azure). Another set for east-west traffic (between VNets and spokes). Provides the highest security granularity, but comes with higher cost and management overhead. Single set of firewalls A consolidated deployment where the same firewall cluster handles both east-west and north-south traffic. Simpler and more cost-effective, but may introduce complexity in routing and policy segregation. This design choice is usually made during Landing Zone design, balancing security requirements, budget, and operational maturity. 2. Why Choose Third-Party Firewalls Over Azure Firewall? While Azure Firewall provides simplicity as a managed service, customers often choose third-party solutions due to: Advanced features – Deep packet inspection, IDS/IPS, SSL decryption, threat feeds. Vendor familiarity – Network teams trained on Palo Alto, Fortinet, or Check Point. Existing contracts – Enterprise license agreements and support channels. Hybrid alignment – Same vendor firewalls across on-premises and Azure. Azure Firewall is a fully managed PaaS service, ideal for customers who want a simple, cloud-native solution without worrying about underlying infrastructure. However, many enterprises continue to choose third-party firewall appliances (Palo Alto, Fortinet, Check Point, etc.) when implementing their Landing Zones. The decision usually depends on capabilities, familiarity, and enterprise strategy. Key Reasons to Choose Third-Party Firewalls Feature Depth and Advanced Security Third-party vendors offer advanced capabilities such as: Deep Packet Inspection (DPI) for application-aware filtering. Intrusion Detection and Prevention (IDS/IPS). SSL/TLS decryption and inspection. Advanced threat feeds, malware protection, sandboxing, and botnet detection. While Azure Firewall continues to evolve, these vendors have a longer track record in advanced threat protection. Operational Familiarity and Skills Network and security teams often have years of experience managing Palo Alto, Fortinet, or Check Point appliances on-premises. Adopting the same technology in Azure reduces the learning curve and ensures faster troubleshooting, smoother operations, and reuse of existing playbooks. Integration with Existing Security Ecosystem Many organizations already use vendor-specific management platforms (e.g., Panorama for Palo Alto, FortiManager for Fortinet, or SmartConsole for Check Point). Extending the same tools into Azure allows centralized management of policies across on-premises and cloud, ensuring consistent enforcement. Compliance and Regulatory Requirements Certain industries (finance, healthcare, government) require proven, certified firewall vendors for security compliance. Customers may already have third-party solutions validated by auditors and prefer extending those to Azure for consistency. Hybrid and Multi-Cloud Alignment Many enterprises run a hybrid model, with workloads split across on-premises, Azure, AWS, or GCP. Third-party firewalls provide a common security layer across environments, simplifying multi-cloud operations and governance. Customization and Flexibility Unlike Azure Firewall, which is a managed service with limited backend visibility, third-party firewalls give admins full control over operating systems, patching, advanced routing, and custom integrations. This flexibility can be essential when supporting complex or non-standard workloads. Licensing Leverage (BYOL) Enterprises with existing enterprise agreements or volume discounts can bring their own firewall licenses (BYOL) to Azure. This often reduces cost compared to pay-as-you-go Azure Firewall pricing. When Azure Firewall Might Still Be Enough Organizations with simple security needs (basic north-south inspection, FQDN filtering). Cloud-first teams that prefer managed services with minimal infrastructure overhead. Customers who want to avoid manual scaling and VM patching that comes with IaaS appliances. In practice, many large organizations use a hybrid approach: Azure Firewall for lightweight scenarios or specific environments, and third-party firewalls for enterprise workloads that require advanced inspection, vendor alignment, and compliance certifications. 3. Deployment Models in Azure Third-party firewalls in Azure are primarily IaaS-based appliances deployed as virtual machines (VMs). Leading vendors publish Azure Marketplace images and ARM/Bicep templates, enabling rapid, repeatable deployments across multiple environments. These firewalls allow organizations to enforce advanced network security policies, perform deep packet inspection, and integrate with Azure-native services such as Virtual Network (VNet) peering, Azure Monitor, and Azure Sentinel. Note: Some vendors now also release PaaS versions of their firewalls, offering managed firewall services with simplified operations. However, for the purposes of this blog, we will focus mainly on IaaS-based firewall deployments. Common Deployment Modes Active-Active Description: In this mode, multiple firewall VMs operate simultaneously, sharing the traffic load. An Azure Load Balancer distributes inbound and outbound traffic across all active firewall instances. Use Cases: Ideal for environments requiring high throughput, resilience, and near-zero downtime, such as enterprise data centers, multi-region deployments, or mission-critical applications. Considerations: Requires careful route and policy synchronization between firewall instances to ensure consistent traffic handling. Typically involves BGP or user-defined routes (UDRs) for optimal traffic steering. Scaling is easier: additional firewall VMs can be added behind the load balancer to handle traffic spikes. Active-Passive Description: One firewall VM handles all traffic (active), while the secondary VM (passive) stands by for failover. When the active node fails, Azure service principals manage IP reassignment and traffic rerouting. Use Cases: Suitable for environments where simpler management and lower operational complexity are preferred over continuous load balancing. Considerations: Failover may result in a brief downtime, typically measured in seconds to a few minutes. Synchronization between the active and passive nodes ensures firewall policies, sessions, and configurations are mirrored. Recommended for smaller deployments or those with predictable traffic patterns. Network Interfaces (NICs) Third-party firewall VMs often include multiple NICs, each dedicated to a specific type of traffic: Untrust/Public NIC: Connects to the Internet or external networks. Handles inbound/outbound public traffic and enforces perimeter security policies. Trust/Internal NIC: Connects to private VNets or subnets. Manages internal traffic between application tiers and enforces internal segmentation. Management NIC: Dedicated to firewall management traffic. Keeps administration separate from data plane traffic, improving security and reducing performance interference. HA NIC (Active-Passive setups): Facilitates synchronization between active and passive firewall nodes, ensuring session and configuration state is maintained across failovers. This design choice is usually made during Landing Zone design, balancing security requirements, budget, and operational maturity. : NICs of Palo Alto External Firewalls and FortiGate Internal Firewalls in two sets of firewall scenario 4. Key Configuration Considerations When deploying third-party firewalls in Azure, several design and configuration elements play a critical role in ensuring security, performance, and high availability. These considerations should be carefully aligned with organizational security policies, compliance requirements, and operational practices. Routing User-Defined Routes (UDRs): Define UDRs in spoke Virtual Networks to ensure all outbound traffic flows through the firewall, enforcing inspection and security policies before reaching the Internet or other Virtual Networks. Centralized routing helps standardize controls across multiple application Virtual Networks. Depending on the architecture traffic flow design, use appropriate Load Balancer IP as the Next Hop on UDRs of spoke Virtual Networks. Symmetric Routing: Ensure traffic follows symmetric paths (i.e., outbound and inbound flows pass through the same firewall instance). Avoid asymmetric routing, which can cause stateful firewalls to drop return traffic. Leverage BGP with Azure Route Server where supported, to simplify route propagation across hub-and-spoke topologies. : Azure UDR directing all traffic from a Spoke VNET to the Firewall IP Address Policies NAT Rules: Configure DNAT (Destination NAT) rules to publish applications securely to the Internet. Use SNAT (Source NAT) to mask private IPs when workloads access external resources. Security Rules: Define granular allow/deny rules for both north-south traffic (Internet to VNet) and east-west traffic (between Virtual Networks or subnets). Ensure least privilege by only allowing required ports, protocols, and destinations. Segmentation: Apply firewall policies to separate workloads, environments, and tenants (e.g., Production vs. Development). Enforce compliance by isolating workloads subject to regulatory standards (PCI-DSS, HIPAA, GDPR). Application-Aware Policies: Many vendors support Layer 7 inspection, enabling controls based on applications, users, and content (not just IP/port). Integrate with identity providers (Azure AD, LDAP, etc.) for user-based firewall rules. : Example Configuration of NAT Rules on a Palo Alto External Firewall Load Balancers Internal Load Balancer (ILB): Use ILBs for east-west traffic inspection between Virtual Networks or subnets. Ensures that traffic between applications always passes through the firewall, regardless of origin. External Load Balancer (ELB): Use ELBs for north-south traffic, handling Internet ingress and egress. Required in Active-Active firewall clusters to distribute traffic evenly across firewall nodes. Other Configurations: Configure health probes for firewall instances to ensure faulty nodes are automatically bypassed. Validate Floating IP configuration on Load Balancing Rules according to the respective vendor recommendations. Identity Integration Azure Service Principals: In Active-Passive deployments, configure service principals to enable automated IP reassignment during failover. This ensures continuous service availability without manual intervention. Role-Based Access Control (RBAC): Integrate firewall management with Azure RBAC to control who can deploy, manage, or modify firewall configurations. SIEM Integration: Stream logs to Azure Monitor, Sentinel, or third-party SIEMs for auditing, monitoring, and incident response. Licensing Pay-As-You-Go (PAYG): Licenses are bundled into the VM cost when deploying from the Azure Marketplace. Best for short-term projects, PoCs, or variable workloads. Bring Your Own License (BYOL): Enterprises can apply existing contracts and licenses with vendors to Azure deployments. Often more cost-effective for large-scale, long-term deployments. Hybrid Licensing Models: Some vendors support license mobility from on-premises to Azure, reducing duplication of costs. 5. Common Challenges Third-party firewalls in Azure provide strong security controls, but organizations often face practical challenges in day-to-day operations: Misconfiguration Incorrect UDRs, route tables, or NAT rules can cause dropped traffic or bypassed inspection. Asymmetric routing is a frequent issue in hub-and-spoke topologies, leading to session drops in stateful firewalls. Performance Bottlenecks Firewall throughput depends on the VM SKU (CPU, memory, NIC limits). Under-sizing causes latency and packet loss, while over-sizing adds unnecessary cost. Continuous monitoring and vendor sizing guides are essential. Failover Downtime Active-Passive models introduce brief service interruptions while IPs and routes are reassigned. Some sessions may be lost even with state sync, making Active-Active more attractive for mission-critical workloads. Backup & Recovery Azure Backup doesn’t support vendor firewall OS. Configurations must be exported and stored externally (e.g., storage accounts, repos, or vendor management tools). Without proper backups, recovery from failures or misconfigurations can be slow. Azure Platform Limits on Connections Azure imposes a per-VM cap of 250,000 active connections, regardless of what the firewall vendor appliance supports. This means even if an appliance is designed for millions of sessions, it will be constrained by Azure’s networking fabric. Hitting this cap can lead to unexplained traffic drops despite available CPU/memory. The workaround is to scale out horizontally (multiple firewall VMs behind a load balancer) and carefully monitor connection distribution. 6. Best Practices for Third-Party Firewall Deployments To maximize security, reliability, and performance of third-party firewalls in Azure, organizations should follow these best practices: Deploy in Availability Zones: Place firewall instances across different Availability Zones to ensure regional resilience and minimize downtime in case of zone-level failures. Prefer Active-Active for Critical Workloads: Where zero downtime is a requirement, use Active-Active clusters behind an Azure Load Balancer. Active-Passive can be simpler but introduces failover delays. Use Dedicated Subnets for Interfaces: Separate trust, untrust, HA, and management NICs into their own subnets. This enforces segmentation, simplifies route management, and reduces misconfiguration risk. Apply Least Privilege Policies: Always start with a deny-all baseline, then allow only necessary applications, ports, and protocols. Regularly review rules to avoid policy sprawl. Standardize Naming & Tagging: Adopt consistent naming conventions and resource tags for firewalls, subnets, route tables, and policies. This aids troubleshooting, automation, and compliance reporting. Validate End-to-End Traffic Flows: Test both north-south (Internet ↔ VNet) and east-west (VNet ↔ VNet/subnet) flows after deployment. Use tools like Azure Network Watcher and vendor traffic logs to confirm inspection. Plan for Scalability: Monitor throughput, CPU, memory, and session counts to anticipate when scale-out or higher VM SKUs are needed. Some vendors support autoscaling clusters for bursty workloads. Maintain Firmware & Threat Signatures: Regularly update the firewall’s software, patches, and threat intelligence feeds to ensure protection against emerging vulnerabilities and attacks. Automate updates where possible. Conclusion Third-party firewalls remain a core building block in many enterprise Azure Landing Zones. They provide the deep security controls and operational familiarity enterprises need, while Azure provides the scalable infrastructure to host them. By following the hub-and-spoke architecture, carefully planning deployment models, and enforcing best practices for routing, redundancy, monitoring, and backup, organizations can ensure a secure and reliable network foundation in Azure.
PramodPalukuru
Nov 05, 2025 Place Azure Networking Blog
599Views
2likes
1Comment
Inter-Hub Connectivity Using Azure Route Server
By Mays_Algebary shruthi_nair As your Azure footprint grows with a hub-and-spoke topology, managing User-Defined Routes (UDRs) for inter-hub connectivity can quickly become complex and error-prone. In this article, we’ll explore how Azure Route Server (ARS) can help streamline inter-hub routing by dynamically learning and advertising routes between hubs, reducing manual overhead and improving scalability. Baseline Architecture The baseline architecture includes two Hub VNets, each peered with their respective local spoke VNets as well as with the other Hub VNet for inter-hub connectivity. Both hubs are connected to local and remote ExpressRoute circuits in a bowtie configuration to ensure high availability and redundancy, with Weight used to prefer the local ExpressRoute circuit over the remote one. To maintain predictable routing behavior, the VNet-to-VNet configuration on the ExpressRoute Gateway should be disabled. Note: Adding ARS to an existing Hub with Virtual Network Gateway will cause downtime that expect to last 10 minutes. Scenario 1: ARS and NVA Coexist in the Hub Option A: Full Traffic Inspection ARS and NVA Coexist in the Hub In this scenario, ARS is deployed in each Hub VNet, alongside the Network Virtual Appliances (NVAs). NVA1 in Region1 establishes BGP peering with both the local ARS (ARS1) and the remote ARS (ARS2). Similarly, NVA2 in Region2 peers with both ARS2 (local) and ARS1 (remote). Let’s break down what each BGP peering relationship accomplishes. For clarity, we’ll focus on Region1, though the same logic applies to Region2: NVA1 Peering with Local ARS1 Through BGP peering with ARS1, NVA1 dynamically learns the prefixes of Spoke1 and Spoke2 at the OS level, eliminating the need to manually configure these routes. The same applies for NVA2 learning Spoke3 and Spoke4 prefixes via its BGP peering with ARS2. NVA1 Peering with Remote ARS2 When NVA1 peers with ARS2, the Spoke1 and Spoke2 prefixes are propagated to ARS2. ARS2 then injects these prefixes into NVA2 at both the NIC level with NVA1 as the next hop, and at the OS level. This mechanism removes the need for UDRs on the NVA subnets to enable inter-hub routing. Additionally, ARS2 advertises the Spoke1 and Spoke2 prefixes to both ExpressRoute circuits (EXR2 and EXR1 due to bowtie configuration) via GW2, making them reachable from on-premises through either EXR1 or EXR2. 👉Important: To ensure that ARS2 accepts and propagates Spoke1/Spoke2 prefixes received via NVA1, AS-Override must be enabled. Without AS-Override, BGP loop prevention will block these routes at ARS2, since both ARS1 and ARS2 use the default ASN 65515, and ARS2 will consider the route as already originated locally. The same principle applies in reverse for Spoke3 and Spoke4 prefixes being advertised from NVA2 to ARS1. Traffic Flow Inter-Hub Traffic: Spoke VNets are configured with UDRs that contain only a default route (0.0.0.0/0) pointing to the local NVA as the next hop. Additionally, the “Propagate Gateway Routes” setting should be set to False to ensure all traffic, whether East-West (intra-hub/inter-hub) or North-South (to/from internet), is forced through the local NVA for inspection. Local NVAs will have the next hop to the other region spokes injected at the NIC level by local ARS, pointing to the other region NVA, for example NVA2 will have next hop to Spoke1 and Spoke2 as NVA1 (10.0.1.4) and vice versa. Why are UDRs still needed on spokes if ARS handles dynamic routing? Even with ARS in place, UDRs are required to maintain control of the next hop for traffic inspection. For instance, if Spoke1 and Spoke2 do not have UDRs, they will learn the remote spoke prefixes (e.g., Spoke3/Spoke4) injected via ARS1, which received them from NVA2. This results in Spoke1/Spoke2 attempting to route traffic directly to NVA2, a path that is invalid, since the spokes don’t have the path to NVA2. The UDR ensures traffic correctly routes through NVA1 instead. On-Premises Traffic: To explain the on-premises traffic flow, we'll break it down into two directions: Azure to on-premises, and on-premises to Azure. Azure to On-Premises Traffic Flow: As previously noted, Spokes send all traffic, including traffic to on-premises, via NVA1 due to the default route in the UDR. NVA1 then routes traffic to the local ExpressRoute circuit, using Weight to prefer the local path over the remote. Note: While NVA1 learns on-premises prefixes from both local and remote ARSs at the OS level, this doesn’t affect routing decisions. The actual NIC-level route injection determines the next hop, ensuring traffic is sent via the correct path—even if the OS selects a different “best” route internally. The screenshot below from NVA1 shows four next hops to the on-premises network 10.2.0.0/16. These include the local ARS (ARS1: 10.0.2.5 and 10.0.2.4) and the remote ARS (ARS2: 10.1.2.5 and 10.1.2.4). On-Premises to Azure Traffic Flow In a bowtie ExpressRoute configuration, Azure VNet prefixes are advertised to on-premises through both local and remote ExpressRoute circuits. Because of this dual advertisement, the on-premises network must ensure optimal path selection when routing traffic to Azure. From Azure side, to maintain traffic symmetry, add UDRs at the GatewaySubnet (GW1 and GW2) with specific routes to the local Spoke VNets, using the local NVA as the next hop. This ensures return traffic flows back through the same path it entered. 👉How Does the ExpressRoute Edge Router Select the Optimal Path? You might ask: If Spoke prefixes are advertised by both GW1 and GW2, how does the ExpressRoute edge router choose the best path? (e.g., diagram below shows EXR1 learns Region1 prefixes from GW1 and GW2) Here’s how: Edge routers (like EXR1) receive the same Spoke prefixes from both gateways. However, these routes have different AS-Path lengths: - Routes from the local gateway (GW1) have a shorter AS-Path. - Routes from the remote gateway (GW2) have a longer AS-Path because NVA1’s ASN (e.g., 65001) is prepended twice as part of the AS-Override mechanism. As a result, the edge router (EXR1) will prefer the local path from GW1, ensuring efficient and predictable routing. For example: EXR1 receives Spoke1, Spoke2, and Hub1-VNet prefixes from both GW1 and GW2. But because the path via GW1 has a shorter AS-Path, EXR1 will select that as the best route. (Refer to the diagram below for a visual of the AS-Path difference). Final Traffic Flow: Option-A Insights: This design simplifies UDR configuration for inter-hub routing, especially useful when dealing with non-contiguous prefixes or operating across multiple hubs. For simplicity, we used a single NVA in each Hub-VNet while explaining the setup and traffic flow throughout this article. However, a high available (HA) NVA deployment is recommended. To maintain traffic symmetry in an HA setup, you’ll need to enable the next-hop IP feature when peering with Azure Route Server (ARS). When on-premises traffic inspection is required, the UDR setup in the GatewaySubnet becomes more complex as the number of Spokes increases. Additionally, each route table is currently limited to 600 UDR entries. As your Azure network scales, keep in mind that Azure Route Server supports a maximum of 8 BGP peers per instance (as of the time writing this article). This limit can impact architectures involving multiple NVAs or hubs. Option B: Bypass On-Premises Inspection If on-premises traffic inspection is not required, NVAs can advertise a supernet prefix summarizing the local Spoke VNets to the remote ARS. This approach provides granular control over which traffic is routed through the NVA and eliminates the need for BGP peering between the local NVA and local ARS. All other aspects of the architecture remain the same as described in Option A. For example, NVA2 can advertise the supernet 192.168.2.0/23 (supernet of Spoke3 and Spoke4) to ARS1. As a result, Spoke1 and Spoke2 will learn this route with NVA2 as the next hop. To ensure proper routing (as discussed earlier) and inter-hub inspection, you need apply a UDR in Spoke1 and Spoke2 that overrides this exact supernet prefix, redirecting traffic to NVA1 as the next hop. At the same time, traffic destined for on-premises will follow the system route through the local ExpressRoute gateway, bypassing NVA1 altogether. In this setup: UDRs on the Spokes should have "Propagate Gateway Routes" set to True. No UDRs are needed in the GatewaySubnet. 👉Can NVA2 Still Advertise Specific Spoke Prefixes? You might wonder: Can NVA2 still advertise specific prefixes (e.g., Spoke3 and Spoke4) learned from ARS2 to ARS1 instead of a supernet? Yes, this is technically possible, but it requires maintaining BGP peering between NVA2 and ARS2. However, this introduces UDR complexity in Spoke1 and Spoke2, as you'd need to manually override each specific prefix. This also defeats the purpose of using ARS for simplified route propagation, undermining the efficiency and scalability of the design. Bypass On-Premises Inspection Final Traffic Flow: Option B: Bypass on-premises inspection traffic flow Option-B Insights: This approach reduces the number of BGP peerings per ARS. Instead of maintaining two BGP sessions (local NVA and remote NVA) per Hub, you can limit it to just one, preserving capacity within ARS’s 8-peer limit for additional inter-hub NVA peerings. Each NVA should advertise a supernet prefix to the remote ARS. This can be challenging if your Spokes don’t use contiguous IP address spaces, as described in Option B. Scenario 2: ARS in the Hub and the NVA in Transit VNet In Scenario 1, we highlighted that when on-premises inspection is required, managing UDRs at the GatewaySubnet becomes increasingly complex as the number of Spoke VNets grows. This is due to the need for UDRs to include specific prefixes for each Spoke VNet. In this scenario, we eliminate the need to apply UDRs at the GatewaySubnet altogether. In this design, the NVA will be deployed in Transit VNet, where: Transit-VNet will be peered with local Spoke VNets and with the local Hub-VNet to enable intra-Hub and on-premises connectivity. Transit-VNet also peered with remote Transit VNets (e.g., Transit-VNet1 peered with Transit-VNet2) to handle inter-Hub connectivity through the NVAs. Additionally, Transit-VNets are peered with remote Hub-VNets, to establish BGP peering with the remote ARS. NVAs OS will need to add static routes for the local Spoke VNets prefixes, it can be specific or it can supernet prefix, which will later be advertised to ARSs over BGP Peering, then ARS will advertise it to on-premises via ExpressRoute. NVAs will BGP peer with local ARS and also with the remote ARS. To understand the reasoning behind this design, let’s take a closer look at the setup in Region1, focusing on how ARS and NVA are configured to connect to Region2. This will help illustrate both inter-hub and on-premises connectivity. The same concept applies in reverse from Region2 to Region1. Inetr-Hub: To enable NVA1 in Region1 to learn prefixes from Region2, NVA2 will configure static routes at the OS level for Spoke3 and Spoke4 (or their supernet prefix) and advertise them to ARS1 via remote BGP peering. As a result, these prefixes will be received by NVA1, both at the NIC level, with NVA2 as the next hop, and at the OS level for proper routing. Spoke1 and Spoke2 will have a UDR with a default route pointing to NVA1 as the next hop. For instance, when Spoke1 needs to communicate with Spoke3, the traffic will first route through NVA1. NVA1 will then forward the traffic to NVA2 using VNet peering between the two Hubs. A similar configuration will be applied in Region2, where NVA1 will configure static routes at the OS level for Spoke1 and Spoke2 (or their supernet prefix) and advertise them to ARS2 via remote BGP peering, as a result, these prefixes will be received by NVA2, both at the NIC level (injected by ARS2), with NVA1 as the next hop, and at the OS level for proper routing. Note: At the OS level, NVA1 learns Spoke3 and Spoke4 prefixes from both local and remote ARSs. However, the NIC-level route injection determines the actual next hop, so even if the OS selects a different best route, it won’t affect forwarding behavior. same applies to NVA2. On-Premises Traffic: To explain the on-premises traffic flow, we'll break it down into two directions: Azure to on-premises, and on-premises to Azure. Azure to On-Premises Traffic Flow: Spokes in Region1 route all traffic through NVA1 via a default route defined in their UDRs. Because of BGP peering between NVA1 and ARS1, ARS1 advertises the Spoke1 and Spoke2 (or their supernet prefix) to on-premises through ExpressRoute (EXR1). The Transit-VNet1 (hosting NVA1) is peered with Hub1-VNet, with “Use Remote Gateway” enabled. This allows NVA1 to learn on-premises prefixes from the local ExpressRoute gateway (GW1), and traffic to on-premises is routed through the local ExpressRoute circuit (EXR1) due to higher BGP Weight configuration. Note: At the OS level, NVA1 learns on-prem prefixes from both local and remote ARSs. However, the NIC-level route injection determines the actual next hop, so even if the OS selects a different best route, it won’t affect forwarding behavior. same applies to NVA2. On-Premises to Azure Traffic Flow: Through BGP peering with ARS1, NVA1 enables ARS1 to advertise Spoke1 and Spoke2 (or their supernet prefix) to both EXR1 and EXR2 circuits (due to the ExpressRoute bowtie setup). Additionally, due to BGP peering between NVA1 and ARS2, ARS2 also advertises Spoke1 and Spoke2 (or their supernet prefix) to EXR2 and EXR1 circuits. As a result, both ExpressRoute edge routers in Region1 and Region2 learn the same Spoke prefixes (or their supernet prefix) from both GW1 and GW2, with identical AS-Path lengths, as shown below. EXR1 learns Region1 Spokes's supernet prefixes from GW1 and GW2 This causes non-optimal inbound routing, where traffic from on-premises destined to Region1 Spokes may first land in Region2’s Hub2-VNet before traversing to NVA1 in Region1. However, return traffic from Spoke1 and Spoke2 will always exit through Hub1-VNet. To prevent suboptimal routing, configure NVA1 to prepend the AS path for Spoke1 and Spoke2 (or their supernet prefix) when advertising them to the remote ARS2. Likewise, ensure NVA2 prepends the AS path for Spoke3 and Spoke4 (or their supernet prefix) when advertising to ARS1. This approach helps maintain optimal routing under normal conditions and during ExpressRoute failover scenarios. Below diagram shows NVA1 is setting AS-Prepend for Spoke1 and Spoke2 supernet prefix when BGP peer with remote ARS (ARS1), same will apply for NVA2 when advertising Spoke3 and Spoke4 prefixes to ARS1. Final Traffic Flow: Full Inspection: Traffic flow when NVA in Transit-VNet Insights: This solution is ideal when full traffic inspection is required. Unlike Scenario 1 - Option A, it eliminates the need for UDRs in the GatewaySubnet. When ARS is deployed in a VNet (typically in Hub VNets), the VNet will be limited to 500 VNet peerings (as of the time writing this article). However, in this design, Spokes peer with the Transit-VNet instead of directly with the ARS VNet, allowing you to scale beyond the 500-peer limit by leveraging Azure Virtual Network Manager (AVNM) or submitting a support request. Some enterprise customers may encounter the 1,000-route advertisement limit on the ExpressRoute circuit from the ExpressRoute gateway. In traditional hub-and-Spoke designs, there's no native control over what is advertised to ExpressRoute. With this architecture, NVAs provide greater control over route advertisement to the circuit. For simplicity, we used a single NVA in each Hub-VNet while explaining the setup and traffic flow throughout this article. However, a high available (HA) NVA deployment is recommended. To maintain traffic symmetry in an HA setup, you’ll need to enable the next-hop IP feature when peering with Azure Route Server (ARS). This design does require additional VNet peerings, including: Between Transit-VNets (inter-region), Between Transit-VNets and local Spokes, and Between Transit-VNets and both local and remote Hub-VNets.
Mays_Algebary
Oct 22, 2025 Place Azure Networking Blog
2.5KViews
3likes
2Comments
Using Application Gateway to secure access to the Azure OpenAI Service: Customer success story
Introduction A large enterprise customer set out to build a generative AI application using Azure OpenAI. While the app would be hosted on-premises, the customer wanted to leverage the latest large language models (LLMs) available through Azure OpenAI. However, they faced a critical challenge: how to securely access Azure OpenAI from an on-prem environment without private network connectivity or a full Azure landing zone. This blog post walks through how customers overcame these limitations using Application Gateway as a reverse proxy in front of their Azure Open AI along with other Azure services, to meet their security and governance requirements. Customer landscape and challenges The customer’s environment lacked: Private network connectivity (no Site-to-Site VPN or ExpressRoute). This was due to using a new Azure Government environment and not having a cloud operations team set up yet Common network topology such as Virtual WAN and Hub-Spoke network design A full Enterprise Scale Landing Zone (ESLZ) of common infrastructure Security components like private DNS zones, DNS resolvers, API Management, and firewalls This meant they couldn’t use private endpoints or other standard security controls typically available in mature Azure environments. Security was non-negotiable. Public access to Azure OpenAI was unacceptable. Customer needs to: Restrict access to specific IP CIDR ranges from on-prem user machines and data centers Limit ports communicating with Azure OpenAI Implement a reverse proxy with SSL termination and Web Application Firewall (WAF) Use a customer-provided SSL certificate to secure traffic Proposed solution To address these challenges, the customer designed a secure architecture using the following Azure components: Key Azure services Application Gateway – Layer 7 reverse proxy, SSL termination & Web Application Firewall (WAF) Public IP – Allows communication over public internet between customer’s IP addresses & Azure IP addresses Virtual Network – Allows control of network traffic in Azure Network Security Group (NSG) – Layer 4 network controls such as port numbers, service tags using five-tuple information (source, source port, destination, destination port, protocol) Azure OpenAI – Large Language Model (LLM) NSG configuration Inbound Rules: Allow traffic only from specific IP CIDR ranges and HTTP(S) ports Outbound Rules: Target AzureCloud.<region> with HTTP(S) ports (no service tag for Azure OpenAI yet) Application Gateway setup SSL Certificate: Issued by the customer’s on-prem Certificate Authority HTTPS Listener: Uses the on-prem certificate to terminate SSL Traffic flow: Decrypt incoming traffic Scan with WAF Re-encrypt using a well-known Azure CA Override backend hostname Custom health probe: Configured to detect a 404 response from Azure OpenAI (since no health check endpoint exists) Azure OpenAI configuration IP firewall restrictions: Only allow traffic from the Application Gateway subnet Outcome By combining Application Gateway, NSGs, and custom SSL configurations, the customer successfully secured their Azure OpenAI deployment—without needing a full ESLZ or private connectivity. This approach enabled them to move forward with their generative AI app while maintaining enterprise-grade security and governance.
vnamani
Sep 25, 2025 Place Azure Networking Blog
442Views
1like
0Comments
Unlock visibility, flexibility, and cost efficiency with Application Gateway logging enhancements
Introduction In today’s cloud-native landscape, organizations are accelerating the deployment of web applications at unprecedented speed. But with rapid scale comes increased complexity—and a growing need for deep, actionable visibility into the underlying infrastructure. As businesses embrace modern architectures, the demand for scalable, secure, and observable web applications continues to rise. Azure Application Gateway is evolving to meet these needs, offering enhanced logging capabilities that empower teams to gain richer insights, optimize costs, and simplify operations. This article highlights three powerful enhancements that are transforming how teams use logging in Azure Application Gateway: Resource-specific tables Data collection rule (DCR) transformations Basic log plan Resource-specific tables improve organization and query performance. DCR transformations give teams fine-grained control over the structure and content of their log data. And the basic log plan makes comprehensive logging more accessible and cost-effective. Together, these capabilities deliver a smarter, more structured, and cost-aware approach to observability. Resource-specific tables: Structured and efficient logging Azure Monitor stores logs in a Log Analytics workspace powered by Azure Data Explorer. Previously, when you configured Log Analytics, all diagnostic data for Application Gateway was stored in a single, generic table called AzureDiagnostics. This approach often led to slower queries and increased complexity, especially when working with large datasets. With resource-specific logging, Application Gateway logs are now organised into dedicated tables, each optimised for a specific log type: AGWAccessLogs- Contains access log information AGWPerformanceLogs-Contains performance metrics and data AGWFirewallLogs-Contains Web Application Firewall (WAF) log data This structured approach delivers several key benefits: Simplified queries – Reduces the need for complex filtering and data manipulation Improved schema discovery – Makes it easier to understand log structure and fields Enhanced performance – Speeds up both ingestion and query execution Granular access control – Allows you to grant Azure role-based access control (RBAC) permissions on specific tables Example: Azure diagnostics vs. resource-specific table approach Traditional AzureDiagnostics query: AzureDiagnostics | where ResourceType == "APPLICATIONGATEWAYS" and Category == "ApplicationGatewayAccessLog" | extend clientIp_s = todynamic(properties_s).clientIP | where clientIp_s == "203.0.113.1" New resource-specific table query: AGWAccessLogs | where ClientIP == "203.0.113.1" The resource-specific approach is cleaner, faster, and easier to maintain as it eliminates complex filtering and data manipulation. Data collection rules (DCR) log transformations: Take control of your log pipeline DCR transformations offer a flexible way to shape log data before it reaches your Log Analytics workspace. Instead of ingesting raw logs and filtering them post-ingestion, you can now filter, enrich, and transform logs at the source, giving you greater control and efficiency. Why DCR transformations matter: Optimize costs: Reduce ingestion volume by excluding non-essential data Support compliance: Strip out personally identifiable information (PII) before logs are stored, helping meet GDPR and CCPA requirements Manage volume: Ideal for high-throughput environments where only actionable data is needed Real-world use cases Whether you're handling high-traffic e-commerce workloads, processing sensitive healthcare data, or managing development environments with cost constraints, DCR transformations help tailor your logging strategy to meet specific business and regulatory needs. For implementation guidance and best practices, refer to Transformations Azure Monitor - Azure Monitor Basic log plan - Cost-effective logging for low-priority data Not all logs require real-time analysis. Some are used for occasional debugging or compliance audits. The Basic log plan in Log Analytics provides a cost-effective way to retain high-volume, low-priority diagnostic data—without paying for premium features you may not need. When to use the Basic log plan Save on costs: Pay-as-you-go pricing with lower ingestion rates Debugging and forensics: Retain data for troubleshooting and incident analysis, without paying premium costs for features you don't use regularly Understanding the trade-offs While the Basic plan offers significant savings, it comes with limitations: No real-time alerts: Not suitable for monitoring critical health metrics Query constraints: Limited KQL functionality and additional query costs Choose the Basic plan when deep analytics and alerting aren’t required and focus premium resources on critical logs. Building a smart logging strategy with Azure Application Gateway To get the most out of Azure Application Gateway logging, combine the strengths of all three capabilities: Assess your needs: Identify which logs require real-time monitoring versus those used for compliance or debugging Design for efficiency: Use the Basic log plan for low-priority data, and reserve standard tiers for critical logs Transform at the source: Apply DCR transformations to reduce costs and meet compliance before ingestion Query with precision: Use resource-specific tables to simplify queries and improve performance This integrated approach helps teams achieve deep visibility, maintain compliance, and manage costs.
vnamani
Sep 25, 2025 Place Azure Networking Blog
317Views
0likes
0Comments
Microsoft Azure scales Hollow Core Fiber (HCF) production through outsourced manufacturing
Introduction As cloud and AI workloads surge, the pressure on datacenter (DC), Metro and Wide Area Network (WAN) networks has never been greater. Microsoft is tackling the physical limits of traditional networking head-on. From pioneering research in microLED technologies to deploying Hollow Core Fiber (HCF) at global scale, Microsoft is reimagining connectivity to power the next era of cloud networking. Azure’s HCF journey has been one of relentless innovation, collaboration, and a vision to redefine the physical layer of the cloud. Microsoft’s HCF, based on the proprietary Double Nested Antiresonant Nodeless Fiber (DNANF) design, delivers up to 47% faster data transmission and approximately 33% lower latency compared to conventional Single Mode Fiber (SMF), bringing significant advantages to the network that powers Azure. Today, Microsoft is announcing a major milestone: the industrial scale-up of HCF production, powered by new strategic manufacturing collaborations with Corning Incorporated (Corning) and Heraeus Covantics (Heraeus). These collaborations will enable Azure to increase the global fiber production of HCF to meet the demands of the growing network infrastructure, advancing the performance and reliability customers expect for cloud and AI workloads. Real-world benefits for Azure customers Since 2023, Microsoft has deployed HCF across multiple Azure regions, with production links meeting performance and reliability targets. As manufacturing scales, Azure plans to expand deployment of the full end-to-end HCF network solution to help increase capacity, resiliency, and speed for customers, with the potential to set new benchmarks for latency and efficiency in fiber infrastructure. Why it matters Microsoft’s proprietary HCF design brings the following improvements for Azure customers: Increased data transmission speeds with up to 33% lower latency. Enhanced signal performance that improves data transmission quality for customers. Improved optical efficiency resulting in higher bandwidth rates compared to conventional fiber. How Microsoft is making it possible To operationalize HCF across Azure with production grade performance, Microsoft is: Deploying a standardized HCF solution with end-to-end systems and components for operational efficiency, streamlined network management, and reliable connectivity across Azure’s infrastructure. Ensuring interoperability with standard SMF environments, enabling seamless integration with existing optical infrastructure in the network for faster deployment and scalable growth. Creating a multinational production supply chain to scale next generation fiber production, ensuring the volumes and speed to market needed for widespread HCF deployment across the Azure network. Scaling up and out With Corning and Heraeus as Microsoft’s first HCF manufacturing collaborators, Azure plans to accelerate deployment to meet surging demand for high-performance connectivity. These collaborations underscore Microsoft’s commitment to enhancing its global infrastructure and delivering a reliable customer experience. They also reinforce Azure’s continued investment in deploying HCF, with a vision for this technology to potentially set the global benchmark for high-capacity fiber innovation. “This milestone marks a new chapter in reimagining the cloud’s physical layer. Our collaborations with Corning and Heraeus establish a resilient, global HCF supply chain so Azure can deliver a standardized, world-class customer experience with ultra-low latency and high reliability for modern AI and cloud workloads.” - Jamie Gaudette, Partner Cloud Network Engineering Manager at Microsoft To scale HCF production, Microsoft will utilize Corning’s established U.S. facilities, while Heraeus will produce out of its sites in both Europe and the U.S. "Corning is excited to expand our longtime collaboration with Microsoft, leveraging Corning’s fiber and cable manufacturing facilities in North Carolina to accelerate the production of Microsoft's Hollow Core Fiber. This collaboration not only strengthens our existing relationship but also underscores our commitment to advancing U.S. leadership in AI innovation and infrastructure. By working closely with Microsoft, we are poised to deliver solutions that meet the demands of AI workloads, setting new benchmarks for speed and efficiency in fiber infrastructure." - Mike O'Day, Senior Vice President and General Manager, Corning Optical Communications “We started our work on HCF a decade ago, teamed up with the Optoelectronics Research Centre (ORC) at the University of Southampton and then with Lumenisity prior to its acquisition. Now, we are excited to continue working with Microsoft on shaping the datacom industry. With leading solutions in glass, tube, preform, and fiber manufacturing, we are ready to scale this disruptive HCF technology to significant volumes. We’ll leverage our proven track record of taking glass and fiber innovations from the lab to widespread adoption, just as we did in the telecom industry, where approximately 2 billion kilometers of fiber are made using Heraeus products.” - Dr. Jan Vydra, Executive Vice President Fiber Optics, Heraeus Covantics Azure engineers are working alongside Corning and Heraeus to operationalize Microsoft manufacturing process intellectual property (IP), deliver targeted training programs, and drive the yield, metrology, and reliability improvements required for scaled production. The collaborations are foundational to a growing standardized, global ecosystem that supports: Glass preform/tubing supply Fiber production at scale Cable and connectivity for deployment into carrier‑grade environments Building on a foundation of innovation: Microsoft’s HCF program In 2022, Microsoft acquired Lumenisity, a spin‑out from the Optoelectronics Research Centre (ORC) at the University of Southampton, UK. That same year, Microsoft launched the world’s first state‑of‑the‑art HCF fabrication facility in the UK to expand production and drive innovation. This purpose-built site continues to support long‑term HCF research, prototyping, and testing, ensuring that Azure remains at the forefront of HCF technology. Working with industry leaders, Microsoft has developed a proven end‑to‑end ecosystem of components, equipment, and HCF‑specific hardware necessary and successfully proven in production deployments and operations. Pushing the boundaries: recent breakthrough research Today, the University of Southampton announced a landmark achievement in optical communications: in collaboration with Azure Fiber researchers, they have demonstrated the lowest signal loss ever recorded for optical fibers (<0.1 dB/km) using research-grade DNANF HCF technology (see figure 4). This breakthrough, detailed in a research paper published in Nature Photonics earlier this month, paves the way for a potential revolution in the field, enabling unprecedented data transmission capacities and longer unamplified spans. ecords at around 1550nm [1] 2002 Nagayama et al. 1 [2] 2025 Sato et al. 2 [3] 2025 research-grade DNANF HCF Petrovich et al. 3 This breakthrough highlights the potential for this technology to transform global internet infrastructure and DC connectivity. Expected benefits include: Faster: Approximately 47% faster, reducing latency, powering real-time AI inference, cloud gaming and other interactive workloads. More capacity: A wider optical spectrum window enabling exponentially greater bandwidth. Future-ready: Lays the groundwork for quantum-safe links, quantum computing infrastructure, advanced sensing, and remote laser delivery. Looking ahead: Unlocking the future of cloud networking The future of cloud networking is being built today! With record-breaking [3] fiber innovations, a rapidly expanding collaborative ecosystem, and the industrialized scale to deliver next-generation performance, Azure continues to evolve to meet the demands for speed, reliability, and connectivity. As we accelerate the deployment of HCF across our global network, we’re not just keeping pace with the demands of AI and cloud, we’re redefining what’s possible. References: [1] Nagayama, K., Kakui, M., Matsui, M., Saitoh, T. & Chigusa, Y. Ultra-low-loss (0.1484 dB/km) pure silica core fibre and extension of transmission distance. Electron. Lett. 38, 1168–1169 (2002). [2] Sato, S., Kawaguchi, Y., Sakuma, H., Haruna, T. & Hasegawa, T. Record low loss optical fiber with 0.1397 dB/km. In Proc. Optical Fiber Communication Conference (OFC) 2024 Tu2E.1 (Optica Publishing Group, 2024). [3] Petrovich, M., Numkam Fokoua, E., Chen, Y., Sakr, H., Isa Adamu, A., Hassan, R., Wu, D., Fatobene Ando, R., Papadimopoulos, A., Sandoghchi, S., Jasion, G., & Poletti, F. Broadband optical fibre with an attenuation lower than 0.1 decibel per kilometre. Nat. Photon. (2025). https://doi.org/10.1038/s41566-025-01747-5 Useful Links: The Deployment of Hollow Core Fiber (HCF) in Azure’s Network How hollow core fiber is accelerating AI | Microsoft Azure Blog Learn more about Microsoft global infrastructure
frrey
Sep 23, 2025 Place Azure Networking Blog
7.5KViews
6likes
0Comments
Introducing WireGuard In-Transit Encryption for AKS (Public Preview)
As organizations continue to scale containerized workloads in Azure Kubernetes Service (AKS), the need to secure network traffic between applications and services has never been more critical especially in regulated or security-sensitive environments. We’re excited to announce the public preview of WireGuard-based in-transit encryption in AKS, a new capability in Advanced Container Networking Services that enhances inter-node traffic protection with minimal operational overhead. What is WireGuard? WireGuard is a modern, high-performance VPN protocol known for its simplicity, and robust cryptography. Integrated into the Cilium data plane and managed as part of AKS networking, WireGuard offers an efficient way to encrypt traffic transparently within your cluster. With this new feature, WireGuard is now natively supported as part of Azure CNI powered by Cilium with Advanced Container Networking services, no need for third-party encryption tools or custom key management systems. What Gets Encrypted? The WireGuard integration in AKS focuses on the most critical traffic path: ✅ Encrypted: Inter-node pod traffic: Network communication between pods running on different nodes in the AKS cluster. This traffic traverses the underlying network infrastructure and is encrypted using WireGuard to ensure confidentiality and integrity. ❌ Not encrypted: Same-node pod traffic: Communication between pods that are running on the same node. Since this traffic does not leave the node, it bypasses WireGuard and remains unencrypted. Node-generated traffic: Traffic initiated by the node itself, which is currently not routed through WireGuard and thus not encrypted. This scope strikes the right balance between strong protection and performance by securing the most critical traffic, which is data that leaves the host and traverses the network. Key Benefits Simple Configuration: Enable WireGuard with just a few flags during AKS cluster creation or update. Automatic Key Management: Each node generates and exchanges WireGuard keys automatically—no need for manual configuration. Transparent to Applications: No application-level changes are required. Encryption happens at the network layer. Cloud-Native Integration: Fully managed as part of Advanced Container Networking Services and Cilium, offering a seamless and reliable experience Architecture: How It Works When WireGuard is enabled: Each node generates a unique public/private key pair. The public keys are securely shared between nodes via the CiliumNode custom resource. A dedicated network interface (cilium_wg0) is created and managed by the Cilium agent running on each node. Peers are dynamically updated, and keys are rotated automatically every 120 seconds to minimize risk. This mechanism ensures that only validated nodes can participate in encrypted communication. WireGuard and VNet Encryption AKS now offers two powerful in-transit encryption options: Feature WireGuard Encryption VNet Encryption Scope Pod-to-pod inter-node traffic All traffic in the VNet VM Support Works on all VM SKUs Requires hardware support (e.g., Gen2 VMs) Deployment Flexibility Cloud-agnostic, hybrid ready Azure-only Performance Software-based, moderate CPU usage Hardware-accelerated, low overhead Choose WireGuard if you want encryption flexibility across clouds or have VM SKUs that don’t support VNet encryption . Choose VNet Encryption for full-network coverage and ultra-low CPU overhead. Conclusion and Next Steps WireGuard in AKS, now in public preview, delivers strong encryption that protects traffic as it leaves the host and traverses the network right where it's needed most. It offers a balanced approach to securing container networking without compromising usability. Ready to get started? Check out our how-to guide for step-by-step instructions on enabling WireGuard in your cluster and securing your container networking with ease. Explore more about Advanced Container Networking Services: Container Network Observability L7 network policies FQDN-based Policy
josephyostos
Sep 18, 2025 Place Azure Networking Blog
447Views
0likes
0Comments