azure networking
72 TopicsAnnouncing Azure DNS security policy with Threat Intelligence feed general availability
Azure DNS security policy with Threat Intelligence feed allows early detection and prevention of security incidents on customer Virtual Networks where known malicious domains sourced by Microsoft’s Security Response Center (MSRC) can be blocked from name resolution. Azure DNS security policy with Threat Intelligence feed is being announced to all customers and will have regional availability in all public regions.73Views1like0CommentsAzure Virtual Network Manager + Azure Virtual WAN
Azure continues to expand its networking capabilities, with Azure Virtual Network Manager and Azure Virtual WAN (vWAN) standing out as two of the most transformative services. When deployed together, they offer the best of both worlds: the operational simplicity of a managed hub architecture combined with the ability for spoke VNets to communicate directly, avoiding additional hub hops and minimizing latency Revisiting the classic hub-and-spoke pattern Element Traditional hub-and-spoke role Hub VNet Centralized network that hosts shared services including firewalls (e.g., Azure Firewall, NVAs), VPN/ExpressRoute gateways, DNS servers, domain controllers, and central route tables for traffic management. Acts as the connectivity and security anchor for all spoke networks. Spoke VNets Host individual application workloads and peer directly to the hub VNet. Traffic flows through the hub for north-south connectivity (to/from on-premises or internet) and cross-spoke communication (east-west traffic between spokes). Benefits • Single enforcement point for security policies and network controls • No duplication of shared services across environments • Simplified routing logic and traffic flow management • Clear network segmentation and isolation between workloads • Cost optimization through centralized resources However, this architecture comes with a trade-off: every spoke-to-spoke packet must route through the hub, introducing additional network hops, increased latency, and potential throughput constraints. How Virtual WAN modernizes that design Virtual WAN replaces a do-it-yourself hub VNet with a fully managed hub service: Managed hubs – Azure owns and operates the hub infrastructure. Automatic route propagation – routes learned once are usable everywhere. Integrated add-ons – Firewalls, VPN, and ExpressRoute ports are first-class citizens. By default, Virtual WAN enables any-to-any routing between spokes. Traffic transits the hub fabric automatically—no configuration required. Why direct spoke mesh? Certain patterns require single-hop connectivity Micro-service meshes that sit in different spokes and exchange chatty RPC calls. Database replication / backups where throughput counts, and hub bandwidth is precious. Dev / Test / Prod spokes that need to sync artifacts quickly yet stay isolated from hub services. Segmentation mandates where a workload must bypass hub inspection for compliance yet still talk to a partner VNet. Benefits Lower latency – the hub detour disappears. Better bandwidth – no hub congestion or firewall throughput cap. Higher resilience – spoke pairs can keep talking even if the hub is under maintenance. The peering explosion problem With pure VNet peering, the math escalates fast: For n spokes you need n × (n-1)/2 links. Ten spokes? 45 peerings. Add four more? Now 91. Each extra peering forces you to: Touch multiple route tables. Update NSG rules to cover the new paths. Repeat every time you add or retire a spoke. Troubleshoot an ever-growing spider web. Where Azure Virtual Network Manager Steps In? Azure Virtual Network Manager introduces Network Groups plus a Mesh connectivity policy: Azure Virtual Network Manager Concept What it gives you Network group A logical container that groups multiple VNets together, allowing you to apply configurations and policies to all members simultaneously Mesh connectivity Automated peering between all VNets in the group, ensuring every member can communicate directly with every other member without manual configuration Declarative config Intent-based approach where you define the desired network state, and Azure Virtual Network Manager handles the implementation and ongoing maintenance Dynamic updates Automatic topology management—when VNets are added to or removed from a group, Azure Virtual Network Manager reconfigures all necessary connections without manual intervention Operational complexity collapses from O(n²) to O(1)—you manage a group, not 100+ individual peerings. A complementary model: Azure Virtual Network Manager mesh inside vWAN Since Azure Virtual Network Manager works on any Azure VNet—including the VNets you already attach to a vWAN hub—you can apply mesh policies on top of your existing managed hub architecture: Spoke VNets join a vWAN hub for branch connectivity, centralized firewalling, or multi-region reach. The same spokes are added to an Azure Virtual Network Manager Network Group with a mesh policy. Azure Virtual Network Manager builds direct peering links between the spokes, while vWAN continues to advertise and learn routes. Result: All VNets still benefit from vWAN’s global routing and on-premises integration. Latency-critical east-west flows now travel the shortest path—one hop—as if the VNets were traditionally peered. Rather than choosing one over the other, organizations can leverage both vWAN and Azure Virtual Network Manager together, as the combination enhances the strengths of each service. Performance illustration Spoke-to-Spoke Communication with Virtual WAN without Azure Virtual Network Manager mesh: Spoke-to-Spoke Communication with Virtual WAN with Azure Virtual Network Manager mesh: Observability & protection NSG flow logs – granular packet logs on every peered VNet. Azure Virtual Network Manager admin rules – org-wide guardrails that trump local NSGs. Azure Monitor + SIEM – route flow logs to Log Analytics, Sentinel, or third-party SIEM for threat detection. Layered design – hub firewalls inspect north-south traffic; NSGs plus admin rules secure east-west flows. Putting it all together Virtual WAN offers fully managed global connectivity, simplifying the integration of branch offices and on-premises infrastructure into your Azure environment. Azure Virtual Network Manager mesh establishes direct communication paths between spoke VNets, making it ideal for workloads requiring high throughput or minimal latency in east-west traffic patterns. When combined, these services provide architects with granular control over traffic routing. Each flow can be directed through hub services when needed or routed directly between spokes for optimal performance—all without re-architecting your network or creating additional management complexity. By pairing Azure Virtual Network Manager’s group-based mesh with VWAN’s managed hubs, you get the best of both worlds: worldwide reach, centralized security, and single-hop performance where it counts.1.2KViews4likes0CommentsDelivering web applications over IPv6
The IPv4 address space pool has been exhausted for some time now, meaning there is no new public address space available for allocation from Internet Registries. The internet continues to run on IPv4 through technical measures such as Network Address Translation (NAT) and Carrier Grade NAT, and reallocation of address space through IPv4 address space trading. IPv6 will ultimately be the dominant network protocol on the internet, as IPv4 life-support mechanisms used by network operators, hosting providers and ISPs will eventually reach the limits of their scalability. Mobile networks are already changing to IPv6-only APNs; reachability of IPv4-only destinations from these mobile network is through 6-4 NAT gateways, which sometimes causes problems. Client uptake of IPv6 is progressing steadily. Google reports 49% of clients connecting to its services over IPv6 globally, with France leading at 80%. IPv6 client access measured by Google: Meanwhile, countries around the world are requiring IPv6 reachability for public web services. Examples are the United States, European Union member states among which the Netherlands and Norway, and India, and Japan. IPv6 adoption per country measured by Google: Entities needing to comply with these mandates are looking at Azure's networking capabilities for solutions. Azure supports IPv6 for both private and public networking, and capabilities have developed and expanded over time. This article discusses strategies to build and deploy IPv6-enabled public, internet-facing applications that are reachable from IPv6(-only) clients. Azure Networking IPv6 capabilities Azure's private networking capabilities center on Virtual Networks (VNETs) and the components that are deployed within. Azure VNETs are IPv4/IPv6 dual stack capable: a VNET must always have IPv4 address space allocated, and can also have IPv6 address space. Virtual machines in a dual stack VNET will have both an IPv4 and an IPv6 address from the VNET range, and can be behind IPv6 capable External- and Internal Load Balancers. VNETs can be connected through VNET peering, which effectively turns the peered VNETs into a single routing domain. It is now possible to peer only the IPv6 address spaces of VNETs, so that the IPv4 space assigned to VNETs can overlap and communication across the peering is over IPv6. The same is true for connectivity to on-premise over ExpressRoute: the Private Peering can be enabled for IPv6 only, so that VNETs in Azure do not have to have unique IPv4 address space assigned, which may be in short supply in an enterprise. Not all internal networking components are IPv6 capable yet. Most notable exceptions are VPN Gateway, Azure Firewall and Virtual WAN; IPv6 compatibility is on the roadmap for these services, but target availability dates have not been communicated. But now let's focus on Azure's externally facing, public, network services. Azure is ready to let customers publish their web applications over IPv6. IPv6 capable externally facing network services include: - Azure Front Door - Application Gateway - External Load Balancer - Public IP addresses and Public IP address prefixes - Azure DNS - Azure DDOS Protection - Traffic Manager - App Service (IPv6 support is in public preview) IPv6 Application Delivery IPv6 Application Delivery refers to the architectures and services that enable your web application to be accessible via IPv6. The goal is to provide an IPv6 address and connectivity for clients, while often continuing to run your application on IPv4 internally. Key benefits of adopting IPv6 in Azure include: ✅ Expanded Client Reach: IPv4-only websites risk being unreachable to IPv6-only networks. By enabling IPv6, you expand your reach into growing mobile and IoT markets that use IPv6 by default. Governments and enterprises increasingly mandate IPv6 support for public-facing services. ✅Address Abundance & No NAT: IPv6 provides a virtually unlimited address pool, mitigating IPv4 exhaustion concerns. This abundance means each service can have its own public IPv6 address, often removing the need for complex NAT schemes. End-to-end addressing can simplify connectivity and troubleshooting. ✅ Dual-Stack Compatibility: Azure supports dual-stack deployments where services listen on both IPv4 and IPv6. This allows a single application instance or endpoint to serve both types of clients seamlessly. Dual-stack ensures you don’t lose any existing IPv4 users while adding IPv6 capability. ✅Performance and Future Services: Some networks and clients might experience better performance over IPv6. Also, being IPv6-ready prepares your architecture for future Azure features and services as IPv6 integration deepens across the platform. General steps to enable IPv6 connectivity for a web application in Azure are: Plan and Enable IPv6 Addressing in Azure: Define an IPv6 address space in your Azure Virtual Network. Azure allows adding IPv6 address space to existing VNETs, making them dual-stack. A /56 segment for the VNET is recommended, /64 segment for subnets are required (Azure requires /64 subnets). If you have existing infrastructure, you might need to create new subnets or migrate resources, especially since older Application Gateway v1 instances cannot simply be “upgraded” to dual-stack. Deploy or Update Frontend Services with IPv6: Choose a suitable Azure service (Application Gateway, External / Global Load Balancer, etc.) and configure it with a public IPv6 address on the frontend. This usually means selecting *Dual Stack* configuration so the service gets both an IPv4 and IPv6 public IP. For instance, when creating an Application Gateway v2, you would specify IP address type: DualStack (IPv4 & IPv6). Azure Front Door by default provides dual-stack capabilities with its global endpoints. Configure Backends and Routing: Usually your backend servers or services will remain on IPv4. At the time of writing this in October 2025, Azure Application Gateway does not support IPv6 for backend pool addresses. This is fine because the frontend terminates the IPv6 network connection from the client, and the backend initiates an IPv4 connection to the backend pool or origin. Ensure that your load balancing rules, listener configurations, and health probes are all set up to route traffic to these backends. Both IPv4 and IPv6 frontend listeners can share the same backend pool. Azure Front Door does support IPv6 origins. Update DNS Records: Publish a DNS AAAA record for your application’s host name, pointing to the new IPv6 address. This step is critical so that IPv6-only clients can discover the IPv6 address of your service. If your service also has an IPv4 address, you will have both A (IPv4) and AAAA (IPv6) records for the same host name. DNS will thus allow clients of either IP family to connect. (In multi-region scenarios using Traffic Manager or Front Door, DNS configuration might be handled through those services as discussed later). Test IPv6 Connectivity: Once set up, test from an IPv6-enabled network or use online tools to ensure the site is reachable via IPv6. Azure’s services like Application Gateway and Front Door will handle the dual-stack routing, but it’s good to verify that content loads on an IPv6-only connection and that SSL certificates, etc., work over IPv6 as they do for IPv4. Next, we explore specific Azure services and architectures for IPv6 web delivery in detail. External Load Balancer - single region Azure External Load Balancer (also known as Public Load Balancer) can be deployed in a single region to provide IPv6 access to applications running on virtual machines or VM scale sets. External Load Balancer acts as a Layer 4 entry point for IPv6 traffic, distributing connections across backend instances. This scenario is ideal when you have stateless applications or services that do not require Layer 7 features like SSL termination or path-based routing. Key IPv6 Features of External Load Balancer: - Dual-Stack Frontend: Standard Load Balancer supports both IPv4 and IPv6 frontends simultaneously. When configured as dual-stack, the load balancer gets two public IP addresses – one IPv4 and one IPv6 – and can distribute traffic from both IP families to the same backend pool. - Zone-Redundant by Default: Standard Load Balancer is zone-redundant by default, providing high availability across Azure Availability Zones within a region without additional configuration. - IPv6 Frontend Availability: IPv6 support in Standard Load Balancer is available in all Azure regions. Basic Load Balancer does not support IPv6, so you must use Standard SKU. - IPv6 Backend Pool Support: While the frontend accepts IPv6 traffic, the load balancer will not translate IPv6 to IPv4. Backend pool members (VMs) must have private IPv6 addresses. You will need to add private IPv6 addressing to your existing VM IPv4-only infrastructure. This is in contrast to Application Gateway, discussed below, which will terminate inbound IPv6 network sessions and connect to the backend-end over IPv4. - Protocol Support: Supports TCP and UDP load balancing over IPv6, making it suitable for web applications and APIs, but also for non-web TCP- or UDP-based services accessed by IPv6-only clients. To set up an IPv6-capable External Load Balancer in one region, follow this high-level process: Enable IPv6 on the Virtual Network: Ensure the VNET where your backend VMs reside has an IPv6 address space. Add a dual-stack address space to the VNET (e.g., add an IPv6 space like 2001:db8:1234::/56 to complement your existing IPv4 space). Configure subnets that are dual-stack, containing both IPv4 and IPv6 prefixes (/64 for IPv6). Create Standard Load Balancer with IPv6 Frontend: In the Azure Portal, create a new Standard Load Balancer. During creation, configure the frontend IP with both IPv4 and IPv6 public IP addresses. Create or select existing Standard SKU public IP resources – one for IPv4 and one for IPv6. Configure Backend Pool: Add your virtual machines or VM scale set instances to the backend pool. Note that your backend instances will need to have private IPv6 addresses, in addition to IPv4 addresses, to receive inbound IPv6 traffic via the load balancer. Set Up Load Balancing Rules: Create load balancing rules that map frontend ports to backend ports. For web applications, typically map port 80 (HTTP) and 443 (HTTPS) from both the IPv4 and IPv6 frontends to the corresponding backend ports. Configure health probes to ensure only healthy instances receive traffic. Configure Network Security Groups: Ensure an NSG is present on the backend VM's subnet, allowing inbound traffic from the internet to the port(s) of the web application. Inbound traffic is "secure by default" meaning that inbound connectivity from internet is blocked unless there is an NSG present that explicitly allows it. DNS Configuration: Create DNS records for your application: an A record pointing to the IPv4 address and an AAAA record pointing to the IPv6 address of the load balancer frontend. Outcome: In this single-region scenario, IPv6-only clients will resolve your application's hostname to an IPv6 address and connect to the External Load Balancer over IPv6. Example: Consider a web application running on a VM (or a VM scale set) behind an External Load Balancer in Sweden Central. The VM runs the Azure Region and Client IP Viewer containerized application exposed on port 80, which displays the region the VM is deployed in and the calling client's IP address. The load balancer's front-end IPv6 address has a DNS name of ipv6webapp-elb-swedencentral.swedencentral.cloudapp.azure.com. When called from a client with an IPv6 address, the application shows its region and the client's address. Limitations & Considerations: - Standard SKU Required: Basic Load Balancer does not support IPv6. You must use Standard Load Balancer. - Layer 4 Only: Unlike Application Gateway, External Load Balancer operates at Layer 4 (transport layer). It cannot perform SSL termination, cookie-based session affinity, or path-based routing. If you need these features, consider Application Gateway instead. - Dual stack IPv4/IPv6 Backend required: Backend pool members must have private IPv6 addresses to receive inbound IPv6 traffic via the load balancer. The load balancer does not translate between the IPv6 frontend and an IPv4 backend. - Outbound Connectivity: If your backend VMs need outbound internet access over IPv6, you need to configure an IPv6 outbound rule. Global Load Balancer - multi-region Azure Global Load Balancer (aka Cross-Region Load Balancer) provides a cloud-native global network load balancing solution for distributing traffic across multiple Azure regions. Unlike DNS-based solutions, Global Load Balancer uses anycast IP addressing to automatically route clients to the nearest healthy regional deployment through Microsoft's global network. Key Features of Global Load Balancer: - Static Anycast Global IP: Global Load Balancer provides a single static public IP address (both IPv4 and IPv6 supported) that is advertised from all Microsoft WAN edge nodes globally. This anycast address ensures clients always connect to the nearest available Microsoft edge node without requiring DNS resolution. - Geo-Proximity Routing: The geo-proximity load-balancing algorithm minimizes latency by directing traffic to the nearest region where the backend is deployed. Unlike DNS-based routing, there's no DNS lookup delay - clients connect directly to the anycast IP and are immediately routed to the best region. - Layer 4 Pass-Through: Global Load Balancer operates as a Layer 4 pass-through network load balancer, preserving the original client IP address (including IPv6 addresses) for backend applications to use in their logic. - Regional Redundancy: If one region fails, traffic is automatically routed to the next closest healthy regional load balancer within seconds, providing instant global failover without DNS propagation delays. Architecture Overview: Global Load Balancer sits in front of multiple regional Standard Load Balancers, each deployed in different Azure regions. Each regional load balancer serves a local deployment of your application with IPv6 frontends. The global load balancer provides a single anycast IP address that clients worldwide can use to access your application, with automatic routing to the nearest healthy region. Multi-Region Deployment Steps: Deploy Regional Load Balancers: Create Standard External Load Balancers in multiple Azure regions (e.g. Sweden Central, East US2). Configure each with dual-stack frontends (IPv4 and IPv6 public IPs) and connect them to regional VM deployments or VM scale sets running your application. Configure Global Frontend IP address: Create a Global tier public IPv6 address for the frontend, in one of the supported Global Load Balancer home regions . This becomes your application's global anycast address. Create Global Load Balancer: Deploy the Global Load Balancer in the same home region. The home region is where the global load balancer resource is deployed - it doesn't affect traffic routing. Add Regional Backends: Configure the backend pool of the Global Load Balancer to include your regional Standard Load Balancers. Each regional load balancer becomes an endpoint in the global backend pool. The global load balancer automatically monitors the health of each regional endpoint. Set Up Load Balancing Rules: Create load balancing rules mapping frontend ports to backend ports. For web applications, typically map port 80 (HTTP) and 443 (HTTPS). The backend port on the global load balancer must match the frontend port of the regional load balancers. Configure Health Probes: Global Load Balancer automatically monitors the health of regional load balancers every 5 seconds. If a regional load balancer's availability drops to 0, it is automatically removed from rotation, and traffic is redirected to other healthy regions. DNS Configuration: Create DNS records pointing to the global load balancer's anycast IP addresses. Create both A (IPv4) and AAAA (IPv6) records for your application's hostname pointing to the global load balancer's static IPs. Outcome: IPv6 clients connecting to your application's hostname will resolve to the global load balancer's anycast IPv6 address. When they connect to this address, the Microsoft global network infrastructure automatically routes their connection to the nearest participating Azure region. The regional load balancer then distributes the traffic across local backend instances. If that region becomes unavailable, subsequent connections are automatically routed to the next nearest healthy region. Example: Our web application, which displays the region it is in, and the calling client's IP address, now runs on VMs behind External Load Balancers in Sweden Central and East US2. The External Load Balancer's front-ends are in the backend pool of a Global Load Balancer, which has a Global tier front-end IPv6 address. The front-end has an FQDN of `ipv6webapp-glb.eastus2.cloudapp.azure.com` (the region designation `eastus2` in the FQDN refers to the Global Load Balancer's "home region", into which the Global tier public IP must be deployed). When called from a client in Europe, Global Load Balancer directs the request to the instance deployed in Sweden Central. When called from a client in the US, Global Load Balancer directs the request to the instance deployed in US East 2. Features: - Client IP Preservation: The original IPv6 client address is preserved and available to backend applications, enabling IP-based logic and compliance requirements. - Floating IP Support: Configure floating IP at the global level for advanced networking scenarios requiring direct server return or high availability clustering. - Instant Scaling: Add or remove regional deployments behind the global endpoint without service interruption, enabling dynamic scaling for traffic events. - Multiple Protocol Support: Supports both TCP and UDP traffic distribution across regions, suitable for various application types beyond web services. Limitations & Considerations: - Home Region Requirement: Global Load Balancer can only be deployed in specific home regions, though this doesn't affect traffic routing performance. - Public Frontend Only: Global Load Balancer currently supports only public frontends - internal/private global load balancing is not available. - Standard Load Balancer Backends: Backend pool can only contain Standard Load Balancers, not Basic Load Balancers or other resource types. - Same IP Version Requirement: NAT64 translation isn't supported - frontend and backend must use the same IP version (IPv4 or IPv6). - Port Consistency: Backend port on global load balancer must match the frontend port of regional load balancers for proper traffic flow. - Health Probe Dependencies: Regional load balancers must have proper health probes configured for the global load balancer to accurately assess regional health. Comparison with DNS-Based Solutions: Unlike Traffic Manager or other DNS-based global load balancing solutions, Global Load Balancer provides: - Instant Failover: No DNS TTL delays - failover happens within seconds at the network level. - True Anycast: Single IP address that works globally without client-side DNS resolution. - Consistent Performance: Geo-proximity routing through Microsoft's backbone network ensures optimal paths. - Simplified Management: No DNS record management or TTL considerations. This architecture delivers global high availability and optimal performance for IPv6 applications through anycast routing, making it a good solution for latency-sensitive applications requiring worldwide accessibility with near-instant regional failover. Application Gateway - single region Azure Application Gateway can be deployed in a single region to provide IPv6 access to applications in that region. Application Gateway acts as the entry point for IPv6 traffic, terminating HTTP/S from IPv6 clients and forwarding to backend servers over IPv4. This scenario works well when your web application is served from one Azure region and you want to enable IPv6 connectivity for it. Key IPv6 Features of Application Gateway (v2 SKU): - Dual-Stack Frontend: Application Gateway v2 supports both [IPv4 and IPv6 frontends](https://learn.microsoft.com/en-us/azure/application-gateway/application-gateway-faq). When configured as dual-stack, the gateway gets two IP addresses – one IPv4 and one IPv6 – and can listen on both. (IPv6-only is not supported; IPv4 is always paired). IPv6 support requires Application Gateway v2, v1 does not support IPv6. - No IPv6 on Backends: The backend pool must use IPv4 addresses. IPv6 addresses for backend servers are currently not supported. This means your web servers can remain on IPv4 internal addresses, simplifying adoption because you only enable IPv6 on the frontend. - WAF Support: The Application Gateway Web Application Firewall (WAF) will inspect IPv6 client traffic just as it does IPv4. Single Region Deployment Steps: To set up an IPv6-capable Application Gateway in one region, consider the following high-level process: Enable IPv6 on the Virtual Network: Ensure the region’s VNET where the Application Gateway will reside has an IPv6 address space. Configure a subnet for the Application Gateway that is dual-stack (contains both an IPv4 subnet prefix and an IPv6 /64 prefix). Deploy Application Gateway (v2) with Dual Stack Frontend: Create a new Application Gateway using the Standard_v2 or WAF_v2 SKU. Populate Backend Pool: Ensure your backend pool (the target application servers or service) contains (DNS names pointing to) IPv4 addresses of your actual web servers. IPv6 addresses are not supported for backends. Configure Listeners and Rules: Set up listeners on the Application Gateway for your site. When creating an HTTP(S) listener, you choose which frontend IP to use – you would create one listener for IPv4 address and one for IPv6. Both listeners can use the same domain name (hostname) and the same underlying routing rule to your backend pool. Testing and DNS: After the gateway is deployed and configured, note the IPv6 address of the frontend (you can find it in the Gateway’s overview or in the associated Public IP resource). Update your application’s DNS records: create an AAAA record pointing to this IPv6 address (and update the A record to point to the IPv4 if it changed). With DNS in place, test the application by accessing it from an IPv6-enabled client or tool. Outcome: In this single-region scenario, IPv6-only clients will resolve your website’s hostname to an IPv6 address and connect to the Application Gateway over IPv6. The Application Gateway then handles the traffic and forwards it to your application over IPv4 internally. From the user perspective, the service now appears natively on IPv6. Importantly, this does not require any changes to the web servers, which can continue using IPv4. Application Gateway will include the source IPv6 address in an X-Forwarded-For header, so that the backend application has visibility of the originating client's address. Example: Our web application, which displays the region it is deployed in and the calling client's IP address, now runs on a VM behind Application Gateway in Sweden Central. The front-end has an FQDN of `ipv6webapp-appgw-swedencentral.swedencentral.cloudapp.azure.com`. Application Gateway terminates the IPv6 connection from the client and proxies the traffic to the application over IPv4. The client's IPv6 address is passed in the X-Forwarded-For header, which is read and displayed by the application. Calling the application's API endpoint at `/api/region` shows additional detail, including the IPv4 address of the Application Gateway instance that initiates the connection to the backend, and the original client IPv6 address (with the source port number appended) preserved in the X-Forwarded-For header. { "region": "SwedenCentral", "clientIp": "2001:1c04:3404:9500:fd9b:58f4:1fb2:db21:60769", "xForwardedFor": "2001:1c04:3404:9500:fd9b:58f4:1fb2:db21:60769", "remoteAddress": "::ffff:10.1.0.4", "isPrivateIP": false, "expressIp": "2001:1c04:3404:9500:fd9b:58f4:1fb2:db21:60769", "connectionInfo": { "remoteAddress": "::ffff:10.1.0.4", "remoteFamily": "IPv6", "localAddress": "::ffff:10.1.1.68", "localPort": 80 }, "allHeaders": { "x-forwarded-for": "2001:1c04:3404:9500:fd9b:58f4:1fb2:db21:60769" }, "deploymentAdvice": "Public IP detected successfully" } Limitations & Considerations: - Application Gateway v1 SKUs are not supported for IPv6. If you have an older deployment on v1, you’ll need to migrate to v2. - IPv6-only Application Gateway is not allowed. You must have IPv4 alongside IPv6 (the service must be dual-stack). This is usually fine, as dual-stack ensures all clients are covered. - No IPv6 backend addresses: The backend pool must have IPv4 addresses. - Management and Monitoring: Application Gateway logs traffic from IPv6 clients to Log Analytics (the client IP field will show IPv6 addresses). - Security: Azure’s infrastructure provides basic DDoS protection for IPv6 endpoints just as for IPv4. However, it is highly recommended to deploy Azure DDoS Protection Standard: this provides enhanced mitigation tailored to your specific deployment. Consider using the Web Application Firewall function for protection against application layer attacks. Application Gateway - multi-region Mission-critical web applications should be deploy in multiple Azure regions, achieving higher availability and lower latency for users worldwide. In a multi-region scenario, you need a mechanism to direct IPv6 client traffic to the “nearest” or healthiest region. Azure Application Gateway by itself is a regional service, so to use it in multiple regions, we use Azure Traffic Manager for global DNS load balancing, or use Azure Front Door (covered in the next section) as an alternative. This section focuses on the Traffic Manager + Application Gateway approach to multi-region IPv6 delivery. Azure Traffic Manager is a DNS-based load balancer that can distribute traffic across endpoints in different regions. It works by responding to DNS queries with the appropriate endpoint FQDN or IP, based on the routing method (Performance, Priority, Geographic) configured. Traffic Manager is agnostic to the IP version: it either returns CNAMEs, or AAAA records for IPv6 endpoints and A records for IPv4. This makes it suitable for routing IPv6 traffic globally. Architecture Overview: Each region has its own dual-stack Application Gateway. Traffic Manager is configured with an endpoint entry for each region’s gateway. The application’s FQDN is now a domain name hosted by Traffic Manager such as ipv6webapp.traffimanager.net, or a CNAME that ultimately points to it. DNS resolution will go through Traffic Manager, which decides which regional gateway’s FQDN to return. The client then connects directly to that Application Gateway’s IPv6 address, as follows: 1. DNS query: Client asks for ipv6webapp.trafficmanager.net, which is hosted in a Traffic Manager profile. 2. Traffic Manager decision: Traffic Manager sees an incoming DNS request and chooses the best endpoint (say, Sweden Central) based on routing rules (e.g., geographic proximity or lowest latency). 3. Traffic Manager response: Traffic Manager returns the FQDN of the Sweden Central Application Gateway to the client. 4. DNS Resolution: The client resolves regional FQDN and receives a AAAA response containing the IPv6 address. 5. Client connects: The client’s browser connects to the West Europe App Gateway IPv6 address directly. The HTTP/S session is established via IPv6 to that regional gateway, which then handles the request. 6. Failover: If that region becomes unavailable, Traffic Manager’s health checks will detect it and subsequent DNS queries will be answered with the FQDN of the secondary region’s gateway. Deployment Steps for Multi-Region with Traffic Manager: Set up Dual-Stack Application Gateways in each region: Similar to the single-region case, deploy an Azure Application Gateway v2 in each desired region (e.g., one in North America, one in Europe). Configure the web application in each region, these should be parallel deployments serving the same content. Configure a Traffic Manager Profile: In Azure Traffic Manager, create a profile and choose a routing method (such as Performance for nearest region routing, or Priority for primary/backup failover). Add endpoints for each region. Since our endpoints are Azure services with IPs, we can either use Azure endpoints (if the Application Gateways have Azure-provided DNS names) or External endpoints using the IP addresses. The simplest way is to use the Public IP resource of each Application Gateway as an Azure endpoint – ensure each App Gateway’s public IP has a DNS label (so it has a FQDN). Traffic Manager will detect those and also be aware of their IPs. Alternatively, use the IPv6 address as an External endpoint directly. Traffic Manager allows IPv6 addresses and will return AAAA records for them. DNS Setup: Traffic Manager profiles have a FQDN (like ipv6webapp.trafficmanager.net). You can either use that as your service’s CNAME, or you can configure your custom domain to CNAME to the Traffic Manager profile. Health Probing: Traffic Manager continuously checks the health of endpoints. When endpoints are Azure App Gateways, it uses HTTP/S probes to a specified URI path, to each gateway’s address. Make sure each App Gateway has a listener on the probing endpoint (e.g., a health check page) and that health probes are enabled. Testing Failover and Distribution: Test the setup by querying DNS from different geographical locations (to see if you get the nearest region’s IP). Also simulate a region down (stop the App Gateway or backend) and observe if Traffic Manager directs traffic to the other region. Because DNS TTLs are involved, failover isn’t instant but typically within a couple of minutes depending on TTL and probe interval. Considerations in this Architecture: - Latency vs Failover: Traffic Manager as a DNS load balancer directs users at connect time, but once a client has an answer (IP address), it keeps sending to that address until the DNS record TTL expires and it re-resolves. This is fine for most web apps. Ensure the TTL in the Traffic Manager profile is not too high (the default is 30 seconds). - IPv6 DNS and Connectivity: Confirm that each region’s IPv6 address is correctly configured and reachable globally. Azure’s public IPv6 addresses are globally routable. Traffic Manager itself is a global service and fully supports IPv6 in its decision-making. - Cost: Using multiple Application Gateways and Traffic Manager incurs costs for each component (App Gateway is per hour + capacity unit, Traffic Manager per million DNS queries). This is a trade-off for high availability. - Alternative: Azure Front Door: Azure Front Door is an alternative to the Traffic Manager + Application Gateway combination. Front Door can automatically handle global routing and failover at layer 7 without DNS-based limitations, offering potentially faster failover. Azure Front Door is discussed in the next section. In summary, a multi-region IPv6 web delivery with Application Gateways uses Traffic Manager for global DNS load balancing. Traffic Manager will seamlessly return IPv6 addresses for IPv6 clients, ensuring that no matter where an IPv6-only client is, they get pointed to the nearest available regional deployment of your app. This design achieves global resiliency (withstand a regional outage) and low latency access, leveraging IPv6 connectivity on each regional endpoint. Example: The global FQDN of our application is now ipv6webapp.trafficmanager.net and clients will use this FQDN to access the application regardless of their geographical location. Traffic Manager will return the FQDN of one of the regional deployments, `ipv6webapp-appgw-swedencentral.swedencentral.cloudapp.azure.com` or `ipv6webappr2-appgw-eastus2.eastus2.cloudapp.azure.com` depending on the routing method configured, the health state of the regional endpoints and the client's location. Then the client resolves the regional FQDN through its local DNS server and connects to the regional instance of the application. DNS resolution from a client in Europe: Resolve-DnsName ipv6webapp.trafficmanager.net Name Type TTL Section NameHost ---- ---- --- ------- -------- ipv6webapp.trafficmanager.net CNAME 59 Answer ipv6webapp-appgw-swedencentral.swedencentral.cloudapp.azure.com Name : ipv6webapp-appgw-swedencentral.swedencentral.cloudapp.azure.com QueryType : AAAA TTL : 10 Section : Answer IP6Address : 2603:1020:1001:25::168 And from a client in the US: Resolve-DnsName ipv6webapp.trafficmanager.net Name Type TTL Section NameHost ---- ---- --- ------- -------- ipv6webapp.trafficmanager.net CNAME 60 Answer ipv6webappr2-appgw-eastus2.eastus2.cloudapp.azure.com Name : ipv6webappr2-appgw-eastus2.eastus2.cloudapp.azure.com QueryType : AAAA TTL : 10 Section : Answer IP6Address : 2603:1030:403:17::5b0 Azure Front Door Azure Front Door is an application delivery network with built-in CDN, SSL offload, WAF, and routing capabilities. It provides a single, unified frontend distributed across Microsoft’s edge network. Azure Front Door natively supports IPv6 connectivity. For applications that have users worldwide, Front Door offers advantages: - Global Anycast Endpoint: Provides anycast IPv4 and IPv6 addresses, advertised out of all edge locations, with automatic A and AAAA DNS record support. - IPv4 and IPv6 origin support: Azure Front Door supports both IPv4 and IPv6 origins (i.e. backends), both within Azure and externally (i.e. accessible over the internet). - Simplified DNS: Custom domains can be mapped using CNAME records. - Layer-7 Routing: Supports path-based routing and automatic backend health detection. - Edge Security: Includes DDoS protection and optional WAF integration. Front Door enables "cross-IP version" scenario's: a client can connect to the Front Door front-end over IPv6, and then Front Door can connect to an IPv4 origin. Conversely, an IPv4-only client can retrieve content from an IPv6 backend via Front Door. Front Door preserves the client's source IP address in the X-Forwarded-For header. Note: Front Door provides managed IPv6 addresses that are not customer-owned resources. Custom domains should use CNAME records pointing to the Front Door hostname rather than direct IP address references. Private Link Integration Azure Front Door Premium introduces Private Link integration, enabling secure, private connectivity between Front Door and backend resources, without exposing them to the public internet. When Private Link is enabled, Azure Front Door establishes a private endpoint within a Microsoft-managed virtual network. This endpoint acts as a secure bridge between Front Door’s global edge network and your origin resources, such as Azure App Service, Azure Storage, Application Gateway, or workloads behind an internal load balancer. Traffic from end users still enters through Front Door’s globally distributed POPs, benefiting from features like SSL offload, caching, and WAF protection. However, instead of routing to your origin over public, internet-facing, endpoints, Front Door uses the private Microsoft backbone to reach the private endpoint. This ensures that all traffic between Front Door and your origin remains isolated from external networks. The private endpoint connection requires approval from the origin resource owner, adding an extra layer of control. Once approved, the origin can restrict public access entirely, enforcing that all traffic flows through Private Link. Private Link integration brings following benefits: - Enhanced Security: By removing public exposure of backend services, Private Link significantly reduces the risk of DDoS attacks, data exfiltration, and unauthorized access. - Compliance and Governance: Many regulatory frameworks mandate private connectivity for sensitive workloads. Private Link helps meet these requirements without sacrificing global availability. - Performance and Reliability: Traffic between Front Door and your origin travels over Microsoft’s high-speed backbone network, delivering low latency and consistent performance compared to public internet paths. - Defense in Depth: Combined with Web Application Firewall (WAF), TLS encryption, and DDoS protection, Private Link strengthens your security posture across multiple layers. - Isolation and Control: Resource owners maintain control over connection approvals, ensuring that only authorized Front Door profiles can access the origin. - Integration with Hybrid Architectures: For scenarios involving AKS clusters, custom APIs, or workloads behind internal load balancers, Private Link enables secure connectivity without requiring public IPs or complex VPN setups. Private Link transforms Azure Front Door from a global entry point into a fully private delivery mechanism for your applications, aligning with modern security principles and enterprise compliance needs. Example: Our application is now placed behind Azure Front Door. We are combining a public backend endpoint and Private Link integration, to show both in action in a single example. The Sweden Central origin endpoint is the public IPv6 endpoint of the regional External Load Balancers and the origin in US East 2 is connected via Private Link integration The global FQDN `ipv6webapp-d4f4euhnb8fge4ce.b01.azurefd.net` and clients will use this FQDN to access the application regardless of their geographical location. The FQDN resolves to Front Door's global anycast address, and the internet will route client requests to the nearest Microsoft edge from this address is advertised. Front Door will then transparently route the request to the nearest origin deployment in Azure. Although public endpoints are used in this example, that traffic will be route over the Microsoft network. From a client in Europe: Calling the application's api endpoint on `ipv6webapp-d4f4euhnb8fge4ce.b01.azurefd.net/api/region` shows some more detail. { "region": "SwedenCentral", "clientIp": "2001:1c04:3404:9500:fd9b:58f4:1fb2:db21", "xForwardedFor": "2001:1c04:3404:9500:fd9b:58f4:1fb2:db21", "remoteAddress": "2a01:111:2053:d801:0:afd:ad4:1b28", "isPrivateIP": false, "expressIp": "2001:1c04:3404:9500:fd9b:58f4:1fb2:db21", "connectionInfo": { "remoteAddress": "2a01:111:2053:d801:0:afd:ad4:1b28", "remoteFamily": "IPv6", "localAddress": "2001:db8:1:1::4", "localPort": 80 }, "allHeaders": { "x-forwarded-for": "2001:1c04:3404:9500:fd9b:58f4:1fb2:db21", "x-azure-clientip": "2001:1c04:3404:9500:fd9b:58f4:1fb2:db21" }, "deploymentAdvice": "Public IP detected successfully" } "remoteAddress": "2a01:111:2053:d801:0:afd:ad4:1b28" is the address from which Front Door sources its request to the origin. From a client in the US: The detailed view shows that the IP address calling the backend instance now is local VNET address. Private Link sources traffic coming in from a local address taken from the VNET it is in. The original client IP address is again preserved in the X-Forwarded-For header. { "region": "eastus2", "clientIp": "2603:1030:501:23::68:55658", "xForwardedFor": "2603:1030:501:23::68:55658", "remoteAddress": "::ffff:10.2.1.5", "isPrivateIP": false, "expressIp": "2603:1030:501:23::68:55658", "connectionInfo": { "remoteAddress": "::ffff:10.2.1.5", "remoteFamily": "IPv6", "localAddress": "::ffff:10.2.2.68", "localPort": 80 }, "allHeaders": { "x-forwarded-for": "2603:1030:501:23::68:55658" }, "deploymentAdvice": "Public IP detected successfully" } Conclusion IPv6 adoption for web applications is no longer optional. It is essential as public IPv4 address space is depleted, mobile networks increasingly use IPv6 only and governments mandate IPv6 reachability for public services. Azure's comprehensive dual-stack networking capabilities provide a clear path forward, enabling organizations to leverage IPv6 externally without sacrificing IPv4 compatibility or requiring complete infrastructure overhauls. Azure's externally facing services — including Application Gateway, External Load Balancer, Global Load Balancer, and Front Door — support IPv6 frontends, while Application Gateway and Front Door maintain IPv4 backend connectivity. This architecture allows applications to remain unchanged while instantly becoming accessible to IPv6-only clients. For single-region deployments, Application Gateway offers layer-7 features like SSL termination and WAF protection. External Load Balancer provides high-performance layer-4 distribution. Multi-region scenarios benefit from Traffic Manager's DNS-based routing combined with regional Application Gateways, or the superior performance and failover capabilities of Global Load Balancer's anycast addressing. Azure Front Door provides global IPv6 delivery with edge optimization, built-in security, and seamless failover across Microsoft's network. Private Link integration allows secure global IPv6 distribution while maintaining backend isolation. The transition to IPv6 application delivery on Azure is straightforward: enable dual-stack addressing on virtual networks, configure IPv6 frontends on load balancing services, and update DNS records. With Application Gateway or Front Door, backend applications require no modifications. These Azure services handle the IPv4-to-IPv6 translation seamlessly. This approach ensures both immediate IPv6 accessibility and long-term architectural flexibility as IPv6 adoption accelerates globally.202Views1like0CommentsDNS flow trace logs in Azure Firewall are now generally available
Background Azure Firewall helps secure your network by filtering traffic and enforcing policies for your workloads and applications. DNS Proxy, a key capability in Azure Firewall, enables the firewall to act as a DNS forwarder for DNS traffic. Today, we’re introducing the general availability of DNS flow trace logs — a new logging capability that provides end-to-end visibility into DNS traffic and name resolution across your environment, such as viewing critical metadata including query types, response codes, queried domains, upstream DNS servers, and the source and destination IPs of each request. Why DNS flow trace logs? Existing Azure Firewall DNS Proxy logs provide visibility for DNS queries as they initially pass through Azure Firewall. While helpful, customers have asked for deeper insights to troubleshoot, audit, and analyze DNS behavior more comprehensively. DNS flow trace logs address this by offering richer, end-to-end logging, including DNS query paths, cache usage, forwarding decisions, and resolution outcomes. With these logs, you can: Troubleshoot faster with detailed query and response information throughout the full resolution flow Validate caching behavior by determining whether Azure Firewall’s DNS cache was used Gain deeper insights into query types, response codes, forwarding logic, and errors Example scenarios Custom DNS configurations – Verify traffic forwarding paths and ensure custom DNS servers are functioning and responding as expected Connectivity issues – Debug DNS resolution issues that prevent apps from connecting to critical services. Getting started in Azure Portal Navigate to your Azure Firewall resource in the Azure Portal. Select Diagnostic settings under Monitoring. Choose an existing diagnostic setting or create a new one. Under Log, select DNS flow trace logs. Stream logs to Log Analytics, Storage, or Event Hub as needed. Save the settings. Azure Firewall logging ✨ Next steps DNS flow trace logs give you greater visibility and control over DNS traffic in Azure Firewall, helping you secure, troubleshoot, and optimize your network with confidence. 🚀 Try DNS flow trace logs today, now generally available – and share your feedback with the team Learn more about how to configure and monitor these logs in the Azure Firewall monitoring data reference documentation.816Views0likes0CommentsDeploying Third-Party Firewalls in Azure Landing Zones: Design, Configuration, and Best Practices
As enterprises adopt Microsoft Azure for large-scale workloads, securing network traffic becomes a critical part of the platform foundation. Azure’s Well-Architected Framework provides the blueprint for enterprise-scale Landing Zone design and deployments, and while Azure Firewall is a built-in PaaS option, some organizations prefer third-party firewall appliances for familiarity, feature depth, and vendor alignment. This blog explains the basic design patterns, key configurations, and best practices when deploying third-party firewalls (Palo Alto, Fortinet, Check Point, etc.) as part of an Azure Landing Zone. 1. Landing Zone Architecture and Firewall Role The Azure Landing Zone is Microsoft’s recommended enterprise-scale architecture for adopting cloud at scale. It provides a standardized, modular design that organizations can use to deploy and govern workloads consistently across subscriptions and regions. At its core, the Landing Zone follows a hub-and-spoke topology: Hub (Connectivity Subscription): Central place for shared services like DNS, private endpoints, VPN/ExpressRoute gateways, Azure Firewall (or third-party firewall appliances), Bastion, and monitoring agents. Provides consistent security controls and connectivity for all workloads. Firewalls are deployed here to act as the traffic inspection and enforcement point. Spokes (Workload Subscriptions): Application workloads (e.g., SAP, web apps, data platforms) are placed in spoke VNets. Additional specialized spokes may exist for Identity, Shared Services, Security, or Management. These are isolated for governance and compliance, but all connectivity back to other workloads or on-premises routes through the hub. Traffic Flows Through Firewalls North-South Traffic: Inbound connections from the Internet (e.g., customer access to applications). Outbound connections from Azure workloads to Internet services. Hybrid connectivity to on-premises datacenters or other clouds. Routed through the external firewall set for inspection and policy enforcement. East-West Traffic: Lateral traffic between spokes (e.g., Application VNet to Database VNet). Communication across environments like Dev → Test → Prod (if allowed). Routed through an internal firewall set to apply segmentation, zero-trust principles, and prevent lateral movement of threats. Why Firewalls Matter in the Landing Zone While Azure provides NSGs (Network Security Groups) and Route Tables for basic packet filtering and routing, these are not sufficient for advanced security scenarios. Firewalls add: Deep packet inspection (DPI) and application-level filtering. Intrusion Detection/Prevention (IDS/IPS) capabilities. Centralized policy management across multiple spokes. Segmentation of workloads to reduce blast radius of potential attacks. Consistent enforcement of enterprise security baselines across hybrid and multi-cloud. Organizations May Choose Depending on security needs, cost tolerance, and operational complexity, organizations typically adopt one of two models for third party firewalls: Two sets of firewalls One set dedicated for north-south traffic (external to Azure). Another set for east-west traffic (between VNets and spokes). Provides the highest security granularity, but comes with higher cost and management overhead. Single set of firewalls A consolidated deployment where the same firewall cluster handles both east-west and north-south traffic. Simpler and more cost-effective, but may introduce complexity in routing and policy segregation. This design choice is usually made during Landing Zone design, balancing security requirements, budget, and operational maturity. 2. Why Choose Third-Party Firewalls Over Azure Firewall? While Azure Firewall provides simplicity as a managed service, customers often choose third-party solutions due to: Advanced features – Deep packet inspection, IDS/IPS, SSL decryption, threat feeds. Vendor familiarity – Network teams trained on Palo Alto, Fortinet, or Check Point. Existing contracts – Enterprise license agreements and support channels. Hybrid alignment – Same vendor firewalls across on-premises and Azure. Azure Firewall is a fully managed PaaS service, ideal for customers who want a simple, cloud-native solution without worrying about underlying infrastructure. However, many enterprises continue to choose third-party firewall appliances (Palo Alto, Fortinet, Check Point, etc.) when implementing their Landing Zones. The decision usually depends on capabilities, familiarity, and enterprise strategy. Key Reasons to Choose Third-Party Firewalls Feature Depth and Advanced Security Third-party vendors offer advanced capabilities such as: Deep Packet Inspection (DPI) for application-aware filtering. Intrusion Detection and Prevention (IDS/IPS). SSL/TLS decryption and inspection. Advanced threat feeds, malware protection, sandboxing, and botnet detection. While Azure Firewall continues to evolve, these vendors have a longer track record in advanced threat protection. Operational Familiarity and Skills Network and security teams often have years of experience managing Palo Alto, Fortinet, or Check Point appliances on-premises. Adopting the same technology in Azure reduces the learning curve and ensures faster troubleshooting, smoother operations, and reuse of existing playbooks. Integration with Existing Security Ecosystem Many organizations already use vendor-specific management platforms (e.g., Panorama for Palo Alto, FortiManager for Fortinet, or SmartConsole for Check Point). Extending the same tools into Azure allows centralized management of policies across on-premises and cloud, ensuring consistent enforcement. Compliance and Regulatory Requirements Certain industries (finance, healthcare, government) require proven, certified firewall vendors for security compliance. Customers may already have third-party solutions validated by auditors and prefer extending those to Azure for consistency. Hybrid and Multi-Cloud Alignment Many enterprises run a hybrid model, with workloads split across on-premises, Azure, AWS, or GCP. Third-party firewalls provide a common security layer across environments, simplifying multi-cloud operations and governance. Customization and Flexibility Unlike Azure Firewall, which is a managed service with limited backend visibility, third-party firewalls give admins full control over operating systems, patching, advanced routing, and custom integrations. This flexibility can be essential when supporting complex or non-standard workloads. Licensing Leverage (BYOL) Enterprises with existing enterprise agreements or volume discounts can bring their own firewall licenses (BYOL) to Azure. This often reduces cost compared to pay-as-you-go Azure Firewall pricing. When Azure Firewall Might Still Be Enough Organizations with simple security needs (basic north-south inspection, FQDN filtering). Cloud-first teams that prefer managed services with minimal infrastructure overhead. Customers who want to avoid manual scaling and VM patching that comes with IaaS appliances. In practice, many large organizations use a hybrid approach: Azure Firewall for lightweight scenarios or specific environments, and third-party firewalls for enterprise workloads that require advanced inspection, vendor alignment, and compliance certifications. 3. Deployment Models in Azure Third-party firewalls in Azure are primarily IaaS-based appliances deployed as virtual machines (VMs). Leading vendors publish Azure Marketplace images and ARM/Bicep templates, enabling rapid, repeatable deployments across multiple environments. These firewalls allow organizations to enforce advanced network security policies, perform deep packet inspection, and integrate with Azure-native services such as Virtual Network (VNet) peering, Azure Monitor, and Azure Sentinel. Note: Some vendors now also release PaaS versions of their firewalls, offering managed firewall services with simplified operations. However, for the purposes of this blog, we will focus mainly on IaaS-based firewall deployments. Common Deployment Modes Active-Active Description: In this mode, multiple firewall VMs operate simultaneously, sharing the traffic load. An Azure Load Balancer distributes inbound and outbound traffic across all active firewall instances. Use Cases: Ideal for environments requiring high throughput, resilience, and near-zero downtime, such as enterprise data centers, multi-region deployments, or mission-critical applications. Considerations: Requires careful route and policy synchronization between firewall instances to ensure consistent traffic handling. Typically involves BGP or user-defined routes (UDRs) for optimal traffic steering. Scaling is easier: additional firewall VMs can be added behind the load balancer to handle traffic spikes. Active-Passive Description: One firewall VM handles all traffic (active), while the secondary VM (passive) stands by for failover. When the active node fails, Azure service principals manage IP reassignment and traffic rerouting. Use Cases: Suitable for environments where simpler management and lower operational complexity are preferred over continuous load balancing. Considerations: Failover may result in a brief downtime, typically measured in seconds to a few minutes. Synchronization between the active and passive nodes ensures firewall policies, sessions, and configurations are mirrored. Recommended for smaller deployments or those with predictable traffic patterns. Network Interfaces (NICs) Third-party firewall VMs often include multiple NICs, each dedicated to a specific type of traffic: Untrust/Public NIC: Connects to the Internet or external networks. Handles inbound/outbound public traffic and enforces perimeter security policies. Trust/Internal NIC: Connects to private VNets or subnets. Manages internal traffic between application tiers and enforces internal segmentation. Management NIC: Dedicated to firewall management traffic. Keeps administration separate from data plane traffic, improving security and reducing performance interference. HA NIC (Active-Passive setups): Facilitates synchronization between active and passive firewall nodes, ensuring session and configuration state is maintained across failovers. This design choice is usually made during Landing Zone design, balancing security requirements, budget, and operational maturity. : NICs of Palo Alto External Firewalls and FortiGate Internal Firewalls in two sets of firewall scenario 4. Key Configuration Considerations When deploying third-party firewalls in Azure, several design and configuration elements play a critical role in ensuring security, performance, and high availability. These considerations should be carefully aligned with organizational security policies, compliance requirements, and operational practices. Routing User-Defined Routes (UDRs): Define UDRs in spoke Virtual Networks to ensure all outbound traffic flows through the firewall, enforcing inspection and security policies before reaching the Internet or other Virtual Networks. Centralized routing helps standardize controls across multiple application Virtual Networks. Depending on the architecture traffic flow design, use appropriate Load Balancer IP as the Next Hop on UDRs of spoke Virtual Networks. Symmetric Routing: Ensure traffic follows symmetric paths (i.e., outbound and inbound flows pass through the same firewall instance). Avoid asymmetric routing, which can cause stateful firewalls to drop return traffic. Leverage BGP with Azure Route Server where supported, to simplify route propagation across hub-and-spoke topologies. : Azure UDR directing all traffic from a Spoke VNET to the Firewall IP Address Policies NAT Rules: Configure DNAT (Destination NAT) rules to publish applications securely to the Internet. Use SNAT (Source NAT) to mask private IPs when workloads access external resources. Security Rules: Define granular allow/deny rules for both north-south traffic (Internet to VNet) and east-west traffic (between Virtual Networks or subnets). Ensure least privilege by only allowing required ports, protocols, and destinations. Segmentation: Apply firewall policies to separate workloads, environments, and tenants (e.g., Production vs. Development). Enforce compliance by isolating workloads subject to regulatory standards (PCI-DSS, HIPAA, GDPR). Application-Aware Policies: Many vendors support Layer 7 inspection, enabling controls based on applications, users, and content (not just IP/port). Integrate with identity providers (Azure AD, LDAP, etc.) for user-based firewall rules. : Example Configuration of NAT Rules on a Palo Alto External Firewall Load Balancers Internal Load Balancer (ILB): Use ILBs for east-west traffic inspection between Virtual Networks or subnets. Ensures that traffic between applications always passes through the firewall, regardless of origin. External Load Balancer (ELB): Use ELBs for north-south traffic, handling Internet ingress and egress. Required in Active-Active firewall clusters to distribute traffic evenly across firewall nodes. Other Configurations: Configure health probes for firewall instances to ensure faulty nodes are automatically bypassed. Validate Floating IP configuration on Load Balancing Rules according to the respective vendor recommendations. Identity Integration Azure Service Principals: In Active-Passive deployments, configure service principals to enable automated IP reassignment during failover. This ensures continuous service availability without manual intervention. Role-Based Access Control (RBAC): Integrate firewall management with Azure RBAC to control who can deploy, manage, or modify firewall configurations. SIEM Integration: Stream logs to Azure Monitor, Sentinel, or third-party SIEMs for auditing, monitoring, and incident response. Licensing Pay-As-You-Go (PAYG): Licenses are bundled into the VM cost when deploying from the Azure Marketplace. Best for short-term projects, PoCs, or variable workloads. Bring Your Own License (BYOL): Enterprises can apply existing contracts and licenses with vendors to Azure deployments. Often more cost-effective for large-scale, long-term deployments. Hybrid Licensing Models: Some vendors support license mobility from on-premises to Azure, reducing duplication of costs. 5. Common Challenges Third-party firewalls in Azure provide strong security controls, but organizations often face practical challenges in day-to-day operations: Misconfiguration Incorrect UDRs, route tables, or NAT rules can cause dropped traffic or bypassed inspection. Asymmetric routing is a frequent issue in hub-and-spoke topologies, leading to session drops in stateful firewalls. Performance Bottlenecks Firewall throughput depends on the VM SKU (CPU, memory, NIC limits). Under-sizing causes latency and packet loss, while over-sizing adds unnecessary cost. Continuous monitoring and vendor sizing guides are essential. Failover Downtime Active-Passive models introduce brief service interruptions while IPs and routes are reassigned. Some sessions may be lost even with state sync, making Active-Active more attractive for mission-critical workloads. Backup & Recovery Azure Backup doesn’t support vendor firewall OS. Configurations must be exported and stored externally (e.g., storage accounts, repos, or vendor management tools). Without proper backups, recovery from failures or misconfigurations can be slow. Azure Platform Limits on Connections Azure imposes a per-VM cap of 250,000 active connections, regardless of what the firewall vendor appliance supports. This means even if an appliance is designed for millions of sessions, it will be constrained by Azure’s networking fabric. Hitting this cap can lead to unexplained traffic drops despite available CPU/memory. The workaround is to scale out horizontally (multiple firewall VMs behind a load balancer) and carefully monitor connection distribution. 6. Best Practices for Third-Party Firewall Deployments To maximize security, reliability, and performance of third-party firewalls in Azure, organizations should follow these best practices: Deploy in Availability Zones: Place firewall instances across different Availability Zones to ensure regional resilience and minimize downtime in case of zone-level failures. Prefer Active-Active for Critical Workloads: Where zero downtime is a requirement, use Active-Active clusters behind an Azure Load Balancer. Active-Passive can be simpler but introduces failover delays. Use Dedicated Subnets for Interfaces: Separate trust, untrust, HA, and management NICs into their own subnets. This enforces segmentation, simplifies route management, and reduces misconfiguration risk. Apply Least Privilege Policies: Always start with a deny-all baseline, then allow only necessary applications, ports, and protocols. Regularly review rules to avoid policy sprawl. Standardize Naming & Tagging: Adopt consistent naming conventions and resource tags for firewalls, subnets, route tables, and policies. This aids troubleshooting, automation, and compliance reporting. Validate End-to-End Traffic Flows: Test both north-south (Internet ↔ VNet) and east-west (VNet ↔ VNet/subnet) flows after deployment. Use tools like Azure Network Watcher and vendor traffic logs to confirm inspection. Plan for Scalability: Monitor throughput, CPU, memory, and session counts to anticipate when scale-out or higher VM SKUs are needed. Some vendors support autoscaling clusters for bursty workloads. Maintain Firmware & Threat Signatures: Regularly update the firewall’s software, patches, and threat intelligence feeds to ensure protection against emerging vulnerabilities and attacks. Automate updates where possible. Conclusion Third-party firewalls remain a core building block in many enterprise Azure Landing Zones. They provide the deep security controls and operational familiarity enterprises need, while Azure provides the scalable infrastructure to host them. By following the hub-and-spoke architecture, carefully planning deployment models, and enforcing best practices for routing, redundancy, monitoring, and backup, organizations can ensure a secure and reliable network foundation in Azure.1.2KViews4likes2CommentsExtending Layer-2 (VXLAN) networks over Layer-3 IP network
Introduction Virtual Extensible LAN (VXLAN) is a network virtualization technology that encapsulates Layer-2 Ethernet frames inside Layer-3 UDP/IP packets. In essence, VXLAN creates a logical Layer-2 overlay network on top of an IP network, allowing Ethernet segments (VLANs) or underlay IP packet to be stretched across routed infrastructures. A key advantage is scale: VXLAN uses a 24-bit segment ID (VNI) instead of the 12-bit VLAN ID, supporting around 16 million isolated networks versus the 4,094 VLAN limit. This makes VXLAN ideal for large cloud data centers and multi-tenant environments that demand many distinct network segments. VXLAN’s Layer-2 overlays bring flexibility and mobility to modern architectures. Because VXLAN tunnels can span multiple Layer-3 domains, organizations can extend VLANs across different sites or subnets – for example, creating a tunnel that extends over two data centers over an IP WAN as long as underlying tunnel IP is reachable. This enables seamless workload mobility and disaster recovery: also helps virtual machines or applications can move between physical locations without changing IP addresses, since they remain in the same virtual L2 network. The overlay approach also decouples the logical network from the physical underlay, meaning you can run your familiar L2 segments over any IP routing infrastructure while leveraging features like equal-cost multi-path (ECMP) load balancing and avoiding large spanning-tree domains. In short, VXLAN combines the best of both worlds – the simplicity of Layer-2 adjacency with the scalability of Layer-3 routing – making it a foundational tool in cloud networking and software-defined data centers. Layer-2 VXLAN overlay on a Layer-3 IP network allows customers or edge networks to stretch Ethernet (VLAN) segments across geographically distributed sites using an IP backbone. This approach preserves VLAN tags end-to-end and enables flexible segmentation across locations without needing an extend or continuous Layer-2 network in the core. It also helps hide or avoid the underlying IP network complexities. However, it’s crucial to account for MTU overhead (VXLAN adds ~50 bytes of header) so that the overlay’s VLAN MTU is set smaller than the underlay IP MTU – otherwise fragmentation or packet loss can occur. Additionally, because VXLAN doesn’t inherently signal link status, implementing Bidirectional Forwarding Detection (BFD) on the VXLAN interfaces provides rapid detection of neighbor failures, ensuring quick rerouting or recovery when a tunnel endpoint goes down. VXLAN overlay use case and benefits VXLAN is a standard protocol (IETF RFC 7348) that can encapsulate Layer-2 Ethernet frames into Layer-3 UDP/IP packets. By doing so, VXLAN creates an L2 overlay network on top of an L3 underlay. The VXLAN tunnel endpoints (VTEPs), which can be routers, switches, or hosts, wrap the original Ethernet frame (including its VLAN tag) with an IP/UDP header plus a VXLAN header, then send it through the IP network. The default UDP port for VXLAN is 4789. This mechanism offers several key benefits: Preserves VLAN Tags and L2 Segmentation: The entire Ethernet frame is carried across, so the original VLAN ID (802.1Q tag) is maintained end-to-end through the tunnel. Even if an extra tag is added at the ingress for local tunneling, the customer’s inner VLAN tag remains intact across the overlay. This means a VLAN defined at one site will be recognized at the other site as the same VLAN, enabling seamless L2 adjacency. In practice, VXLAN can transport multiple VLANs transparently by mapping each VLAN or service to a VXLAN Network Identifier (VNI). Flexible network segmentation at scale: VXLAN uses a 24-bit VNI (VXLAN Network ID), supporting about 16 million distinct segments, far exceeding the 4094 VLAN limit of traditional 802.1Q networks. This gives architects freedom to create many isolated L2 overlay networks (for multi-tenant scenarios, application tiers, etc.) over a shared IP infrastructure. Geographically distributed sites can share the same VLANs and broadcast domain via VXLAN, without the WAN routers needing any VLAN configurations. The IP/MPLS core only sees routed VXLAN packets, not individual VLANs, simplifying the underlay configuration. No need for end-to-end VLANs in underlay: Traditional solutions to extend L2 might rely on methods like MPLS/VPLS or long ethernet trunk lines, which often require configuring VLANs across the WAN and can’t scale well. In a VXLAN overlay, the intermediate L3 network remains unaware of customer VLANs, and you don’t need to trunk VLANs across the WAN. Each site’s VTEP encapsulates and decapsulates traffic, so the core routers/switches just forward IP/UDP packets. This isolation improves scalability and stability—core devices don’t carry massive MAC address tables or STP domains from all sites. It also means the underlay can use robust IP routing (OSPF, BGP, etc.) with ECMP, rather than extending spanning-tree across sites. In short, VXLAN lets you treat the WAN like an IP cloud while still maintaining Layer-2 connectivity between specific endpoints. Multi-path and resilience: Since the overlay runs on IP, it naturally leverages IP routing features. ECMP in the underlay, for example, can load-balance VXLAN traffic across multiple links, something not possible with a single bridged VLAN spanning the WAN. The encapsulated traffic’s UDP header even provides entropy (via source port hashing) to help load-sharing on multiple paths. Furthermore, if one underlay path fails, routing protocols can reroute VXLAN packets via alternate paths without disrupting the logical L2 network. This increases reliability and bandwidth usage compared to a Layer-2 only approach. Diagram: VXLAN Overlay Across a Layer-3 WAN – Below is a simplified illustration of two sites using a VXLAN overlay. “Site A” and “Site B” each have a local VLAN (e.g. VLAN 100) that they want to bridge across an IP WAN. The VTEPs at each site encapsulate the Layer-2 frames into VXLAN/UDP packets and send them over the IP network. Inside the tunnel, the original VLAN tag is preserved. In this example, a BFD session (red dashed line) runs between the VTEPs to monitor the tunnel’s health, as explained later. Figure 1: Two sites (A and B) extend “VLAN 100” across an IP WAN using a VXLAN tunnel. The inner VLAN tag is preserved over the L3 network. A BFD keepalive (every 900ms) runs between the VXLAN endpoints to detect failures. The practical effect of this design is that devices in Site A and Site B can be in the same VLAN and IP subnet, broadcast to each other, etc., even though they are connected by a routed network. For example, if Site A has a machine in VLAN 100 with IP 10.1.100.5/24 and Site B has another in VLAN 100 with IP 10.1.100.10/24, they can communicate as if on one LAN – ARP, switches, and VLAN tagging function normally across the tunnel. MTU and overhead considerations One critical consideration for deploying VXLAN overlays is handling the increased packet size due to encapsulation. A VXLAN packet includes additional headers on top of the original Ethernet frame: an outer IP header, UDP header, and VXLAN header (plus an outer Ethernet header on the WAN interface). This encapsulation adds approximately 50 bytes of overhead to each packet (for IPv4; about 70 bytes for IPv6). In practical terms, if your original Ethernet frame was the typical 1500-byte payload (1518 bytes with Ethernet header and CRC, or 1522 with a VLAN tag), the VXLAN-encapsulated version will be ~1550 bytes. **The underlying IP network *must* accommodate these larger frames**, or you’ll get fragmentation or drops. Many network links by default only support 1500-byte MTUs, so without adjustments, a VXLAN carrying a full-sized VLAN packet would exceed that. Though modern networks runs jumbo frames (~9k), if the underlying encapsulated packet frames exceeds 8950 bytes it can create problems like control-plane failure (ex BGP session tear down) or fragmentation for data packet causing out of order packet. Solution: Either raise the MTU on the underlay network or enforce a lower MTU on the overlay. Network architects generally prefer to increase the IP MTU of the core so the overlay can carry standard 1500-byte Ethernet frames unfragmented. For example, one vendor’s guide recommends configuring at least a 1550-byte MTU on all network segments to account for VXLAN’s ~50B overhead. In enterprise environments, it’s common to use “baby jumbo” frames (e.g. 1600 bytes) or full jumbo (9000 bytes) in the datacenter/WAN to accommodate various tunneling overheads. If increasing the underlay MTU is not possible (say, over an ISP that only supports 1500), then the VLAN MTU on the overlay should be reduced – for instance, set the VLAN interface MTU to 1450 bytes, so that even with the 50B VXLAN overhead the outer packet remains 1500 bytes. This prevents any IP fragmentation. Why Fragmentation is Undesirable: VXLAN itself doesn’t include any fragmentation mechanism; it relies on the underlay IP to fragment if needed. But IP fragmentation can harm performance and some devices/drop policies might simply drop oversized VXLAN packets instead of fragmenting. In fact, certain implementations don’t support VXLAN fragmentation or Path MTU discovery on tunnels. The safe approach is to ensure no encapsulated packet ever exceeds the physical MTU. That means planning your MTUs end-to-end: make the core links slightly larger than the largest expected overlay packet. Diagram: VXLAN Encapsulation and MTU Layering – The figure below illustrates the components of a VXLAN-encapsulated frame and how they contribute to packet size. The original Ethernet frame (yellow) with a VLAN tag is wrapped with a new outer Ethernet, IP, UDP, and VXLAN header . The extra headers add ~50 bytes. If the inner (yellow) frame was, say, 1500 bytes of payload plus 18 bytes Ethernet overhead, the outer packet becomes ~1568 bytes (including new headers and FCS). In practice the old FCS is replaced by a new one, so the net growth is ~50 bytes. The key takeaway: the IP transport must handle the total size. Figure 2: Layered view of a VXLAN-encapsulated packet (not to scale). The original Ethernet frame with VLAN tag (yellow) is encapsulated by outer headers (blue/green/red/gray), resulting in ~50 bytes of overhead for IPv4. The outer packet must fit within the WAN MTU (e.g. 1518B if inner frame is 1468B) to avoid fragmentation. In summary, ensure the IP underlay’s MTU is configured to accommodate the VXLAN overhead. If using standard 1500-byte MTUs on the WAN, set your overlay interfaces (VLAN SVIs or bridge MTUs) to around 1450 bytes. In many cases if possible, raising the WAN MTU to 1600 or using jumbo frames throughout is the best practice to provide ample headroom. Always test your end-to-end path with ping sweeps (e.g. using the DF-bit and varying sizes) to verify that the encapsulated packets aren’t being dropped due to MTU limits. Neighbor failure detection with BFD One challenge with overlays like VXLAN is that the logical link lacks immediate visibility into physical link status. If one end of the VXLAN tunnel goes down or the path fails, the other end’s VXLAN interface may remain “up” (since its own underlay interface is still up), potentially blackholing traffic until higher-level protocols notice. VXLAN itself doesn’t send continuous “link alive” messages to check the remote VTEP’s reachability. To address this, network engineers deploy BFD on VXLAN endpoints. BFD is a lightweight protocol specifically designed for rapid failure detection independent of media or routing protocol. It works by two endpoints periodically sending very fast, small hello packets to each other (often every 50ms or less). If a few consecutive hellos are missed, BFD declares the peer down – often within <1 second, versus several seconds (or tens of seconds) with conventional detection. Applying BFD to VXLAN: Many router and switch vendors support running BFD over a VXLAN tunnel or on the VTEP’s loopback adjacencies. When enabled, the two VTEPs will continuously ping each other at the configured interval. If the VXLAN tunnel fails (e.g. one site loses connectivity), BFD on the surviving side will quickly detect the loss of response. This can then trigger corrective actions: for instance, the BFD can generate logs for the logical interface or notify the routing protocol to withdraw routes via that tunnel. In designs with redundant tunnels or redundant VTEPs, BFD helps achieve sub-second failover – traffic can switch to a backup VXLAN tunnel almost immediately upon a primary failure. Even in a single-tunnel scenario, BFD gives an early alert to the network operator or applications that the link is down, rather than quietly dropping packets. Example: If Site A and Site B have two VXLAN tunnels (primary and backup) connecting them, running BFD on each tunnel interface means that if the primary’s path goes down, BFD at Site A and B will detect it within milliseconds and inform the routing control-plane. The network can then shift traffic to the backup tunnel right away. Without BFD, the network might have to wait for a timeout (e.g. OSPF dead interval or even ARP timeouts) to realize the primary tunnel is dead, causing a noticeable outage. BFD is protocol-agnostic – it can integrate with any routing protocols. For VXLAN, it’s purely a monitoring mechanism: lightweight and with minimal overhead on the tunnel. Its messages are small UDP packets (often on port 3784/3785) that can be sourced from the VTEP’s IP. The frequency is configurable based on how fast you need detection vs. how much overhead you can afford; common timers are 300ms with 3x multiplier (detect in ~1s) for moderate speeds, or even 50ms with 3x (150ms detection) for high-speed failover requirements. Bottom line: Implementing BFD dramatically improves the reliability of a VXLAN-based L2 extension. Since VXLAN tunnels don’t automatically signal if a neighbor is unreachable, BFD acts as the heartbeat. Many platforms even allow BFD to directly influence interface state (for example, the VXLAN interface can be tied to go down when BFD fails) so that any higher-level protocols (like VRRP, dynamic routing, etc.) immediately react to the loss. This prevents lengthy outages and ensures the overlay network remains robust even over a complex WAN. Conclusion Deploying a Layer-2 VXLAN overlay across a Layer-3 WAN unlocks powerful capabilities: you can keep using familiar VLAN-based segmentation across sites while taking advantage of an IP network’s scalability and resilience. It’s a vendor-neutral solution widely supported in modern networking gear. By preserving VLAN tags over the tunnel, VXLAN makes it possible to stretch subnets and broadcast domains to remote locations for workloads that require Layer-2 adjacency. With the huge VNI address space, segmentation can scale for large enterprises or cloud providers well beyond traditional VLAN limits. However, to realize these benefits successfully, careful attention must be paid to MTU and link monitoring. Always accommodate the ~50-byte VXLAN overhead by configuring proper MTUs (or adjusting the overlay’s MTU) – this prevents fragmentation and packet loss that can be very hard to troubleshoot after deployment. And since a VXLAN tunnel’s health isn’t apparent to switches/hosts by default, use tools like BFD to add fast failure detection, thereby avoiding black holes and improving convergence times. In doing so, you ensure that your stretched network is not only functional but also resilient and performant. By following these guidelines – leveraging VXLAN for flexible L2 overlays, minding the MTU, and bolstering with BFD – network engineers can build a robust, wide-area Layer-2 extension that behaves nearly indistinguishably from a local LAN, yet rides on the efficiency and reliability of a Layer-3 IP backbone. Enjoy the best of both worlds: VLANs without borders, and an IP network without unnecessary constraints. References: VXLAN technical overview and best practices from vendor documentation and industry sources have been used to ensure accuracy in the above explanations. This ensures the blog is grounded in real-world proven knowledge while remaining vendor-neutral and applicable to a broad audience of cloud and network professionals.435Views2likes0CommentsCut the Noise & Cost with Container Network Metrics Filtering in ACNS for AKS
We’re excited to introduce Container Network Metrics Filtering in Azure Container Networking Services (ACNS) for Azure Kubernetes Service (AKS) is now in Public Preview! This capability transforms how you manage network observability in Kubernetes clusters by giving you control over what metrics matter most. Why Excessive Metrics Are a Problem (And How We’re Fixing It) In today’s large-scale, microservices-driven environments, teams often face metrics bloat, Bollecting far more data than they need. The result? High Storage & Ingestion Costs: Paying for data you’ll never use. Cluttered Dashboards: Hunting for critical latency spikes in a sea of irrelevant pod restarts. Operational Overhead: Slower queries, higher maintenance, and fatigue. Our new filtering capability solves this by letting you define precise filters at the pod level using standard Kubernetes Custom Resources. You collect only what matters, before it ever reaches your monitoring stack. Key Benefits: Signal Over Noise Benefit Your Gain Fine-Grained Control Filter by namespace or pod label. Target critical services and ignore noise. Cost Optimization Reduce ingestion costs for Prometheus, Grafana, and other tools. Improved Observability Cleaner dashboards and faster troubleshooting with relevant metrics only. Dynamic & Zero-Downtime Apply or update filters without restarting Cilium agents or Prometheus. How It Works: Filtering at the Source Unlike traditional sampling or post-processing, filtering happens at the Cilium agent level—inside the kernel’s data plane. You define filters using the ContainerNetworkMetric Custom Resource to include or exclude metrics such as: DNS lookups TCP connection metrics Flow metrics Drop (error) metrics This reduces data volume before metrics leave the host, ensuring your observability tools receive only curated, high-value data. Example: Filtering Flow Metrics to Reduce Noise Here’s a sample ContainerNetworkMetric CRD that filters only dropped flows from the traffic/http namespace and excludes flows from traffic/fortio pods: apiVersion: acn.azure.com/v1alpha1 kind: ContainerNetworkMetric metadata: name: container-network-metric spec: filters: - metric: flow includeFilters: # Include only DROPPED flows from traffic namespace verdict: - "dropped" from: namespacedPod: - "traffic/http" excludeFilters: # Exclude traffic/fortio flows to reduce noise from: namespacedPod: - "traffic/fortio" Before Filtering After Applying Filters Getting Started Today Ready to simplify your network observability? Enable ACNS: Make sure ACNS is enabled on your AKS cluster. Define Your Filter: Apply the ContainerNetworkMetric CRD with your include/exclude rules. Validate: Check your settings via ConfigMap and Cilium agent logs. See the Impact: Watch ingestion costs drop and dashboards become clearer! 👉 Learn more in the Metrics Filtering Guide. Try the Public Preview today and take control of your container network metrics.231Views0likes0CommentsLayer 7 Network Policies for AKS: Now Generally Available for Production Security and Observability!
We are thrilled to announce that Layer 7 (L7) Network Policies for Azure Kubernetes Service (AKS), powered by Cilium and Advanced Container Networking Services (ACNS), has reached General Availability (GA)! The journey from public preview to GA signifies a critical step: L7 Network Policies are now fully supported, highly optimized, and ready for your most demanding, mission-critical production workloads. A Practical Example: Securing a Multi-Tier Retail Application Let's walk through a common production scenario. Imagine a standard retail application running on AKS with three core microservices: frontend-app: Handles user traffic and displays product information. inventory-api: A backend service that provides product stock levels. It should be read-only for the frontend. payment-gateway: A highly sensitive service that processes transactions. It should only accept POST requests from the frontend to a specific endpoint. The Security Challenge: A traditional L4 policy would allow the frontend-app to talk to the inventory-api on its port, but it couldn't prevent a compromised frontend pod from trying to exploit a potential vulnerability by sending a DELETE or POST request to modify inventory data. The L7 Policy Solution: With GA L7 policies, you can enforce the Principle of Least Privilege at the application layer. Here's how you would protect the inventory-api: apiVersion: cilium.io/v2 kind: CiliumNetworkPolicy metadata: name: protect-inventory-api spec: endpointSelector: matchLabels: app: inventory-api ingress: - fromEndpoints: - matchLabels: app: frontend-app toPorts: - ports: - port: "8080" # The application port protocol: TCP rules: http: - method: "GET" # ONLY allow the GET method path: "/api/inventory/.*" # For paths under /api/inventory/ The Outcome: Allowed: A legitimate request from the frontend-app (GET /api/inventory/item123) is seamlessly forwarded. Blocked: Assuming frontend-app is compromised, any malicious request (like DELETE /api/inventory/item123) originating from it is blocked at the network layer. This Zero Trust approach protects the inventory-api service from the threat, regardless of the security state of the source service. This same principle can be applied to protect the payment-gateway, ensuring it only accepts POST requests to the /process-payment endpoint, and nothing else. Beyond L7: Supporting Zero Trust with Enhanced Security In addition, toL7 application-level policies to ensure Zero Trust, we support Layer 3/4 network security and advanced egress controls like Fully Qualified Domain Name (FQDN) filtering. This comprehensive approach allows administrators to: Restrict Outbound Connections (L3/L4 & FQDN): Implement strict egress control by ensuring that workloads can only communicate with approved external services. FQDN filtering is crucial here, allowing pods to connect exclusively to trusted external domains (e.g., www.trusted-partner.com), significantly reducing the risk of data exfiltration and maintaining compliance. To learn more, visit the FQDN Filtering Overview. Enforce Uniform Policy Across the Cluster (CCNP): Extend protections beyond individual namespaces. By defining security measures as a Cilium Clusterwide Network Policy (CCNP), thanks to its General Availability (GA), administrators can ensure uniform policy enforcement across multiple namespaces or the entire Kubernetes cluster, simplifying management and strengthening the overall security posture of all workloads. To learn CCNP Example: L4 Egress Policy with FQDN Filtering This policy ensures that all pods across the cluster (CiliumClusterwideNetworkPolicy) are only allowed to establish outbound connections to the domain *.example.com on the standard web ports (80 and 443). apiVersion: cilium.io/v2 kind: CiliumClusterwideNetworkPolicy metadata: name: allow-egress-to-example-com spec: endpointSelector: {} # Applies to all pods in the cluster egress: - toFQDNs: - matchPattern: "*.example.com" # Allows access to any subdomain of example.com toPorts: - ports: - port: "443" protocol: TCP - port: "80" protocol: TCP Operational Excellence: Observability You Can Trust A secure system must be observable. With GA, the integrated visibility of your L7 traffic is production ready. In our example above, the blocked DELETE request isn't silent. It is immediately visible in your Azure Managed Grafana dashboards as a "Dropped" flow, attributed directly to the protect-inventory-api policy. This makes security incidents auditable and easy to diagnose, enabling operations teams to detect misconfigurations or threats in real time. Below is a sample dashboard layout screenshot. Next Steps: Upgrade and Secure Your Production! We encourage you to enable L7 Network Policies on your AKS clusters and level up your network security controls for containerized workloads. We value your feedback as we continue to develop and improve this feature. Please refer to the Layer 7 Policy Overview for more information and visit How to Apply L7 Policy for an example scenario.451Views1like0CommentsIntroducing eBPF Host Routing: High performance AI networking with Azure CNI powered by Cilium
AI-driven applications demand low-latency workloads for optimal user experience. To meet this need, services are moving to containerized environments, with Kubernetes as the standard. Kubernetes networking relies on the Container Network Interface (CNI) for pod connectivity and routing. Traditional CNI implementations use iptables for packet processing, adding latency and reducing throughput. Azure CNI powered by Cilium natively integrates Azure Kubernetes service (AKS) data plane with Azure CNI networking modes for superior performance, hardware offload support, and enterprise-grade reliability. Azure CNI powered by Cilium delivers up to 30% higher throughput in both benchmark and real-world customer tests compared to a bring-your-own Cilium setup on AKS. The next leap forward: Now, AKS data plane performance can be optimized even further with eBPF host routing, which is an open-source Cilium CNI capability that accelerates packet forwarding by executing routing logic directly in eBPF. As shown in the figure, this architecture eliminates reliance on iptables and connection tracking (conntrack) within the host network namespace. As a result, significantly improving packet processing efficiency, reducing CPU overhead and optimized performance for modern workloads. Comparison of host routing using the Linux kernel stack vs eBPF Azure CNI powered by Cilium is battle-tested for mission-critical workloads, backed by Microsoft support, and enriched with Advanced Container Networking Services features for security, observability, and accelerated performance. eBPF host routing is now included as part of Advanced Container Networking Services suite, delivering network performance acceleration. In this blog, we highlight the performance benefits of eBPF host routing, explain how to enable it in an AKS cluster, and provide a deep dive into its implementation on Azure. We start by examining AKS cluster performance before and after enabling eBPF host routing. Performance comparison Our comparative benchmarks measure the difference in Azure CNI Powered by Cilium, by enabling eBPF host routing. To perform these measurements, we use AKS clusters on K8s version 1.33, with host nodes of 16 cores, running Ubuntu 24.04. We are interested in throughput and latency numbers for pod-to-pod traffic in these clusters. For throughput measurements, we deploy netperf client and server pods, and measure TCP_STREAM throughput at varying message sizes in tests running 20 seconds each. The wide range of message sizes are meant to capture the variety of workloads running on AKS clusters, ranging from AI training and inference to messaging systems and media streaming. For latency, we run TCP_RR tests, measuring latency at various percentiles, as well as transaction rates. The following figure compares pods on the same node; eBPF-based routing results in a dramatic improvement in throughput (~30%). This is because, on the same node, the throughput is not constrained by factors such as the VM NIC limits and is almost entirely determined by host routing performance. For pod-to-pod throughput across different nodes in the cluster. eBPF host routing results in better pod-to-pod throughput across nodes, and the difference is more pronounced with smaller message sizes (3x more). This is because, with smaller messages, the per-message overhead incurred in the host network stack has a bigger impact on performance. Next, we compare latency for pod-to-pod traffic. We limit this benchmark to intra-node traffic, because cross-node traffic latency is determined by factors other than the routing latency incurred in the hosts. eBPF host routing results in reduced latency compared to the non-accelerated configuration at all measured percentiles. We have also measured the transaction rate between client and server pods, with and without eBPF host routing. This benchmark is an alternative measurement of latency because a transaction is essentially a small TCP request/response pair. We observe that eBPF host routing improves transactions per second by around 27% as compared to legacy host routing. Transactions/second (same node) Azure CNI configuration Transactions/second eBPF host routing 20396.9 Traditional host routing 16003.7 Enabling eBPF routing through Advanced Container Networking Services eBPF host routing is disabled by default in Advanced Container Networking Services because bypassing iptables in the host network namespace can ignore custom user rules and host-level security policies. This may lead to visible failures such as dropped traffic or broken network policies, as well as silent issues like unintended access or missed audit logs. To mitigate these risks, eBPF host routing is offered as an opt-in feature, enabled through Advanced Container Networking Services on Azure CNI powered by Cilium. The Advanced Container Networking Services advantage: Built-in safeguards: Enabling eBPF Host Routing in ACNS enhances the open-source offering with strong built-in safeguards. Before activation, ACNS validates existing iptables rules in the host network namespace and blocks enablement if user-defined rules are detected. Once enabled, kernel-level protections prevent new iptables rules and generate Kubernetes events for visibility. These measures allow customers to benefit from eBPF’s performance gains while maintaining security and reliability. Thanks to the additional safeguards, eBPF host routing in Advanced Container Networking Services is a safer and more robust option for customers who wish to obtain the best possible networking performance on their Kubernetes infrastructure. How to enable eBPF Host Routing with ACNS Visit the documentation on how to enable eBPF Host Routing for new and existing Azure CNI Powered by Cilium clusters. Verify the network profile with the new performance `accelerationMode`field set to `BpfVeth`. "networkProfile": { "advancedNetworking": { "enabled": true, "performance": { "accelerationMode": "BpfVeth" }, … For more information on Advanced Container Networking Services and ACNS Performance, please visit https://aka.ms/acnsperformance. Resources For more info about Advanced Container Networking Services please visit (Container Network Security with Advanced Container Networking Services (ACNS) - Azure Kubernetes Service | Microsoft Learn). For more info about Azure CNI Powered by Cilium please visit (Configure Azure CNI Powered by Cilium in Azure Kubernetes Service (AKS) - Azure Kubernetes Service | Microsoft Learn).518Views1like0CommentsPublic Preview: Custom WAF Block Status & Body for Azure Application Gateway
Introduction Azure Application Gateway Web Application Firewall (WAF) now supports custom HTTP status codes and custom response bodies for blocked requests. This Public Preview feature gives you more control over user experience and client-side handling, aligning with capabilities already available on Azure Front Door WAF. Why this matters Previously, WAF returned a fixed 403 response with a generic message. Now you can: Set a custom status code (e.g., 403, 429) to match your app logic. Provide a custom response body (e.g., a friendly error page or troubleshooting steps). Ensure consistency across all blocked requests under WAF policy. This feature improves user experience (UX), helps with compliance, and simplifies troubleshooting. Key capabilities Custom Status Codes: Allowed values: 200, 403, 405, 406, 429, 990–999. Custom Response Body: Up to 32 KB, base64-encoded for ARM/REST. Policy-level setting: Applies to all blocked requests under that WAF policy. Limit: Up to 20 WAF policies with custom block response per Application Gateway. Configure in the Azure Portal Follow these steps: Sign in to the https://portal.azure.com. Navigate to your WAF Policy linked to the Application Gateway. Under Settings, select Policy settings. In the Custom block response section: Block response status code: Choose from allowed values (e.g., 403 or 429). Block response body: Enter your custom message (plain text or HTML). Save the policy. Apply the policy to your Application Gateway if not already associated. Configure via CLI az network application-gateway waf-policy update \ --name MyWafPolicy \ --resource-group MyRG \ --custom-block-response-status-code 429 \ --custom-block-response-body "$(base64 custompage.html)" Configure via PowerShell Set-AzApplicationGatewayFirewallPolicy ` -Name MyWafPolicy ` -ResourceGroupName MyRG ` -CustomBlockResponseStatusCode 429 ` -CustomBlockResponseBody (Get-Content custompage.html -Encoding Byte | [System.Convert]::ToBase64String) Tip: For ARM/REST, the body must be base64-encoded. Best practices Use meaningful status codes (e.g., 429 for rate limiting). Keep the response body lightweight and informative. Test thoroughly to ensure downstream systems handle custom codes correctly. Resources Configure Custom Response code Learn more about Application Gateway WAF229Views0likes0Comments