azure networking
13 TopicsNetwork Redundancy from On-Premises to Azure VMware and VNETs
Establishing redundant network connectivity is vital to ensuring the availability, reliability, and performance of workloads operating in hybrid and cloud environments. Proper planning and implementation of network redundancy are key to achieving high availability and sustaining operational continuity. This guide presents common architectural patterns for building redundant connectivity between on-premises datacenters, Azure Virtual Networks (VNets), and Azure VMware Solution (AVS) within a single-region deployment. AVS allows organizations to run VMware-based workloads directly on Azure infrastructure, offering a streamlined path for migrating existing VMware environments to the cloud without the need for significant re-architecture or modification. Connectivity Between AVS, On-Premises, and Virtual Networks The diagram below illustrates a common network design pattern using either a Hub-and-Spoke or Virtual WAN (vWAN) topology, deployed within a single Azure region. ExpressRoute is used to establish connectivity between on-premises environments and VNets. The same ExpressRoute circuit is extended to connect AVS to the on-premises infrastructure through ExpressRoute Global Reach. Consideration: This design presents a single point of failure. If the ExpressRoute circuit (ER1-EastUS) experiences an outage, connectivity between the on-premises environment, VNets, and AVS will be disrupted. Let’s examine some solutions to establish redundant connectivity in case the ER1-EastUS experiences an outage. Solution1: Network Redundancy via VPN In this solution, one or more VPN tunnels are deployed as a backup to ExpressRoute. If ExpressRoute becomes unavailable, the VPN provides an alternative connectivity path from the on-premises environment to VNets and AVS. In a Hub-and-Spoke topology, a backup path to and from AVS can be established by deploying Azure Route Server (ARS) in the hub VNet. ARS enables seamless transit routing between ExpressRoute and the VPN gateway. In vWAN topology, ARS is not required. The vHub's built-in routing service automatically provides transitive routing between the VPN gateway and ExpressRoute. Note: In vWAN topology, to ensure optimal route convergence when failing back to ExpressRoute, you should prepend the prefixes advertised from on-premises over the VPN. Without route prepending, VNets may continue to use the VPN as the primary path to on-premises. If prepend is not an option, you can trigger the failover manually by bouncing the VPN tunnel. Solution Insights: Cost-effective and straightforward to deploy. Latency: The VPN tunnel introduces additional latency due to its reliance on the public internet and the overhead associated with encryption. Bandwidth Considerations: Multiple VPN tunnels might be needed to achieve bandwidth comparable to a high-capacity ExpressRoute circuit (e.g., over 10G). For details on VPN gateway SKU and tunnel throughput, refer to this link. Solution2: Network Redundancy via SD-WAN In this solution, SDWAN tunnels are deployed as a backup to ExpressRoute. If ExpressRoute becomes unavailable, SDWAN provides an alternative connectivity path from the on-premises environment to VNets and AVS. In a Hub-and-Spoke topology, a backup path to and from AVS can be established by deploying ARS in the hub VNet. ARS enables seamless transit routing between ExpressRoute and the SDWAN appliance. In vWAN topology, ARS is not required. The vHub's built-in routing service automatically provides transitive routing between the SDWAN and ExpressRoute. Note: In vWAN topology, to ensure optimal convergence from the VNets when failing back to ExpressRoute, prepend the prefixes advertised from on-premises over the SDWAN to make sure it's longer that ExpressRoute learned routes. If you don't prepend, VNets will continue to use SDWAN path to on-prem as primary path. In this design using Azure vWAN, the SD-WAN can be deployed either within the vHub or in a spoke VNet connected to the vHub the same principle applies in both cases. Solution Insights: If you have an existing SDWANs no additional deployment is needed. Bandwidth Considerations: Vendor specific. Management consideration: Third party dependency and need to manage the HA deployment, except for SD-WAN SaaS solution. Solution3: Network Redundancy via ExpressRoute in Different Peering Locations Deploy an additional ExpressRoute circuit at a different peering location and enable Global Reach between ER2-peeringlocation2 and AVS-ER. To use this circuit as a backup path, prepend the on-premises prefixes on the second circuit; otherwise, AVS or VNet will perform Equal-Cost Multi-Path (ECMP) routing across both circuits to on-prem. Note: Use public ASN to prepend the on-premises prefix as AVS-ER will strip the private ASN toward AVS. Refer to this link for more details. Solution Insights: Ideal for mission-critical applications, providing predictable throughput and bandwidth for backup. Could be cost prohibitive depending on the bandwidth of the second circuit. Solution4: Network Redundancy via Metro ExpressRoute Metro ExpressRoute is the new ExpressRoute configuration. This configuration allows you to benefit from a dual-homed setup that facilitates diverse connections to two distinct ExpressRoute peering locations within a city. This configuration offers enhanced resiliency if one peering location experiences an outage. Solution Insights Higher Resiliency: Provides increased reliability with a single circuit. Limited regional availability: Availability is restricted to specific regions within the metro area. Cost-effective Conclusion: The choice of failover connectivity should be guided by the specific latency and bandwidth requirements of your workloads. Ultimately, achieving high availability and ensuring continuous operations depend on careful planning and effective implementation of network redundancy strategies.1.2KViews3likes0CommentsAzure CNI now supports Node Subnet IPAM mode with Cilium Dataplane
Azure CNI Powered by Cilium is a high-performance data plane leveraging extended Berkeley Packet Filter (eBPF) technologies to enable features such as network policy enforcement, deep observability, and improved service routing. Legacy CNI supports Node Subnet where every pod gets an IP address from a given subnet. AKS clusters that require VNet IP addressing mode (non-overlay scenarios) are typically advised to use Pod Subnet mode. However, AKS clusters that do not face the risk of IP exhaustion can continue to use node subnet mode for legacy reasons and switch the CNI dataplane to utilize Cilium's features. With this feature launch, we are providing that migration path! Users often leverage node subnet mode in Azure Kubernetes Service (AKS) clusters for ease of use. This mode provides an optionality where users do not want to worry about managing multiple subnets, especially when using smaller clusters. Besides, let’s highlight some additional benefits unlocked by this feature. Improved Network Debugging Capabilities through Advanced Container Networking Services By upgrading to Azure CNI Powered by Cilium with Node Subnet, Advanced Container Networking Services opens the possibility of using eBPF tools to gather request metrics at the node and pod level. Advanced Observability tools provide a managed Grafana dashboard to inspect these metrics for a streamlined incident response experience. Advanced Network Policies Network policies for Legacy CNI present a challenge because policies on IP-based filtering require constant updating in a Kubernetes cluster where pod IP addresses frequently change. Enabling the Cilium data plane offers an efficient and scalable approach to managing network policies. Create an Azure CNI Powered by Cilium cluster with node subnet as the IP Address Management (IPAM) networking model. This is the default option when done with a `--network-plugin azure` flag. az aks create --name <clusterName> --resource-group <resourceGroupName> --location <location> --network-plugin azure --network-dataplane cilium --generate-ssh-keys A flat network can lead to less efficient use of IP addresses. Careful planning through the List Usage command of a given VNet helps to see current usage of the subnet space. AKS creates a VNet and subnet automatically from cluster creation. Note the resource group for this VNet is generated based on the resource group for the cluster, the cluster name, and location. From the Portal under Settings > Networking for the AKS cluster, we can see the names of the resources created automatically. az rest --method get \ --url https://management.azure.com/subscriptions/{subscription-id} /resourceGroups/MC_acn-pm_node-subnet-test_westus2/providers/Microsoft.Network/virtualNetworks/aks-vnet-34761072/usages?api-version=2024-05-01 { "value": [ { "currentValue": 87, "id": "/subscriptions/9b8218f9-902a-4d20-a65c-e98acec5362f/resourceGroups/MC_acn-pm_node-subnet-test_westus2/providers/Microsoft.Network/virtualNetworks/aks-vnet-34761072/subnets/aks-subnet", "isAdjustable": false, "limit": 65531, "name": { "localizedValue": "Subnet size and usage", "value": "SubnetSpace" }, "unit": "Count" } ] } To better understand this utilization, click the link of the Virtual network then access the list of Connected Devices. The view also shows which IPs are utilized on a given node. There are a total of 87 devices consistent with the previous command line output of subnet usage. Since the default creates three nodes with a max pod count of 30 per node (configurable up to 250), IP exhaustion is not a concern although careful planning is required for larger clusters. Next, we will enable Advanced Container Networking Services (ACNS) on this cluster. az aks update --resource-group <resourceGroupName> --name <clusterName> --enable-acns Create a default deny Cilium Network policy. The namespace is `default`, and we will use `app: server` as the label in this example. kubectl apply -f - <<EOF apiVersion: cilium.io/v2 kind: CiliumNetworkPolicy metadata: name: default-deny namespace: default spec: endpointSelector: matchLabels: app: server ingress: - {} egress: - {} EOF The empty brackets under ingress and egress represent all traffic. Next, we will use `agnhost`, a network connectivity utility used in Kubernetes upstream testing that can help set up a client/server scenario. kubectl run server --image=k8s.gcr.io/e2e-test-images/agnhost:2.41 --labels="app=server" --port=80 --command -- /agnhost serve-hostname --tcp --http=false --port "80" Get the server address IP: kubectl get pod server -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES server 1/1 Running 0 9m 10.224.0.57 aks-nodepool1-20832547-vmss000002 <none> <none> Create a client that will use the agnhost utility to test the network policy. Open a new terminal window as this will also open a new shell. kubectl run -it client --image=k8s.gcr.io/e2e-test-images/agnhost:2.41 --command -- bash Test connectivity to the server from client. Timeout is expected since the network policy is default deny for all traffic in the default namespace. Your pod IP may be different from the example. bash-5.0# ./agnhost connect 10.224.0.57:80 --timeout=3s --protocol=tcp –verbose TIMEOUT Remove the network policy. In practice, additional policies would be added to retain the default deny policy while allowing applications that satisfy the conditions to allow connectivity. kubectl delete cnp default-deny From a shell with the client pod, verify the connection is now allowed. If successful, there is simply no output. kubectl attach client -c client -i -t bash-5.0# ./agnhost connect 10.224.0.57:80 --timeout=3s --protocol=tcp Connectivity between server and client is restored. Additional tools such as Hubble UI for debugging can be found in Container Network Observability - Advanced Container Networking Services (ACNS) for Azure Kubernetes Service (AKS) - Azure Kubernetes Service | Microsoft Learn. Conclusion Building a seamless migration path is critical to continued growth and adoption of ACPC. The goal is to provide a best-in-class experience by providing an upgrade path to enable the Cilium data plane to enable high-performance networking across various IP addressing modes. This allows for flexibility to fit your IP address plans to build varied workload types using AKS networking. Keep an eye out on the AKS public roadmap for more developments in the near future. Resources Learn more about Azure CNI Powered by Cilium. Learn more about IP address planning. Visit Azure CNI Powered by Cilium benchmarking to see performance benchmarks using an eBPF dataplane. Visit to learn more about Advanced Container Networking Services.193Views1like2CommentsAzure virtual network terminal access point (TAP) public preview announcement
What is virtual network TAP? Virtual network TAP allows customers continuously stream virtual machine network traffic to a network packet collector or analytics tool. Many security and performance monitoring tools rely on packet-level insights that are difficult to access in cloud environments. Virtual network TAP bridges this gap by integrating with our industry partners to offer: Enhanced security and threat detection: Security teams can inspect full packet data in real-time to detect and respond to potential threats. Performance monitoring and troubleshooting: Operations teams can analyze live traffic patterns to identify bottlenecks, troubleshoot latency issues, and optimize application performance. Regulatory compliance: Organizations subject to compliance frameworks such as Health Insurance Portability and Accountability Act (HIPAA), and General Data Protection Regulation (GDPR) can use virtual network TAP to capture network activity for auditing and forensic investigations. Why use virtual network TAP? Unlike traditional packet capture solutions that require deploying additional agents or network appliances, virtual network TAP leverages Azure's native infrastructure to enable seamless traffic mirroring without complex configurations and without impacting the performance of the virtual machine. A key advantage is that mirrored traffic does not count towards virtual machine’s network limits, ensuring complete visibility without compromising application performance. Additionally, virtual network TAP supports all Azure virtual machine SKU. Deploying virtual network TAP The portal is a convenient way to get started with Azure virtual network TAP. However, if you have a lot of Azure resources and want to automate the setup you may want to use a PowerShell, CLI, or REST API. Add a TAP configuration on a network interface that is attached to a virtual machine deployed in your virtual network. The destination is a virtual network IP address in the same virtual network as the monitored network interface or a peered virtual network. The collector solution for virtual network TAP can be deployed behind an Azure Internal Load balancer for high availability. You can use the same virtual network TAP resource to aggregate traffic from multiple network interfaces in the same or different subscriptions. If the monitored network interfaces are in different subscriptions, the subscriptions must be associated to the same Microsoft Entra tenant. Additionally, the monitored network interfaces and the destination endpoint for aggregating the TAP traffic can be in peered virtual networks in the same region. Partnering with industry leaders to enhance network monitoring in Azure To maximize the value of virtual network TAP, we are proud to collaborate with industry-leading security and network visibility partners. Our partners provide deep packet inspection, analytics, threat detection, and monitoring solutions that seamlessly integrate with virtual network TAP: Network packet brokers Partner Product Gigamon GigaVUE Cloud Suite for Azure cPacket cPacket Cloud Suite Keysight CloudLens Security analytics, network/application performance management Partner Product Darktrace Darktrace /NETWORK Netscout Omnis Cyber Intelligence NDR Corelight Corelight Open NDR Platform LinkShadow LinkShadow NDR Fortinet FortiNDR Cloud FortiGate VM TrendMicro Trend Vision One™ Network Security Extrahop RevealX Bitdefender GravityZone Extended Detection and Response for Network eSentire eSentire MDR Vectra Vectra NDR AttackFence AttackFence NDR Arista Networks Arista NDR See our partner blogs: Streamline Traffic Mirroring in the Cloud with Azure Virtual Network Terminal Access Point (TAP) and Keysight Visibility | Keysight Blogs eSentire | Unlocking New Possibilities for Network Monitoring and… LinkShadow Unified Identity, Data, and Network Platform Integrated with Microsoft Virtual Network TAP Extrahop and Microsoft Extend Coverage for Azure Workloads Resources | Announcing cPacket Partnership with Azure virtual network terminal access point (TAP) Gain Network Traffic Visibility with FortiGate and Azure virtual network TAP Get started with virtual network TAP To learn more and get started, visit our website. We look forward to seeing how you leverage virtual network TAP to enhance security, performance, and compliance in your cloud environment. Stay tuned for more updates as we continue to refine and expand on our feature set!1.6KViews3likes1CommentWhat's New in the World of eBPF from Azure Container Networking!
Azure Container Networking Interface (CNI) continues to evolve, now bolstered by the innovative capabilities of Cilium. Azure CNI Powered by Cilium (ACPC) leverages Cilium’s extended Berkeley Packet Filter (eBPF) technologies to enable features such as network policy enforcement, deep observability, and improved service routing. Here’s a deeper look into the latest features that make management of Azure Kubernetes Service (AKS) clusters more efficient, scalable, and secure. Improved Performance: Cilium Endpoint Slices One of the standout features in the recent updates is the introduction of CiliumEndpointSlice. This feature significantly enhances the performance and scalability of the Cilium dataplane in AKS clusters. Previously, Cilium used Custom Resource Definitions (CRDs) called CiliumEndpoints to manage pods. Each pod had a CiliumEndpoint associated with it, which contained information about the pod’s status and properties. However, this approach placed significant stress on the control plane, especially in larger clusters. To alleviate this load, CiliumEndpointSlice batches CiliumEndpoints and their updates, reducing the number of updates propagated to the control plane. Our performance testing has shown remarkable improvements: Average API Server Responsiveness: Upto 50% decrease in latency, meaning faster processing of queries. Pod Startup Latencies: Upto 60% reduction, allowing for faster deployment and scaling. In-Cluster Network Latency: Upto 80% decrease, translating to better application performance. Note that this feature is Generally Available in AKS clusters, by default, using Cilium 1.17 release and above and does not require additional configuration changes! Learn more about improvements unlocked by CiliumEndpointSlices with Azure CNI by Cilium - High-Scale Kubernetes Networking with Azure CNI Powered by Cilium | Microsoft Community Hub. Deployment Flexibility: Dual Stack for Cilium Network Policies Kubernetes clusters operating on an IPv4/IPv6 dual-stack network enable workloads to natively access both IPv4 and IPv6 endpoints without incurring additional complexities or performance drawbacks. Previously, we had enabled the use of dual stack networking on AKS clusters (starting with AKS 1.29) running Azure CNI powered by Cilium in preview mode. Now, we are happy to announce that the feature is Generally Available! By enabling both IPv4 and IPv6 addressing, you can manage your production AKS clusters in mixed environments, accommodating various network configurations seamlessly. More importantly, dual-stack support in Azure CNI’s Cilium network policies extend security benefits for AKS clusters in those complex environments. For instance, you can enable dual stack AKS clusters using eBPF dataplane as follows: az aks create \ --location <region> \ --resource-group <resourceGroupName> \ --name <clusterName> \ --network-plugin azure \ --network-plugin-mode overlay \ --network-dataplane cilium \ --ip-families ipv4,ipv6 \ --generate-ssh-keys Learn more about Azure CNI’s Network Policies - Configure Azure CNI Powered by Cilium in Azure Kubernetes Service (AKS) - Azure Kubernetes Service | Microsoft Learn Ease of Use: Node Subnet Mode with Cilium Azure CNI now supports Node Subnet IPAM mode with Cilium Dataplane. In Node Subnet mode IP addresses to pods are assigned from the same subnet as the node itself, simplifying routing and policy management. This mode is particularly beneficial for smaller clusters where managing multiple subnets is cumbersome. AKS clusters using this mode also gain the benefits of improved network observability, Cilium Network Policies and FQDN filtering and more capabilities unlocked by Advanced Container Networking Services (ACNS). More notable, with this feature we now support all IPAM configuration options with eBPF dataplane on AKS clusters. You can create an AKS cluster with node subnet IPAM mode and eBPF dataplane as follows: az aks create \ --name <clusterName> \ --resource-group <resourceGroupName> \ --location <location> \ --network-plugin azure \ --network-dataplane cilium \ --generate-ssh-keys Learn more about Node Subnet - Configure Azure CNI Powered by Cilium in Azure Kubernetes Service (AKS) - Azure Kubernetes Service | Microsoft Learn. Defense-in-depth: Cilium Layer 7 Policies Azure CNI by Cilium extends its comprehensive Layer4 network policy capabilities to Layer7, offering granular control over application traffic. This feature enables users to define security policies based on application-level protocols and metadata, adding a powerful layer of security and compliance management. Layer7 policies are implemented using Envoy, an open-source service proxy, which is part of ACNS Security Agent operating in conjunction with the Cilium agent. Envoy handles traffic between services and provides necessary visibility and control at the application layer. Policies can be enforced based on HTTP and gRPC methods, paths, headers, and other application-specific attributes. Additionally, Cilium Network Policies support Kafka-based workflows, enhancing security and traffic management. This feature is currently in public preview mode and you can learn more about the getting started experience here - Introducing Layer 7 Network Policies with Advanced Container Networking Services for AKS Clusters! | Microsoft Community Hub. Coming Soon: Transparent Encryption with Wireguard By leveraging Cilium’s Wireguard, customers can achieve regulatory compliance by ensuring that all network traffic, whether HTTP-based or non-HTTP, is encrypted. Users can enable inter-node transparent encryption in their Kubernetes environments using Cilium’s open-source based solution. When Wireguard is enabled, the cilium agent on each cluster node will establish a secure Wireguard tunnel with all other known nodes in the cluster to encrypt traffic between cilium endpoints. This feature will soon be in public preview and will be enabled as part of ACNS. Stay tuned for more details on this. Conclusion These new features in Azure CNI Powered by Cilium underscore our commitment to enhancing default network performance and security in your AKS environments, all while collaborating with the open-source community. From the impressive performance boost with CiliumEndpointSlice to the adaptability of dual-stack support and the advanced security of Layer7 policies and Wireguard based encryption, these innovations ensure your AKS clusters are not just ready for today but are primed for the future. Also, don’t forget to dive into the fascinating world of eBPF-based observability in multi-cloud environments! Check out our latest post - Retina: Bridging Kubernetes Observability and eBPF Across the Clouds. Why wait, try these out now! Stay tuned to the AKS public roadmap for more exciting developments! For additional information, visit the following resources: For more info about Azure CNI Powered by Cilium visit - Configure Azure CNI Powered by Cilium in AKS. For more info about ACNS visit Advanced Container Networking Services (ACNS) for AKS | Microsoft Learn.356Views0likes0CommentsDeploying Virtual Networks Across Tenants Using Azure Virtual Network Manager IP Address Management
Learn how to deploy a virtual network in a managed Azure tenant using an IPAM pool from a central management tenant with Azure Virtual Network Manager. This guide helps central enterprise IT teams streamline IP allocation and prevent CIDR conflicts across multi-tenant environments.454Views1like0CommentsIntroducing Layer 7 Network Policies with Advanced Container Networking Services for AKS Clusters!
We have been on an exciting journey to enhance the network observability and security capabilities of Azure Kubernetes Service (AKS) clusters through our Azure Container Networking Services offering. Our initial step, the launch of Fully Qualified Domain Name (FQDN) filtering, marked a foundational step in enabling policy-driven egress control. By allowing traffic management at the domain level, we set the stage for more advanced and fine-grained security capabilities that align with modern, distributed workloads. This was just the beginning, a glimpse into our commitment to enabling AKS users with robust and granular security controls. Today, we are thrilled to announce the public preview of Layer 7 (L7) Network Policies for AKS and Azure CNI powered by Cilium users with Advanced Container Networking Services enabled. This update brings a whole new dimension of security to your containerized environments, offering much more granular control over your application layer traffic. Overview of L7 Policy Unlike traditional Layer 3 and Layer 4 policies that operate at the network and transport layers, L7 policies operate at the application layer. This enables more precise and effective traffic management based on application-specific attributes. L7 policies enable you to define security rules based on application-layer protocols such as HTTP(S), gRPC, and Kafka. For example, you can create policies that allow traffic based on HTTP(S) methods (GET, POST, etc.), headers, paths, and other protocol-specific attributes. This level of control helps in implementing fine-grained access control, restricting traffic based on the actual content of the communication, and gaining deeper insights into your network traffic at the application layer. Use cases of L7 policy API Security: For applications exposing APIs, L7 policies provide fine-grained control over API access. You can define policies that only allow specific HTTP(S) methods (e.g., GET for read-only operations, POST for creating resources) on particular API paths. This helps in enforcing API security best practices and prevent unnecessary access. Zero-Trust Implementation: L7 policies are a key component in implementing a Zero-Trust security model within your AKS environment. By default-denying all traffic and then explicitly allowing only necessary communication based on application-layer context, you can significantly reduce the attack surface and improve overall security posture. Microservice Segmentation and Isolation: In microservice architecture, it's essential to isolate services to limit the blast radius of potential security breaches. L7 policies allow you to define precise rules for inter-service communication. For example, you can ensure that a billing service can only be accessed by an order processing service via specific API endpoints and HTTP(S) methods, preventing unauthorized access from other services. How Does It Work? When a pod sends out network traffic, it is first checked against your defined rules using a small, efficient program called an extended Berkley Packet Filter (eBPF) probe. This probe marks the traffic if L7 policies are enabled for that pod, it is then redirected to a local Envoy Proxy. The Envoy proxy here is part of the ACNS Security Agent, separate from the Cilium agent. The Envoy Proxy, part of the ACNS Security Agent, then acts as a gatekeeper, deciding whether the traffic is allowed to proceed based on your policy criteria. If the traffic is permitted, it flows to its destination. If not, the application receives an error message saying access denied. Example: Restricting HTTP POST Requests Let's say you have a web application running on your AKS cluster, and you want to ensure that a specific backend service (backend-app) only accepts GET requests on the /data path from your frontend application (frontend-app). With L7 Network Policies, you can define a policy like this: apiVersion: cilium.io/v2 kind: CiliumNetworkPolicy metadata: name: allow-get-products spec: endpointSelector: matchLabels: app: backend-app # Replace with your backend app name ingress: - fromEndpoints: - matchLabels: app: frontend-app # Replace with your frontend app name toPorts: - ports: - port: "80” protocol: TCP rules: http: - method: "GET" path: "/data" How Can You Observe L7 Traffic? Advanced Container Networking Services with L7 policies also provides observability into L7 traffic, including HTTP(S), gRPC, and Kafka, through Hubble. Hubble is enabled by default with Advanced Container Networking Services. To facilitate easy analysis, Advanced Container Networking Services with L7 policy offers pre-configured Azure Managed Grafana dashboards, located in the Dashboards > Azure Managed Prometheus folder. These dashboards, such as "Kubernetes/Networking/L7 (Namespace)" and "Kubernetes/Networking/L7 (Workload)", provide granular visibility into L7 flow data at the cluster, namespace, and workload levels. The screenshot below shows a Grafana dashboard visualizing incoming HTTP traffic for the http-server-866b29bc75 workload in AKS over the last 5 minutes. It displays request rates, policy verdicts (forwarded/dropped), method/status breakdown, and dropped flow rates for real-time monitoring and troubleshooting. This is just an example; similar detailed metrics and visualizations, including heatmaps, are also available for gRPC and Kafka traffic on our pre-created dashboards. For real-time log analysis, you can also leverage the Hubble CLI and UI options offered as part of the Advanced Container Networking Services observability solution, allowing you to inspect individual L7 flows and troubleshoot any policy enforcement. Call to Action We encourage you to try out the public preview of L7 Network Policies on your AKS clusters and level up your network security controls for containerized workloads. We value your feedback as we continue to develop and improve this feature. Please refer to the Layer 7 Policy Overview for more information and visit How to Apply L7 Policy for an example scenario.591Views2likes0CommentsThe Deployment of Hollow Core Fiber (HCF) in Azure’s Network
Co-authors: Jamie Gaudette, Frank Rey, Tony Pearson, Russell Ellis, Chris Badgley and Arsalan Saljoghei In the evolving Cloud and AI landscape, Microsoft is deploying state-of-the-art Hollow Core Fiber (HCF) technology in Azure’s network to optimize infrastructure and enhance performance for customers. By deploying cabled HCF technology together with HCF-supportable datacenter (DC) equipment, this solution creates ultra-low latency traffic routes with faster data transmission to meet the demands of Cloud & AI workloads. The successful adoption of HCF technology in Azure’s network relies on developing a new ecosystem to take full advantage of the solution, including new cables, field splicing, installation and testing… and Microsoft has done exactly that. Azure has collaborated with industry leaders to deliver components and equipment, cable manufacturing and installation. These efforts, along with advancements in HCF technology, have paved the way for its deployment in-field. HCF is now operational and carrying live customer traffic in multiple Microsoft Azure regions, proving it is as reliable as conventional fiber with no field failures or outages. This article will explore the installation activities, testing, and link performance of a recent HCF deployment, showcasing the benefits that Azure customers can leverage from HCF technology. HCF connected Azure DCs are ready for service The latest HCF cable deployment connects two Azure DCs in a major city, with two metro routes each over 20km long. The hybrid cables both include 32 HCF and 48 single mode fiber (SMF) strands, with HCFs delivering high-capacity Dense Wavelength Division Multiplexing (DWDM) transmission comparable to SMF. The cables are installed over two diverse paths (the red and blue lines shown in image 1), each with different entry points into the DC. Route diversity at the physical layer enhances network resilience and reliability by allowing traffic to be rerouted through alternate paths, minimizing the risk of network outage should there be a disruption. It also allows for increased capacity by distributing network traffic more evenly, improving overall network performance and operational efficiency. Image 1: Satellite image of two Azure DC sites (A & Z) within a metro region interconnected with new ultra-low latency HCF technology, using two diverse paths (blue & red) Image 2 shows the optical routing that the deployed HCF cables take through both Inside Plant (ISP) and Outside Plant (OSP), for interconnecting terminal equipment within key sites in the region (comprised of DCs, Network Gateways and PoPs). Image 2: Optical connectivity at the physical layer between DCA and DCZ The HCF OSP cables have been developed for outdoor use in harsh environments without degrading the propagation properties of the fiber. The cable technology is smaller, faster, and easier to install (using a blown installation method). Alongside cables, various other technologies have been developed and integrated to provide a reliable end-to-end HCF network solution. This includes dedicated HCF-compatible equipment (shown in image 3), such as custom cable joint enclosures, fusion splicing technology, HCF patch tails for cable termination in the DC, and a HCF custom-designed Optical Time Domain Reflectometer (OTDR) to locate faults in the link. These solutions work with commercially available transponders and DWDM technologies to deliver multi-Tb/s capacities for Azure customers. Looking more closely at a HCF cable installation, in image 4 the cable is installed by passing it through a blowing-head (1) and inserting it into pre-installed conduit in segments underground along the route. As with traditional installations with conventional cable, the conduit, cable entry/exit, and cable joints are accessible through pre-installed access chambers, typically a few hundred meters apart. The blowing head uses high-pressure air from a compressor to push the cable into the conduit. A single drum-length of cable can be re-fleeted above ground (2) at multiple access points and re-jetted (3) over several kilometers. After the cables are installed inside the conduit, they are jointed at pre-designated access chamber locations. These house the purpose designed cable joint enclosures. Image 4: Cable preparation and installation during in-field deployment Image 5 shows a custom HCF cable joint enclosure in the field, tailored to protect HCFs for reliable data transmission. These enclosures organize the many HCF splices inside and are placed in underground chambers across the link. Image 5: 1) HCF joint enclosure in a chamber in-field 2) Open enclosure showing fiber loop storage protected by colored tubes at the rear-side of the joint 3) Open enclosure showing HCF spliced on multiple splice tray layers Inside the DC, connectorized ‘plug-and-play’ HCF-specific patch tails have been developed and installed for use with existing DWDM solutions. The patch tails interface between the HCF transmission and SMF active and passive equipment, each containing two SMF compatible connectors, coupled to the ISP HCF cable. In image 6, this has been terminated to a patch panel and mated with existing DWDM equipment inside the DC. Image 6: HCF patch tail solution connected to DWDM equipment Testing To validate the end-to-end quality of the installed HCF links (post deployment and during its operation), field deployable solutions have been developed and integrated to ensure all required transmission metrics are met and to identify and restore any faults before the link is ready for customer traffic. One such solution is Microsoft’s custom-designed HCF-specific OTDR, which helps measure individual splice losses and verify attenuation in all cable sections. This is checked against rigorous Azure HCF specification requirements. The OTDR tool is invaluable for locating high splice losses or faults that need to be reworked before the link can be brought into service. The diagram below shows an OTDR trace detecting splice locations and splice loss levels (dB) across a single strand of installed HCF. The OTDR can also continuously monitor HCF links and quickly locate faults, such as cable cuts, for quick recovery and remediation. For this deployment, a mean splice loss of 0.16dB was achieved, with some splices as low as 0.04dB, comparable to conventional fiber. Low attenuation and splice loss helps to maintain higher signal integrity, supporting longer transmission reach and higher traffic capacity. There are ongoing Azure HCF roadmap programs to continually improve this. Performance Before running customer traffic on the link, the fiber is tested to ensure reliable, error-free data transmission across the operating spectrum by counting lost or error bits. Once confirmed, the link is moved into production, allowing customer traffic to flow on the route. These optical tests, tailored to HCF, are carried out by the installer to meet Azure’s acceptance requirements. Image 8 illustrates the flow of traffic across a HCF link, dictated by changing demand on capacity and routing protocols in the region, which fluctuate throughout the day. The HCF span supports varying levels of customer traffic from the point the link was made live, without incurring any outages or link flaps. A critical metric for measuring transmission performance over each HCF path is the instantaneous Pre-Forward Error Correction (FEC) Bit Error Rate (BER) level. Pre-FEC BERs measure errors in a digital data stream at the receiver before any error correction is applied. This is crucial for transmission quality when the link carries data traffic; lower levels mean fewer errors and higher signal quality, essential for reliable data transmission. The following graph (image 9) shows the evolution of the Pre-FEC BER level on a HCF span once the link is live. A single strand of HCF is represented by a color, with all showing minimal fluctuation. This demonstrates very stable Pre-FEC BER levels, well below < -3.4 (log 10 ), across all 400G optical transponders, operating over all channels during a 24-day period. This indicates the network can handle high-data transmission efficiently with no Post-FEC errors, leading to high customer traffic performance and reliability. Image 9: Very stable Pre-FEC BER levels across the HCF span over 20 days The graph below demonstrates the optical loss stability over one entire span which is comprised of two HCF strands. It was monitored continuously over 20 days using the inbuilt line system and measured in both directions to assess the optical health of the HCF link. The new HCF cable paths are live and carrying customer traffic across multiple Azure regions. Having demonstrated the end-to-end deployment capabilities and network compatibility of the HCF solution, it is possible to take full advantage of the ultra-stable, high performance and reliable connectivity HCF delivers to Azure customers. What’s next? Unlocking the full potential of HCF requires compatible, end-to-end solutions. This blog outlines the holistic and deployable HCF systems we have developed to better serve our customers. While we further integrate HCF into more Azure Regions, our development roadmap continues. Smaller cables with more fibers and enhanced systems components to further increase the capacity of our solutions, standardized and simplified deployment and operations, as well as extending the deployable distance of HCF long haul transmission solutions. Creating a more stable, higher capacity, faster network will allow Azure to better serve all its customers. Learn more about how hollow core fiber is accelerating AI. If you are attending the Optical Fiber Communication (OFC) Conference in San Francisco, US (March 30 – April 3 2025), find out more during a panel discussion given by Russ Ellis (Azure Principal Cloud Network Engineer), titled ‘Deployment of Hollow Core Fiber (HCF) in the Microsoft Azure Cloud’. The session is at 16:00-18:30 on Sunday March 30 meeting room 201-202 (Level 2): OFC Program6.8KViews8likes1CommentAnnouncing the general availability of Route-Maps for Azure virtual WAN.
Route-maps is a feature that gives you the ability to control route advertisements and routing for Virtual WAN virtual hubs. Route-maps lets you have more control of the routing that enters and leaves Azure Virtual WAN site-to-site (S2S) VPN connections, User VPN point-to-site (P2S) connections, ExpressRoute connections, and virtual network (VNet) connections. Why use Route-maps? Route-maps can be used to summarize routes when you have on-premises networks connected to Virtual WAN via ExpressRoute or VPN and are limited by the number of routes that can be advertised from/to virtual hub. You can use Route-maps to control routes entering and leaving your Virtual WAN deployment among on-premises and virtual-networks. You can control routing decisions in your Virtual WAN deployment by modifying a BGP attribute such as AS-PATH to make a route more, or less preferable. This is helpful when there are destination prefixes reachable via multiple paths and customers want to use AS-PATH to control best path selection. You can easily tag routes using the BGP community attribute in order to manage routes. What is a Route-map? A Route-map is an ordered sequence of one or more rules that are applied to routes received or sent by the virtual hub. Each Route-map rule comprises of 3 sections: 1. Match conditions Route-maps allows you to match routes using Route-prefix, BGP community and AS-Path. These are the set of conditions that a processed route must meet in order to be considered as a match for the rule. Below are the supported match conditions. Property Criterion Value (example) Interpretation Route-prefix equals 10.1.0.0/16,10.2.0.0/16,10.3.0.0/16,10.4.0.0/16 Matches these 4 routes only. Specific prefixes under these routes won't be matched. Route-prefix contains 10.1.0.0/16,10.2.0.0/16, 192.168.16.0/24, 192.168.17.0/24 Matches all the specified routes and all prefixes underneath. (Example 10.2.1.0/24 is underneath 10.2.0.0/16) Community equals 65001:100,65001:200 Community property of the route must have both the communities. Order isn't relevant. Community contains 65001:100,65001:200 Community property of the route can have one or more of the specified communities. AS-Path equals 65001,65002,65003 AS-PATH of the routes must have ASNs listed in the specified order. AS-Path contains 65001,65002,65003 AS-PATH in the routes can contain one or more of the ASNs listed. Order isn't relevant. 2. Actions to be performed Route-Maps allows you to Drop or Modify routes. Below are the supported actions. Property Action Value Interpretation Route-prefix Drop 10.3.0.0/8,10.4.0.0/8 The routes specified in the rule are dropped. Route-prefix Replace 10.0.0.0/8,192.168.0.0/16 Replace all the matched routes with the routes specified in the rule. As-Path Add 64580,64581 Prepend AS-PATH with the list of ASNs specified in the rule. These ASNs are applied in the same order for the matched routes. As-Path Replace 65004,65005 AS-PATH will be set to this list in the same order, for every matched route. See key considerations for reserved AS numbers. As-Path Replace No value specified Remove all ASNs in the AS-PATH in the matched routes. Community Add 64580:300,64581:300 Add all the listed communities to all the matched routes community attribute. Community Replace 64580:300,64581:300 Replace community attribute for all the matched routes with the list provided. Community Replace No value specified Remove community attribute from all the matched routes. Community Remove 65001:100,65001:200 Remove any of the listed communities that are present in the matched routes’ Community attribute. 3. Apply to a Connection You can apply route-maps on each connection for the inbound, outbound, or both inbound and outbound directions. Where can I find Route-maps? Route-Maps can be found in the routing section of your virtual WAN hub. How do I troubleshoot Route-maps? The Route-Map Dashboard lets you views the routes for a connection. You can view the routes before or after a Route-Map is applied. Demo4.7KViews12likes2CommentsIntroducing Subnet Peering in Azure
Subnet peering refers to a method of connecting two Virtual Networks (VNETs) by linking only the subnet address spaces rather than the entire VNET address spaces. It lets users specify which subnets are supposed to participate in the peering across the local and remote vnet. It is an added flexibility built on top of VNET peering, where users get an option to choose specific subnets that need to be peered across VNETS. User can enter the list of subnets across the VNETS that they want to peer. In contrast, in the case of regular VNET peering, entire address space/subnets across the VNETS get peered. Note: The feature is available in TF, CLI, API, ARM, PS across all regions. The feature is being rolled out to selected customers on need basis. To use it, you would need to allowlist your subscription. Please fill this form to get your subscription registered. Key customer use cases: Peering subnets across VNets having overlapping address space (given the subnets to be peered are having unique addresses and belong to unique address spaces – typical scenario below depicts a hub and spoke model). Refer the image below: Peering subnets over IPv6 only in case of dual stack subnets - With subnet peering, users get an option to peer over IPv6 address space when they have dual stack subnets across the local and remote VNETS. This would enable customers to rely on IPv6 for peering connectivity. Refer image below: Gateway scenarios related to on-prem connectivity, where admins would want to expose certain subnets only to the on-prem and not all. Refer the image below: Subnet peering - Checks and Limitations We would like to discuss few critical aspects wrt. subnet peering in the present release. For this consider the following setup of three VNETS, where VNET A is subnet peered with VNET B and with VNET C. Key points to note are: The participating subnets must be unique and must belong to unique address spaces. For example, in the VNET A and VNET C peering illustrated in the above figure (black arrow headed line) VNET A cannot subnet peer over Subnet 1, Subnet 2 and Subnet 3 with any of the subnets in VNET C as VNET C, as these subnets of VNET A belong to 10.1.0.0/16 Address space which is also present in VNET C. However, VNET A’s Subnet 4 (10.0.1.0/24) can still subnet peer with Subnet 5 in VNET C (10.6.1.0/24) as these subnets are unique across the VNETS and they belong to unique address spaces across VNETS. Note that Subnet 4 belongs to 10.0.0.0/16 address space in VNET A and Subnet 5 belongs to 10.6.0.0/16 address space in VNET C. There can be only one peering link between any two VNETS. If you want to add, remove subnets from the peering link, then the same peering link will be required to be updated. This also means multiple exclusive peering between set of subnets are not possible. Also, a given peering link type cannot be changed. That means, if there is a VNET peering between VNET A and VNET B, and user wants to change that to subnet peering, the existing VNET peering link needs to be deleted, and new peering needs to be created with the required parameters for subnet peering and vice versa. Number of subnets that can be part of the peering link should be less than or equal to 400 (200 limit on each local and remote side). For example, in the VNET A and VNET B peering link (illustrated by blue arrow headed line), total number of subnets participating in the peering here is 4 (two from VNET A and two from VNET B side). This number should be <=400. In the present release (Public preview and GA March 2025, feature will remain behind subscription flag), forward route from non-peered subnet to peered subnet exists- i.e. in the current scenario for VNET A and VNET B peering, even though Subnet 2 from VNET A side is not peered, but it will still have route for Subnet 1 and Subnet 2 in VNET B. To clarify more, in the subnet peering for VNET A and VNET B above, customer would expect only Subnet 1 and Subnet 3 from VNET A to have route for Subnet 1 and Subnet 2 in remote VNET B, however, Subnet 2 and Subnet 4 (from local side VNET A which are not peered) also have route for Subnet 1 and Subnet 2 in remote side (VNET B), although the packet gets dropped and do not reach VM. This limitation will be removed in the post GA release. Subnet Peering and AVNM Connected Group Note that if two VNETs are connected in 'Connected Group', and if Subnet peering is configured over these two VNETS, subnet peering takes preference and the connectivity between non-peered subnets gets dropped. AVNM Connectivity Configuration AVNM today cannot differentiate between VNET peering and Subnet peering. So let's say if Subnet peering exists between VNET A and VNET B, and later an AVNM user tries to establish a VNET peering between VNET A and VNET B through some connectivity configuration (say Hub and Spoke deployment), AVNM would assume that peering between VNET A and VNET B already exists and would ignore the new peering request. We recommend users to exercise caution in such conflicting scenarios while using AVNM and Subnet peering Subnet Peering and Virtual WAN: Currently Azure Virtual WAN does not support subnet peering. Note: The feature is being rolled out to selected customers on need basis. To use it, you would need to allowlist your subscription. Please fill this form to get your subscription registered. How to configure subnet peering? In the existing vnet peering create process, few new optional parameters are introduced (refer A to D below). Below is the description/reference of each: A. --peer-complete-vnet This parameter would let user select subnet peering. By default, the value for this parameter is set to true, which means entire vnet is peered (all address spaces/subnets across the VNETS). To enable subnet peering, this parameter needs to be set to false (0/f/n/no) accepted values: 0, 1, f, false, n, no, t, true, y, yes default value: True az network vnet peering create --name --remote-vnet --resource-group --vnet-name [--allow-forwarded-traffic {0, 1, f, false, n, no, t, true, y, yes}] [--allow-gateway-transit {0, 1, f, false, n, no, t, true, y, yes}] [--allow-vnet-access {0, 1, f, false, n, no, t, true, y, yes}] [--no-wait {0, 1, f, false, n, no, t, true, y, yes}] [--use-remote-gateways {0, 1, f, false, n, no, t, true, y, yes}] [--peer-complete-vnet {0, 1(default), f, false, n, no, t, true, y, yes}] B. --local-subnet-names This parameter lets user enter local subnet names they want to peer with the remote subnets, in case subnet peering is enabled by setting “peer-complete-vnet’ parameter as C. --remote-subnet-names This parameter would let user enter remote subnet names they want to peer with the remote subnets in case subnet peering is enabled by setting “peer-complete-vnet’ parameter as Sample CLI input: az network vnet peering create -g MyResourceGroup -n MyVnet1ToMyVnet2 --vnet-name MyVnet1 --remote-vnet MyVnet2Id --allow-vnet-access --peer-complete-vnet 0 --local-subnet-names subnet1 subnet2 --remote-subnet-names subnet1 subnet2 Subnet peering and IPv6 only peering In addition to above, in case of dual stack subnets, you can also configure IPv6 only peering wherein the engaging subnets interact over IPv6 only. In case of dual stack subnets, users can choose to peer the subnets over IPv6 address space only. Earlier, entire address spaces across VNETS used to get evaluated before peeing, and if the overlap check gets cleared, entire VNET address spaces get peered, i.e. both IPv4 and IPv6. However, with this release, we are providing an additional flexibility to users on top of Subnet peering, where they can specifically choose to peer over IPv6 address space only. Note, currently this feature is built on top of subnet peering, and to use this, an additional parameter 'enable-only-ipv6' needs to be set to true in the VNET peering create experience. D. --enable-only-ipv6 This parameter would let user exercise an option to select subnet peering with only IPv6 peering functionality. By default, the value for this parameter is set to false, which means peering would be done over IPv4 addresses by default. If set to true, peering would be done over IPv6 in case of dual stack subnets Accepted values: 0, 1, f, false, n, no, t, true, y, yes Sample CLI input: az network vnet peering create -g MyResourceGroup -n MyVnet1ToMyVnet2 --vnet-name MyVnet1 --remote-vnet MyVnet2Id --allow-vnet-access --peer-complete-vnet 0 ---enable-only-ipv6 1 --local-subnet-names subnet1 subnet2 --remote-subnet-names subnet1 subnet2 In this case, if the subnets mentioned are dual stack (or IPv6 only subnets in future), then peering would be over IPv6 address space only. Learn more: Azure Virtual Network- VNET peering How to configure subnet peering Allowlist your subscription- Fill the form to use the feature. Questions and concerns: amitmishra@microsoft.com2.3KViews2likes0CommentsExpressRoute Metro and Global Reach Expansion
Following the General Availability of ExpressRoute Metro in October, 2024, we’re excited to roll it out in four new locations. This broader footprint enables more customers to benefit from increased resiliency to strengthen the reliability of their private connectivity to Azure. In addition, we are introducing two new ExpressRoute locations in Brussels, further enhancing secure and private cloud connectivity access across Europe. New ExpressRoute Metro Locations: Atlanta, United States of America Jakarta, Indonesia Madrid, Spain Milan, Italy New ExpressRoute Peering Locations: Brussels, Belgium Brussels2, Belgium ExpressRoute Direct is now available in all of these new locations. ExpressRoute Global Reach, which allows private connectivity between your on-premises sites through Microsoft’s global network, is now offered in the following additional locations: Belgium Italy Spain Learn more about ExpressRoute Metro. Learn more about ExpressRoute Locations. Learn more about ExpressRoute Global Reach.482Views0likes0Comments