updates
15 TopicsThe Deployment of Hollow Core Fiber (HCF) in Azure’s Network
Co-authors: Jamie Gaudette, Frank Rey, Tony Pearson, Russell Ellis, Chris Badgley and Arsalan Saljoghei In the evolving Cloud and AI landscape, Microsoft is deploying state-of-the-art Hollow Core Fiber (HCF) technology in Azure’s network to optimize infrastructure and enhance performance for customers. By deploying cabled HCF technology together with HCF-supportable datacenter (DC) equipment, this solution creates ultra-low latency traffic routes with faster data transmission to meet the demands of Cloud & AI workloads. The successful adoption of HCF technology in Azure’s network relies on developing a new ecosystem to take full advantage of the solution, including new cables, field splicing, installation and testing… and Microsoft has done exactly that. Azure has collaborated with industry leaders to deliver components and equipment, cable manufacturing and installation. These efforts, along with advancements in HCF technology, have paved the way for its deployment in-field. HCF is now operational and carrying live customer traffic in multiple Microsoft Azure regions, proving it is as reliable as conventional fiber with no field failures or outages. This article will explore the installation activities, testing, and link performance of a recent HCF deployment, showcasing the benefits that Azure customers can leverage from HCF technology. HCF connected Azure DCs are ready for service The latest HCF cable deployment connects two Azure DCs in a major city, with two metro routes each over 20km long. The hybrid cables both include 32 HCF and 48 single mode fiber (SMF) strands, with HCFs delivering high-capacity Dense Wavelength Division Multiplexing (DWDM) transmission comparable to SMF. The cables are installed over two diverse paths (the red and blue lines shown in image 1), each with different entry points into the DC. Route diversity at the physical layer enhances network resilience and reliability by allowing traffic to be rerouted through alternate paths, minimizing the risk of network outage should there be a disruption. It also allows for increased capacity by distributing network traffic more evenly, improving overall network performance and operational efficiency. Image 1: Satellite image of two Azure DC sites (A & Z) within a metro region interconnected with new ultra-low latency HCF technology, using two diverse paths (blue & red) Image 2 shows the optical routing that the deployed HCF cables take through both Inside Plant (ISP) and Outside Plant (OSP), for interconnecting terminal equipment within key sites in the region (comprised of DCs, Network Gateways and PoPs). Image 2: Optical connectivity at the physical layer between DCA and DCZ The HCF OSP cables have been developed for outdoor use in harsh environments without degrading the propagation properties of the fiber. The cable technology is smaller, faster, and easier to install (using a blown installation method). Alongside cables, various other technologies have been developed and integrated to provide a reliable end-to-end HCF network solution. This includes dedicated HCF-compatible equipment (shown in image 3), such as custom cable joint enclosures, fusion splicing technology, HCF patch tails for cable termination in the DC, and a HCF custom-designed Optical Time Domain Reflectometer (OTDR) to locate faults in the link. These solutions work with commercially available transponders and DWDM technologies to deliver multi-Tb/s capacities for Azure customers. Looking more closely at a HCF cable installation, in image 4 the cable is installed by passing it through a blowing-head (1) and inserting it into pre-installed conduit in segments underground along the route. As with traditional installations with conventional cable, the conduit, cable entry/exit, and cable joints are accessible through pre-installed access chambers, typically a few hundred meters apart. The blowing head uses high-pressure air from a compressor to push the cable into the conduit. A single drum-length of cable can be re-fleeted above ground (2) at multiple access points and re-jetted (3) over several kilometers. After the cables are installed inside the conduit, they are jointed at pre-designated access chamber locations. These house the purpose designed cable joint enclosures. Image 4: Cable preparation and installation during in-field deployment Image 5 shows a custom HCF cable joint enclosure in the field, tailored to protect HCFs for reliable data transmission. These enclosures organize the many HCF splices inside and are placed in underground chambers across the link. Image 5: 1) HCF joint enclosure in a chamber in-field 2) Open enclosure showing fiber loop storage protected by colored tubes at the rear-side of the joint 3) Open enclosure showing HCF spliced on multiple splice tray layers Inside the DC, connectorized ‘plug-and-play’ HCF-specific patch tails have been developed and installed for use with existing DWDM solutions. The patch tails interface between the HCF transmission and SMF active and passive equipment, each containing two SMF compatible connectors, coupled to the ISP HCF cable. In image 6, this has been terminated to a patch panel and mated with existing DWDM equipment inside the DC. Image 6: HCF patch tail solution connected to DWDM equipment Testing To validate the end-to-end quality of the installed HCF links (post deployment and during its operation), field deployable solutions have been developed and integrated to ensure all required transmission metrics are met and to identify and restore any faults before the link is ready for customer traffic. One such solution is Microsoft’s custom-designed HCF-specific OTDR, which helps measure individual splice losses and verify attenuation in all cable sections. This is checked against rigorous Azure HCF specification requirements. The OTDR tool is invaluable for locating high splice losses or faults that need to be reworked before the link can be brought into service. The diagram below shows an OTDR trace detecting splice locations and splice loss levels (dB) across a single strand of installed HCF. The OTDR can also continuously monitor HCF links and quickly locate faults, such as cable cuts, for quick recovery and remediation. For this deployment, a mean splice loss of 0.16dB was achieved, with some splices as low as 0.04dB, comparable to conventional fiber. Low attenuation and splice loss helps to maintain higher signal integrity, supporting longer transmission reach and higher traffic capacity. There are ongoing Azure HCF roadmap programs to continually improve this. Performance Before running customer traffic on the link, the fiber is tested to ensure reliable, error-free data transmission across the operating spectrum by counting lost or error bits. Once confirmed, the link is moved into production, allowing customer traffic to flow on the route. These optical tests, tailored to HCF, are carried out by the installer to meet Azure’s acceptance requirements. Image 8 illustrates the flow of traffic across a HCF link, dictated by changing demand on capacity and routing protocols in the region, which fluctuate throughout the day. The HCF span supports varying levels of customer traffic from the point the link was made live, without incurring any outages or link flaps. A critical metric for measuring transmission performance over each HCF path is the instantaneous Pre-Forward Error Correction (FEC) Bit Error Rate (BER) level. Pre-FEC BERs measure errors in a digital data stream at the receiver before any error correction is applied. This is crucial for transmission quality when the link carries data traffic; lower levels mean fewer errors and higher signal quality, essential for reliable data transmission. The following graph (image 9) shows the evolution of the Pre-FEC BER level on a HCF span once the link is live. A single strand of HCF is represented by a color, with all showing minimal fluctuation. This demonstrates very stable Pre-FEC BER levels, well below < -3.4 (log 10 ), across all 400G optical transponders, operating over all channels during a 24-day period. This indicates the network can handle high-data transmission efficiently with no Post-FEC errors, leading to high customer traffic performance and reliability. Image 9: Very stable Pre-FEC BER levels across the HCF span over 20 days The graph below demonstrates the optical loss stability over one entire span which is comprised of two HCF strands. It was monitored continuously over 20 days using the inbuilt line system and measured in both directions to assess the optical health of the HCF link. The new HCF cable paths are live and carrying customer traffic across multiple Azure regions. Having demonstrated the end-to-end deployment capabilities and network compatibility of the HCF solution, it is possible to take full advantage of the ultra-stable, high performance and reliable connectivity HCF delivers to Azure customers. What’s next? Unlocking the full potential of HCF requires compatible, end-to-end solutions. This blog outlines the holistic and deployable HCF systems we have developed to better serve our customers. While we further integrate HCF into more Azure Regions, our development roadmap continues. Smaller cables with more fibers and enhanced systems components to further increase the capacity of our solutions, standardized and simplified deployment and operations, as well as extending the deployable distance of HCF long haul transmission solutions. Creating a more stable, higher capacity, faster network will allow Azure to better serve all its customers. Learn more about how hollow core fiber is accelerating AI. Recently published HCF research papers: Ultra high resolution and long range OFDRs for characterizing and monitoring HCF DNANFs Unrepeated HCF transmission over spans up to 301.7km10KViews10likes1CommentCombining firewall protection and SD-WAN connectivity in Azure virtual WAN
Virtual WAN (vWAN) introduces new security and connectivity features in Azure, including the ability to operate managed third-party firewalls and SD-WAN virtual appliances, integrated natively within a virtual WAN hub (vhub). This article will discuss updated network designs resulting from these integrations and examine how to combine firewall protection and SD-WAN connectivity when using vWAN. The objective is not to delve into the specifics of the security or SD-WAN connectivity solutions, but to provide an overview of the possibilities. Firewall protection in vWAN In a vWAN environment, the firewall solution is deployed either automatically inside the vhub (Routing Intent) or manually in a transit VNet (VM-series deployment). Routing Intent (managed firewall) Routing Intent refers to the concept of implementing a managed firewall solution within the vhub for internet protection or private traffic protection (VNet-to-VNet, Branch-to-VNet, Branch-to-Branch), or both. The firewall could be either an Azure Firewall or a third-party firewall, deployed within the vhub as Network Virtual Appliances or a SaaS solution. A vhub containing a managed firewall is called a secured hub. For an updated list of Routing Intent supported third-party solutions please refer to the following links: managed NVAs SaaS solution Transit VNet (unmanaged firewall) Another way to provide inspection in vWAN is to manually deploy the firewall solution in a spoke of the vhub and to cascade the actual spokes behind that transit firewall VNet (aka indirect spoke model or tiered-VNet design). In this discussion, the primary reasons for choosing unmanaged deployments are: either the firewall solution lacks an integrated vWAN offer, or it has an integrated offer but falls short in horizontal scalability or specific features compared to the VM-based version. For a detailed analysis on the pros and cons of each design please refer to this article. SD-WAN connectivity in vWAN Similar to the firewall deployment options, there are two main methods for extending an SDWAN overlay into an Azure vWAN environment: a managed deployment within the vhub, or a standard VM-series deployment in a spoke of the vhub. More options here. SD-WAN in vWAN deployment (managed) In this scenario, a pair of virtual SD-WAN appliances are automatically deployed and integrated in the vhub using dynamic routing (BGP) with the vhub router. Deployment and management processes are streamlined as these appliances are seamlessly provisioned in Azure and set up for a simple import into the partner portal (SD-WAN orchestrator). For an updated list of supported SDWAN partners please refer to this link. For more information on SD-WAN in vWAN deployments please refer to this article. VM-series deployment (unmanaged) This solution requires manual deployment of the virtual SD-WAN appliances in a spoke of the vhub. The underlying VMs and the horizontal scaling are managed by the customer. Dynamic route exchange with the vWAN environment is achieved leveraging BGP peering with the vhub. Alternatively, and depending on the complexity of your addressing plan, static routing may also be possible. Firewall protection and SD-WAN in vWAN THE CHALLENGE! Currently, it is only possible to chain managed third-party SD-WAN connectivity with Azure Firewall in the same vhub, or to use dual-role SD-WAN connectivity and security appliances. Routing Intent provided by third-party firewalls combined with another managed SD-WAN solution inside the same vhub is not yet supported. But how can firewall protection and SD-WAN connectivity be integrated together within vWAN? Solution 1: Routing Intent with Azure Firewall and managed SD-WAN (same vhub) Firewall solution: managed. SD-WAN solution: managed. This design is only compatible with Routing Intent using Azure Firewall, as it is the sole firewall solution that can be combined with a managed SD-WAN in vWAN deployment in that same vhub. With the private traffic protection policy enabled in Routing Intent, all East-West flows (VNet-to-VNet, Branch-to-VNet, Branch-to-Branch) are inspected. Solution 2: Routing Intent with a third-party firewall and managed SD-WAN (2 vhubs) Firewall solution: managed. SD-WAN solution: managed. To have both a third-party firewall managed solution in vWAN and an SD-WAN managed solution in vWAN in the same region, the only option is to have a vhub dedicated to the security solution deployment and another vhub dedicated to the SD-WAN solution deployment. In each region, spoke VNets are connected to the secured vhub, while SD-WAN branches are connected to the vhub containing the SD-WAN deployment. In this design, Routing Intent private traffic protection provides VNet-to-VNet and Branch-to-VNet inspection. However, Branch-to-Branch traffic will not be inspected. Solution 3: Routing Intent and SD-WAN spoke VNet (same vhub) Firewall solution: managed. SD-WAN solution: unmanaged. This design is compatible with any Routing Intent supported firewall solution (Azure Firewall or third-party) and with any SD-WAN solution. With Routing Intent private traffic protection enabled, all East-West flows (VNet-to-VNet, Branch-to-VNet, Branch-to-Branch) are inspected. Solution 4: Transit firewall VNet and managed SDWAN (same vhub) Firewall solution: unmanaged. SD-WAN solution: managed. This design utilizes the indirect spoke model, enabling the deployment of managed SD-WAN in vWAN appliances. This design provides VNet-to-VNet and Branch-to-VNet inspection. But because the firewall solution is not hosted in the hub, Branch-to-Branch traffic will not be inspected. Solution 5 - Transit firewall VNet and SD-WAN spoke VNet (same vhub) Firewall solution: unmanaged. SD-WAN solution: unmanaged. This design integrates both the security and the SD-WAN connectivity as unmanaged solutions, placing the responsibility for deploying and managing the firewall and the SD-WAN hub on the customer. Just like in solution #4, only VNet-to-VNet and Branch-to-VNet traffic is inspected. Conclusion Although it is currently not possible to combine a managed third-party firewall solution with a managed SDWAN deployment within the same vhub, numerous design options are still available to meet various needs, whether managed or unmanaged approaches are preferred.4.3KViews6likes2CommentsMicrosoft Azure scales Hollow Core Fiber (HCF) production through outsourced manufacturing
Introduction As cloud and AI workloads surge, the pressure on datacenter (DC), Metro and Wide Area Network (WAN) networks has never been greater. Microsoft is tackling the physical limits of traditional networking head-on. From pioneering research in microLED technologies to deploying Hollow Core Fiber (HCF) at global scale, Microsoft is reimagining connectivity to power the next era of cloud networking. Azure’s HCF journey has been one of relentless innovation, collaboration, and a vision to redefine the physical layer of the cloud. Microsoft’s HCF, based on the proprietary Double Nested Antiresonant Nodeless Fiber (DNANF) design, delivers up to 47% faster data transmission and approximately 33% lower latency compared to conventional Single Mode Fiber (SMF), bringing significant advantages to the network that powers Azure. Today, Microsoft is announcing a major milestone: the industrial scale-up of HCF production, powered by new strategic manufacturing collaborations with Corning Incorporated (Corning) and Heraeus Covantics (Heraeus). These collaborations will enable Azure to increase the global fiber production of HCF to meet the demands of the growing network infrastructure, advancing the performance and reliability customers expect for cloud and AI workloads. Real-world benefits for Azure customers Since 2023, Microsoft has deployed HCF across multiple Azure regions, with production links meeting performance and reliability targets. As manufacturing scales, Azure plans to expand deployment of the full end-to-end HCF network solution to help increase capacity, resiliency, and speed for customers, with the potential to set new benchmarks for latency and efficiency in fiber infrastructure. Why it matters Microsoft’s proprietary HCF design brings the following improvements for Azure customers: Increased data transmission speeds with up to 33% lower latency. Enhanced signal performance that improves data transmission quality for customers. Improved optical efficiency resulting in higher bandwidth rates compared to conventional fiber. How Microsoft is making it possible To operationalize HCF across Azure with production grade performance, Microsoft is: Deploying a standardized HCF solution with end-to-end systems and components for operational efficiency, streamlined network management, and reliable connectivity across Azure’s infrastructure. Ensuring interoperability with standard SMF environments, enabling seamless integration with existing optical infrastructure in the network for faster deployment and scalable growth. Creating a multinational production supply chain to scale next generation fiber production, ensuring the volumes and speed to market needed for widespread HCF deployment across the Azure network. Scaling up and out With Corning and Heraeus as Microsoft’s first HCF manufacturing collaborators, Azure plans to accelerate deployment to meet surging demand for high-performance connectivity. These collaborations underscore Microsoft’s commitment to enhancing its global infrastructure and delivering a reliable customer experience. They also reinforce Azure’s continued investment in deploying HCF, with a vision for this technology to potentially set the global benchmark for high-capacity fiber innovation. “This milestone marks a new chapter in reimagining the cloud’s physical layer. Our collaborations with Corning and Heraeus establish a resilient, global HCF supply chain so Azure can deliver a standardized, world-class customer experience with ultra-low latency and high reliability for modern AI and cloud workloads.” - Jamie Gaudette, Partner Cloud Network Engineering Manager at Microsoft To scale HCF production, Microsoft will utilize Corning’s established U.S. facilities, while Heraeus will produce out of its sites in both Europe and the U.S. "Corning is excited to expand our longtime collaboration with Microsoft, leveraging Corning’s fiber and cable manufacturing facilities in North Carolina to accelerate the production of Microsoft's Hollow Core Fiber. This collaboration not only strengthens our existing relationship but also underscores our commitment to advancing U.S. leadership in AI innovation and infrastructure. By working closely with Microsoft, we are poised to deliver solutions that meet the demands of AI workloads, setting new benchmarks for speed and efficiency in fiber infrastructure." - Mike O'Day, Senior Vice President and General Manager, Corning Optical Communications “We started our work on HCF a decade ago, teamed up with the Optoelectronics Research Centre (ORC) at the University of Southampton and then with Lumenisity prior to its acquisition. Now, we are excited to continue working with Microsoft on shaping the datacom industry. With leading solutions in glass, tube, preform, and fiber manufacturing, we are ready to scale this disruptive HCF technology to significant volumes. We’ll leverage our proven track record of taking glass and fiber innovations from the lab to widespread adoption, just as we did in the telecom industry, where approximately 2 billion kilometers of fiber are made using Heraeus products.” - Dr. Jan Vydra, Executive Vice President Fiber Optics, Heraeus Covantics Azure engineers are working alongside Corning and Heraeus to operationalize Microsoft manufacturing process intellectual property (IP), deliver targeted training programs, and drive the yield, metrology, and reliability improvements required for scaled production. The collaborations are foundational to a growing standardized, global ecosystem that supports: Glass preform/tubing supply Fiber production at scale Cable and connectivity for deployment into carrier‑grade environments Building on a foundation of innovation: Microsoft’s HCF program In 2022, Microsoft acquired Lumenisity, a spin‑out from the Optoelectronics Research Centre (ORC) at the University of Southampton, UK. That same year, Microsoft launched the world’s first state‑of‑the‑art HCF fabrication facility in the UK to expand production and drive innovation. This purpose-built site continues to support long‑term HCF research, prototyping, and testing, ensuring that Azure remains at the forefront of HCF technology. Working with industry leaders, Microsoft has developed a proven end‑to‑end ecosystem of components, equipment, and HCF‑specific hardware necessary and successfully proven in production deployments and operations. Pushing the boundaries: recent breakthrough research Today, the University of Southampton announced a landmark achievement in optical communications: in collaboration with Azure Fiber researchers, they have demonstrated the lowest signal loss ever recorded for optical fibers (<0.1 dB/km) using research-grade DNANF HCF technology (see figure 4). This breakthrough, detailed in a research paper published in Nature Photonics earlier this month, paves the way for a potential revolution in the field, enabling unprecedented data transmission capacities and longer unamplified spans. ecords at around 1550nm [1] 2002 Nagayama et al. 1 [2] 2025 Sato et al. 2 [3] 2025 research-grade DNANF HCF Petrovich et al. 3 This breakthrough highlights the potential for this technology to transform global internet infrastructure and DC connectivity. Expected benefits include: Faster: Approximately 47% faster, reducing latency, powering real-time AI inference, cloud gaming and other interactive workloads. More capacity: A wider optical spectrum window enabling exponentially greater bandwidth. Future-ready: Lays the groundwork for quantum-safe links, quantum computing infrastructure, advanced sensing, and remote laser delivery. Looking ahead: Unlocking the future of cloud networking The future of cloud networking is being built today! With record-breaking [3] fiber innovations, a rapidly expanding collaborative ecosystem, and the industrialized scale to deliver next-generation performance, Azure continues to evolve to meet the demands for speed, reliability, and connectivity. As we accelerate the deployment of HCF across our global network, we’re not just keeping pace with the demands of AI and cloud, we’re redefining what’s possible. References: [1] Nagayama, K., Kakui, M., Matsui, M., Saitoh, T. & Chigusa, Y. Ultra-low-loss (0.1484 dB/km) pure silica core fibre and extension of transmission distance. Electron. Lett. 38, 1168–1169 (2002). [2] Sato, S., Kawaguchi, Y., Sakuma, H., Haruna, T. & Hasegawa, T. Record low loss optical fiber with 0.1397 dB/km. In Proc. Optical Fiber Communication Conference (OFC) 2024 Tu2E.1 (Optica Publishing Group, 2024). [3] Petrovich, M., Numkam Fokoua, E., Chen, Y., Sakr, H., Isa Adamu, A., Hassan, R., Wu, D., Fatobene Ando, R., Papadimopoulos, A., Sandoghchi, S., Jasion, G., & Poletti, F. Broadband optical fibre with an attenuation lower than 0.1 decibel per kilometre. Nat. Photon. (2025). https://doi.org/10.1038/s41566-025-01747-5 Useful Links: The Deployment of Hollow Core Fiber (HCF) in Azure’s Network How hollow core fiber is accelerating AI | Microsoft Azure Blog Learn more about Microsoft global infrastructure5.7KViews5likes0CommentsAzure virtual network terminal access point (TAP) public preview announcement
What is virtual network TAP? Virtual network TAP allows customers continuously stream virtual machine network traffic to a network packet collector or analytics tool. Many security and performance monitoring tools rely on packet-level insights that are difficult to access in cloud environments. Virtual network TAP bridges this gap by integrating with our industry partners to offer: Enhanced security and threat detection: Security teams can inspect full packet data in real-time to detect and respond to potential threats. Performance monitoring and troubleshooting: Operations teams can analyze live traffic patterns to identify bottlenecks, troubleshoot latency issues, and optimize application performance. Regulatory compliance: Organizations subject to compliance frameworks such as Health Insurance Portability and Accountability Act (HIPAA), and General Data Protection Regulation (GDPR) can use virtual network TAP to capture network activity for auditing and forensic investigations. Why use virtual network TAP? Unlike traditional packet capture solutions that require deploying additional agents or network appliances, virtual network TAP leverages Azure's native infrastructure to enable seamless traffic mirroring without complex configurations and without impacting the performance of the virtual machine. A key advantage is that mirrored traffic does not count towards virtual machine’s network limits, ensuring complete visibility without compromising application performance. Additionally, virtual network TAP supports all Azure virtual machine SKU. Deploying virtual network TAP The portal is a convenient way to get started with Azure virtual network TAP. However, if you have a lot of Azure resources and want to automate the setup you may want to use a PowerShell, CLI, or REST API. Add a TAP configuration on a network interface that is attached to a virtual machine deployed in your virtual network. The destination is a virtual network IP address in the same virtual network as the monitored network interface or a peered virtual network. The collector solution for virtual network TAP can be deployed behind an Azure Internal Load balancer for high availability. You can use the same virtual network TAP resource to aggregate traffic from multiple network interfaces in the same or different subscriptions. If the monitored network interfaces are in different subscriptions, the subscriptions must be associated to the same Microsoft Entra tenant. Additionally, the monitored network interfaces and the destination endpoint for aggregating the TAP traffic can be in peered virtual networks in the same region. Partnering with industry leaders to enhance network monitoring in Azure To maximize the value of virtual network TAP, we are proud to collaborate with industry-leading security and network visibility partners. Our partners provide deep packet inspection, analytics, threat detection, and monitoring solutions that seamlessly integrate with virtual network TAP: Network packet brokers Partner Product Gigamon GigaVUE Cloud Suite for Azure Keysight CloudLens Security analytics, network/application performance management Partner Product Darktrace Darktrace /NETWORK Netscout Omnis Cyber Intelligence NDR Corelight Corelight Open NDR Platform LinkShadow LinkShadow NDR Fortinet FortiNDR Cloud FortiGate VM cPacket cPacket Cloud Suite TrendMicro Trend Vision One™ Network Security Extrahop RevealX Bitdefender GravityZone Extended Detection and Response for Network eSentire eSentire MDR Vectra Vectra NDR AttackFence AttackFence NDR Arista Networks Arista NDR See our partner blogs: Bitdefender + Microsoft Virtual Network TAP: Deepening Visibility, Strengthening Security Streamline Traffic Mirroring in the Cloud with Azure Virtual Network Terminal Access Point (TAP) and Keysight Visibility | Keysight Blogs eSentire | Unlocking New Possibilities for Network Monitoring and… LinkShadow Unified Identity, Data, and Network Platform Integrated with Microsoft Virtual Network TAP Extrahop and Microsoft Extend Coverage for Azure Workloads Resources | Announcing cPacket Partnership with Azure virtual network terminal access point (TAP) Gain Network Traffic Visibility with FortiGate and Azure virtual network TAP Get started with virtual network TAP To learn more and get started, visit our website. We look forward to seeing how you leverage virtual network TAP to enhance security, performance, and compliance in your cloud environment. Stay tuned for more updates as we continue to refine and expand on our feature set! If you have any questions please reach out to us at azurevnettap@microsoft.com.2.6KViews3likes7CommentsIntroducing Azure Gateway Load Balancer: Deploy and scale network virtual appliances with ease
Today, we are pleased to announce the preview of Gateway Load Balancer, a fully managed service enabling you to deploy, scale, and enhance the availability of third party NVAs in Azure. You can add your favorite third party appliance whether it is a firewall, inline DDoS appliance, deep packet inspection system, or even your own custom appliance into the network path transparently – all with a single click.19KViews3likes0CommentsAnnouncing the General Availability of Azure Load Balancer Health Event Logs
Health event logs are now fully available in all public, Azure China, and Government regions under the Azure Monitor resource log category LoadBalancerHealthEvent, providing you with enhanced capabilities to monitor and troubleshoot your load balancer resources. Health Event Types As announced in our previous public preview blog, the following health events are now logged when detected by the Azure Load Balancer platform. These events are designed to address the most critical issues affecting your load balancer’s health and availability: LoadBalancerHealthEventType Scenario DataPathAvailabilityWarning Detect when the Data Path Availability metric of the frontend IP is less than 90% due to platform issues DataPathAvailabilityCritical Detect when the Data Path Availability metric of the frontend IP is less than 25% due to platform issues NoHealthyBackends Detect when all backend instances in a pool are not responding to the configured health probes HighSnatPortUsage Detect when a backend instance utilizes more than 75% of its allocated ports from a single frontend IP SnatPortExhaustion Detect when a backend instance has exhausted all allocated ports and will fail further outbound connections until ports have been released or more ports are allocated Benefits of Using Health Event Logs Health event logs provide deeper insights into the health of your load balancer, eliminating the need to set thresholds for metric-based alerts or manage complex metric data for historical analysis. Here’s how you can get started using these logs today: Create Diagnostic Settings: Archive or analyze these logs for long-term insights. Leverage Log Analytics: Use powerful querying capabilities to gain detailed insights. Configure Alerts: Set up alerts to trigger actions based on the generated logs. For more detailed instructions on how to enable and use health event logs, refer to our documentation here. Contoso’s Story Context: Contoso uses a Standard Public Load Balancer with outbound rules to connect their application to public APIs. They allocate 8k ports to each backend instance using an outbound rule, anticipating up to 8 backend instances in a pool. Problem: Contoso is concerned about SNAT port exhaustion and wants to create alerts to warn them if backend instances are close to consuming all allocated SNAT ports. Solution with metrics: Initially, they create an alert using the Used SNAT ports metric, triggering when the value exceeds 6k ports (out of 8k). However, this requires constant adjustment as they scale their infrastructure and update port allocation on outbound rules. Solution with health event logs: With the new health event logs, Contoso configures two alerts: HighSnatPortUsage: Sends an email and creates an incident whenever this event is generated, warning network engineers to allocate more SNAT ports. SnatPortExhaustion: Notifies the on-call engineer immediately to address critical impact to outbound connectivity due to lack of SNAT ports. Now, Contoso no longer needs to adjust alert rules as they scale, ensuring seamless monitoring and response. What’s Next? This general availability announcement marks a significant step in enhancing the health and monitoring capabilities of Azure Load Balancer. We are committed to expanding these capabilities with additional health event types, providing configuration guidance, best practices, and warnings for service-related limits. We welcome your feedback and look forward to hearing about your experiences with health event logs. Get started today by exploring our public documentation. Stay tuned on Azure Updates for future announcements and enhancements!725Views1like0CommentsTroubleshoot health probe failures with Azure Load Balancer Health Status
In today's fast-paced cloud computing environment, maintaining the optimal performance and reliability of your applications is crucial. Azure Load Balancer's Health Status feature , now generally available to customers, significantly simplifies this task by providing detailed health information about your backend instances without the need to file a support ticket. This tool offers invaluable insights into the health state of each backend instance and the specific reasons behind their status, whether user-triggered or platform-triggered. By leveraging this feature, customers can proactively address issues, ensure minimal downtime, and enhance the overall user experience, all while reducing reliance on support services. What is Health Status? Health Status is an Azure Load Balancer feature that gives you detailed health information about the backend instances connected to your Azure Load Balancer’s backend pool. Each status is linked to your load balancing rules and provides two key insights: the health state of each backend instance and the reasoning behind its state. The health state indicates whether your backend instance is healthy ("Up") or unhealthy ("Down"). The reasoning behind these states is explained through reason codes, which fall into two categories: User Triggered Reason Codes and Platform Triggered Reason Codes. User Triggered Reason Codes are based on how you configured your load balancer setup and can be addressed by you. Platform Triggered Reason Codes are based on the Azure Load Balancer platform and cannot be addressed by you. For more information about the different reason codes, view our public documentation. Why use Health Status? In the past, customers were not provided with insights into why their backend instances were deemed healthy or unhealthy. To access this crucial information, customers often had to follow troubleshooting procedures such as taking packet captures or going through the process of creating a support ticket, relying on support engineers to identify the cause of a failed health probe. This process was not only complex and time-consuming but also incurred additional costs and added significant management overhead. Now, with the Health Status feature, customers can easily access real-time health information of their backend instances. This empowers them to make swift and informed decisions, minimizing downtime, reducing support costs, and enhancing the overall user experience. By leveraging these insights, customers can proactively manage their environment and ensure optimal performance. Retrieving Health Status Health Status can be easily retrieved on a per load balancing rule basis. To retrieve Health Status: Sign in to the Azure Portal and search for "Load balancers". Select your load balancer and navigate to "Load balancing rules" under Settings. View the health status of the rule by clicking “View details” value of the corresponding rule. Refresh button can be used to get the latest status. Figure 1: Sample Health Status in Azure Portal Contoso's Utilization of Health Status for Game Server Maintenance Let’s explore how one of our customers, Contoso, uses the Health Status feature for efficient decision-making and troubleshooting. Who is Contoso and what is their issue Contoso, a prominent name in the gaming industry, has been leveraging Azure Load Balancer to distribute traffic to their highly popular game server hosted on Azure Virtual Machine Scale Sets. Their users love using Contoso’s servers due to the reliability and performance achieved on them. Recently, Contoso encountered an issue where one of their game servers became unhealthy, leading to disruptions in the gaming experience for their users. How Health Status resolved their issue Thanks to the Azure Load Balancer Health Status feature, the Contoso team was able to quickly navigate to the Load balancing rule page in Portal to view the health status of the unhealthy virtual machine instance. By doing so, they retrieved detailed insights into why their game server was marked unhealthy. This real-time information highlighted “the backend instance was unhealthy due to Admin State set to Down”. Armed with this crucial data, Contoso's Network team swiftly addressed the configuration issue by toggling the Admin State value of unhealthy server to “None”, thereby restoring the server to a healthy state. After a root cause analysis, it was determined that the previous engineer mistakenly toggled the wrong server to a Down Admin State value when trying to do fixes on another server. Benefits of using Health Status Instead of creating a support ticket and waiting for assistance, they utilized the Health Status feature to diagnose and resolve the problem independently. This proactive approach not only minimized downtime but also reduced support costs and enhanced the overall user experience. Conclusion By incorporating the Health Status feature into their operational workflow, Contoso has been able to make efficient, data-driven decisions and troubleshooting issues promptly, ensuring their gaming services remain robust and reliable for their users. Get Started We are excited to bring the Azure Load Balancer’s Health Status feature to you. This feature provides valuable insights into the health of your backend instances, helping you ensure better troubleshooting for optimal performance and reliability of your applications. For more information and to get started, visit the following links: Overview of health status concepts How to retrieve health status We hope you can take advantage of this feature, and we welcome your feedback. Please feel free to leave a comment below.1.1KViews1like0Comments