The document outlines the process of upgrading the ExpressRoute gateway to a zone-redundant SKU and the public IP to a Standard SKU, enhancing reliability, performance, and security. It highlights the importance of following ITSM guidelines, including change management protocols, planning it during a maintenance window and thorough testing.
Objective
The objective of this document is to help with transitioning the ExpressRoute gateway from a non-zone-redundant SKU to a zone-redundant SKU. This upgrade enhances the reliability and availability of the gateway by ensuring that it is resilient to zone failures. Additionally, the public IP associated with the gateway will be upgraded from a Basic SKU to a Standard SKU. This upgrade provides improved performance, security features, and availability guarantees.
The entire migration should be conducted in accordance with IT Service Management (ITSM) guidelines, ensuring that all best practices and standards are followed. Change management protocols should be strictly adhered to, including obtaining necessary approvals, documenting the change, and communicating with stakeholders. Pre-migration and post-migration testing should be performed to validate the success of the migration and to ensure that there are no disruptions to services.
The migration should be scheduled within a planned maintenance window to minimize impact on users and services. This window should be carefully selected to ensure that it aligns with business requirements and minimizes downtime. Throughout the process, detailed monitoring and logging should be in place to track progress and quickly address any issues that may arise.
Single-zone ExpressRoute Gateway:
Zone-redundant ExpressRoute Gateway:
Background
- ExpressRoute Gateway Standard SKU is a non-zone-redundant and lower the resiliency for the service.
- Basic SKU public IP is retiring in the end of September 2025. After this date the support for this SKU will be ceased which will potentially impact the ExpressRoute Gateway support.
- ExpressRoute Gateway Public IP is used internally for control plane communication.
Migration Scenarios
This document is equally relevant to all of the following scenarios:
- ExpressRoute Gateway Standard/High/Ultraperformance to ErGw1Az/ ErGw2Az/ ErGw3Az SKU
- ExpressRoute Gateway Standard/High/Ultraperformance to Standard/High/Ultraperformance (Multi-Zone) SKU
- Single-zone and multi-zone regions
- Zone redundant SKU (ErGw1Az/ErGw2Az/ErGw3Az) deployed in single zone.
Prerequisites
- Stakeholder Approvals: Ensure ITSM approvals are in place. This is to ensure that changes to IT systems are properly reviewed and authorized before implementation.
- Change Request (CR): Submit and secure approval for a Change Request to guarantee that all modifications to IT systems are thoroughly reviewed, authorized, and implemented in a controlled manner.
- Maintenance Window: When scheduling a maintenance window for production work, consider the following to minimize disruption and ensure efficiency:
- Key Considerations
- Minimizing Disruption: Schedule during low activity periods, often outside standard business hours or on weekends.
- Ensuring Adequate Staffing: Ensure necessary staff and resources are available, including technical support.
- Aligning with Production Cycles: Coordinate with departments to align with production cycles.
- Best Practices
- Preventive and Predictive Maintenance: Focus on regular inspections, part replacements, and system upgrades.
- Effective Communication: Inform stakeholders in advance about the maintenance schedule.
- Proper Planning: Use historical data and insights to identify the best time slots for maintenance.
Backup Plan: Document rollback or roll-forward procedures in case of failure. Following are some important considerations:
- Minimizing Disruption: A backup plan minimizes disruptions during planned maintenance, especially for VMs that may shut down or reboot.
- Ensuring Data Integrity: It protects against data loss or corruption by backing up critical data beforehand.
- Facilitating Quick Recovery: It allows for quick recovery if issues arise, maintaining business continuity and minimizing downtime.
- Current Configuration backup: Backup configuration for ExpressRoute Gateway, ExpressRoute Gateway Connection and Routing table associated with Gateway (if any) properties.
Here are the Powershell commands that can be used to backup ExpressRoute Gateway Configuration.
Review Gateway migration article
Be ready to open a Microsoft Support Ticket (Optional/Proactive): In certain corner case scenarios where migration encounters a blocker, be ready with the necessary details to open a Microsoft support ticket. In the ticket, provide the maintenance plan to the support engineer and ensure they are fully informed about your environment-specific configuration.
Pre-Migration
Testing
- Connectivity Tests: Run network reachability tests to validate current state. Some of the sample tests could be as following:
- ICMP test from on-premises virtual machine to Azure virtual machine to test basic connectivity. Ping on-premises Virtual machine to an Azure virtual machine.
$ ping <Azure-Virtual-Machine-IP>
- Application access test: Access your workload application from on-premises to a service running in Azure. This depends on the customer application. For example, if it is a web application, access the web server from a browser on a laptop or an on-premises machine.
- Latency and throughput tests: You can used ACT to test latency and throughput. Please refer to this link for installation details. Troubleshoot network link performance: Azure ExpressRoute | Microsoft Learn
$ Get-LinkPerformance -RemoteHost 10.0.0.1 -TestSeconds 10
- To test Jitter and packet loss you can use following tools.
PSPing: psping -l 1024 -n 100 <Azure_VM_IP>:443
PathPing: pathping <Azure VM IP>
Capture the results from above test to compare them after the migration.
“iperf” is another tool widely used for throughput and latency testing.
A web-based latency tool works fine as well: https://www.azurespeed.com/
- Test the whole ExpressRoute Gateway migration process in lower environment (Optional): In other words, migrate an ExpressRoute Gateway in non-production environment.
Advanced Notification
Send an email to the relevant stakeholders and impacted users/teams a few weeks in advance.
Send a final notification to the same group a day before.
Stop IOs on hybrid private endpoint
Using private endpoints in Azure over a hybrid connection with ExpressRoute provides a secure, reliable, and high-performance connection to Azure services. By leveraging ExpressRoute's private peering and connectivity models, you can ensure that your traffic remains within the Microsoft global network, avoiding public internet exposure. This setup is ideal for scenarios requiring high security, consistent performance, and seamless integration between on-premises and Azure environments
Private endpoints (PEs) in a virtual network connected over ExpressRoute private peering might experience connectivity outage during migration.
To avoid this, stop all IOs over hybrid private endpoints.
Validate you have enough IP for migration
Our guidance is to proceed with migration, a /27 prefix or longer is required in the GatewaySubnet. The migration feature checks for enough address space during validation phase.
In a scenario where there aren’t enough IP addresses available to create zone-redundant ExpressRoute Gateway, the Gateway migration script will add additional prefix to the subnet. As a user you don’t have to take any action. The migration feature will tell you if it needs more IPs.
Migration Steps
Migration using Azure portal
Step 1: Test connectivity from On-premises to Azure via ExpressRoute Gateway. Refer Step-7
Step 2: Verify that the Microsoft Azure support engineer is on standby.
Step 3: Send an email to notify users about the start of the planned connectivity outage.
Step 4: Stop or minimize IOs over ExpressRoute circuit (Downtime). Minimizing the IOs will reduce the impact.
Step 5: Follow the document below to migrate the ExpressRoute gateway using Azure Portal
Step 6: Restart IOs over ExpressRoute Circuit
Step 7: Validate and Test Post Migration connectivity.
- Verify BGP Peering:
Get-AzExpressRouteCircuitPeering -ResourceGroupName <RG> -CircuitName <CircuitName>
- Route Propagation Check:
Get-AzExpressRouteCircuitRouteTable -ResourceGroupName <RG> -ExpressRouteCircuitName <CircuitName> -PeeringType AzurePrivatePeering
- Connectivity Tests: Run network reachability tests to validate current state. Some of the sample tests could be as following:
- ICMP test from on-premises virtual machine to Azure virtual machine to test basic connectivity. Ping on-premises Virtual machine to an Azure virtual machine.
$ ping <Azure-Virtual-Machine-IP>
- Application access test: Access your workload application from on-premises to a service running in Azure. This depends on the customer application. For example, if it is a web application, access the web server from a browser on a laptop or an on-premises machine.
- Latency and throughput tests: You can used ACT to test latency and throughput. Please refer to this link for installation details. Troubleshoot network link performance: Azure ExpressRoute | Microsoft Learn
$ Get-LinkPerformance -RemoteHost 10.0.0.1 -TestSeconds 10
- To test Jitter and packet loss you can use following tools.
PSPing: psping -l 1024 -n 100 <Azure_VM_IP>:443
PathPing: pathping <Azure VM IP>
Compare the new results with the one captured before the outage.
- Validate that the migration is successful. ExpressRoute Gateway is migrated to the new SKU.
Migration using powershell
Step 1: Test connectivity from On-premises to Azure via ExpressRoute Gateway. Refer Step-7
Step 2: Verify that the Microsoft Azure support engineer is on standby. Refer
Step 3: Send an email to notify users about the start of the planned connectivity outage.
Step 4: Stop or minimize IOs over ExpressRoute circuit (Downtime). Minimizing the IOs will reduce the impact.
Step 5: Follow the document below to migrate the ExpressRoute gateway using Powershell.
Step 6: Restart IOs over ExpressRoute Circuit
Step 7: Validate and Test Post Migration connectivity.
- Verify BGP Peering:
Get-AzExpressRouteCircuitPeering -ResourceGroupName <RG> -CircuitName <CircuitName>
- Route Propagation Check:
Get-AzExpressRouteCircuitRouteTable -ResourceGroupName <RG> -ExpressRouteCircuitName <CircuitName> -PeeringType AzurePrivatePeering
- Connectivity Tests: Run network reachability tests to validate current state. Some of the sample tests could be as following:
- ICMP test from on-premises virtual machine to Azure virtual machine to test basic connectivity. Ping on-premises Virtual machine to an Azure virtual machine.
$ ping <Azure-Virtual-Machine-IP>
- Application access test: Access your workload application from on-premises to a service running in Azure. This depends on the customer application. For example, if it is a web application, access the web server from a browser on a laptop or an on-premises machine.
- Latency and throughput tests: You can used ACT to test latency and throughput. Please refer to this link for installation details. Troubleshoot network link performance: Azure ExpressRoute | Microsoft Learn
$ Get-LinkPerformance -RemoteHost 10.0.0.1 -TestSeconds 10
- To test Jitter and packet loss you can use following tools.
PSPing: psping -l 1024 -n 100 <Azure_VM_IP>:443
PathPing: pathping <Azure VM IP>
Compare the new results with the one captured before the outage.
- Validate that the migration is successful. ExpressRoute Gateway is migrated to the new SKU.
Rollback Plan
If any issue arises during migration take help of Microsoft support engineer to:
- Restore Previous Gateway: Use the backed-up configuration to either get back the original gateways or create a new one, based on guidance from support engineer.
- Validate Connectivity: Perform on-premises to Azure connectivity testing as mentioned in step 7 above.
Post-Migration Steps
- Update Change Request: Document and close the CR.
- Update CMDB: Reflect the new gateway details in the Configuration Management Database.
- Stakeholder Sign-off: Ensure all teams validate and approve the changes.
Contact Information
- Network Team:
- Azure Support: Azure Support Portal