Azure VMware Solution
41 TopicsAzure VMware Solution Performance Design Considerations
Azure VMware Solution Design Series Availability Design Considerations Recoverability Design Considerations Performance Design Considerations Security Design Considerations VMware HCX Design with Azure VMware Solution Overview A global enterprise wants to migrate thousands of VMware vSphere virtual machines (VMs) to Microsoft Azure as part of their application modernization strategy. The first step is to exit their on-premises data centers and rapidly relocate their legacy application VMs to the Azure VMware Solution as a staging area for the first phase of their modernization strategy. What should the Azure VMware Solution look like? Azure VMware Solutionis a VMware validated first party Azure service from Microsoft that provides private clouds containing VMware vSphere clusters built from dedicated bare-metal Azure infrastructure. It enables customers to leverage their existing investments in VMware skills and tools, allowing them to focus on developing and running their VMware-based workloads on Azure. In this post, I will introduce the typical customer workload performance requirements, describe the Azure VMware Solution architectural components, and describe the performance design considerations for Azure VMware Solution private clouds. In the next section, I will introduce the typical performance requirements of a customer’s workload. Customer Workload Requirements A typical customer has multiple application tiers that have specific Service Level Agreement (SLA) requirements that need to be met. These SLAs are normally named by a tiering system such as Platinum, Gold, Silver, and Bronze or Mission-Critical, Business-Critical, Production, and Test/Dev. Each SLA will have different availability, recoverability, performance, manageability, and security requirements that need to be met. For the performance design quality, customers will normally have CPU, RAM, Storage and Network requirements. This is normally documented for each application and then aggregated into the total performance requirements for each SLA. For example: SLA Name CPU RAM Storage Network Gold Low vCPU:pCore ratio (<1 to 2), Low VM to Host ratio (1-8) No RAM oversubscription (<=1) High Throughput or High IOPS (for a particular I/O size), Low Latency High Throughput, Low Latency Silver Medium vCPU:pCore ratio (3 to 10), Medium VM to Host ratio (9-15) Medium RAM oversubscription ratio (1.1-1.4) Medium Latency Medium Latency Bronze High vCPU:pCore ratio (10-15), High VM to Host ratio (16+) High RAM oversubscription ratio (1.5-2.5) High Latency High Latency Table 1 – Typical Customer SLA requirements for Performance The performance concepts introduced in Table 1 have the following dimensions: CPU: CPU model and speed (this can be important for legacy single threaded applications), number of cores, vCPU to physical core ratios, CPU Ready times. Memory: Random Access Memory size, Input/Output (I/O) speed and latency, oversubscription ratios. Storage: Capacity, Read/Write Input/Output per Second (IOPS) with Input/Output (I/O) size, Read/Write Throughput, Read/Write Input/Output Latency. Network: In/Out Speed, Network Latency (Round Trip Time). A typical legacy business-critical application will have the following application architecture: Load Balancer layer: Uses load balancers to distribute traffic across multiple web servers in the web layer to improve application availability. Web layer: Uses web servers to process client requests made via the secure Hypertext Transfer Protocol (HTTPS). Receives traffic from the load balancer layer and forwards to the application layer. Application layer: Uses application servers to run software that delivers a business application through a communication protocol. Receives traffic from the web layer and uses the database layer to access stored data. Database layer: Uses a relational database management service (RDMS) cluster to store data and provide database services to the application layer. The application can also be classified as OLTP or OLAP, which have the following characteristics: Online Transaction Processing (OLTP) is a type of data processing that consists of executing several transactions occurring concurrently. For example, online banking, retail shopping, or sending text messages. OLTP systems tend to have a performance profile that is latency sensitive, choppy CPU demands, with small amounts of data being read and written. Online Analytical Processing (OLAP) is a technology that organizes large business databases and supports complex analysis. It can be used to perform complex analytical queries without negatively impacting transactional systems (OLTP). For example, data warehouse systems, business performance analysis, or marketing analysis. OLAP systems tend to have a performance profile that is latency tolerant, requires large amounts of storage for records processing, has a steady state of CPU, RAM and storage throughput. Depending upon the performance requirements for each service, infrastructure design could be a mix of technologies used to meet the different performance SLAs with cost efficiency. Figure 1 – Typical Legacy Business-Critical Application Architecture In the next section, I will introduce the architectural components of the Azure VMware Solution. Architectural Components The diagram below describes the architectural components of the Azure VMware Solution. Figure 2 – Azure VMware Solution Architectural Components Each Azure VMware Solution architectural component has the following function: Azure Subscription:Used to provide controlled access, budget, and quota management for the Azure VMware Solution. Azure Region:Physical locations around the world where we group data centers into Availability Zones (AZs) and then group AZs into regions. Azure Resource Group:Container used to place Azure services and resources into logical groups. Azure VMware Solution Private Cloud:Uses VMware software, including vCenter Server, NSX software-defined networking, vSAN software-defined storage, and Azure bare-metal ESXi hosts to provide compute, networking, and storage resources.Azure NetApp Files, Azure Elastic SAN, and Pure Cloud Block Store are also supported. Azure VMware Solution Resource Cluster:Uses VMware software, including vSAN software-defined storage, and Azure bare-metal ESXi hosts to provide compute, networking, and storage resources for customer workloads by scaling out the Azure VMware Solution private cloud.Azure NetApp Files, Azure Elastic SAN, and Pure Cloud Block Store are also supported. VMware HCX:Provides mobility, migration, and network extension services. VMware Site Recovery:Provides Disaster Recovery automation and storage replication services with VMware vSphere Replication. Third party Disaster Recovery solutions Zerto Disaster Recovery and JetStream Software Disaster Recovery are also supported. Dedicated Microsoft Enterprise Edge (D-MSEE):Router that providesconnectivity between Azure cloud and the Azure VMware Solution private cloud instance. Azure Virtual Network (VNet):Private network used to connect Azure services and resources together. Azure Route Server:Enables network appliances to exchange dynamic route information with Azure networks. Azure Virtual Network Gateway:Cross premises gateway for connecting Azure services and resources to other private networks using IPSec VPN, ExpressRoute, and VNet to VNet. Azure ExpressRoute:Provides high-speed private connections between Azure data centers and on-premises or colocation infrastructure. Azure Virtual WAN (vWAN):Aggregates networking, security, and routing functions together into a single unified Wide Area Network (WAN). In the next section, I will describe the performance design considerations for the Azure VMware Solution. Performance Design Considerations The architectural design process takes the business problem to be solved and the business goals to be achieved and distills these into customer requirements, design constraints and assumptions. Design constraints can be characterized by the following three categories: Laws of the Land – data and application sovereignty, governance, regulatory, compliance, etc. Laws of Physics – data and machine gravity, network latency, etc. Laws of Economics – owning versus renting, total cost of ownership (TCO), return on investment (ROI), capital expenditure, operational expenditure, earnings before interest, taxes, depreciation, and amortization (EBITDA), etc. Each design consideration will be a trade-off between availability, recoverability, performance, manageability, and security design qualities. The desired result is to deliver business value with the minimum of risk by working backwards from the customer problem. Design Consideration 1 – Azure Region:Azure VMware Solution isavailable in 30 Azure Regionsaround the world (US Government has 2 additional Azure Regions). Select the relevant Azure Regions that meet your geographic requirements. These locations will typically be driven by your design constraints and the required Azure services that will be dependent upon the Azure VMware Solution. For highest throughput and lowest network latency, the Azure VMware Solution and dependent Azure services such as third-party backup/recovery and Azure NetApp Filer volumes should be placed in the same Availability Zone in an Azure Region. Unfortunately, the Azure VMware Solution does not have a Placement Policy Group feature to allow Azure services to be automatically deployed in the same Availability Zone. You can open a ticket with Microsoft to configure aSpecial Placement Policy to deploy your Azure VMware Solution private cloud to a particular AZ to ensure that your Azure services are placed as closely together as possible. In addition, the proximity of the Azure Region to the remote users and applications consuming the service should also be considered for network latency and throughput. Figure 3 – Azure VMware Solution Availability Zone Placement for Performance Design Consideration 2 – SKU type:Table 2 lists the three SKU types can be selected for provisioning an Azure VMware Solution private cloud. Depending upon the workload performance requirements, the AV36 and AV36P nodes can be used for general purpose compute and the AV52 nodes can be used for compute intensive and storage heavy workloads. The AV36 SKU is widely available in most Azure regions and the AV36P and AV52 SKUs are limited to certain Azure regions. Azure VMware Solution does not support mixing different SKU types within a private cloud (AV64 SKU is the exception).You can check Azure VMware Solution SKU availability by Azure Region here. TheAV64 SKU is currently only available for mixed SKU deployments in certain regions. Figure 4 – AV64 Mixed SKU Topology Currently, Azure VMware Solution does not have SKUs that support GPU hardware. The Azure VMware Solution does not natively support Auto-Scale, however you can use thisAuto-Scale function instead. For more information, refer toSKU types. SKU Type Purpose CPU (Cores/GHz) RAM (GB) vSAN Cache Tier (TB, raw) vSAN Capacity Tier (TB, raw) Network Interface Cards AV36 General Purpose Compute Dual Intel Xeon Gold 6140 CPUs (Skylake microarchitecture) with 18 cores/CPU @ 2.3 GHz, Total 36 physical cores (72 logical cores with hyperthreading) 576 3.2 (NVMe) 15.20 (SSD) 4x 25 Gb/s NICs (2 for management & control plane, 2 for customer traffic) AV36P General Purpose Compute Dual Intel Xeon Gold 6240 CPUs (Cascade Lake microarchitecture) with 18 cores/CPU @ 2.6 GHz / 3.9 GHz Turbo, Total 36 physical cores (72 logical cores with hyperthreading) 768 1.5 (Intel Cache) 19.20 (NVMe) 4x 25 Gb/s NICs (2 for management & control plane, 2 for customer traffic) AV52 Compute/Storage heavy workloads Dual Intel Xeon Platinum 8270 CPUs (Cascade Lake microarchitecture) with 26 cores/CPU @ 2.7 GHz / 4.0 GHz Turbo, Total 52 physical cores (104 logical cores with hyperthreading) 1,536 1.5 (Intel Cache) 38.40 (NVMe) 4x 25 Gb/s NICs (2 for management & control plane, 2 for customer traffic) AV64 General Purpose Compute Dual Intel Xeon Platinum 8370C CPUs (Ice Lake microarchitecture) with 32 cores/CPU @ 2.8 GHz / 3.5 GHz Turbo, Total 64 physical cores (128 logical cores with hyperthreading) 1,024 3.84 (NVMe) 15.36 (NVMe) 1x 100 Gb/s Table 2 – Azure VMware Solution SKUs Design Consideration 3 – Deployment topology:Select the Azure VMware Solution topology that best matches the performance requirements of your SLAs. For very large deployments, it may make sense to have separate private clouds dedicated to each SLA for optimum performance. The Azure VMware Solution supports a maximum of 12 clusters per private cloud. Each cluster supports a minimum of 3 hosts and a maximum of 16 hosts per cluster. Each private cloud supports a maximum of 96 hosts. VMware vCenter Server, VMware HCX Manager, VMware SRM and VMware vSphere Replication Manager are individual appliances that run in Cluster-1. VMware NSX Manager is a cluster of 3 unified appliances that have a VM-VM anti-affinity placement policy to spread them across the hosts of the cluster. The VMware NSX Edge cluster is a pair of appliances that also use a VM-VM anti-affinity placement policy. All northbound customer traffic traverses the NSX Edge cluster. All vSAN storage traffic traverses the VLAN-backed Portgroup of the Management vSphere Distributed Switch, which is part of the management and control plane. The management and control plane cluster (Cluster-1) can be shared with customer workload VMs or be a dedicated cluster for management and control, including customer enterprise services, such as Active Directory, DNS, & DHCP. Additional resource clusters can be added to support customer demand. This also includes the option of using dedicated clusters for each customer SLA. Topology 1 – Mixed:Run mixed SLA workloads in each cluster of the Azure VMware Solution private cloud. Figure 5 – Azure VMware Solution Mixed Workloads Topology Topology 2 – Dedicated Clusters:Use separate clusters for each SLA in the Azure VMware Solution private cloud. Figure 6 – Azure VMware Solution Dedicated Clusters Topology Topology 3 – Dedicated Private Clouds:Use dedicated Azure VMware Solution private clouds for each SLA for optimum performance. Figure 7 – Azure VMware Solution Dedicated Private Cloud Instances Topology Design Consideration 4 – Network Connectivity:Azure VMware Solution private clouds can be connected using IPSec VPN and Azure ExpressRoute circuits, including a variety of Azure Virtual Networking topologies such as Hub-Spoke and Azure Virtual WAN with Azure Firewall and third-party Network Virtualization Appliances. Azure Public IP connectivity with NSX is also available. From a performance perspective, Azure ExpressRoute and AVS Interconnect should be used instead of Azure Virtual WAN and IPSec VPN. The following design considerations (5-9) elaborate on network performance design. For more information, refer to the Azure VMware Solution networking and interconnectivity concepts. The Azure VMware Solution Cloud Adoption Framework also hasexample network scenarios that can be considered. Design Consideration 5 – Azure VNet Connectivity: Use FastPath for connecting an Azure VMware Solution private cloud to an Azure VNet for highest throughput and lowest latency. For maximum performance between Azure VMware Solution and Azure native services, a VNet Gateway with the Ultra performance or ErGw3AZ SKU is needed to enable the Fast Path feature when creating the connection. FastPath is designed to improve the data path performance to your VNet. When enabled, FastPath sends network traffic directly to virtual machines in the VNet, bypassing the gateway, resulting in 10 Gbps or higher throughput. For more information, refer toAzure ExpressRoute FastPath. Figure 8 – Azure VMware Solution connected to VNet Gateway with FastPath Design Consideration 6 – Intra-region Connectivity:Use AVS Interconnect for connecting Azure VMware Solution private clouds together in the same Azure Region for the highest throughput and lowest latency. You can select Azure VMware Solution private clouds from another Azure Subscription or Azure Resource Group, the only constraint is it must be in the same Azure Region. A maximum of 10 private clouds can be connected per private cloud instance. For more information, refer toAVS Interconnect. Figure 9 – Azure VMware Solution with AVS Interconnect Design Consideration 7 – Inter-region/On-Premises Connectivity:Use ExpressRoute Global Reach for connecting Azure VMware Solution private clouds together in different Azure Regions or to on-premises vSphere environments for the highest throughput and lowest latency. For more information, refer toAzure VMware Solution network design considerations. Figure 10 – Azure VMware Solution with ExpressRoute Global Reach Figure 11 – Azure VMware Solution with ExpressRoute Global Reach to On-premises vSphere infrastructure Design Consideration 8 – Host Connectivity:Use NSX Multi-Edge to increase the throughput of north/south traffic from the Azure VMware Solution private cloud. This configuration is available for a management cluster (Cluster-1) with four or more nodes. The additional Edge VMs are added to the Edge Cluster and increase the amount of traffic that can be forwarded through the 25Gbps uplinks across the ESXi hosts. This feature needs to be configured by opening an SR. For more information, refer toAzure VMware Solution network design considerations. Figure 12 – Azure VMware Solution Multi-Edge with NSX Design Consideration 9 – Internet Connectivity:Use Public IP on the NSX Edge if high speed internet access direct to the Azure VMware Solution private cloud is needed. This allows you to bring an Azure-owned Public IPv4 address range directly to the NSX Edge for consumption. You should configure this public range on a network virtual appliance (NVA) to secure the private cloud. For more information, refer toInternet Connectivity Design Considerations. Figure 13 – Azure VMware Solution Public IP Address with NSX Design Consideration 10 – VM Optimization:Use VM Hardware tuning, and Resource Pools to provide peak performance for workloads. VMware vSphere Virtual Machine Hardware should be optimized for the required performance: vNUMA optimization for CPU and RAM Shares Reservations & Limits Latency Sensitive setting Paravirtual network & storage adapters Multiple SCSI controllers Spread vDisks across SCSI controllers Resource Pools can be used to apply CPU and RAM QoS policies for each SLA running in a mixed cluster. For more information, refer toPerformance Best Practices. Design Consideration 11 – Placement Policies:Placement policiescan be used to increase the performance of a service by separating the VMs in an application availability layer across ESXi hosts. This allows you to pin workloads to a particular host for exclusive access to CPU and RAM resources. Placement policies support VM-VM and VM-Host affinity and anti-affinity rules. The vSphere Distributed Resource Scheduler (DRS) is responsible for migrating VMs to enforce the placement policies. For more information, refer toPlacement Policies. Figure 14 – Azure VMware Solution Placement Policies Design Consideration 12 – External Datastores:Use a first-party or third-party storage solution to offload lower SLA workloads from VMware vSAN into a separate tier of storage. Azure VMware Solution supports attaching Azure NetApp Files as Network File System (NFS) datastores for offloading virtual machine storage from VMware vSAN. This allows the VMware vSAN datastore to be dedicated to Gold SLA virtual machines. Azure VMware Solution also supports the use of Azure Elastic SAN and Pure Cloud Block Stores as attached iSCSI datastores. For more information, refer toAzure NetApp Files datastores. Figure 15 – Azure VMware Solution External Datastores with Azure NetApp Files Design Consideration 13 – Storage Policies:Table 3 lists the pre-defined VM Storage Policies available for use with VMware vSAN. The appropriate redundant array of independent disks (RAID) and failures to tolerate (FTT) settings per policy need to be considered to match the customer workload SLAs. Each policy has a trade-off between availability, performance, capacity, and cost that needs to be considered. The highest performing VM Storage Policy for enterprise workloads is the RAID-1 policy. To comply with the Azure VMware Solution SLA, you are responsible for using an FTT=2 storage policy when the cluster has 6 or more nodes in a standard cluster. You must also retain a minimum slack space of 25% for backend vSAN operations. For more information, refer toConfigure Storage Policy. Deployment Type Policy Name RAID Failures to Tolerate (FTT) Site Standard RAID-1 FTT-1 1 1 N/A Standard RAID-1 FTT-2 1 2 N/A Standard RAID-1 FTT-3 1 3 N/A Standard RAID-5 FTT-1 5 1 N/A Standard RAID-6 FTT-2 6 2 N/A Standard VMware Horizon 1 1 N/A Table 3 – VMware vSAN Storage Policies Design Consideration 14 – Mobility:VMware HCX can be tweaked to improve throughput and performance. VMware HCX Manager can be upsized through Run Command. The number of network extension (NE) instances can be increased to allow Portgroups to be distributed over instances to increase layer 2 extension (L2E) performance. You can also establish a dedicated Mobility Cluster, accompanied by a dedicated Service Mesh for each distinct workload cluster, thereby increasing mobility performance. The Azure VMware Solution supports amaximum of 10 service meshes per private cloud, this is due to the allocation of the /22 management IP schema. Application Path Resiliency & TCP Flow Conditioning are also options that can be enabled to improve mobility performance. TCP Flow Conditioning dynamically optimizes the segment size for traffic traversing the Network Extension path. Application Path Resiliency technology creates multiple Foo-Over-UDP (FOU) tunnels between the source and destination Uplink IP pair for improved performance, resiliency, and path diversity. For more information, refer toVMware HCX Best Practices. Figure 16 – VMware HCX with Dedicated Mobility Cluster Design Consideration 15 – Anti-Patterns:Try to avoid using these anti-patterns in your performance design. Anti-Pattern 1 – Stretched Clusters:Azure VMware Solution Stretched Clusters should primarily be used to meet a Multi-AZ or Recovery Point Objective of zero requirement. If stretched clusters are used, there will be a write throughput and write latency impact for all synchronous writes using the site mirroring storage policy. For more information, refer toStretched Clusters. Figure 17 – Azure VMware Solution Private Cloud with Stretched Clusters In the following section, I will describe the next steps that need to be made to progress this high-level design estimate towards a validated detailed design. Next Steps The Azure VMware Solution sizing estimate should be assessed usingAzure Migrate. With large enterprise solutions for strategic and major customers, an Azure VMware Solution Solutions Architect from Azure, VMware, or a trusted VMware Partner should be engaged to ensure the solution is correctly sized to deliver business value with the minimum of risk. This should also include an application dependency assessment to understand the mapping between application groups and identify areas of data gravity, application network traffic flows, and network latency dependencies. Summary In this post, we took a closer look at the typical performance requirements of a customer workload, the architectural building blocks, and the performance design considerations for the Azure VMware Solution. We also discussed the next steps to continue an Azure VMware Solution design. If you are interested in the Azure VMware Solution, please use these resources to learn more about the service: Homepage:Azure VMware Solution Documentation:Azure VMware Solution SLA:SLA for Azure VMware Solution Azure Regions:Azure Products by Region Service Limits:Azure VMware Solution subscription limits and quotas SKU types:Introduction Storage policies:Configure storage policy VMware HCX: Configuration & Best Practices GitHub repository:Azure/azure-vmware-solution Well-Architected Framework:Azure VMware Solution workloads Cloud Adoption Framework:Introduction to the Azure VMware Solution adoption scenario Network connectivity scenarios:Enterprise-scale network topology and connectivity for Azure VMware Solution Enterprise Scale Landing Zone:Enterprise-scale for Microsoft Azure VMware Solution Enterprise Scale GitHub repository:Azure/Enterprise-Scale-for-AVS Azure CLI:Azure Command-Line Interface (CLI) Overview PowerShell module:Az.VMware Module Azure Resource Manager: Microsoft.AVS/privateClouds REST API: Azure VMware Solution REST API Terraform provider:azurerm_vmware_private_cloud Terraform Registry Author Bio René van den Bedemis a Principal Technical Program Manager in the Azure VMware Solution product group at Microsoft. His background is in enterprise architecture with extensive experience across all facets of the enterprise, public cloud, and service provider spaces, including digital transformation and the business, enterprise, and technology architecture stacks. René works backwards from the problem to be solved and designs solutions that deliver business value with the minimum of risk. In addition to being the first quadrupleVMware Certified Design Expert (VCDX), he is also aDell Technologies Certified Master Enterprise Architect, aNutanix Platform Expert (NPX), and a VMware vExpert. Link to PPTX Diagrams: azure-vmware-solution/azure-vmware-master-diagramsAzure VMware Solution Recoverability Design Considerations
Azure VMware Solution Design Series Availability Design Considerations Recoverability Design Considerations Performance Design Considerations Security Design Considerations VMware HCX Design with Azure VMware Solution Overview A global enterprise wants to migrate thousands of VMware vSphere virtual machines (VMs) to Microsoft Azure as part of their application modernization strategy. The first step is to exit their on-premises data centers and rapidly relocate their legacy application VMs to the Azure VMware Solution as a staging area for the first phase of their modernization strategy. What should the Azure VMware Solution look like? Azure VMware Solution is a VMware validated first party Azure service from Microsoft that provides private clouds containing VMware vSphere clusters built from dedicated bare-metal Azure infrastructure. It enables customers to leverage their existing investments in VMware skills and tools, allowing them to focus on developing and running their VMware-based workloads on Azure. In this post, I will introduce the typical customer workload recoverability requirements, describe the Azure VMware Solution architectural components, and describe the recoverability design considerations for Azure VMware Solution private clouds. In the next section, I will introduce the typical recoverability requirements of a customer’s workload. Customer Workload Requirements A typical customer has multiple application tiers that have specific Service Level Agreement (SLA) requirements that need to be met. These SLAs are normally named by a tiering system such as Platinum, Gold, Silver, and Bronze or Mission-Critical, Business-Critical, Production, and Test/Dev. Each SLA will have different availability, recoverability, performance, manageability, and security requirements that need to be met. For the recoverability design quality, customers will normally have an uptime percentage requirement with a recovery point objective (RPO), recovery time objective (RTO), work recovery time (WRT), maximum tolerable downtime (MTD) and a Disaster Recovery Site requirement that defines each SLA level. This is normally documented in the customer’s Business Continuity Plan (BCP). For example: SLA Name Uptime RPO RTO WRT MTD DR Site Gold 99.999% (5.26 min downtime/year) 5 min 3 min 2 min 5 min Yes Silver 99.99% (52.6 min downtime/year) 1 hour 20 min 10 min 30 min Yes Bronze 99.9% (8.76 hrs downtime/year) 4 hours 6 hours 2 hours 8 hours No Table 1 – Typical Customer SLA requirements for Recoverability The recoverability concepts introduced in Table 1 have the following definitions: Recovery Point Objective (RPO): Defines the maximum age of the restored data after a failure. Recovery Time Objective (RTO): Defines the maximum time to restore the service. Work Recovery Time (WRT): Defines how long it takes for the recovered service to be broughtonlineand begin serving customers again. Maximum Tolerable Downtime (MTD): Sum of the RTO and WRT, which is the total time required to recover from a disaster and start serving the business again from the Disaster Recovery Site. This value needs to fit within the downtime value of the SLA for each year. Figure 1– Recoverability Concepts A typical legacy business-critical application will have the following application architecture: Load Balancer layer: Uses load balancers to distribute traffic across multiple web servers in the web layer to improve application availability. Web layer: Uses web servers to process client requests made via the secure Hypertext Transfer Protocol (HTTPS). Receives traffic from the load balancer layer and forwards to the application layer. Application layer: Uses application servers to run software that delivers a business application through a communication protocol. Receives traffic from the web layer and uses the database layer to access stored data. Database layer: Uses a relational database management service (RDMS) cluster to store data and provide database services to the application layer. Depending upon the recoverability requirements for each service, the disaster recovery protection mechanisms could be a mix of manual runbooks and disaster recovery automation solutions with replication and clustering mechanisms connected to many different regions to meet the customer SLAs. Figure 2 – Typical Legacy Business-Critical Application Architecture In the next section, I will introduce the architectural components of the Azure VMware Solution. Architectural Components The diagram below describes the architectural components of the Azure VMware Solution. Figure 3 – Azure VMware Solution Architectural Components Each Azure VMware Solution architectural component has the following function: Azure Subscription: Used to provide controlled access, budget, and quota management for the Azure VMware Solution. Azure Region: Physical locations around the world where we group data centers into Availability Zones (AZs) and then group AZs into regions. Azure Resource Group: Container used to place Azure services and resources into logical groups. Azure VMware Solution Private Cloud: Uses VMware software, including vCenter Server, NSX software-defined networking, vSAN software-defined storage, and Azure bare-metal ESXi hosts to provide compute, networking, and storage resources.Azure NetApp Files, Azure Elastic SAN, and Pure Cloud Block Store are also supported. Azure VMware Solution Resource Cluster: Uses VMware software, including vSAN software-defined storage, and Azure bare-metal ESXi hosts to provide compute, networking, and storage resources for customer workloads by scaling out the Azure VMware Solution private cloud.Azure NetApp Files, Azure Elastic SAN, and Pure Cloud Block Store are also supported. VMware HCX: Provides mobility, migration, and network extension services. VMware Site Recovery: Provides Disaster Recovery automation and storage replication services with VMware vSphere Replication. Third party Disaster Recovery solutions Zerto Disaster Recovery and JetStream Software Disaster Recovery are also supported. Dedicated Microsoft Enterprise Edge (D-MSEE):Router that providesconnectivity between Azure cloud and the Azure VMware Solution private cloud instance. Azure Virtual Network (VNet): Private network used to connect Azure services and resources together. Azure Route Server: Enables network appliances to exchange dynamic route information with Azure networks. Azure Virtual Network Gateway: Cross premises gateway for connecting Azure services and resources to other private networks using IPSec VPN, ExpressRoute, and VNet to VNet. Azure ExpressRoute: Provides high-speed private connections between Azure data centers and on-premises or colocation infrastructure. Azure Virtual WAN (vWAN): Aggregates networking, security, and routing functions together into a single unified Wide Area Network (WAN). In the next section, I will describe the recoverability design considerations for the Azure VMware Solution. Recoverability Design Considerations The architectural design process takes the business problem to be solved and the business goals to be achieved and distills these into customer requirements, design constraints and assumptions. Design constraints can be characterized by the following three categories: Laws of the Land – data and application sovereignty, governance, regulatory, compliance, etc. Laws of Physics – data and machine gravity, network latency, etc. Laws of Economics – owning versus renting, total cost of ownership (TCO), return on investment (ROI), capital expenditure, operational expenditure, earnings before interest, taxes, depreciation, and amortization (EBITDA), etc. Each design consideration will be a trade-off between the availability, recoverability, performance, manageability, and security design qualities. The desired result is to deliver business value with the minimum of risk by working backwards from the customer problem. Design Consideration 1 – Azure Region: Azure VMware Solution is available in 30 Azure Regions around the world(US Government has 2 additional Azure Regions). Select the relevant Azure Regions that meet your geographic requirements. These locations will typically be driven by your design constraints and the required distance the Disaster Recovery Site needs to be from the Primary Site. The Primary Site can be located on-premises, in a co-location or in the public cloud. Figure 4 – Azure VMware Solution Region for Disaster Recovery Design Consideration 2 – Deployment topology: Select the Azure VMware Solution Disaster Recovery Pod topology that best matches the uptime and geographic requirements of your SLAs. For very large deployments, it may make sense to have separate Disaster Recovery Pods (private clouds) dedicated to each SLA for cost efficiency. The management and control plane cluster (Cluster-1) can be shared with customer workload VMs or be a dedicated cluster for management and control, including customer enterprise services, such as Active Directory, DNS, & DHCP. Additional resource clusters can be added to support customer workload demand. This also includes the option of using separate clusters for each customer SLA. The best practice for Disaster Recovery design is to follow a pod architecture where each protected site has a matching private cloud in the Disaster Recovery Azure Region. Complex mesh topologies should be avoided for operational simplicity. The required workload Service Level Agreement values must be mapped to the appropriate Recovery Point Objectives (RPO) and Recovery Time Objectives (RTO) and use a naming convention that is easy to understand. For example, Gold, Silver and Bronze or Tier-1, Tier-2 and Tier-3. Each pod should be designated with an SLA capability for operational simplicity. On a smaller scale, the pod concept could be per cluster instead of per private cloud. The Disaster Recovery pods are provisioned to support the necessary replicated storage capacity during steady state. When a disaster is declared, the necessary compute resources will be added to the private cloud. This can be configured automatically using thisAuto-Scale function with Azure Automation Accounts and PowerShell Runbooks. Figure 5 – Azure VMware Solution DR Shared Services Figure 6 – Azure VMware Solution Dedicated DR Pods Design Consideration 3 – Disaster Recovery Solution: The Azure VMware Solution supports the following first-party and third-party Disaster Recovery solutions. Depending upon yourrecoverability and cost efficiency requirements, the best solution can be selected from Table 2 below. For cost efficiency, a best effort RPO and RTO can be met using backup replication of daily snapshots to the Disaster Recovery Site or using the Disaster Recovery replication feature of VMware HCX (Solution 4). If these solutions are not viable, you can also consider application, database or message bus clustering as an option. Solution RPO RTO DR Automation 1. VMware Site Recovery 5min – 24hr Minutes Yes, with Protection Groups & Recovery Plans 2. Zerto DR Seconds Minutes Yes, with Virtual Protection Groups (VPGs) 3. JetStream Software DR Seconds Minutes Yes, with Protection Domains, Runbooks & Runbook Groups 4. VMware HCX 5min – 24hr Hours No, manual process only Table 2 – Disaster Recovery Vendor Products Note:Azure Site Recovery can be used to protect Azure VMware Solution but is not listed here since we are describing how to use Azure VMware Solution to protect on-premises VMware vSphere solutions. Solution 1 –VMware Site Recovery supports Disaster Recovery automation with an RPO of 5 minutes to 24 hours with VMware SRM Virtual Appliance, VMware vSphere Replication and VMware vSAN. Currently, using VMware Site Recovery with Azure NetApp Files is not supported. When designing a solution with VMware Site Recovery, these Azure VMware Solution limits should be considered. Figure 7 – Azure VMware Solution with VMware Site Recovery Manager Solution 2 – Zerto Disaster Recovery supports Disaster Recovery automation with an RPO of secondsusing continuous replication with the Zerto Virtual Manager (ZVM), Zerto Virtual Replication Appliance (ZVRA) and VMware vSAN. When designing a solution with Zerto Disaster Recovery, this Zerto Architecture Guide should be considered. Figure 8 – Azure VMware Solution with Zerto Disaster Recovery Solution 3 –JetStream Software Disaster Recovery supports Disaster Recovery automation with an RPO of secondsusing continuous replication with the JetStream Manager Virtual Appliance (MSA), JetStream DR Virtual Appliance (DRVA) and VMware vSAN. When designing a solution with JetStream Software Disaster Recovery, these JetStream Software resources should be considered. Figure 9 – Azure VMware Solution with JetStream Software Disaster Recovery Solution 4 – VMware HCX Disaster Recoverysupports manual Disaster Recovery with an RPO of 5 minutes to 24 hours with VMware HCX Manager, VMware vSphere Replication and VMware vSAN. When designing a solution with VMware HCX, these Azure VMware Solution limits should be considered. Figure 10– Azure VMware Solution with VMware HCX Disaster Recovery Design Consideration 5 – SKU type: Three SKU types can be selected for provisioning an Azure VMware Solution private cloud. The smaller AV36 SKU can be used at the Disaster Recovery Site to build a pilot light cluster with the minimum storage resources for cost efficiency while the Primary Site can use the larger and more expensive AV36P and AV52 SKUs. The AV36 SKU is widely available in most Azure regions and the AV36P and AV52 SKUs are limited to certain Azure regions. Azure VMware Solution does not support mixing different SKU types within a private cloud (AV64 SKU is the exception).You can check Azure VMware Solution SKU availability by Azure Region here. TheAV64 SKU is currently only available for mixed SKU deployments in certain regions. Figure 11 – AV64 Mixed SKU Topology Design Consideration 6 – Runbook Application Groups:After the application dependency assessment is complete, this data will be used to create the runbook application groups to ensure that the application SLAs are met during a disaster event. If the application dependency assessment is incomplete, the runbook application groups can be initially designed using the process knowledge from your application architecture team and IT operations. The idea is to ensure each application is captured in a runbook that allows the application to be recovered completely and consistently using the runbook architecture and order of operations. Figure 12 – VMware Site Recovery Application Recovery Plans Design Consideration 7– Storage Policies: Table 3 lists the pre-defined VM Storage Policies available for use with VMware vSAN. The appropriate redundant array of independent disks (RAID) and failures to tolerate (FTT) settings per policy need to be considered to match the customer workload SLAs. Each policy has a trade-off between availability, performance, capacity, and cost that needs to be considered. To comply with the Azure VMware Solution SLA, you are responsible for using an FTT=2 storage policy when the cluster has 6 or more nodes in a standard cluster. You must also retain a minimum slack space of 25% for backend vSAN operations. Deployment Type Policy Name RAID Failures to Tolerate (FTT) Site Standard RAID-1 FTT-1 1 1 N/A Standard RAID-1 FTT-2 1 2 N/A Standard RAID-1 FTT-3 1 3 N/A Standard RAID-5 FTT-1 5 1 N/A Standard RAID-6 FTT-2 6 2 N/A Standard VMware Horizon 1 1 N/A Table 3 – VMware vSAN Storage Policies Design Consideration 8 – Network Connectivity: Azure VMware Solution private clouds can be connected using IPSec VPN and Azure ExpressRoute circuits, including a variety of Azure Virtual Networking topologies such as Hub-Spoke and Virtual WAN with Azure Firewall and third-party Network Virtualization Appliances. For more information, refer to the Azure VMware Solution networking and interconnectivity concepts. The Azure VMware Solution Cloud Adoption Framework also has example network scenarios that can be considered. Design Consideration 9 – Layer 2 Network Extension:VMware HCX can be used to provide Layer 2 network extension functionality to maintain the same IP address schema between sites. Figure 13 – VMware HCX Layer 2 Network Extension with VMware Site Recovery Design Consideration 10 – Anti-Patterns: Try to avoid using these anti-patterns in your recoverability design. Anti-Pattern 1 – Stretched Clusters: Azure VMware Solution Stretched Clusters is the only option for meeting an RPO of 0 requirement. Remember that stretched clusters are considered an availability solution, not disaster recovery, because it is a single fault domain for the management and control plane running in dual Availability Zones (AZs). Azure VMware Solution stretched clusters (GA) currently does not support the VMware Site Recoveryadd-on. Figure 14 – Azure VMware Solution Private Cloud with Stretched Clusters Anti-Pattern 2 – Ransomware Protection: A Disaster Recovery Automation solution does not provide protection against a ransomware attack. Ransomware protection requires additional security functionality where an isolated and secure area is used to filter through a series of data restores to validate the point in time copy is free from ransomware. This process can take months and it is necessary to access data backups that maybe months or years old. This is because the ransomware demand for moneyis merely the end of a long period of reconnaissance by an attacker and every system needs to be checked for active securityvulnerabilities and spyware agents. Disaster Recovery Automation assumes that ransomware is not present, and that data corruption has not replicated to the Disaster Recovery Site. That said, some Disaster Recovery Automation vendors now have a Ransomware Protection feature that can be leveraged as part of the solution. In the following section, I will describe the next steps that would need to be made to progress this high-level design estimate towards a validated detailed design. Next Steps The Azure VMware Solution sizing estimate should be assessed using Azure Migrate. With large enterprise solutions for strategic and major customers, an Azure VMware Solution Solutions Architect from Azure, VMware, or a trusted VMware Partner should be engaged to ensure the solution is correctly sized to deliver business value with the minimum of risk. This should also include an application dependency assessment to understand the mapping between application groups and identify areas of data gravity, application network traffic flows, and network latency dependencies. Summary In this post, we took a closer look at the typical recoverability requirements of a customer workload, the architectural building blocks, and the recoverability design considerations for the Azure VMware Solution. We also discussed the next steps to continue an Azure VMware Solution design. If you are interested in the Azure VMware Solution, please use these resources to learn more about the service: Homepage: Azure VMware Solution Documentation: Azure VMware Solution SLA: SLA for Azure VMware Solution Azure Regions: Azure Products by Region Service Limits: Azure VMware Solution subscription limits and quotas VMware Site Recovery: Deploy disaster recovery with VMware Site Recovery Manager Zerto DR: Deploy Zerto disaster recovery on Azure VMware Solution Zerto DR: Architecture Guide JetStream Software DR: Deploy disaster recovery using JetStream DR VMware HCX DR: Deploy disaster recovery using VMware HCX Stretched Clusters (Public Preview): Deploy vSAN stretched clusters SKU types: Introduction Storage policies: Configure storage policy GitHub repository: Azure/azure-vmware-solution Well-Architected Framework:Azure VMware Solution workloads Cloud Adoption Framework: Introduction to the Azure VMware Solution adoption scenario Network connectivity scenarios: Enterprise-scale network topology and connectivity for Azure VMware Solution Enterprise Scale Landing Zone: Enterprise-scale for Microsoft Azure VMware Solution Enterprise Scale GitHub repository: Azure/Enterprise-Scale-for-AVS Azure CLI:Azure Command-Line Interface (CLI) Overview PowerShell module:Az.VMware Module Azure Resource Manager: Microsoft.AVS/privateClouds REST API: Azure VMware Solution REST API Terraform provider:azurerm_vmware_private_cloud Terraform Registry Author Bio René van den Bedem is a Principal Technical Program Manager in the Azure VMware Solution product group at Microsoft. His background is in enterprise architecture with extensive experience across all facets of the enterprise, public cloud, and service provider spaces, including digital transformation and the business, enterprise, and technology architecture stacks. René works backwards from the problem to be solved and designs solutions that deliver business value with the minimum of risk. In addition to being the first quadruple VMware Certified Design Expert (VCDX), he is also a Dell Technologies Certified Master Enterprise Architect, a Nutanix Platform Expert (NPX), and a VMware vExpert. Link to PPTX Diagrams: azure-vmware-solution/azure-vmware-master-diagramsAzure VMware Solution Availability Design Considerations
Azure VMware Solution Design Series Availability Design Considerations Recoverability Design Considerations Performance Design Considerations Security Design Considerations VMware HCX Design with Azure VMware Solution Overview A global enterprise wants to migrate thousands of VMware vSphere virtual machines (VMs) to Microsoft Azure as part of their application modernization strategy. The first step is to exit their on-premises data centers and rapidly relocate their legacy application VMs to the Azure VMware Solution as a staging area for the first phase of their modernization strategy. What should the Azure VMware Solution look like? Azure VMware Solution is a VMware validated first party Azure service from Microsoft that provides private clouds containing VMware vSphere clusters built from dedicated bare-metal Azure infrastructure. It enables customers to leverage their existing investments in VMware skills and tools, allowing them to focus on developing and running their VMware-based workloads on Azure. In this post, I will introduce the typical customer workload availability requirements, describe the Azure VMware Solution architectural components, and describe the availability design considerations for Azure VMware Solution private clouds. In the next section, I will introduce the typical availability requirements of a customer’s workload. Customer Workload Requirements A typical customer has multiple application tiers that have specific Service Level Agreement (SLA) requirements that need to be met. These SLAs are normally named by a tiering system such as Platinum, Gold, Silver, and Bronze or Mission-Critical, Business-Critical, Production, and Test/Dev. Each SLA will have different availability, recoverability, performance, manageability, and security requirements that need to be met. For the availability design quality, customers will normally have an uptime percentage requirement with an availability zone (AZ) or region requirement that defines each SLA level. For example: SLA Name Uptime AZ/Region Gold 99.999% (5.26 min downtime/year) Dual Regions Silver 99.99% (52.6 min downtime/year) Dual AZs Bronze 99.9% (8.76 hrs downtime/year) Single AZ Table 1 – Typical Customer SLA requirements for Availability A typical legacy business-critical application will have the following application architecture: Load Balancer layer: Uses load balancers to distribute traffic across multiple web servers in the web layer to improve application availability. Web layer: Uses web servers to process client requests made via the secure Hypertext Transfer Protocol (HTTPS). Receives traffic from the load balancer layer and forwards to the application layer. Application layer: Uses application servers to run software that delivers a business application through a communication protocol. Receives traffic from the web layer and uses the database layer to access stored data. Database layer: Uses a relational database management service (RDMS) cluster to store data and provide database services to the application layer. Depending upon the availability requirements for the service, the application components could be many and spread across multiple sites and regions to meet the customer SLA. Figure 1 – Typical Legacy Business-Critical Application Architecture In the next section, I will introduce the architectural components of the Azure VMware Solution. Architectural Components The diagram below describes the architectural components of the Azure VMware Solution. Figure 2 – Azure VMware Solution Architectural Components Each Azure VMware Solution architectural component has the following function: Azure Subscription: Used to provide controlled access, budget and quota management for the Azure VMware Solution. Azure Region: Physical locations around the world where we group data centers into Availability Zones (AZs) and then group AZs into regions. Azure Resource Group: Container used to place Azure services and resources into logical groups. Azure VMware Solution Private Cloud: Uses VMware software, including vCenter Server, NSX software-defined networking, vSAN software-defined storage, and Azure bare-metal ESXi hosts to provide compute, networking, and storage resources.Azure NetApp Files, Azure Elastic SAN, and Pure Cloud Block Store are also supported. Azure VMware Solution Resource Cluster: Uses VMware software, including vSAN software-defined storage, and Azure bare-metal ESXi hosts to provide compute, networking, and storage resources for customer workloads by scaling out the Azure VMware Solution private cloud.Azure NetApp Files, Azure Elastic SAN, and Pure Cloud Block Store are also supported. VMware HCX: Provides mobility, migration, and network extension services. VMware Site Recovery: Provides Disaster Recovery automation, and storage replication services with VMware vSphere Replication.Third party Disaster Recovery solutions Zerto DR and JetStream DR are also supported. Dedicated Microsoft Enterprise Edge (D-MSEE):Router that providesconnectivity between Azure cloud and the Azure VMware Solution private cloud instance. Azure Virtual Network (VNet): Private network used to connect Azure services and resources together. Azure Route Server: Enables network appliances to exchange dynamic route information with Azure networks. Azure Virtual Network Gateway: Cross premises gateway for connecting Azure services and resources to other private networks using IPSec VPN, ExpressRoute, and VNet to VNet. Azure ExpressRoute: Provides high-speed private connections between Azure data centers and on-premises or colocation infrastructure. Azure Virtual WAN (vWAN): Aggregates networking, security, and routing functions together into a single unified Wide Area Network (WAN). In the next section, I will describe the availability design considerations for the Azure VMware Solution. Availability Design Considerations The architectural design process takes the business problem to be solved and the business goals to be achieved and distills these into customer requirements, design constraints and assumptions. Design constraints can be characterized by the following three categories: Laws of the Land – data and application sovereignty, governance, regulatory, compliance, etc. Laws of Physics – data and machine gravity, network latency, etc. Laws of Economics – owning versus renting, total cost of ownership (TCO), return on investment (ROI), capital expenditure, operational expenditure, earnings before interest, taxes, depreciation, and amortization (EBITDA), etc. Each design consideration will be a trade-off between the availability, recoverability, performance, manageability, and security design qualities. The desired result is to deliver business value with the minimum of risk by working backwards from the customer problem. Design Consideration 1 – Azure Region and AZs: Azure VMware Solution is available in 30 Azure Regions around the world(US Government has 2 additional Azure Regions). Select the relevant Azure Regions and AZs that meet your geographic requirements. These locations will typically be driven by your design constraints. Design Consideration 2 – Deployment topology: Select the Azure VMware Solution topology that best matches the uptime and geographic requirements of your SLAs. For very large deployments, it may make sense to have separate private clouds dedicated to each SLA for cost efficiency. The Azure VMware Solution supports a maximum of 12 clusters per private cloud. Each cluster supports a minimum of 3 hosts and a maximum of 16 hosts per cluster. Each private cloud supports a maximum of 96 hosts. VMware vSphere HA provides protection against ESXi host failures and VMware vSphere DRS provides distributed resource management. VMware vSphere Fault Tolerance is not supported by the Azure VMware Solution. These features are preconfigured as part of the managed service and cannot be changed by the customer. VMware vCenter Server, VMware HCX Manager, VMware SRM and VMware vSphere Replication Manager are individual appliances and are protected by vSphere HA. VMware NSX Manager is a cluster of 3 unified appliances that have a VM-VM anti-affinity placement policy to spread them across the hosts of the cluster. The VMware NSX Edge cluster is a pair of appliances that also use a VM-VM anti-affinity placement policy. Topology 1 – Standard: The Azure VMware Solution standard private cloud is deployed within a single AZ in an Azure Region, which delivers an infrastructure SLA of 99.9%. Figure 3 – Azure VMware Solution Private Cloud Standard Topology Topology 2 – Multi-AZ: Azure VMware Solution private clouds in separate AZs per Azure Region. VMware HCX is used to connect private clouds across AZs. Application clustering is required to provide the multi-AZ availability mechanism. The customer is responsible for ensuring their application clustering solution is within the limits of bandwidth and latency between private clouds. This topology will deliver an SLA of greater than 99.9%, however it will be dependent upon the application clustering solution used by the customer. The Azure VMware Solution does not support AZ selection during provisioning. This is mitigated by having separate Azure Subscriptions with quota in each separate AZ. You can open a ticket with Microsoft to configure a Special Placement Policy to deploy your Azure VMware Solution private cloud to a particular AZ per subscription. Figure 4 – Azure VMware Solution Private Cloud Multi-AZ Topology Topology 3 – Stretched: The Azure VMware Solution stretched clusters private cloud is deployed across dual AZs in an Azure Region, which delivers a 99.99% infrastructure SLA. This also includes a third AZ for the Azure VMware Solution witness site. Stretched clusters support policy-based synchronous replication to deliver a recovery point objective (RPO) of zero. It is possible to use placement policies and storage policies to mix SLA levels within stretched clusters, by pinning lower SLA workloads to a particular AZ, which will experience downtime during an AZ failure. This feature is GA and is currently only available in Australia East, West Europe, UK South and Germany West Central Azure Regions. Figure 5 – Azure VMware Solution Private Cloud with Stretched Clusters Topology Topology 4 – Multi-Region: Azure VMware Solution private clouds across Azure regions. VMware HCX is used to connect private clouds across Azure Regions. Application clustering is required to provide the multi-region availability mechanism. The customer is responsible for ensuring their application clustering solution is within the limits of bandwidth and latency between private clouds. This topology will deliver an SLA of greater than 99.9%, however it will be dependent upon the application clustering solution used by the customer. An additional enhancement could be using Azure VMware Solution stretched clusters in one or both Azure Regions. Figure 6 – Azure VMware Solution Private Cloud Multi-Region Topology Design Decision 3 – Shared Services or Separate Services Model: The management and control plane cluster (Cluster-1) can be shared with customer workload VMs or be a dedicated cluster for management and control, including customer enterprise services, such as Active Directory, DNS, and DHCP. Additional resource clusters can be added to support customer workload demand. This also includes the option of using separate clusters for each customer SLA. Figure 7 – Azure VMware Solution Shared Services Model Figure 8 – Azure VMware Solution Separate Services Model Design Consideration 4 – SKU type: Three SKU types can be selected for provisioning an Azure VMware Solution private cloud. The smaller AV36 SKU can be used to minimize the impact radius of a failed node. The larger AV36P and AV52 SKUs can be used to run more workloads with less nodes which increases the impact radius of a failed node. The AV36 SKU is widely available in most Azure regions and the AV36P and AV52 SKUs are limited to certain Azure regions. Azure VMware Solution does not support mixing different SKU types within a private cloud (AV64 SKU is the exception).You can check Azure VMware Solution SKU availability by Azure Region here. TheAV64 SKU is currently only available for mixed SKU deployments in certain regions. Figure 9 – AV64 Mixed SKU Topology Design Consideration 5 – Placement Policies: Placement policies are used to increase the availability of a service by separating the VMs in an application availability layer across ESXi hosts. When an ESXi failure occurs, it would only impact one VM of a multi-part application layer, which would then restart on another ESXi host through vSphere HA. Placement policies support VM-VM and VM-Host affinity and anti-affinity rules. The vSphere Distributed Resource Scheduler (DRS) is responsible for migrating VMs to enforce the placement policies. To increase the availability of an application cluster, a placement policy with VM-VM anti-affinity rules for each of the web, application and database service layers can be used. Alternatively, VM-Host affinity rules can be used to segment the web, application, and database components to dedicated groups of hosts. The placement policies for stretched clusters can use VM-Host affinity rules to pin workloads to the preferred and secondary sites, if needed. Figure 10 – Azure VMware Solution Placement Policies – VM-VM Anti-Affinity Figure 11 – Azure VMware Solution Placement Policies – VM-Host Affinity Design Consideration 6 – Storage Policies: Table 2 lists the pre-defined VM Storage Policies available for use with VMware vSAN. The appropriate redundant array of independent disks (RAID) and failures to tolerate (FTT) settings per policy need to be considered to match the customer workload SLAs. Each policy has a trade-off between availability, performance, capacity, and cost that needs to be considered. The storage policies for stretched clusters include a designation for the dual site (synchronous replication), preferred site and secondary site policies that need to be considered. To comply with theAzure VMware Solution SLA, you are responsible for using an FTT=2 storage policy when the cluster has 6 or more nodes in a standard cluster. You must also retain a minimum slack space of 25% for backend vSAN operations. Deployment Type Policy Name RAID Failures to Tolerate (FTT) Site Standard RAID-1 FTT-1 1 1 N/A Standard RAID-1 FTT-2 1 2 N/A Standard RAID-1 FTT-3 1 3 N/A Standard RAID-5 FTT-1 5 1 N/A Standard RAID-6 FTT-2 6 2 N/A Standard VMware Horizon 1 1 N/A Stretched RAID-1 FTT-1 Dual Site 1 1 Site mirroring Stretched RAID-1 FTT-1 Preferred 1 1 Preferred Stretched RAID-1 FTT-1 Secondary 1 1 Secondary Stretched RAID-1 FTT-2 Dual Site 1 2 Site mirroring Stretched RAID-1 FTT-2 Preferred 1 2 Preferred Stretched RAID-1 FTT-2 Secondary 1 2 Secondary Stretched RAID-1 FTT-3 Dual Site 1 3 Site mirroring Stretched RAID-1 FTT-3 Preferred 1 3 Preferred Stretched RAID-1 FTT-3 Secondary 1 3 Secondary Stretched RAID-5 FTT-1 Dual Site 5 1 Site mirroring Stretched RAID-5 FTT-1 Preferred 5 1 Preferred Stretched RAID-5 FTT-1 Secondary 5 1 Secondary Stretched RAID-6 FTT-2 Dual Site 6 2 Site mirroring Stretched RAID-6 FTT-2 Preferred 6 2 Preferred Stretched RAID-6 FTT-2 Secondary 6 2 Secondary Stretched VMware Horizon 1 1 Site mirroring Table 2 – VMware vSAN Storage Policies Design Consideration 7 – Network Connectivity: Azure VMware Solution private clouds can be connected using IPSec VPN and Azure ExpressRoute circuits, including a variety of Azure Virtual Networking topologies such as Hub-Spoke and AzureVirtual WAN with Azure Firewall and third-party Network Virtualization Appliances. Multiple Azure ExpressRoute circuits can be used to provide redundant connectivity. VMware HCX also supports redundant Network Extension appliances to provide high availability for Layer-2 network extensions. For more information, refer to theAzure VMware Solution networking and interconnectivity concepts. The Azure VMware Solution Cloud Adoption Framework also has example network scenarios that can be considered. And, if you are interested in Azure ExpressRoute design: Understanding ExpressRoute private peering to address ExpressRoute resiliency ExpressRoute MSEE hairpin design considerations In the following section, I will describe the next steps that would need to be made to progress this high-level design estimate towards a validated detailed design. Next Steps The Azure VMware Solution sizing estimate should be assessed using Azure Migrate. With large enterprise solutions for strategic and major customers, an Azure VMware Solution Solutions Architect from Azure, VMware, or a VMware Partner should be engaged to ensure the solution is correctly sized to deliver business value with the minimum of risk.This should also include an application dependency assessment to understand the mapping between application groups and identify areas of data gravity, application network traffic flows, and network latency dependencies. Summary In this post, we took a closer look at the typical availability requirements of a customer workload, the architectural building blocks, and the availability design considerations for the Azure VMware Solution. We also discussed the next steps to continue an Azure VMware Solution design. If you are interested in the Azure VMware Solution, please use these resources to learn more about the service: Homepage: Azure VMware Solution Documentation: Azure VMware Solution SLA: SLA for Azure VMware Solution Azure Regions: Azure Products by Region Service Limits: Azure VMware Solution subscription limits and quotas Stretched Clusters: Deploy vSAN stretched clusters SKU types: Introduction Placement policies: Create placement policy Storage policies: Configure storage policy VMware HCX: Configuration & Best Practices GitHub repository: Azure/azure-vmware-solution Well-Architected Framework:Azure VMware Solution workloads Cloud Adoption Framework: Introduction to the Azure VMware Solution adoption scenario Network connectivity scenarios: Enterprise-scale network topology and connectivity for Azure VMware Solution Enterprise Scale Landing Zone: Enterprise-scale for Microsoft Azure VMware Solution Enterprise Scale GitHub repository: Azure/Enterprise-Scale-for-AVS Azure CLI:Azure Command-Line Interface (CLI) Overview PowerShell module:Az.VMware Module Azure Resource Manager: Microsoft.AVS/privateClouds REST API: Azure VMware Solution REST API Terraform provider:azurerm_vmware_private_cloud Terraform Registry Author Bio René van den Bedem is a Principal Technical Program Manager in the Azure VMware Solution product group at Microsoft. His background is in enterprise architecture with extensive experience across all facets of the enterprise, public cloud, and service provider spaces, including digital transformation and the business, enterprise, and technology architecture stacks. René works backwards from the problem to be solved and designs solutions that deliver business value with the minimum of risk. In addition to being the first quadruple VMware Certified Design Expert (VCDX), he is also a Dell Technologies Certified Master Enterprise Architect, a Nutanix Platform Expert (NPX), and a VMware vExpert. Link to PPTX Diagrams: azure-vmware-solution/azure-vmware-master-diagramsAzure VMware Solution Security Design Considerations
Azure VMware Solution Design Series Availability Design Considerations Recoverability Design Considerations Performance Design Considerations Security Design Considerations VMware HCX Design with Azure VMware Solution Overview A global enterprise wants to migrate thousands of VMware vSphere virtual machines (VMs) to Microsoft Azure as part of their application modernization strategy. The first step is to exit their on-premises data centers and rapidly relocate their legacy application VMs to the Azure VMware Solution as a staging area for the first phase of their modernization strategy. What should the Azure VMware Solution look like? Azure VMware Solution is a VMware validated first party Azure service from Microsoft that provides private clouds containing VMware vSphere clusters built from dedicated bare-metal Azure infrastructure. It enables customers to leverage their existing investments in VMware skills and tools, allowing them to focus on developing and running their VMware-based workloads on Azure. In this post, I will introduce the typical customer workload security requirements, describe the Azure VMware Solution architectural components, describe the zero trust security model, and describe the security design considerations for Azure VMware Solution private clouds. In the next section, I will introduce the typical security requirements of a customer’s workload. Customer Workload Requirements A typical customer has multiple application tiers that have specific Service Level Agreement (SLA) requirements that need to be met. These SLAs are normally named by a tiering system such as Platinum, Gold, Silver, and Bronze or Mission-Critical, Business-Critical, Production, and Test/Dev. Each SLA will have different availability, recoverability, performance, manageability, and security requirements that need to be met. For the security design quality, customers will normally have governance, regulatory, and compliance requirements. This is normally documented for each application and then aggregated into the information security policy requirements for each SLA or line of business. For example: Line of Business Governance, Regulatory, & Compliance Finance ISO/IEC 27001:2022, GLBA, PCI DSS 4.0 US Government NIST Cybersecurity Framework, FISMA, FedRAMP Health Care ISO/IEC 27001:2022, HIPAA, PCI DSS 4.0 Table 1 – Typical Customer requirements for Information Security The security concepts introduced in Table 1 have the following definitions: Governance: The process of making and enforcing decisions within an IT organization or system. It encompasses decision-making, rule-setting, and enforcement mechanisms to guide the functioning of IT in alignment with the business goals and strategies. Regulatory: The set of laws, regulations, and guidelines that govern the collection, storage, processing, and sharing of sensitive information. These regulations are designed to ensure that organizations protect sensitive information from unauthorized access, use, disclosure, and destruction. Compliance with these regulations is mandatory and non-compliance can result in legal penalties, fines, and reputational damage. Compliance: The act or process of following the rules, regulations, and standards that apply to the IT organization or system. It can also refer to the ability of an IT system to adapt to changing requirements or demands. A typical legacy business-critical application will have the following application architecture: Load Balancer layer: Uses load balancers to distribute traffic across multiple web servers in the web layer to improve application availability. Web layer: Uses web servers to process client requests made via the secure Hypertext Transfer Protocol (HTTPS). Receives traffic from the load balancer layer and forwards to the application layer. Application layer: Uses application servers to run software that delivers a business application through a communication protocol. Receives traffic from the web layer and uses the database layer to access stored data. Database layer: Uses a relational database management service (RDMS) cluster to store data and provide database services to the application layer. Depending upon the security requirements for each service, infrastructure design could be a mix of technologies used to meet the different security policies with cost efficiency. Figure 1 – Typical Legacy Business-Critical Application Architecture In the next section, I will introduce the architectural components of the Azure VMware Solution. Architectural Components The diagram below describes the architectural components of the Azure VMware Solution. Figure 2 – Azure VMware Solution Architectural Components Each Azure VMware Solution architectural component has the following function: Azure Subscription: Used to provide controlled access, budget, and quota management for the Azure VMware Solution. Azure Region: Physical locations around the world where we group data centers into Availability Zones (AZs) and then group AZs into regions. Azure Resource Group: Container used to place Azure services and resources into logical groups. Azure VMware Solution Private Cloud: Uses VMware software, including vCenter Server, NSX software-defined networking, vSAN software-defined storage, and Azure bare-metal ESXi hosts to provide compute, networking, and storage resources. Azure NetApp Files, Azure Elastic SAN, and Pure Cloud Block Store are also supported. Azure VMware Solution Resource Cluster: Uses VMware software, including vSAN software-defined storage, and Azure bare-metal ESXi hosts to provide compute, networking, and storage resources for customer workloads by scaling out the Azure VMware Solution private cloud. Azure NetApp Files, Azure Elastic SAN, and Pure Cloud Block Store are also supported. VMware HCX: Provides mobility, migration, and network extension services. VMware Site Recovery: Provides Disaster Recovery automation and storage replication services with VMware vSphere Replication. Third party Disaster Recovery solutions Zerto Disaster Recovery and JetStream Software Disaster Recovery are also supported. Dedicated Microsoft Enterprise Edge (D-MSEE): Router that provides connectivity between Azure cloud and the Azure VMware Solution private cloud instance. Azure Virtual Network (VNet): Private network used to connect Azure services and resources together. Azure Route Server: Enables network appliances to exchange dynamic route information with Azure networks. Azure Virtual Network Gateway: Cross premises gateway for connecting Azure services and resources to other private networks using IPSec VPN, ExpressRoute, and VNet to VNet. Azure ExpressRoute: Provides high-speed private connections between Azure data centers and on-premises or colocation infrastructure. Azure Virtual WAN (vWAN): Aggregates networking, security, and routing functions together into a single unified Wide Area Network (WAN). In the next section, I will introduce the Zero Trust Security Model which should be used as the framework for securing your Azure cloud resources. Zero Trust Security Model A holistic approach to Zero Trust should extend to your entire digital estate: inclusive of identities, endpoints, network, data, apps, and infrastructure. Zero Trust architecture serves as a comprehensive end-to-end strategy and requires integration across the elements. The foundation of Zero Trust security is identities. Both human and non-human identities need strong authorization, connecting from either personal or corporate endpoints with compliant devices, requesting access based on strong policies grounded in Zero Trust principles of explicit verification, least-privilege access, and assumed breach. As a unified policy enforcement, the Zero Trust policy intercepts the request, explicitly verifies signals from all six foundational elements based upon policy configuration and enforces least-privilege access. Signals include the role of the user, location, device compliance, data sensitivity, and application sensitivity. In addition to telemetry and state information, the risk assessment from threat protection feeds into the policy engine to automatically respond to threats in real time. Policy is enforced at the time of access and continuously evaluated throughout the session. This policy is further enhanced by policy optimization. Governance and compliance are critical to a strong Zero Trust implementation. Security posture assessment and productivity optimization are necessary to measure the telemetry throughout the services and systems. Telemetry and analytics feeds into the threat protection system. Large amounts of telemetry and analytics enriched by threat intelligence generate high-quality risk assessments that can be either manually investigated or automated. Attacks happen at cloud speed and, because humans can’t react quickly enough or sift through all the risks, your defense systems must also act at cloud speed. The risk assessment feeds into the policy engine for real-time automated threat protection and additional manual investigation if needed. Traffic filtering and segmentation is applied to the evaluation and enforcement from the Zero Trust policy before access is granted to any public or private network. Data classification, labeling, and encryption should be applied to emails, documents, and structured data. Access to apps should be adaptive, whether SaaS or on-premises. Runtime control is applied to infrastructure with serverless, containers, IaaS, PaaS, and internal sites with just-in-time (JIT) and version controls actively engaged. Finally, telemetry, analytics, and assessment from the network, data, apps, and infrastructure are fed back into the policy optimization and threat protection systems. Figure 3 – Zero Trust Security Model Many legacy monolithic application stacks may use the older defense in depth security model, however, we introduced the zero trust security model here for consideration. For more information, refer to theMicrosoft Security Adoption Framework, and the Microsoft Cybersecurity Reference Architectures. In the next section, I will describe the security design considerations for the Azure VMware Solution. Security Design Considerations The architectural design process takes the business problem to be solved and the business goals to be achieved and distills these into customer requirements, design constraints and assumptions. Design constraints can be characterized by the following three categories: Laws of the Land – data and application sovereignty, governance, regulatory, compliance, etc. Laws of Physics – data and machine gravity, network latency, etc. Laws of Economics – owning versus renting, total cost of ownership (TCO), return on investment (ROI), capital expenditure, operational expenditure, earnings before interest, taxes, depreciation, and amortization (EBITDA), etc. Each design consideration will be a trade-off between availability, recoverability, performance, manageability, and security design qualities. The desired result is to deliver business value with the minimum of risk by working backwards from the customer problem. Design Consideration 1 – Governance, Regulatory, & Compliance: Learn how Microsoft cloud services protect your data, and how you can manage cloud data security and compliance for your organization with these certifications, regulations, and standards. Also included are reports, whitepapers, artifacts, and industry/regional resources. You can also use theMicrosoft cloud security benchmark to address the common challenges customers have when securing their Azure infrastructure. For more information visitSecurity, governance, and compliance disciplines for Azure VMware Solution. Design Consideration 2 – Azure Region: Select the relevant Azure Regions that meet your governance, regulatory, and compliance requirements. Azure VMware Solution is available in 30 Azure Regions around the world.Azure VMware Solution is also available in two Azure Government regions and in scope for FedRAMP HIGH authorization. Azure VMware Solution in Azure Government is currently pending authorization for DoD IL4/IL5. Design Consideration 3 – Identity & Access Management: Use role-based access control to provide secure access to the Azure VMware Solution. Use Microsoft Entra Privileged Identity Management to allow time-bound access to the Azure portal and control pane operations. Use privileged identity management audit history to track operations that highly privileged accounts perform. For more information refer toSecurity considerations for Azure VMware Solution workloads and Enterprise-scale identity and access management. Design Consideration 4 – LDAPS Integration: Select an external identity source for your Azure VMware Solution. The local cloud administrator user accounts for vCenter Server and NSX Manager should be used as emergency access accounts for "break glass" scenarios in your private cloud. It's not intended to be used for daily administrative activities or for integration with other services. Use an external identity source to give your administrators and operators access to the Azure VMware Solution. This allows them to use their corporate credentials when accessing the Azure VMware Solution. For more information refer toIdentity and access. This MTC blog post also provides a detailed procedure to Configure LDAPS within Azure VMware Solution. Design Consideration 5 – SIEM Logging: Select the Azure VMware Solution Diagnostic setting to stream platform logs and metrics to your Security Information & Event Management (SIEM) solution. Azure VMware Solution supports Azure Log Analytics, Azure Event Hub, and Azure Blob Storage as logging targets. The log stream contains the following: vCenter Server logs ESXi logs vSAN logs NSX Manager logs NSX Data Center Distributed Firewall logs NSX Data Center Gateway Firewall logs NSX Data Center Edge Appliance logs The Azure VMware Solution does not currently support Syslog Forwarding to a customer Syslog server, however you can use thisSyslog Forwarding function instead. For more information refer toConfigure VMware syslogs for Azure VMware Solution. Figure 4 – Azure VMware Solution Logging Options Design Consideration 6 – VMware vSAN Encryption: Use a Customer Managed Key to augment the Azure VMware Solution vSAN encryption process. Customer-managed keys give you control over the encrypted vSAN data on Azure VMware Solution. You can use Azure Key Vault to generate customer managed keys and centralize the key management process. For more information refer toConfigure customer-managed key encryption at rest in Azure VMware Solution. Figure 5 – Customer Managed Keys with Azure VMware Solution Design Consideration 7 – VM Security: Use Trusted Launch with Azure VMware Solution to increase the security of your Virtual Machines. Trusted Launch comprises of Secure Boot, Virtual Trusted Platform Module (vTPM), and Virtualization-based Security (VBS) to provide a formidable defense against modern cyber threats. Trusted Launch is a requirement for Windows 11 compatibility. For more information refer toTrusted Launch for Azure VMware Solution virtual machines. Design Consideration 8 – Firewall Placement: Select the placement locations of network security firewalls within the Azure VMware Solution or Azure native services, including the zones of trust topology to meet your traffic flow requirements. Zones of trust refer to the concept of segmenting a network into different areas based upon the level of trust assigned to the devices and users within that area. This is used as part of the Zero Trust security model, where access to resources is granted on a need-to-know basis and is strictly enforced through continuous authentication and authorization. Traffic flows refer to the movement of data between different zones of trust in a network. Understanding traffic flows is important for network design, capacity planning, and information security. For Azure VMware Solution, the zones of trust need to account for the Azure VMware Solution private clouds, Azure native services, the internet, and other non-Azure locations. Figure 6 – Zones of Trust There are four options in the Azure Cloud Adoption Framework that describe different ways to secure and manage the network traffic between the Azure VMware Solution and other Azure resources, on-premises, and the internet: Option 1: Use a Secured Virtual WAN hub with default route propagation to route all traffic through an Azure Firewall or a third-party security provider. Option 2: Use Network Virtual Appliances in Azure Virtual Network to inspect all network traffic and apply firewall rules or policies. Option 3: Egress from Azure VMware Solution with or without NSX or NVA to control the outbound traffic from Azure VMware Solution to other destinations. Option 4: Use third-party firewall solutions in a hub virtual network with Azure Route Server to enable dynamic routing between Azure VMware Solution and the firewall appliances. Each option has its own benefits and trade-offs depending upon your requirements and preferences. In addition, we have VMware NSX features that can be used to secure north-south and east-west traffic, including the optional use of host-based endpoint solutions. Figure 7 – Azure Firewall and 3 rd party NVA options Option 1 –Secured Virtual WAN hub: Use a Secured Virtual WAN hub with default route propagation to route all traffic through an Azure Firewall or a third-party security provider. This solution doesn't work for on-premises filtering and Global Reach bypasses the Virtual WAN hubs. For more information refer toAzure Cloud Adoption Framework. Figure 8 – Option 1: Secured Virtual WAN hub Option 2 – Third-party Firewalls for all traffic: Use Network Virtual Appliances in Azure Virtual Network to inspect all network traffic and apply firewall rules or policies. Choose this option if you want to use your existing NVA and centralize all traffic inspection in your hub virtual network. For more information refer toAzure Cloud Adoption Framework. Figure 9 – Option 2: Third-party Firewalls for all traffic The diagram below provides an example of how FortiNet FortiGate NVAs can be used to build Scenario 2. For more information refer toAzure routing and network interfaces. Figure 10 – Example design of Option 2 with FortiNet FortiGate NVAs Option 3 – Egress from Azure VMware Solution: Egress from Azure VMware Solution with or without NSX or an NVA to control the outbound traffic from Azure VMware Solution to other destinations. Choose this option if you need to inspect traffic from two or more Azure VMware Solution private clouds. This option lets you use NSX native features. You can also combine this option with NVAs running on Azure VMware Solution between Tier-0 and Tier-1 Gateways. For more information refer toAzure Cloud Adoption Framework. Figure 11 – Option 3: Egress from Azure VMware Solution Option 4 – Third-party Firewall for internet traffic: Use a third-party firewall solution in a hub virtual network with Azure Route Server to enable dynamic routing between Azure VMware Solution and the firewall appliances. Choose this option to advertise the default route from an NVA in your Azure hub virtual network to an Azure VMware Solution. For more information refer toAzure Cloud Adoption Framework. Figure 12 – Option 4: Third-party Firewall for internet traffic Option 5 – NSX Gateways: Use NSX Gateways to secure North-South network traffic. You can use the network filtering capabilities of the Tier-0 and Tier-1 Gateways in NSX to provide North-South traffic filtering. For more information refer toAzure VMware Solution Network Security. Figure 13 – VMware NSX Gateway Firewall policies for North-South traffic Option 6 – Micro segmentation with DFW: Use NSX Distributed Firewall to secure East-West network traffic. You can use the Distributed Firewall in NSX to provide East-West micro segmentation of traffic flows. For more information refer toAzure VMware Solution Network Security. Figure 14 – VMware NSX Micro-segmentation (DFW) for East-West traffic Option 7 – Microsoft Defender for Cloud: Use a Microsoft Defender for Cloud or a third-party host-based security solution to secure your traffic from the guest operating system. Microsoft Defender for Cloud provides advanced threat protection across your Azure VMware Solution and on-premises virtual machines (VMs). It assesses the vulnerability of Azure VMware Solution VMs and raises alerts as needed. These security alerts can be forwarded to Azure Monitor for resolution. You can define security policies in Microsoft Defender for Cloud. For more information refer toIntegrate Microsoft Defender for Cloud with Azure VMware Solution. Figure 15 – Microsoft Defender for Cloud with Azure VMware Solution Design Consideration 9 – Internet Access: Azure VMware Solution has three options within each private cloud instance that can be configured. These are explicitly described separate from the firewall scenarios in Design Consideration 8 even though they overlap. Option 1 – Internet Disabled: Use Azure native networking to provide internet access. There are multiple ways to generate a default route in Azure and send it towards your Azure VMware Solution private cloud or on-premises. The options are as follows: An Azure firewall in a Virtual WAN Hub. A third-party Network Virtual Appliance in a Virtual WAN Hub Spoke Virtual Network. A third-party Network Virtual Appliance in a Native Azure Virtual Network using Azure Route Server. A default route from on-premises transferred to Azure VMware Solution over Global Reach. For more information refer toInternet connectivity design considerations. Figure 16 – Azure VMware Solution Internet Disabled Option 2 – Managed SNAT: Managed SNAT service provides a simple method for outbound internet access from an Azure VMware Solution private cloud. Features of Managed SNAT include the following: Easily enabled. No control over SNAT rules, all sources that reach the SNAT service are allowed. No visibility of connection logs. Two Public IPs are used and rotated to support up to 128k simultaneous outbound connections. No inbound DNAT capability is available. For more information refer toInternet connectivity design considerations. Figure 17 – Azure VMware Solution Managed SNAT Internet Access Option 3 – Public IP Address (PIP): Use an allocated Azure Public IPv4 address directly with the NSX Edge for consumption. PIP allows the Azure VMware Solution private cloud to directly consume and apply public network addresses in NSX as required. These addresses are used for the following types of connections: Outbound SNAT Inbound DNAT Load balancing using VMware NSX Advanced Load Balancer and other third-party Network Virtual Appliances Applications directly connected to a workload VM interface. This option also lets you configure the public address on a third-party Network Virtual Appliance to create a DMZ within the Azure VMware Solution private cloud. The Azure Public IP addresses are customer owned and are not actively scanned by Microsoft. Microsoft Defender for Cloud or a suitable 3 rd party solution should be used fully secure the internet connection. These options are discussed inDesign Consideration 8. For more information refer toInternet connectivity design considerations. Figure 18 – Azure VMware Solution VMware NSX Public IP Address (PIP) The use of a “T1 Sandwich” with a third-party NVA, allows you to scale beyond the 10 vNIC limitation for VMware vSphere virtual machine hardware. The result is an increase in the number of networks you can protect with an NVA. The diagram below provides an example of how a CheckPoint NVA can be used with a T1 Sandwich topology. Figure 19 – VMware NSX T1 Gateway Sandwich with CheckPoint In the following section, I will describe the next steps that need to be made to progress this high-level design estimate towards a validated detailed design. Next Steps The Azure VMware Solution sizing estimate should be assessed using Azure Migrate. With large enterprise solutions for strategic and major customers, an Azure VMware Solution Solutions Architect from Azure, VMware, or a trusted VMware Partner should be engaged to ensure the solution is correctly sized to deliver business value with the minimum of risk. This should also include an application dependency assessment to understand the mapping between application groups and identify areas of data gravity, application network traffic flows, and network latency dependencies. Summary In this post, we took a closer look at the typical security requirements of a customer workload, the architectural building blocks, the zero trust security model, and the security design considerations for the Azure VMware Solution. We also discussed the next steps to continue an Azure VMware Solution design. If you are interested in the Azure VMware Solution, please use these resources to learn more about the service: Homepage: Azure VMware Solution Documentation: Azure VMware Solution SLA: SLA for Azure VMware Solution Azure Regions: Azure Products by Region Security Fundamentals: Azure security fundamentals Zero Trust Model: Zero Trust Model - Modern Security Architecture Zero Trust Security Framework: Zero Trust security in Azure Microsoft Cybersecurity: Microsoft Cybersecurity Reference Architectures Adoption Framework: Microsoft Security Adoption Framework Microsoft cloud: Learn how Microsoft cloud services protect your data Benchmark: Microsoft cloud security benchmark Azure VMware Solution: Security, governance, and compliance disciplines Vulnerabilities: Concepts - How Azure VMware Solution Addresses Vulnerabilities Security Recommendations: Concepts - Security recommendations for Azure VMware Solution Security Baseline: Azure security baseline for Azure VMware Solution Defense in Depth: Microsoft Azure's defense in depth approach to cloud vulnerabilities Azure Compliance: Azure compliance documentation WAF: Security considerations for Azure VMware Solution workloads Identity & Access Management: Enterprise-scale identity and access management Configure LDAPS: Configure LDAPS within Azure VMware Solution Syslog: Configure VMware syslogs for Azure VMware Solution Syslog Forwarding: Syslog Forwarding function Customer-managed keys: Configure customer-managed key encryption at rest Trusted Launch: Trusted Launch for Azure VMware Solution virtual machines Network connectivity scenarios: Enterprise-scale network topology and connectivity for Azure VMware Solution Network Security: Azure VMware Solution Network Security Defender for Cloud: Integrate Microsoft Defender for Cloud with Azure VMware Solution Internet Connectivity: Internet connectivity design considerations Service Limits: Azure VMware Solution subscription limits and quotas GitHub repository: Azure/azure-vmware-solution Well-Architected Framework: Azure VMware Solution workloads Cloud Adoption Framework: Introduction to the Azure VMware Solution adoption scenario Enterprise Scale Landing Zone: Enterprise-scale for Microsoft Azure VMware Solution Enterprise Scale GitHub repository: Azure/Enterprise-Scale-for-AVS Azure CLI: Azure Command-Line Interface (CLI) Overview PowerShell module: Az.VMware Module Azure Resource Manager: Microsoft.AVS/privateClouds REST API: Azure VMware Solution REST API Terraform provider: azurerm_vmware_private_cloud Terraform Registry Author Bio René van den Bedem is a Principal Technical Program Manager in the Azure VMware Solution product group at Microsoft. His background is in enterprise architecture with extensive experience across all facets of the enterprise, public cloud & service provider spaces, including digital transformation and the business, enterprise, and technology architecture stacks. René works backwards from the problem to be solved and designs solutions that deliver business value with the minimum of risk. In addition to being the first quadruple VMware Certified Design Expert (VCDX), he is also a Dell Technologies Certified Master Enterprise Architect, a Nutanix Platform Expert (NPX), and a VMware vExpert. Link to PPTX Diagrams: azure-vmware-solution/azure-vmware-master-diagramsWhat's new in Azure Migrate?
Introduction The journey to the cloud is an essential step for modern enterprises looking to leverage the benefits of scalability, flexibility, and cost-efficiency. A crucial part of this transformation is understanding the current state of your IT infrastructure, including workloads, applications, and their interdependencies. Often, organizations aim to set their migration goals based on the applications they want to move to the cloud, rather than focusing on individual servers or databases in isolation. I am thrilled to share that Azure Migrate is evolving to both simplifyandenrichyour cloud adoption journey. We are introducing new capabilities in Azure Migrate to help you achieve your goals. Introducing Application awareness in Azure Migrate [limited preview] A key step in any cloud transformation plan is a current state analysis of the entire IT estate covering workloads and applications, and relationships/ dependencies among them. I am excited to announce the limited preview ofapplication aware experiencesin Azure Migrate – across every phase of the migration journey. This allows you to gain insights into the total cost of ownership, identify suitable IaaS and PaaS targets, and receive tailored migration and modernization guidance. To get started with Azure Migrate, simply create an Azure Migrate project on Azure portal, and leverage Azure Migrate’s wide-ranging discovery capabilities, including the Azure Migrate appliance or importing inventory via RVTools to discover your environment. This allows you to explore inventory across Infra-Data-Web tiers and use the updated dependency analysis to identify application boundaries. As part of the application aware experiences, we are introducing the concept of tags within Azure Migrate. So once dependencies are identified, you can group the dependent workloads comprising an application via tags. Then, Azure Migrate can be used to create application-specific business cases to identify savings and ROI, assess ideal migration strategies, and get recommendations for Azure services, SKUs, resource costs, and migration/modernization tools. Further, as part of executing the migration and onboarding to Azure, customers can use the recommended tools to modernize via re-platform and refactor(out of band)techniques or use the integrated rehost migration experience to rehost to Azure VM. Complemented with a refreshed user experience As part of delivering application awareness and sustainability insights, Azure Migrate will also feature a refreshed user interface. The new experience is designed to help customers across every step of the migration journey – across Decide, Plan and Execute phases. The experience provides you with a new intuitive table of contents and overview page to allow easy navigation. You can explore discovered workloads and their relationships through effective search, sort, and seamless transition from Azure Migrate to other specialized migration tools, depending on your specific goals and requirements. Finally, you can quickly create and visualize different migration and modernization strategies side-by-side. Expanded support for workloads and platforms In addition to the capabilities described above, Azure Migrate continues to evolve to support capabilities provided by Azure for customers to evaluate and execute as part of their cloud adoption journey. As part of this effort, I am pleased to announce public preview of the following capabilities. These capabilities are available for customers, partners and sellers to try today! ROI/TCO of Azure Arc in Azure Migrate Business Case [public preview] We understand that customers are looking to understand the best path as they evaluate the cloud. This includes continuing to stay on-premises in their current environment while benefiting from Azure services such as Azure Arc. Knowing the varying needs of every customer and with the goal to meet customers where they are, we are introducing the envisioning of ROI for Azure Arc in Azure Migrate Business Case. This includes - Azure Migrate business case to help you compare the Total Cost of Ownership (TCO) for on-premises estates versus Azure, including year-on-year cash flow analysis. With this new capability, the Azure Migrate Business Case now includes the added value of Azure Arc for resources remaining on-premises during the customer’s migration journey. You can now visualize cost savings and other benefits of using Azure security and management tools via Azure Arc for your on-premises servers and see licensing benefits such as Extended Security Updates and SQL Pay-As-You-Go. In addition to visualizing the business case for Arc, customers can identify and at-scale onboard machines that are not yet Arc-enabled directly from the Azure Migrate portal. Additional details and step by step instructions can be foundhere. Support for migrations to Azure Stack HCI [public preview] Azure Stack HCI enables customers to run workloads in the private cloud or edge and offers an ideal platform for modernizing workloads with enhanced performance, scalability, simplified management, and cost efficiency. To support this modernization, we have introduced the ability to migrate virtual machines from Hyper-V and VMware environments to Azure Stack HCI using Azure Migrate: Server Migrations. Like Azure migrations, you can leverage Azure Migrate to discover virtual machines from VMware and Hyper-V environments at scale, without needing prior agent installation. After discovery, you can migrate virtual machines to Azure Stack HCI through an easy-to-use Azure Migrate portal experience, ensuring zero data loss and minimal downtime. This migration keeps data flow locally from on-premises to Azure Stack HCI. Learn more about this capabilityhere. Expanded OSS Support in Azure Migrate [public preview] Azure Migrate has been diligently expanding its capabilities to better support customers using Linux. We are thrilled to highlight three significant updates that enhance your migration experience: Support for newer Linux Distributions [public preview] Azure Migrate now supports a range of newer Linux distributions, including Rocky Linux, Alma Linux, SLES 15, RHEL 9, and Ubuntu 22.04. This enhancement ensures a broader compatibility for Linux workloads, allowing you to migrate seamlessly, whether using agentless or agent-based migrations. Azure Hybrid Benefit (AHB) for Enterprise Linux [public preview] We've integrated Azure Hybrid Benefit (AHB) for Enterprise Linux (RHEL and SLES) into the migration process. Customers can visualize the savings from AHB directly in Azure Migrate business case assessments, maximizing their return on investment. To leverage AHB, you can directly enable the appropriate licenses for migrating Enterprise Linux machines within Azure Migrate. This integration eliminates the need for manual installation of the AHB extension post migrations, streamlining the migration workflow and ensuring compliance. Discovery and Assessment of MySQL Databases [public preview] In our endeavor to increase coverage of OSS workloads in Azure Migrate, we are announcing discovery and modernization assessment of MySQL databases running on Linux servers. Customers previously had limited visibility in their MySQL workloads and often received generalized VM lift-and-shift recommendations. With this new capability, you can now accurately identify the MySQL workloads and assess them for right-sizing into Azure Database for MySQL: Flexible Server. CSV Import powered discovery for SQL Servers [limited preview] We understand that deploying an appliance may not be the quickest way to generate migration assessments to enable planning. Further, many times customers can’t provide credentials for SQL Server instances, to allow Azure Migrate to capture relevant details and provide accurate readiness and right-sized recommendations. Hence, we are now adding the ability to import SQL Server details which can then be used to discover SQL Server instances and databases and generate accurate assessment reports. Use existing repositories such as SQL Server Dynamic Management Views, SCOM etc. to populate the CSV schema required to discover SQL Server. Interested in trying the limited preview experience? The capabilities described above are currently in limited preview. To take advantage of these capabilities for your environment, please share your interesthere. Conclusion The enhancements in Azure Migrate underscore our commitment to providing comprehensive, user-friendly, and efficient migration solutions. Stay tuned for more updates and join us at Ignite 2024 for a detailed demo of these exciting new features. Curious to learn more? Here is a sneak peek of what we plan to announce at Ignite -https://youtu.be/aquRVLvau7cVMware HCX Troubleshooting with Azure VMware Solution
Overview VMware HCX is one of the Azure VMware Solution components that generates a large number of service requests from our customers. The Azure VMware Solution product group has worked to cover the most common troubleshooting considerations that you should know about when using VMware HCX with the Azure VMware Solution. Azure VMware Solution is a VMware validated first party Azure service from Microsoft that provides private clouds containing VMware vSphere clusters built from dedicated bare-metal Azure infrastructure. It enables customers to leverage their existing investments in VMware skills and tools, allowing them to focus on developing and running their VMware-based workloads on Azure. VMware HCX is the mobility and migration software used by the Azure VMware Solution to connect remote VMware vSphere environments to the Azure VMware Solution. These remote VMware vSphere environments can be on-premises, co-location or cloud-based instances. Figure 1 – Azure VMware Solution with VMware HCX Service Mesh In the next section, I will introduce the architectural components of the Azure VMware Solution. Architectural Components The diagram below describes the architectural components of the Azure VMware Solution. Figure 2 – Azure VMware Solution Architectural Components Each Azure VMware Solution architectural component has the following function: Azure Subscription: Used to provide controlled access, budget and quota management for the Azure VMware Solution. Azure Region: Physical locations around the world where we group data centers into Availability Zones (AZs) and then group AZs into regions. Azure Resource Group: Container used to place Azure services and resources into logical groups. Azure VMware Solution Private Cloud: Uses VMware software, including vCenter Server, NSX software-defined networking, vSAN software-defined storage, and Azure bare-metal ESXi hosts to provide compute, networking, and storage resources. Azure NetApp Files, Azure Elastic SAN, and Pure Cloud Block Store are also supported. Azure VMware Solution Resource Cluster: Uses VMware software, including vSAN software-defined storage, and Azure bare-metal ESXi hosts to provide compute, networking, and storage resources for customer workloads by scaling out the Azure VMware Solution private cloud. Azure NetApp Files, Azure Elastic SAN, and Pure Cloud Block Store are also supported. VMware HCX: Provides mobility, migration, and network extension services. VMware Site Recovery: Provides Disaster Recovery automation, and storage replication services with VMware vSphere Replication. Third party Disaster Recovery solutions Zerto DR and JetStream DR are also supported. Dedicated Microsoft Enterprise Edge (D-MSEE): Router that providesconnectivity between Azure cloud and the Azure VMware Solution private cloud instance. Azure Virtual Network (VNet): Private network used to connect Azure services and resources together. Azure Route Server: Enables network appliances to exchange dynamic route information with Azure networks. Azure Virtual Network Gateway: Cross premises gateway for connecting Azure services and resources to other private networks using IPSec VPN, ExpressRoute, and VNet to VNet. Azure ExpressRoute: Provides high-speed private connections between Azure data centers and on-premises or colocation infrastructure. Azure Virtual WAN (vWAN): Aggregates networking, security, and routing functions together into a single unified Wide Area Network (WAN). In the next section, I will describe the troubleshooting steps you should follow for VMware HCX when used with the Azure VMware Solution. Troubleshooting Considerations Before opening a ticket with Microsoft support, please use the following steps as a checklist to ensure you are not impacted by the most common VMware HCX issues. Troubleshooting Step 1: Download the VMware HCX Connector. Once VMware HCX is deployed on the Azure VMware Solution side, the download for the VMware HCX Connector OVA is in the VMware HCX UI plugin. Under the Administration there is a Request Download Link. The OVA can be copied locally or a download link for the OVA can be selected. Figure 3 – VMware HCX Connector OVA Download Troubleshooting Step 2: Upgrade to HCX Enterprise. Azure VMware Solution comes with an Enterprise license key for VMware HCX. If you have a pre-existing VMware HCX Connector on-prem that is licensed for VMware HCX Advanced, please be sure to upgrade the connector to the Enterprise version. To upgrade VMware HCX navigate to the HCX Connector athttps://<hcx_connector_fqdn>:9443, under the Configuration section select Licensing and Activation, edit the current license and enter the VMware HCX enterprise license key obtained from the Azure VMware Solution portal. Verify that the License is showing Enterprise. Figure 4 – VMware HCX Connector License Key Once you have updated the VMware HCX Connector, be sure to update/edit the VMware HCX Compute Profile and Service Mesh to include the updated VMware HCX services that you would like to take advantage of, such as Replicated Assisted vMotion and OS Assisted Migration. OS Assisted Migration is used for migrating and converting Microsoft Hyper-V and RedHat KVM workloads into Azure VMware Solution. Figure 5 – VMware HCX Connector Compute Profile Service Activation Troubleshooting Step 3: Only use the key from the Azure VMware Solution private cloud you are connecting to. When deploying the VMware HCX Connector on-premises, the activation key should come from the Azure VMware Solution you are migrating to. In the Azure portal, an activation Key can be obtained in the Add-Ons section. Simply request an activation key, provide it with a friendly name and map that activation key to the on-premises VMware HCX connector. Figure 6 – VMware HCX Connector License Key Troubleshooting Step 4: Do not use an IPSec VPN. If possible, avoid using an IPSec VPN connection to Azure VMware Solution when migrations with VMware HCX will happen. Migrating with VMware HCX over VPN has been known to cause issues and multiple failures around migrations. Although utilizing VMware HCX via VPN is supported, it is not the recommended way to migrate virtual machines to Azure VMware Solution. One of the biggest caveats of migrating VMs with VMware HCX over VPN is that a separate uplink network profile is needed on-premises. The management network cannot be used as an uplink profile, as the MTU of the uplink profile needs to be adjusted to 1300 to accommodate the IPSec overhead. Note that VMware HCX uses IPSec VPN natively as part of the VMware HCX Service Mesh. Troubleshooting Step 5: Check MTU size within your Network Profile. Be sure to verify the MTU setting on the Network Profiles setup. Within VMware HCX, navigate to the Interconnect section, select Network Profiles and be sure to verify the correct MTU size is being used for each Profile. Be sure to verify this on both ends of the VMware HCX site pair. Figure 7 – VMware HCX MTU size in Network Profile Use this guide of recommended MTU sizes for the Network Profiles in the table below when connecting to Azure VMware Solution. Connectivity Method Management Uplink Replication vMotion Azure ExpressRoute 1500 1500 1500 or 9000 1500 or 9000 VMware HCX over IPSec VPN 1500 1300 1500 or 9000 1500 or 9000 Table 1 – VMware HCX Network Profile MTU Sizes Troubleshooting Step 6: Always keep your VMware HCX versions updated (Connectors, Cloud Manager and Service Meshes). Before you upgrade VMware HCX, check theVMware product interoperability matrix to ensure the integrated versions of on-premises VMware solution software are supported by the new version of VMware HCX you are going to upgrade to. Updates to VMware HCX are released regularly by VMware. It is the responsibility of the customer to upgrade and maintain VMware HCX on both sides of the Service Mesh (on-premises and Azure VMware Solution). When updating VMware HCX, the VMware HCX Cloud Managers should be updated first. It is recommended to create a back-up to the VMware HCX Connector before updating. Backups to the VMware HCX Connector can be done through the VMware HCX manager UI athttps://<hcx_connector_fqdn>:9443 with the admin password created at the time of VMware HCX Connector deployment. Under the Administration section head to the Backups and restore section. Backups can be taken here and scheduled to be taken as well. Optionally, you can take a vSphere snapshot of the VMware HCX Connector on-premises as well. Figure 8 – VMware HCX Connector Backup & Restore Updates for the VMware HCX Cloud Managers can be found in the administration section, select your current version, and hit the ‘Check for Updates’ button. If a new version is available, you will be able to download and update to the newest version. Backups of the VMware HCX Cloud Manager are taken automatically each day. Figure 9 – VMware HCX Upgrades It should be noted that VMware HCX Service Meshes are updated independently of the VMware HCX Cloud Managers and Connectors. Upon completion of the VMware HCX Cloud Manager and Connector updates, Service Meshes should be updated next. VMware HCX Cloud Managers and Service Meshes should be upgraded in order and together as to not cause an issue with mixed mode versions of Managers and Service Meshes. Running mixed mode versions of VMware HCX Cloud Managers, Connectors, and Service Meshes in production is highly discouraged. You can lose certain features and it often creates issues within the environment. Figure 10 – VMware HCX Manager Service Mesh Update During the Service Mesh update process, if Network Extension appliances are deployed a temporary loss of connectivity will occur while the appliances update. For Network Extension in an HA pair, down time is approximately a few seconds. Network Extension appliances not in an HA pair will incur downtime of approximately one minute. Troubleshooting Step 7: On-Premises Network Connectivity and Firewalls. For VMware HCX to be activated and receive updates, your on-premises firewalls need to allow outbound traffic to port 443 for the following websites: https://connect.hcx.vmware.com https://hybridity-depot.vmware.com https://hcx.<guid>.<region>.avs.azure.com Your on-premises firewalls will also need to allow outbound traffic to UDP port 4500. Within VMware HCX UDP port 4500 serves a specific purpose, it allows IPSec VPN communication between VMware HCX components across environments and is essential for communication and data transfer between environments to work. When configuring VMware HCX, you need to ensure that this port is open between your on-premises VMware HCX Connector uplink network profile and the Azure VMware Solution HCX Cloud Manager uplink network profile. Another common issue we see within VMware HCX, is that your on-premises VMware HCX Connector is unable to reach the VMware HCX activation and entitlement website. A simple way to verify your on-premises environment has access to the activation and entitlement website is as follows. SSH into the on-premises VMware HCX Connector and run the below curl commands to verify connectivity: Curl -k -v https://connect.hcx.vmware.com Curl -k -v https://hyridity-depot.vmware.com A successful connection to the above website will look like the figure below. Figure 11 – VMware HCX Connector SSH CURL connectivity test Troubleshooting Step 8: Diagnostics page on the Service Mesh. Built into the VMware HCX Service Mesh there is an option to run a diagnostics check on the Service Mesh appliances. This is an effective way to verify the health of your Service Mesh and pinpoint any specific issues the appliances may have. In the VMware HCX Connect user interface, under the Interconnect section, select the Service Mesh you want to run the diagnostics on. Under the “More” link, select Run Diagnostics to perform a health check on the appliances. Figure 12 – VMware HCX Service Mesh Run Diagnostics Once the Diagnostics test is completed, if there are any issues, a red banner will appear under the Service Mesh name. You can drill down to the specific issues by clicking on the red alert (!) icon. Figure 13 – VMware HCX Service Mesh Alert Troubleshooting Step 9: If you are having issues with the source side interface reboot the VMware HCX Connector. VMware HCX Connectors may have issues over time. It is recommended to reboot the VMware HCX Connector if it has been up and running for an extended period without a reboot. On the Azure VMware Solution side, we do have the option for customers to reboot the VMware HCX Cloud Manager within Azure VMware Solution through a Run Command in the Azure portal. The option to Force or Hard Reboot the VMware HCX Cloud Manager is also an option that is offered. Please use this with caution as it does not check for any active migrations or replications that may be occurring. Figure 14 – Azure VMware Solution Run Command Restart-HCXManager Troubleshooting Step 10: Logging into the VMware HCX Cloud Manager directly You have the ability to log into the VMware HCX Cloud Manager directly. At times the VMware HCX plugin through your Azure VMware Solution vSphere Client will not be available or fail to open. You can obtain the IP address of the VMware HCX Cloud Manager in the Azure portal when you are in the Azure VMware Solution resource. In the Add-ons section under the “Migration using VMware HCX”, the IP address of the VMware HCX Cloud manager will be listed. It is part of the /22 network you provided when deploying Azure VMware Solution. Access the manager directly athttps://<x.x.x.9>:443 or https://hcx.<guid>.<region>.avs.azure.com. The VMware HCX Cloud Manager will always end with a .9 octet. Figure 15 – VMware HCX Cloud Manager Login Troubleshooting Step 11: Network Extensions are for temporary migration phases, not for permanent use. At its core VMware HCX is a migration tool. When using Network Extensions in VMware HCX, it is important to understand that these Network Extensions should be a temporary solution used during the migration process to migrate VMs into Azure VMware Solution with no downtime during the migration. It is best practice to remove the network extensions as soon as the migration waves are completed. Leaving network extensions in place for extended periods of time can cause issues and outages in your environment. Use Network Extensions with caution. Figure 16 – VMware HCX Network Extension Troubleshooting Step 12: If you have Mobility Optimized Networking (MON) enabled, ensure you have the router location set to the correct side. When configuring MON, verify where the default gateway resides. The default gateway will always be located on the source side of the network extension. Primarily, it will reside in the on-premises data center when connecting to Azure VMware Solution. Figure 17 – VMware HCX Mobility Optimized Network (MON) Troubleshooting Step 13: OS Assisted Migration -Sentinel Gateway Appliances. When using VMware HCX OS Assisted Migration, it is important to maintain and manage the VMware HCX Sentinel Gateway Appliance (SGW) at the source site (On-premises). The Sentinel Gateway Appliance is responsible for establishing a forwarding connection with the VMware HCX Sentinel Data Receiver (SDR) on the destination site. Managing and maintaining the Sentinel Gateway appliance’s resources, CPU and memory configuration, is the responsibility of the customer. Next Steps If this has not resolved the VMware HCX issue in your Azure VMware Solution private cloud, please open a Service Request with Microsoft to continue the resolution process. Summary In this post, we described helpful troubleshooting tips when facing some of the most common VMware HCX service issues our customers have with the Azure VMware Solution. If you are interested in the Azure VMware Solution, please use these resources to learn more about the service: Homepage: Azure VMware Solution Documentation: Azure VMware Solution SLA: SLA for Azure VMware Solution Azure Regions: Azure Products by Region VMware Ports and Protocols for HCX VMware HCX - VMware Ports and Protocols VMware Interoperability Matrix Product Interoperability Matrix (vmware.com) VMware HCX: Configuration & Best Practices Design:Availability Design Considerations Design:Recoverability Design Considerations Design:Performance Design Considerations Design:Security Design Considerations GitHub repository: Azure/azure-vmware-solution Well-Architected Framework: Azure VMware Solution workloads Cloud Adoption Framework: Introduction to the Azure VMware Solution adoption scenario Network connectivity scenarios: Enterprise-scale network topology and connectivity for Azure VMware Solution Enterprise Scale Landing Zone: Enterprise-scale for Microsoft Azure VMware Solution Enterprise Scale GitHub repository: Azure/Enterprise-Scale-for-AVS Azure CLI: Azure Command-Line Interface (CLI) Overview PowerShell module: Az.VMware Module Azure Resource Manager: Microsoft.AVS/privateClouds REST API: Azure VMware Solution REST API Terraform provider: azurerm_vmware_private_cloud Terraform Registry Author Bios Ricky Perez is a Senior Cloud Solution Architect in the international Customer Success Unit (iCSU) at Microsoft. His background is in solution architecture with experience in public cloud and core infrastructure services. Jason Trammell is a Senior Software Engineer in the Azure VMware Solution engineering group at Microsoft. Kenyon Hensler is a Principal Technical Program Manager in the Azure VMware Solution product group at Microsoft. His background is in system engineering with experience across all facets of enterprise networking and compute stacks. René van den Bedem is a Principal Technical Program Manager in the Azure VMware Solution product group at Microsoft. His background is in enterprise architecture with extensive experience across all facets of the enterprise, public cloud & service provider spaces, including digital transformation and the business, enterprise, and technology architecture stacks. René works backwards from the problem to be solved and designs solutions that deliver business value with the minimum of risk. In addition to being the first quadruple VMware Certified Design Expert (VCDX), he is also a Dell Technologies Certified Master Enterprise Architect, a Nutanix Platform Expert (NPX), and a VMware vExpert.Azure VMware Solution Syslog Forwarder
Large enterprise and strategic customers have existing and established monitoring solutions that are a constraint for adopting new solutions such as the Azure VMware Solution. Customers do not want to adopt a new real-time monitoring framework for the Azure VMware Solution. This Syslog Forwarder function allows a customer to achieve operational excellence with the Azure VMware Solution and meet their operational management goals. The Azure VMware Solution does not currently support Syslog Forwarding to a customer Syslog server. This Syslog Forwarder function provides this capability by streaming Azure VMware Solution Diagnostic logs as a JSON.string to Azure Event Hub which is then processed by an Azure Function App to convert the JSON.string into Syslog format and send to an external Syslog server. Figure 1 – Azure VMware Solution Syslog Forwarder Function Background The Azure VMware Solution product group has created this GitHub repository to share prescriptive architectural approaches and tools for customers and partners using the Azure VMware Solution service. This is intended to enhance the value of the Azure VMware Solution service to our customers and partners. To navigate the Azure VMware Solution GitHub repository, select the solution you are interested in from the Table of Contents in the README.md file to open the project folder. Each project has a descriptive README.md file that describes how to use it. Azure VMware Solution is a VMware validated first party Azure service from Microsoft that provides private clouds containing VMware vSphere clusters built from dedicated bare-metal Azure infrastructure. It enables customers to leverage their existing investments in VMware skills and tools, allowing them to focus on developing and running their VMware-based workloads on Azure. If you are interested in the Azure VMware Solution, please use these resources to learn more about the service: Homepage: Azure VMware Solution | Microsoft Azure Learn: Run VMware resources on Azure VMware Solution Training Documentation: Azure VMware Solution Azure CLI: Azure Command-Line Interface (CLI) Overview PowerShell module: Az.VMware Module Terraform provider: azurerm_vmware_private_cloud Terraform Registry GitHub repository: Azure/azure-vmware-solution Cloud Adoption Framework: Introduction to the Azure VMware Solution adoption scenario Network connectivity scenarios: Enterprise-scale network topology and connectivity for Azure VMware Solution Enterprise Scale Landing Zone: Enterprise-scale for Microsoft Azure VMware Solution Enterprise Scale GitHub repository: Azure/Enterprise-Scale-for-AVS Microsoft Ignite 2022: Azure VMware Solution Update – New Features and Capabilities (microsoft.com) VMware homepage: VMware to Azure Migration Solutions VMware Hands-on Labs: Azure VMware Solution Hands-on Labs VMware Cloud Tech Zone: Azure VMware Solution VMware Explore 2022: VMware Explore Video Library Learning Resources:Azure VMware Solution (AVS) (microsoft.github.io) Author Bio René van den Bedemis a Principal Technical Program Manager in the Azure VMware Solution product group at Microsoft. His background is in enterprise architecture with extensive experience across all facets of the enterprise, public cloud & service provider spaces, including digital transformation and the business, enterprise, and technology architecture stacks. René works backwards from the problem to be solved and designs solutions that deliver business value with the minimum of risk. In addition to being the first quadrupleVMware Certified Design Expert (VCDX), he is also aDell Technologies Certified Master Enterprise Architect, aNutanix Platform Expert (NPX), and aVMware vExpert.Azure VMware Solution Auto-Scale
This Auto-Scale function allows a customer to scale their Azure VMware Solution automatically to cost effectively meet their performance goals. The Azure VMware Solution has Azure Metrics for the percentage usage of cluster CPU, memory, and storage resources. These metrics are incorporated into Azure Alerts with thresholds for high-water mark and low-water mark values to trigger a call to an Azure Automation PowerShell Runbook via a Webhook which triggers the auto-scale node event within the Azure VMware Solution private cloud. Figure 1 – Azure VMware Solution Auto-Scale Function Figure 2 – Azure VMware Solution Auto-Scale with Azure Automation & PowerShell Runbooks Background The Azure VMware Solution product group has created this GitHub repository to share prescriptive architectural approaches and tools for customers and partners using the Azure VMware Solution service. This is intended to enhance the value of the Azure VMware Solution service to our customers and partners. To navigate the Azure VMware Solution GitHub repository, select the solution you are interested in from the Table of Contents in the README.md file to open the project folder. Each project has a descriptive README.md file that describes how to use it. Azure VMware Solution is a VMware validated first party Azure service from Microsoft that provides private clouds containing VMware vSphere clusters built from dedicated bare-metal Azure infrastructure. It enables customers to leverage their existing investments in VMware skills and tools, allowing them to focus on developing and running their VMware-based workloads on Azure. If you are interested in the Azure VMware Solution, please use these resources to learn more about the service: Homepage: Azure VMware Solution | Microsoft Azure Learn: Run VMware resources on Azure VMware Solution Training Documentation: Azure VMware Solution Azure CLI: Azure Command-Line Interface (CLI) Overview PowerShell module: Az.VMware Module Terraform provider: azurerm_vmware_private_cloud Terraform Registry GitHub repository: Azure/azure-vmware-solution Cloud Adoption Framework: Introduction to the Azure VMware Solution adoption scenario Network connectivity scenarios: Enterprise-scale network topology and connectivity for Azure VMware Solution Enterprise Scale Landing Zone: Enterprise-scale for Microsoft Azure VMware Solution Enterprise Scale GitHub repository: Azure/Enterprise-Scale-for-AVS Microsoft Ignite 2022: Azure VMware Solution Update – New Features and Capabilities (microsoft.com) VMware homepage: VMware to Azure Migration Solutions VMware Hands-on Labs: Azure VMware Solution Hands-on Labs VMware Cloud Tech Zone: Azure VMware Solution VMware Explore 2022: VMware Explore Video Library Learning Resources:Azure VMware Solution (AVS) (microsoft.github.io) Author Bio René van den Bedemis a Principal Technical Program Manager in the Azure VMware Solution product group at Microsoft. His background is in enterprise architecture with extensive experience across all facets of the enterprise, public cloud & service provider spaces, including digital transformation and the business, enterprise, and technology architecture stacks. René works backwards from the problem to be solved and designs solutions that deliver business value with the minimum of risk. In addition to being the first quadrupleVMware Certified Design Expert (VCDX), he is also aDell Technologies Certified Master Enterprise Architect, aNutanix Platform Expert (NPX), and aVMware vExpert.Azure VMware Solution Advanced Monitoring
The available Azure VMware Solution metrics in the Azure portal contains a standard monitoring library. For customers that require an advanced and comprehensive set of metrics from the VMware vSphere and NSX-T Data Center components of their Azure VMware Solution, this Advanced Monitoring Telegraf plug-in can be used. This provides operational simplicity for our customers using the Azure VMware Solution, removing the need for additional third-party solutions when access to advanced metrics is a customer requirement. This solution add-on deploys a virtual machine running Telegraf in Azure with a managed identity that has contributor and metrics publisher access to the Azure VMware Solution private cloud object. Telegraf then connects to vCenter Server and NSX-T Manager via API and provides responses to API metric requests from the Azure portal. Figure 1 – Azure VMware Solution Advanced Monitoring with Telegraf This solution add-on has an ARM Template that can be used to quickly deploy the necessary Telegraf Azure Virtual Machine and configuration, or it can be manually deployed by an Azure administrator. Background The Azure VMware Solution product group has created this GitHub repository to share prescriptive architectural approaches and tools for customers and partners using the Azure VMware Solution service. This is intended to enhance the value of the Azure VMware Solution service to our customers and partners. To navigate the Azure VMware Solution GitHub repository, select the solution you are interested in from the Table of Contents in the README.md file to open the project folder. Each project has a descriptive README.md file that describes how to use it. Azure VMware Solution is a VMware validated first party Azure service from Microsoft that provides private clouds containing VMware vSphere clusters built from dedicated bare-metal Azure infrastructure. It enables customers to leverage their existing investments in VMware skills and tools, allowing them to focus on developing and running their VMware-based workloads on Azure. If you are interested in the Azure VMware Solution, please use these resources to learn more about the service: Homepage: Azure VMware Solution | Microsoft Azure Learn: Run VMware resources on Azure VMware Solution Training Documentation: Azure VMware Solution Azure CLI: Azure Command-Line Interface (CLI) Overview PowerShell module: Az.VMware Module Terraform provider: azurerm_vmware_private_cloud Terraform Registry GitHub repository: Azure/azure-vmware-solution Cloud Adoption Framework: Introduction to the Azure VMware Solution adoption scenario Network connectivity scenarios: Enterprise-scale network topology and connectivity for Azure VMware Solution Enterprise Scale Landing Zone: Enterprise-scale for Microsoft Azure VMware Solution Enterprise Scale GitHub repository: Azure/Enterprise-Scale-for-AVS Microsoft Ignite 2022: Azure VMware Solution Update – New Features and Capabilities (microsoft.com) VMware homepage: VMware to Azure Migration Solutions VMware Hands-on Labs: Azure VMware Solution Hands-on Labs VMware Cloud Tech Zone: Azure VMware Solution VMware Explore 2022: VMware Explore Video Library Learning Resources:Azure VMware Solution (AVS) (microsoft.github.io) Author Bios Kenyon Hensler is a Principal Technical Program Manager in the Azure VMware Solution product group at Microsoft. His background is in system engineering with experience across all facets of the enterprise networking and compute stacks. René van den Bedemis a Principal Technical Program Manager in the Azure VMware Solution product group at Microsoft. His background is in enterprise architecture with extensive experience across all facets of the enterprise, public cloud & service provider spaces, including digital transformation and the business, enterprise, and technology architecture stacks. René works backwards from the problem to be solved and designs solutions that deliver business value with the minimum of risk. In addition to being the first quadrupleVMware Certified Design Expert (VCDX), he is also aDell Technologies Certified Master Enterprise Architect, aNutanix Platform Expert (NPX), and aVMware vExpert.