hyper-v
328 TopicsBypass LBFO Teaming deprecation on Hyper-V and Windows Server 2022
Starting with Windows Server 1903 and 1909, Hyper-V virtual switches on an LBFO-type network adapter cluster are deprecated (see documentation). The technology remains supported, but it will not evolve. It is recommended to create an aggregate of type SET. In practice The SET is a very interesting technology that has some constraints. The interfaces used must have identical characteristics: Manufacturer Model Link speed Configuration Even if these constraints do not seem huge, we are very far from the flexibility of LBFO Teaming. As a reminder, this one has absolutely no constraints. In practice the SET is recommended with network interfaces of 10Gb or more. Therefore, we are very far from the target of the LBFO (use of all integrated boards with motherboard pro, Home Lab, refurbish). If SET cannot be used As of Windows Server 2022, it is not possible to use the Hyper-V Management Console to create a virtual switch with LBFO, as it will prompt an error saying that LBFO have been depreciated. However, it is possible to use PowerShell to create this virtual switch. First, create the Teaming of your network cards using the Server Manager, in my case the teaming will be with LACP mode and Dynamic load balancing mode. Then execute the below PowerShell Command to create the virtual switch based on the teaming created in the previous step: New-VMSwitch -Name "LAN" -NetAdapterName "LINK-AGGREGATION" -AllowNetLbfoTeams $true -AllowManagementOS $true In detail: The virtual switch will be named "LAN" The network adapter cluster teaming is named "LINK-AGGREGATION" The aggregate remains usable to access the Hyper-V host. You will see your network teaming up and running on Hyper-V host. Thats it!152KViews6likes10CommentsGreat Manager for Storage Spaces Direct from Starwind Manager
Are you deploying Storage Spaces Direct you need to get this amazing free tool from Starwind Software called Starwind Manager https://www.starwindsoftware.com/starwind-manager Check it out let me know what you think.1.2KViews3likes0CommentsUnable to read server queue performance data
Has anyone started seeing this on Windows Server 2016? Unable to read Server Queue performance data from the Server service. The first four bytes (DWORD) of the Data section contains the status code, the second four bytes contains the IOSB.Status and the next four bytes contains the IOSB.Information. We have this on two of our Cluster at the moment. The same two nodes also end up having issues draining and will lock resources. The other 2 nodes are fine as far as I can see34KViews2likes9CommentsBLOG: "Only 16 nodes per cluster?! - but VMware..." limitations and rightsizing of failover clusters
Greetings community Windows Server Community members! Today I am sharing insights with you on an often discussed matter. Intro This is an exercise on technical limitations and rightsizing of Hyper-V based clusters. The article applies to rules for general Hyper-V based failover clusters, using Windows Server with Shared Storage (SAN), dHCI, (Azure Stack) HCI and the underlying S2D considerations in special. Seriously, I've stopped counting the number of customers telling me about Hyper-V / Storage Spaces Direct / Azure Stack not being scalable. Especially when thinking about Azure Stack HCI, this gives me chuckles. Inspired by a simple question from a Microsoft techcommunity member I thought it is about time to share my experience on this "limitation". Granted it is themed for S2D and Azure Stack HCI and I do see differences for many use cases out there using shared storages (SAN) or Scale Out File server. If you have comments and suggestions, I am all ears. I appreciate your comments and thoughts. This article is something I am writing from the top of my mind so, bear with me if I missed aspects or things are wrong, I will certainly investigate your comments and corrections. Thinking about Cluster Size - I am putting all my eggs in one basket A great classic song, we will look into this further from an IT perspective. As always in IT: IT depends™. The cluster size possible with S2D scales from 1-16 physical or in lab virtual nodes forming one logical cluster. Especially with S2D (Storage Spaces Direct) using Windows Server and Hyper-V or the more advanced adaptive cloud (hybrid cloud) focused Azure Stack HCI. It is using the same technology as a base product on Windows Server with some notable extras in terms of deployment and management. One should consider though that the number of nodes in a Failover Cluster (and with that the number of disks) does not necessarily help to defend physical disk errors. It depends how the Storage Pool deals with fault domains. This is automatic and just sometimes is good to revise and adjust (if you know what you are doing). Considering fault domains Independent from a Storage point of view, running a large cluster also means it is one large fault domain. In case of issues with the failover-cluster, and there are numerous from networking, physical up to “it is always DNS™”, storage issues, configuration issues and changes or drift. Performance impacts Running one large cluster also causes higher performance and bandwidth impacts. No so much when using a shared SAN, one might think, but certainly when using SDS, dHCI, HCI, like Microsoft Storage Spaces Direct. This is especially true for rebuild times of S2D in case of disk failures, replacement or HW especially disk capacity expansions. Costs When considering cost, S2D requires equal disks in a pool and mostly identical hardware within a cluster. Larger cluster could be less efficient and not tightly targeted and HW optimized to use case, especially for general VM or VDI workloads. Lifespan and oopsies with physical disks Granted NVMe, when choosing appropriated TBW / TWPD models, offer a long very lifespan, excellent response times and performance galore, for sequential but especially for random operations and IOPS. Today they are more cost efficient than SSDs. Albeit when one does not follow the advice to patch your OS, FW and drivers regularly you might be hitting sudden outtakes on NVMe, SSDs and HDDs due to code issues in the firmware. This happened sometimes in the past and just recently also affected Samsung NVMe but have been spotted before disasters at scale. Understanding Storage Spaces (Direct) / Storage Pools In Windows Server S2D (always equally include Azure Stack HCI), all physical disks are pooled. In general, there is just one Storage pool available for all servers within a cluster. An exception are Stretched Clusters, something I do not want to go into detail here. If you want to learn more about these, I can recommend you this epic YT series. If you face a problem with your pool, you are facing a problem for all nodes. This is common and likely what happens with other third-party SAN / RAID / SDS systems. No change here, we are all cooking with water. Here is a general overview of this amazing and "free" technology. It requires Windows Server Datacenter licensing, that's all for all bells and whistles of a highly performant and reliable Software defined Storage. It runs best on with NVMe only setups, but allows flexibility, based on the use case. A high level overview for now to explain the relation to the original topic. Storage Spaces Direct (S2D) has been introduced with Windows Server 2016 and uses ReFS 3.1. Currently we are at Windows Server 2025 soon, and ReFS 3.12, which comes with a ton of improvements. Next to S2D there is Storage Spaces, a similar technology, but not forming a shared storage across different servers, so designed for standalone servers, opposing to server clusters. Something you should consider with ReFS for unclustered Hyper-V, Scale-out File Server and Backup Servers, instead of RAID or SAN. When larger doesn’t mean more secure – Storage Resiliency affects also Cluster resilience On both you define your storage policies per Volume / Cluster Shared Volume, likely to LUNs. So, you can dedicate how much of resiliency, Deduplication and performance is required based on the workload that is going to be stored on that volume. Some basic and common policies are Mirror and Nested Mirror. There exist other depending the number of disks / hosts, but these are not all recommended for VM workloads. When using these resiliency methods, especially Mirror, adding several disks (or hosts) exponentially raises the risks of a full data loss on this Volume / CSV in case of unfortunate events. So, choose and plan wisely. Can just recommend doing the RTFM job beforehand, as later changes are possible but require juggling the data and require having space left in the pool (physical disks) for such storage (migration) operations. Sure there are other methods that scale better like Dual parity. Be warned that the diagrams are simplified, and the data is not equally distributed "per disks" as you would expect in traditional RAID but using 256 MB data blocks (slabs) by using an algorithm that care for the balanced placement. It is important to understand this small difference to understand better on the predicable outcome of disk or host failures within the cluster. Not saying the docs are wrong, just the display of it is simplified. Read on more here: S2D - Nested Resiliency S2D - Understanding Storage fault tolerance Speaking of clusters the best effort is starting with 2 or 4 nodes. I would avoid and unequal number of nodes like three nodes or a multiple of it, as they are not very efficient (33%) and expanding on or from these require changing the Storage Policy (e. g. Three Way Mirror). S2D also support single node clusters, with nested mirror. You have heard right. Still satisfactory performance for many use cases, when you do not need full hardware stack resiliency, at a very low footprint and cost. Notable upcoming improvements to Clustering Storage Spaces (Direct) and beyond I trust that Azure Stack HCI will receive some of the recently announced improvements of Windows Server 2025. Be curious what's coming up next in the from Microsoft in regards of storage and storage options. Have a look at this later on. S2D - New Storage Options for Windows Server 2025 (release planned later this year) One large vs one or more smaller use case designed, scaled clusters Again, it is a common question why the limit of 16 nodes while e. g. VMware supports larger clusters. With no further ado, let's talk about how to do it right (imho). You might seek to create smaller clusters and Azure Stack HCI by design makes it easier to do management, RBAC / Security and LCM operations across several clusters. Having more than one Cluster also enables you to leverage different Azure Subscriptions (RBAC, Cost Management / Billing > Controlling). Sizing considerations – Why you do not need the same amount of hardware you had > smaller clusters Proper (physical) CPU sizing Often, when sizing is done, it is not considered about rightsizing the workloads, and rightsizing the CPU in use in a node. Modern CPUs compared to e.g. Sandy Bridge can help to consolidate in an 8:1 physical server. This way, you can easily save quite some complexity and costs for HW, licensing, cooling etc. To understand the efficiency and why you should not expect current pCPUs to be same on new systems, these calculators from Intel and AMD help you to find a right sized CPU for your Windows Server and what to expect on reducing hardware, TCO and environmental impact. That is climate action up to par. You can find the calculator from Intel here. Same exists for AMD. The vCPU to pCPU ration in today’s environments appear much higher than we are used to in previous clusters. Yes, That's true. I often hear VMware / Hyper-V customers being happy with a 2:1 vCPU:pCPU ratio across their workloads. It depends on the use case but often CPU resources are wasted and pCPUs are idling for their money, even at the end of life of a usual hardware cycle. Plan for rightsizing existing workloads before sizing hardware > Save costs Please consider: Storage Deduplication, also included in a more efficient way, in with in-line Deduplication and Compression with Windows Server 2025. Extra points (savings) for keeping all OS on a similar release. Storage Savings and RAM saving, using Windows Server Core for infrastructure VMs. Saving through Dynamic Memory vCPU / RAM reduction per VM Disk Space (too large, fixed disks, or Thin Provisioned Disks that got a relevant amount data deleted and have not been compressed) etc. etc. All this can be based on your monitoring metrics with your existing solution, or Azure Monitoring through Azure Arc. There is an enormous potential for savings and reduction of Cluster Nodes, RAM and Storage when you are interpreting your metrics before the migration to a more efficient platform. VM Assessment As outlined you can rely on your own methods and monitoring for the assessment of your workloads for rightsizing of hardware and VMs. In addition you can leverage Azure Migrate, to do that for you. It does not matter if you finally decide to migrate to Azure or Azure Stack HCI, it can help you with the general assessment using live data during your operation, which gives you good conclusions on right sizing. No matter the target. Consider growth and migrations There is always growth of the business or increased requirements to consider. The Azure Stack HCI Sizing tool helps you here but watch out sometimes there is huge gap of free resources. The tool is not logically perfect. It is math. Also OS migrations cause temporary growth that can surpass 50% of resources. Good news it is getting better with IPU starting with Windows Server 2022 and later. Additionally, services on older VMs are not well designed like Fileserver+ DHCP + RDS + Certificate Auth on a Domain Controller. These scenarios are still around existing and scream to be resolved, at costs of more VMs / resources. Have you heard Hyper-V isn't scalable? Get you own idea, here are the facts for Windows Server / Azure Stack HCI. And often growing with every release. Source: Azure Stack HCI System Requirements Azure Stack HCI and S2D These limitations shall not be confused with the ability of Hyper-V using a general Hypervisor, e.g. not using S2D but attached SAN: General Hyper-V Scalability and Limitations Conclusion You see there is some complexity in the game on the decision and the “limitation” of 16 nodes per cluster. I personally do not see this as a limitation in Windows Server Hyper-V or Azure Stack HCI given all of those aspects. Smaller clusters use case targeted clusters can also ensure flexibility, inherit the motivation for (cost) reductions and right sizing in the first place. No doubt lift and shift is easy, but it is often more expensive than investing some time into assessments, same with on-premises as in the cloud. So why a 16+ node cluster? Hope this helped you to make a decision. Allow me to cite Kenny Lowe, Regional Director and expert for Azure Stack HCI and other topics: “As ever, just because you can, doesn't mean you should.” Looking for more? Here is an amazing 1h video, you should consider watching if this article just fueled your interest into alternatives to classic clusters. VMware to Hyper-V migration options. Full agenda, on-demand, of Windows Server Summit 2024 https://techcommunity.microsoft.com/t5/tech-community-live/windows-server-summit-2024/ev-p/4068971 Thank you for reading, learning, sharing. Do you find this post helpful? Appreciate your thumbs up, feedback, questions. Let me know in the comments below.3.2KViews2likes2CommentsLBFO Teaming deprecation on Hyper-V for Windows Server 2022 - Solved
While creating a virtual switch using a teamed interface in Hyper-V for Windows Server 2022, the following error is encountered. To resolve this, NIC teaming for Hyper-V needs to be configured via PowerShell. Step 1: Delete the existing teaming manually created. Step 2: Go to PowerShell and run the command: New-VMSwitch -Name "VMSwitch-1" -NetAdapterName "Embedded NIC 1","Embedded NIC 2" (Here, I have given the switch name 'VMSwitch-1' and aggregated it with two adapters—'Embedded NIC 1' and 'Embedded NIC 2' are the adapter names in the list.) Step 3: Check the algorithm of the VMSwitch command: Get-VMSwitchTeam -Name "VMSwitch-1" | FL (This command will display the algorithm. If it's Hyper-V, proceed to the next step; otherwise, you can ignore the last step.) Step 4: Set the load balancing algorithm to dynamic: Set-VMSwitchTeam -Name "VMSwitch-1" -LoadBalancingAlgorithm Dynamic (This command changes the load balancing algorithm to dynamic. Test it using the command in step 3. The teamed interface should now appear in the Hyper-V virtual switch.) This will not help to LACP mode , so If you want LACP, Then only need to do the two step This is my recommend Step 1: First, create the Teaming of your network cards using the Server Manager, in my case the teaming will be with LACP mode and Dynamic load balancing mode. Step 2: Then execute the below PowerShell Command to create the virtual switch based on the teaming created . New-VMSwitch -Name "VMSWITCH-1" -NetAdapterName "SR-LAG-1" -AllowNetLbfoTeams $true -AllowManagementOS $true this case name of my hyperv switch given "VMSWITCH-1" and created teaming network adapter name ""SR-LAG-1"" Go to hyper-v and check the VMSwitch That's It.....107KViews2likes4CommentsBLOG: Windows Server / Azure Local keeps setting Live Migration to 1 - here is why
Affected products: Windows Server 2022, Windows Server 2025 Azure Local 21H2, Azure Local 22H2, Azure Local 23H2 Network ATC Dear Community, I have seen numerous reports from customers running Windows Server 2022 servers or Azure Local (Azure Stack HCI) that Live Migration settings are constantly changed to 1 per Hyper-V Host, as mirrored in PowerShell and Hyper-V Host Settings. The customer previously set the value to 4 via PowerShell, so he could prove it was a different value at a certain time. First, I didn't step into intense research why the configuration altered over time, but the stumbled across it, quite accidently, when fetching all parameters of Get-Cluster. According to an article a LCU back in September 2022 changed the default behaviour and allows to specify the live migrations at cluster level. The new live migration default appears to be 1 at cluster level and this forces to change the values on the Hyper-V nodes to 1 accordingly. In contrast to the commandlet documentation, the value is not 2, which would make more sense. Quite unknown, as not documented in the LCU KB5017381 itself, but only referenced in the documentation for the PowerShell commandlet Get-Cluster. Frankly, none of the aren't areas customers nor partners would check quite regularly to spot any of such relevant feature improvements or changes. "Beginning with the 2022-09 Cumulative Update, you can now configure the number of parallel live migrations within a cluster. For more information, see KB5017381 for Windows Server 2022 and KB5017382 for Azure Stack HCI (Azure Local), version 21H2. (Get-Cluster).MaximumParallelMigrations = 2 The example above sets the cluster property MaximumParallelMigrations to a value of 2, limiting the number of live migrations that a cluster node can participate in. Both existing and new cluster nodes inherit this value of 2 because it's a cluster property. Setting the cluster property overrides any values configured using the Set-VMHost command." Network ATC in Azure Local 22H2+ and Windows Server 2025+: When using Network ATC in Windows Server 2025 and Azure Local, it will set the live migration to 1 per default and enforce this across all cluster nodes. Disregarding the Cluster Settings above or Local Hyper-V Settings. To change the number of live migration you can specify a cluster-wide override in Network ATC. Conclusion: The default values for live migration have been changes. The global cluster setting or Network ATC forcing these down to the Hyper-V hosts based on Windows Server 2022+/ Azure Local nodes and ensure consistency. Previously we thought this would happen after using Windows Admin Center (WAC) when opening the WAC cluster settings, but this was not the initial cause. Finding references: Later the day, as my interest grew about this change I found an official announcement. In agreement to another article, on optimizing live migrations, the default value should be 2, but for some reason at most customers, even on fresh installations and clusters, it is set to 1. TLDR: 1. Stop bothering on changing the Livemigration setting manually or PowerShell or DSC / Policy. 2. Today and in future train your muscle memory to change live migration at cluster level with Get-Cluster, or via Network ATC overrides. These will be forced down quite immediately to all nodes and will be automatically corrected if there is any configuration drift on a node. 3. Check and set the live migration value to 2 as per default and follow these recommendations: Optimizing Hyper-V Live Migrations on an Hyperconverged Infrastructure | Microsoft Community Hub Optimizing your Hyper-V hosts | Microsoft Community Hub 4. You can stop blaming WAC or overeager colleagues for changing the LM settings to undesirable values over and over. Starting with Windows Admin Center (WAC) 2306, you can set the Live Migration Settings at cluster level in Cluster > Settings. Happy Clustering! 😀1.2KViews2likes0CommentsHyper-V 2022 - VMSS logs constantly about Hyper-V-VmSwitch
Hi guys, any hyper-v gurus around? I have a new 2022 host which will be deployed to production soon. I've found by luck that vmms process (Virtual Machine Management service) constantly logs Verbose messages about "Ioctl Begin ioctlCode: 0xD15" and " Ioctl End ioctlCode: 0xD15, delta (100 ns): 80, ntStatus: 0x80000005(NT=Buffer Overflow)" with Event ID 0 and source Hyper-V-VmSwitch. I've looked around and had no luck finding the cause. It's happening even if I stop all the VMs and I even removed the vSwitch - there is no vSwitch on the host and it still logs like hell. The source is well know SID 1-5-18 (SECURITY_LOCAL_SYSTEM_RID) Anyone saw this before or have any idea what could be the issue here? Thanks for any ideas Martin7.1KViews2likes8CommentsNo network for Server 2022 in Hyper-v with 2012R2 host
Greetings, I'm having network issues running virtual Server 2022 in Hyper-V with a Server 2012 host. Neither of my two Server 2022 VM's have any network connection...? First one was an upgrade from (a working) Server 2016 and second a clean install. The Hyper-V host in question has multiple VM's ( 2008 -> 2019 + Linux ) and none of them have any network issues, only the newly added 2022's. I also tried to set static ip, but no go...same problem. And yes, the 2022's use the same virtual switch as the working ones. I can see in the dhcp-server (running wireshark) that the failing servers are sending DHCP Discover(s), and the dhcp-server replying with Offer(s) but the 2022's ignores these requests. Interestingly, when running a virtual 2022 in a Win10 Hyper-V is works... So, known feature or..? /BjarneSolved16KViews2likes4CommentsStorage Spaces Direct 2 Node Hyper-V Cluster
Setting up a 2 node Hyper-V cluster without a SAN and plan to use Storage Spaces Direct for my storage. I have two identical servers in this Hyper-V cluster with a multi disk SSD Raid 5 data array exposed to the OS. I assume S2D will see the 2 Raid 5 arrays on these servers and I will be able to mirror them as a single CSV which I will then put my clustered VMs on. (these two servers will be connected to each other with a 10gb switch) My question is; if VM1 is live on Node1 in this simple 2 node S2D mirror example, will S2D attempt to serve the VM files from the same node or with all the CSV traffic go out to the physical switch regardless? I understand this gets more complicated with multiple drives scattered across many servers, but are there any optimizations in this 2 node example? Final question - is there a better way that I should be constructing a 2 node Hyper-V cluster with out a SAN? Thanks in advance!8.7KViews2likes1Comment