To secure their server infrastructure customers often employ local or Active Directory based Group Policies which are applied uniformly to all domain joined machines as soon as the machine joins the domain.
To form a Hyper-V based cluster both Azure Stack HCI and Windows Server nodes need to be domain joined, which in turn will apply the Group Policies applicable to that container in Active Directory.
If these policies restrict certain operating system areas and functionality, it can lead to errors when you try to deploy, update, or operate AKS hybrid.
Group Policies are a very powerful tool with many capabilities and 100s of 1000s of ways to configure your system to your liking in a repeatable way.
We have made every effort to ensure we test with the baseline templates available from Microsoft and found 10 major areas of impact when using Group Policies.
- URL, port and IPsec rules that interfere with the requirements outlined in the system requirements documentation.
- Permissions for the Cluster Network Account to create and manage objects in Active Directory.
- Automatic update of DNS records is disabled.
- Executables and services with unknown names are prevented from starting.
- Group and network policies impacting virtual machines.
- PowerShell is limited to restricted mode.
- Proxy settings that change frequently.
- Setting Log On As a Service rights to specific accounts.
- Overwriting Root certificate store content and removing unknown certificates.
Let’s have a look at each one of these in more detail
URL, port and IPsec rules
The documentation for AKS hybrid, Azure Stack HCI and Azure Arc document the list of URLs and ports and network connectivity requirements.
Especially be on the lookout for ports blocked beyond standard SMB, HTTP, HTTPS, RPC ports between computers in a Hyper-V cluster.
The solution needs to be able to communicate using ports like 6443, 65000, 55000 etc. to effectively function.
Learn more in the documentation here and here.
Permissions in Active Directory for the Cluster Network Account
As part of the deployment AKS hybrid creates a cluster service resource for the cloud agent service which is responsible for facilitating the communication between the management cluster and the physical Hyper-V cluster nodes.
This cluster service resource is created by calling Hyper-V APIs which in turn use the Cluster Network Account to access Active Directory and create the respective objects.
We have documented the process for AKS hybrid here.
Automatic updates for DNS records
Along the lines of the previous topic to enable name resolution of the physical nodes and Hyper-V cluster resources we need to be able to register them in DNS. If automatic DNS updates are disabled on the server or the DNS server in use does not support automatic updates these DNS records need to be created upfront and then the information provided during AKS hybrid deployment.
We have documented the steps here.
Executables and services with unknown names are prevented from starting
There is a set of policy settings which allow for locking down the system to a set of known executables and services that are allowed to run.
AKS hybrid introduces a set of binaries and services to manage the Hyper-V cluster resources, deploy virtual machines, configure networking and storage etc.
You need to determine
- If you are using restrictions in your Group Policy
- Who to talk to in the operations team to get the executables added to the list of allowed binaries and services:
- Cloud Agent: wssdcloudagent.exe (WSSD Cloud Agent Service)
- Node Agent: wssdagent.exe (WSSD Node Agent Service)
- Kubernetes CLI: kubectl.exe
- Management Cluster CLI: kvactl.exe
- Microsoft On-Premises Cloud CLI: mocctl.exe
- WSSD Node Agent CLI: nodectl.exe
- Azure CLI: az (az.bat) requires python.exe or python310.exe depending on the version.
Group and network policies impacting virtual machines
Do not attempt to domain join Windows worker node VMs or these policies will get applied there too. More importantly though, the domain join will be lost at the next AKS hybrid upgrade.
If you need to use Active Directory based accounts for authentication in your containerized application, make use of the gMSAv2 feature in AKS hybrid.
Make sure that IPsec policies on the host do not impact network traffic between the management cluster and the target cluster control plane(s) as well as communication between the cloud agent cluster service and the node agents on the physical hosts.
Restrictive user permissions
The user account logged in to install AKS hybrid needs certain permissions in the domain.
To install and configure the Hyper-V cluster you need to be a local administrator on all physical cluster nodes.
In its simplest form the user should be a member of the ‘Domain Admins’ group in Active Directory once the Hyper-V cluster is deployed.
Reducing the permissions to a least privilege model is possible and depends on your Active Directory and administrative role definitions.
The user must have the following permissions at a minimum:
- Local administrator on all physical nodes in the Hyper-V cluster
- Cluster Administrator permissions for the Hyper-V cluster (these are separate from the local administrator rights)
- Access to the internet (proxy authentication)
- Permission to create and install certificates on all cluster nodes
- Create, edit and delete objects in the respective Active Directory container (OU)
- Download and run PowerShell scripts and/or
- Access to Windows Admin Center
- Ability to create service principals in the domain
- Delegate access for Kerberos tokens
- Owner permissions to the respective Azure subscription.
Depending on your configuration there might be other permissions to be set. Verify with your domain administration to make sure you have the required permissions.
PowerShell is limited to restricted mode
In some environments we have found that PowerShell is configured in restricted mode.
Make sure to change that to remote signed.
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned
Proxy settings that change frequently
If you have frequently changing proxy settings that are applied via Group Policy, these changes will not be applied automatically to a deployed AKS hybrid installation.
At the time of writing, AKS hybrid supports updating the noProxy list of proxy parameters.
See the documentation here.
As we add more functionality to manage proxy settings at runtime the above documentation will be updated.
Limiting “Log On As a Service” rights
If you are setting “Log On As a Service” rights to specific accounts in the Group Policy make sure you add the NT Virtual Machine\Virtual Machines account to the Logon as a Service user rights group. Otherwise, you will see an error like below:
Error 0x80070569 ('VM_NAME' failed to start worker process: Logon Failure: The user has not been granted the requested logon type at this computer.)
More details are listed here: Starting or live migrating Hyper-V VMs fails - Windows Server | Microsoft Learn
Restricting and updating the Root certificate store
If your Group Policy updates the root certificate store and/or limits which certificates can be in the store, and you are using a proxy with SSL inspection. Make sure to always include the proxy certificate in the list of allowed certificates to prevent failures during install and upgrade.
Using Group Policy is a great way to govern and configure your Active Directory based environment. But some settings can impact the way AKS hybrid works. If you find something else in Active Directory that causes you trouble let us know in the Azure Kubernetes Services on Azure Stack HCI · Community (I know. We will be changing that name as well)