Most of you are probably familiar with Classic Hybrid Deployments that are automatically configured through classic option in the Hybrid Configuration Wizard (HCW). In this blog post we will focus on Modern Hybrid Configuration (aka Modern Hybrid Agent) and specifically, Agent troubleshooting.
The 2 mentioned flavors of hybrid deployments are:
A comparison of the hybrid functionalities configured with these two options can be found in Hybrid Configuration Wizard options article.
Before getting more specific on how to troubleshoot the Modern HCW option, let me give you some information and troubleshooting hints for HCW application in general.
Always make sure you are running the latest HCW version by downloading it from https://aka.ms/hybridwizard or opening it from the Exchange Admin Center. Why run the latest version? Because many things get fixed in newer builds.
The version of the app can be found in multiple places such as in the HCW Graphical User Interface (GUI) in the top right corner in control panel, under Programs and Features, and in HCW log. Currently, the HCW build numbers are 17.x and if you have the build 16.x, the application will not auto-update to 17.x. Ideally, if you need to re-run HCW you should update to latest 17.x build first. Both versions can coexist on the same machine (not really recommended because you might be confused as to which is which), more info here:
Hybrid Prerequisites and HCW FAQ
Many issues can be prevented if we read thoroughly the prerequisites of HCW modes and frequently asked questions. Here are they:
There are 2 types of Admin accounts needed to run HCW:
On-premises Exchange Account
This account needs to be member of Organization Management. If you change the account and insert credentials, these credentials will be automatically used for Test-MigrationServerAvailability in case HCW will need to create a Hybrid Migration Endpoint.
For Modern HCW, you would see the Migration Admin here:
If you get an “invalid username or password” error in this step of the HCW, check if you have the required permissions for a Remote Mailbox Migration and that credentials are correct (use domain\account format).
Office 365 Exchange Online Account.
This needs to be a Global Admin (Exchange Admin included).
Know that you can select 2 types of login to your O365 tenant: Modern Authentication and Basic Authentication (Legacy Login). Note that Basic authentication is being deprecated, but sometimes it’s useful to test it when you have login issues. That option can be found under “Legacy Login”:
HCW F12 Diagnostic tools
Pressing F12 when in HCW will give you an additional Diagnostic Tools section in HCW UI:
This is awesome and very useful when troubleshooting. Let’s mention each of them:
Open Exchange Management Shell
This shortcut opens a PowerShell window and connects to your on-premises Exchange environment.
If, for example, you get HCW failing to run a command in on-premises shell, you can quickly copy the failing command from the HCW log, open this shell and then paste the same command to see if the problem is in the on-premises environment / shell or it’s an issue with HCW itself (very rarely the case).
Another example is if you have issues with connecting to on-premises PowerShell in HCW GUI, you can quickly use this and see if the error is the same.
Open Exchange Online PowerShell
This shortcut opens a PowerShell window and connects to your Exchange Online environment.
You would open this up when you see a command failing in Exchange Online PowerShell and use it to run the failing command to see if you get the same outcome.
You would also use it when you get the following error in HCW GUI:
In HCW log, you would look for the entry Activity=Tenant Connection Validation for more details on the issue. This entry below suggests that I don’t use a proxy on the machine running HCW to connect to Office 365 Exchange Online PowerShell:
2020.05.29 20:37:37.469 10179 [Client=UX, HTTP GET=https://outlook.office365.com/, Thread=7] Request for https://outlook.office365.com/ does NOT go thru proxy
Open Log File
This opens the current HCW log file. In Notepad, I normally don’t use word wrap when I first open it so that I can have a quick and clear table format view.
This is the main HCW log, date_time.log, where we get information about what HCW version we are using, machine from where we ran HCW, OS and .NET version, on-premises organization current configuration etc.:
We can also see EXO organization current configuration:
Create Support Package
This is very useful for us (people in support). This will make a zip of your HCW log files so that you can send it to Microsoft Support. There is a second screen once you click on the link, where you can specify to ZIP logs from the last 24 hours, or within a date range. You then have the option to copy the file to the clipboard so it can be easily attached to an email.
Open Logging Folder
This shortcut opens the folder where HCW logs are located on the machine where you ran HCW. C:\Users\<admin>\AppData\Roaming\Microsoft\Exchange Hybrid Configuration
Notable files in this folder:
Date_time.log file was mentioned above during Open Log File.
Date_time .xhcw
The second important log is the .xhcw log. This is an XML log which lists all the cmdlets done by HCW (like Get-*, Set-*, Update-*,Remove-*,New-*) in both EXO and Exchange OnPrem PowerShell sessions.
You would open this file in Notepad, add a starting xml tag like <root> at the beginning and then at the end add </root> as the ending tag and then save the xhcw log as .xml. Then, you can open it in the browser and check /expand the cmdlets. Example:
Date_time.boot log is the log showing the startup of HCW:
The .cc log is a small log with extra info regarding your Hybrid Configuration:
Date_time.hybridconnector.log
This is the setup log for Hybrid Connector (when you install the Hybrid Agent). This log is therefore not present in Classic Hybrid Configs.
Open Process Folder
This link opens a Command Prompt with directory set to the HCW process. This can be helpful when we get HCW crashes or some generic errors like “Object reference not set to an instance of an object” and we can combine it with ProcDump tracing of the process. See here for more info on ProcDump download and syntax, we would normally use a quick command like shown below to get the stack exception but if you have to get here with troubleshooting, we advise you open a support case with us:
procdump.exe -e 1 -f "" Microsoft.Online.CSE.Hybrid.App.exe
Hybrid Agent in Modern HCW
You can learn more about Hybrid Agent architecture here.
This feature will install an agent, built on the same technology as Azure Application Proxy, which will publish the Exchange on-premises environment to Exchange Online to support free/busy and mailbox migrations without many of the challenges customers previously faced with external DNS, publishing of EWS and inbound connections ports having to be opened. The secret sauce here is that the Hybrid Agent registers a custom URL for only your tenant in the following format:
<guid>.resource.mailboxmigration.his.msappproxy.net
This URL is then used by the Organization Relationship or the Intra Organization Connector and the Mailbox Replication Service to route requests from your tenant to on-premises. This URL is only accessible from Exchange Online. Free/busy requests from cloud users to on-premises and mailbox migrations to/from the cloud are the two flows currently supported through the Hybrid Agent.
Where exactly in the hybrid configuration can we see this URL ending in “his.msapproxy.net”? Here are some of the cmdlets that will show you were the URL is used:
Determine the Hybrid Agent route
Simply put, the Hybrid Agent is the middle man between the Exchange Online servers and the on-premises Exchange Server(s) and you can think of it as an Inbound Connector for HTTPS traffic from EXO to on-premises Exchange. The Hybrid Agent accepts traffic only from Exchange Online Servers.
A simple schema of the inbound route to Exchange on-premises Modern Hybrid would be the following:
EXO > Hybrid Agent (External URL) > Load Balancer or Exchange On Prem Server (Internal URL).
You can check the Connector route with the Get-HybridApplication cmdlet available with the Hybrid Management PowerShell Module:
From this screenshot, we can tell that the external URI is https://<GUID>.resource.mailboxmigration.his.msappproxy.net (externalUrl parameter) and that connections to this published namespace will be relayed to your Exchange Server or Load Balancer: https://internalFQDN/ (my lab machine in this example being mirebm340vm.domain.lab which is internalUrl)
The external URI should be resolvable in public DNS, You can use nslookup and resolve-dnsname to check if the Hybrid Agent is correctly published for my Office 365 tenant.
You can also check the Hybrid Connector Route in HCW log (ConnectorRoute value) and by looking at the output of the following Graph query: https://graph.microsoft.com/edu/$metadata#applications
ConnectorRoute in HCW log showing externalUrl and internalUrl
Graph Query in HCW log which shows the Hybrid Application:
If there is no application returned in Get-HybridApplication, re-run the HCW with Modern Hybrid Topology option. It will restore the application.
If an application is returned, but the Exchange CAS server machine pointed to is not available, Update-HybridApplication allows you to reset the target URI to another CAS server or load balancer (supported only for Exchange 2013 MRSproxy servers and above). See more on this step here.
Setting up the Hybrid Agent in Modern HCW
There are 4 phases when setting up the Hybrid Agent via HCW:
Starting with HCW version 17.x, we also have a Hybrid Updater Agent step which will be visible in HCW UI:
Determine if the Hybrid Agent is installed and running
The Hybrid Agent may be running on a Client Access Server (CAS), or it may be running in the DMZ, but it must be running somewhere. The first step is to go to that machine and check the status of the service (Microsoft Hybrid Service should be started) and if the Hybrid Connector is up and running.
There are a few methods of checking if the Hybrid Agent is Active (registered and running) or Inactive (not registered or not running):
Note that this cmdlet will return all Azure Proxy Connectors (including Pass-thru Authentication ones), not just Exchange Hybrid Agents, whereas HCW GUI mentioned first shows only Exchange ones.
If a connector for Exchange Online doesn’t show at all or it shows but status is inactive, this means that it’s either not running or not registered.
The Agent would be uninstalled if, for example, you switched from Modern to Classic Hybrid Topology or manually uninstalled the Microsoft Hybrid Service in Programs and Features.
If you didn’t uninstall the Microsoft Hybrid Service and the service is started and running, then you would need to check the Hybrid Service logs.
If the service doesn’t start, look at the service startup logs. To enable the logging, you would have to navigate to Hybrid Service installation path, for example C:\Program Files\Microsoft Hybrid Service. In this folder, there is a config file of the Hybrid Service called Microsoft.Online.EME.Hybrid.Agent.Service.exe.config. You would run the Notepad as Administrator and then open this config file to edit it. Remove the <!— and --> characters (uncomment the XML comments) from the config file and save it.
The file should look like this after removing the XML comments and turning on logging:
Restart the Microsoft Hybrid Services in services.msc. Follow the procedure from here to attempt a connection to the connector. Navigate to these 2 folders and check the HybridService logs:
Once we established that the Hybrid Agent is installed, registered and running, it is time to validate its functionality.
You can use the following steps to validate free/busy and mailbox migration flow via the Agent: Testing and validation of the Hybrid Agent.
If the requests counter goes up when doing Test-MigrationServerAvailaility, the connector is fine.
Next thing is to check if the request from Hybrid Agent Machine reaches Exchange CAS. There are 3 main logs for EWS (MRSproxy.svc) requests:
Here are some tips on how to use these logs:
When doing Test-MigrationServerAvailability you would see about 5 entries in IIS logs for mrsproxy.svc, 2 with 401 status and additional ones with 200 OK status. In case of an error in Test-MigrationServerAvailability, let’s say 503 Service Unavailable, you would need to see if the entry with 503 is there is IIS logs.
503 error in IIS log:
2020-02-20 06:57:42 192.168.2.50 POST /EWS/mrsproxy.svc - 443 miry\administrator 4.43.0.1 - 503 0 0 125
If you see it (example above), then investigate the HTTPProxy logs (if Exchange 2013 or higher) and Failed Request Traces in IIS to get more info on the error. For the above example, you can narrow down the statuses between 500-599 as an example for this situation.
For most common errors at Validate Hybrid Agent phase (Test-MigrationServerAvailability in Modern Hybrid) and their fixes, see my other blog post.
Hope you find this helpful! I realize this is not really a post that you sit down and read from end to end, but it should come in handy when troubleshooting!
I wanted to thank Nino Bilic, Jason Nelson, Rob Whaley, Raymond Fong, Mukesh Kumar, Diganta Deb Roy and Timothy Heeney for their review and comments on this blog post.
Mirela Buruiana
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.