Some of you may have noticed that more migrations might be failing due to encountering 'too many bad items'. Upon closer review, you may notice that the migration report contains entries referencing corrupted items and being unable to translate principals. I wanted to take a few minutes and provide more information to help understand what this means, why these are now occurring, and what can be done about them. Ready to geek out?
During a mailbox migration, there are several stages we go through. We start off with copying the folder hierarchy (including any views associated with those folders), then perform an initial copy of the data (what we call the Initial Sync). Once the initial data copy process is complete, we then copy rules and security descriptors. Reviewing a move report shows entries similar to these.
Stage: CreatingFolderHierarchy. Percent complete: 10
Initializing folder hierarchy from mailbox <guid>: X folders total
Folder hierarchy initialized for mailbox <guid>: X folders created
Stage: LoadingMessages
Copying messages is complete. Copying rules and security descriptors.
For our discussion today, we are interested in the stage of “Copying rules and security descriptors”. Security descriptors are Access Control Lists (ACLs), which are then comprised of Access Control Entries (ACEs, or the individual permissions entries) and stored in
SDDL format. In the context of a mailbox, we include both the Mailbox security descriptor (Mailbox permissions) as well as Folder security descriptors (permissions on individual folders). When we look at the Mailbox Security descriptor, it should be noted that only Explicit mailbox permissions are copied. These would include permissions granted by using the Add-MailboxPermission cmdlet, by using the Exchange Management Console (2010) or Exchange Admin Center (2013 and 2016) to add Full Access rights. Any Inherited permissions are not evaluated during the copy process. For example, granting the Receive-As permission on a database object in Active Directory results in an Inherited Allow for Full Access for all mailboxes on that database. When mailboxes on that database are migrated to Exchange Online, those Inherited permissions will not get copied.
Now that we have briefly covered security descriptors, let’s look at the issue. About midway through 2016, a change was introduced to Exchange Online whereby if a security principal could not be successfully validated/mapped to an Exchange Online object, it would be marked as a bad item. Previously, the behavior was that invalid permissions would simply be ignored, and administrators were then left to wonder why some permissions no longer worked after the migration. With this new behavior, corrupt/invalid permissions are now logged so that administrators will know that there are problems with permissions. From my perspective as a Support Engineer, this is a change for the better because as Administrators, you are now able to see when there are issues with permissions. It is possible that this behavior will continue to evolve over time, but I would advise to become familiar with this new behavior so that you understand what is happening.
Now how does this affect you? Since we are now incrementing the bad item count for each corrupt/invalid permission, this means that if we encounter more corrupt/invalid permissions than your current bad item limit is set to (default is 10 for a migration batch), the migration will fail. Depending on the state of permissions, you could potentially see a LOT of bad entries being logged. If you are looking at the migration report text file (downloadable from the Exchange Online Portal), you may see entries similar to the following:
11/12/2016 8:44:43 AM [EXO MRS Server] A corrupted item was encountered: Unable to translate principals for folder "Folder Name"/"FolderNTSD": Failed to find a principal from the source forest.
5/19/2016 6:33:50 PM [EXO MRS Server] A corrupted item was encountered: Unable to translate principals to the target mailbox: Failed to find a principal in the target forest that corresponds to the following source forest principal values: Alias: <alias>; DisplayName: <Display Name>; MailboxGuid: <mailbox guid>; SID: <SID of User>; ObjectGuid:
<Object GUID>; LegDN: <legacyExchangeDN>; Proxies: [X500:<legacyExchagneDN format>; SMTP:user@contoso.com;];.
5/19/2016 6:33:50 PM [EXO MRS Server] A corrupted item was encountered: Unable to translate principals to the target mailbox: Failed to find a principal in the target forest that corresponds to the following source forest principal values: SID: <SID of User>; ObjectGuid: <Object GUID>;.
So, what is the logic used to validate permissions?
I’m glad you asked! Here is the process spelled out. There are four basic steps to this process, broken out as follows.
- Exchange Online - I need to resolve this SID which is present in the security descriptor (Folder or Mailbox)
- Exchange Online - Make a request to the On-Premises MRS Proxy, passing the SID to resolve
- On-Premises MRS Proxy - Look up the SID against Active Directory and return a set of attributes (including primary SID and legacyExchangeDN)
- Exchange Online – Take the legExchangeDN value provided, and attempt to match it up with a user account in the cloud which has that stamped as an X500 proxy address.
Normally, Directory Synchronization will take care of stamping the legacyExchangeDN from each side as an X500 proxy address, but this does mean that the On-Premises legacyExchangeDN must match a Mail-enabled recipient (i.e. Mailbox, MailUser, Mail-enabled Security Group) in the cloud by an X500 Proxy. If it does not, then resolving that permission entry will fail.
I do want to differentiate between the different types of permissions errors you may see.
SourcePrincipalMappingException – these mean that when MRS Proxy tried to look up the SID against On-Premises Active Directory, it couldn’t be resolved. This is a common scenario when users leave the company and their accounts are deleted. You could also encounter these issues if the SID in question is part of the SIDHistory of an On-Premises account. When MRS Proxy attempts to look up the SID, we only search by ObjectSID or msExchMasterAccountSID. MRS Proxy does not evaluate against SIDHistory, so the SID failing to be resolved would be expected behavior. SIDHistory being populated won’t be a common scenario, but it is nonetheless something to be aware of.
Note: Exchange Online has a special built-in bad item limit of 1000 for these Source Principal Mapping errors, so these moves will not fail unless you encounter more than 1000 of these types of bad items.
TargetPrincipalMappingException – these mean that we can’t map the permission to a user account in the Target forest (Exchange Online). A common scenario here would be if a user or group was given permissions on a mailbox, but that user or group is not in your dirsync scope. After trying to move that mailbox via MRS, that user or group is not going to be present in Exchange Online, so this error would be expected. Another scenario is if a security group (not mail-enabled!) was used to assign permissions. Non mail-enabled security groups are not synchronized to Exchange Online, so they won’t exist in the Target forest.
To resolve this issue, there are really two options.
- Increase the bad item limit to account for permissions errors. In complex legacy environments where multiple Exchange versions have been in place, and there has been a lot of user turnover, I’ve seen where permissions errors can number into the thousands. Be prepared that you may need to increase the bad item limit to a number higher than you expect. The good news here is that with improvement to Exchange over the years, the odds of encountering actual bad messages is relatively slim, so odds are good that the vast majority of bad items are bad permissions. The second bit of good news here is that we log the type of bad item that is encountered and make this information available in the move report. I’ll show you how to dig into a move report and look at the bad items later on in this blog post.
- Cancel the move, fix the bad permissions from the folder or mailbox by either removing them or fixing the issue causing the user/group to not be resolved in Exchange Online, and then submit the move again. But – you may ask – what if I want to fix the permissions on the current move and then resume it? Well, I’m not going to stop you from fixing bad permissions. But I will tell you that it won’t make any difference for the current move. We only evaluate permissions once, at the end of the initial data copy. If the move fails due to bad items (permissions), even if you fix the bad permissions we won’t re-evaluate the now fixed-up permissions and allow the move to complete successfully. You either have to up the bad item limit, or remove the move and fix the permissions and submit a new move.
Now, I promised earlier that I would go through how to review the permissions errors. You can do this by using PowerShell and saving the move report into a variable where it is stored in memory. I typically have the move report exported out to an XML file because I don’t have direct access to customer tenant information. If you are reviewing failed moves within your own tenant, there is no need to do that if you don’t want. I’ll provide the context to do both just in case you want to know both methods.
To save the move report to a variable, you would run the following from PowerShell connected to Exchange Online.
$movereport = Get-MoveRequestStatistics <move request identity> -IncludeReport
To save the move report to an XML file, then import the XML file into PowerShell, you would run the following from PowerShell connected to Exchange Online.
Get-MoveRequestStatistics <move request identity> -IncludeReport | Export-CliXml c:\temp\movereport.xml
Once the file is saved, then you import it into PowerShell. Note that this PowerShell instance does
not have to be connected to Exchange Online. It can be just a regular PowerShell instance.
$movereport = Import-CliXml c:\temp\movereport.xml
If you never dug into a move report, let me just say that there are all sorts of golden nuggets of information buried inside (which won’t show in the text file from the Portal, by the way!)
Now that you have the move report imported as a variable, you can access all the rich information within the report. We specified our variable earlier as $movereport, so we just need to call that variable, and access the information stored inside it.
$movereport.report.baditems – this gives you a list of all the bad items encountered. A cool tip is that you can use the Out-GridView PowerShell function to open another window with the list.
$movereport.report.baditems | Out-GridView
What is nice about the Grid View is that you can then filter the output. For example, to validate that all of your bad items are permissions errors, you can simply choose “Add criteria”, check the “Kind” box, and click “Add”.
Change “Contains” to “Does not contain”, and type Security. This will quickly show you if there are any other types of bad items.
Now that we have identified the behavior change, and gone over how to address it, let’s end by talking about what approach should be taken for migrating mailboxes.
The recommended approach to this new change in behavior would be to continue to migrate using low bad item counts, and then manually remediate those that fail. We recommend this approach because migrations that fail would indicate either a LOT of bad source permissions (more than 1000), or it indicates there are valid, working permissions On-Premises that are failing to be correctly mapped to objects in Exchange Online. Both of these conditions should not be common, so investigation would be warranted to ensure that you are in fact dealing with bad permissions.
Special thanks to Brad Hughes and the rest of the MRS team for their assistance and review of this content.
Ben Winzenz