Blog Post

Exchange Team Blog
7 MIN READ

Content Search for Targeted Collection of Inactive Mailbox Data

The_Exchange_Team's avatar
The_Exchange_Team
Platinum Contributor
Jan 18, 2023

Inactive mailboxes in Exchange Online are a vital part of the Microsoft Purview Data Lifecycle Management capabilities. They offer a low/no-cost and versatile way of retaining mailbox data according to your organization’s requirements. At times, though, it’s necessary for an inactive mailbox and/or its data to be moved to a user-accessible state. In those cases the options for retrieval include three methods: Recover, Restore or export with content search. Each method involves its own tools and procedures. Which one is best depends, again, on your organization’s requirements. The recover and restore methods are relatively straightforward; most Exchange administrators are familiar with the concepts and cmdlets associated with each. The content search method, on the other hand, is the least intuitive.

Content search, albeit a powerful tool when applied to typical use cases, doesn’t come with much guidance on the topic of retrieving inactive mailbox data. This post offers a way to use its strengths to bring back inactive mailbox data, in part or in whole, while preserving much of the structure of the source mailbox. Further, when inactive mailboxes include auto-expanding archiving and auxiliary archives, only the content search method is available to bring back the data.

Enabling auto-expanding archiving alone does not limit inactive mailbox data retrieval to content search. That happens when one or more auxiliary archive mailboxes are generated. Use the following command to check if a mailbox has auxiliary archives:
Get-Mailbox <MailboxID> | Select-Object -ExpandProperty MailboxLocations

In this example, the archive mailbox has an auxiliary archive mailbox, noted in green, confirming that auto-expanding archiving is enabled and in use. Larger archive mailboxes may have multiple AuxArchive locations.

Methodology

The documentation for content search for targeted collections describes how an admin can run Get-MailboxFolderStatistics to reveal the folder ID value for each folder in a mailbox. This can be done for both primary and archive mailboxes, along with their respective Recoverable Items folders. The folder ID values can be built into a search query identifying each folder to search for items and later export.

This is straightforward, but there’s a catch when it comes to inactive mailboxes. In their case it may not be possible to run Get-MailboxFolderStatistics on a mailbox without an active mailbox identity. For targeted collection content searches to work with inactive mailboxes it’s a critical prerequisite to collect the folder ID values before a mailbox is made inactive. Administrators intending to use the content search method should integrate the following steps for inactive mailboxes as part of data lifecycle management. We cannot guarantee any inactive mailboxes created without first collecting the folder ID values (assuming the mailbox owner’s account is not recoverable from Azure AD) will be eligible for using content search for targeted collections.

Preparing for target collection content search

Enumerate mailbox folder ID values

Use the guidance in Use Content search for targeted collections to prepare a script for collecting the folder ID values for a mailbox slated to be made inactive. A sample script (a bit modified from the sample in that article) is provided as the GetFolderSearchParameters.zip attachment to this blog post.

Run the script for the mailbox in preparation for conversion to inactive. In this example, the script shows the folder ID values in the console, and it also generates a CSV file with the list for both the primary and archive mailboxes.

Verify and store the folder ID data

Confirm that the list of folders and the FolderID property values match the folder list in the primary and archive mailboxes. Put the list in a safe storage location where it will be retained at least as long as the inactive mailbox. Remember, this data is critical for this method to work; if this data is lost, so is the ability to use content search for a targeted collection.

Recommended best practice: Send the CSV with folder ID data as an attachment by email into the mailbox from where the data came. Use a distinct subject or other attribute that can be easily searched should the need to retrieve the data arise. At that time an admin can use content search to locate the CSV attachment, export it, and then use the data to search the targeted collection. This way no separate storage is required, and the list is retained in the inactive mailbox.

With the folder ID attribute data secured, the process to create an inactive mailbox can proceed.

Targeted Collection of Inactive Mailbox Data Procedure

When the time comes to retrieve data from an inactive mailbox using content search, the first step is to obtain the list of folder ID values for the mailbox. If the folder ID data was sent by email to the now inactive mailbox, a preliminary content search will quickly retrieve that list. Of course, if the data was stored outside the mailbox, it must be opened from the alternative storage. In that case skip to the section called “Create, name and specify a mailbox location for a targeted collection content search.”

Preliminary content search to export the folder ID data

In the Microsoft Purview compliance portal go Content search and select New search.

Name the search, provide an optional description, and then choose a location. Switch Exchange mailboxes to On and select “Choose users, groups, or teams.”

Use the email address of the inactive mailbox to locate it. An inactive mailbox will start with a “.” in the result:

In our example, a CSV of the data was generated by the sample script and emailed to the mailbox itself as an attachment with the subject of “FolderID Values for Targeted Collection.” That subject can be used for the search to retrieve the folder ID data.

Once this preliminary search is complete, folder ID data can be exported, allowing the more extensive targeted collection to proceed.

Content search sample review allowing the export of folderID properties from an inactive mailbox.

Create, name and specify a mailbox location for a targeted collection content search

The first few steps are like those for setting up the preliminary search in the previous section. In the Microsoft Purview compliance portal, go to Content search and select New search.

Name the search, provide an optional description, and then choose a location. Switch Exchange mailboxes to On and select “Choose users, groups, or teams.”

Use the email address of the inactive mailbox to locate it. Inactive mailboxes start with a “.” in the results.

With the inactive mailbox chosen for the location to search, the targeted collection query can be built.

Build the targeted collection search query using the FolderID values

In the “Define your search conditions section” use the default Keywords query builder and enter the folderID values for search. To retrieve data from a folder, its FolderID value must be added to the query and joined to the rest by an “OR” clause, like this:

FolderID:<FolderIDValue1> OR FolderID:<FolderIDValue2> OR FolderID:<FolderIDValue3>

Suppose, for example, that an admin needs to retrieve data from the Presentations folder in the primary mailbox of Test User87 as well as the Inbox and the Inbox\Lab subfolders from the archive mailbox. The admin should copy the folder ID values for each folder listed in the mailbox’s CSV.

Each folder or subfolder must have its value added to the query, the search will not automatically recurse down from a parent folder.

The admin opens the CSV file and locates the folders, highlighted with green boxes. Each of the three folders has a unique FolderID value, highlighted in red boxes. To build the content search query those three values are put into the content search query, like this:

folderid:A6EDA8618658144A8D6F2C0A43812D6F0000000018EA0000 OR
folderid:15DB8FE3100F1542994F475280A489070000000001280000 OR
folderid:15DB8FE3100F1542994F475280A489070000000001330000

 

In the portal, the search query looks like this:

For brevity, a small number of folders are used for illustration here, but if the admin wants to find and export the closest thing to the entire mailbox then a query can include all of the IPM subtree FolderID values along with those for each of the Recoverable Items folders.

It is possible to use PowerShell to create and manage the search using the *-ComplianceSearch cmdlets, as documented here. Consider that a script can be used to automatically parse a CSV of FolderID values into an array to pass into a New-ComplianceSearch instance. How that would work is outside the scope of this post, but it’s mentioned to convey that staging and running the search can be almost entirely automated.

Export and download the data

The fine detail on how to verify the search, review samples, and then export the results isn’t necessary here. The steps aren’t different from how they are generally. For the export it’s up to the admin to pick the options that make the most sense. In our example, the default of one PST for each mailbox is chosen with no de-duplication enabled:

It should be noted, though, that exporting content search data to a PST file has a size limit of 10GB. In most cases it will require quite a few of PSTs no matter how the export is configured. The export result will be a series of PSTs like this:

If each PST is opened in Outlook, the exported folder data will appear this way:

Conclusion

With a prerequisite of preparation (to get folder IDs), the ability to use content search to locate and export data from some or all the folders in an inactive mailbox (including its archive) is a powerful option. Especially for customers working with auto-expanding archives, this is a potential lifeline to a seamless data lifecycle management plan. The 10GB PST file limit may be burdensome if using only the Outlook client, but there are several different ways of working with large numbers of PST files to import data from them into a mailbox (a topic for a different post). Please leave a comment, questions, or feedback!

I wanted to thank Jay York, Nino Bilic, Linda Harrell, and Scott Schnoll for their collaboration in producing this post.

Jesse Tedoff
Sr. Cloud Solution Architect - Engineering

Updated May 18, 2023
Version 4.0

6 Comments

  • JeremyTBradshaw, thanks for your comments. The -IncludeSoftDeletedRecipients switch is reliable for the time being, I've learned that some organizations are already using it for existing inactive mailboxes to prepare for the method presented in this post, so it's a great option for customers who don't have the data in advance. That said there are code-based differences between soft-deleted and inactive mailbox handling, like with the Get-Mailbox cmdlet. It has switches for inactive but also a separate one for soft-deleted. While the -IncludeSoftDeletedRecipients works in Get-MailboxFolderStatistics for inactive mailboxes presently I can't say with certainty that it will 1, 5, or 10 years from now. Cmdlets change, switches and their behavior are added and dropped, so the "critical prerequisite" statement reflects that concern.

    I agree there's overhead involved to satisfy it (which can be minimized with investment in PowerShell or Power Platform automation), but I think it represents a worthwhile insurance policy in a situation where there's no guarantee/supportability statement that the cmdlets will always work the way they do today.* If the overhead to collect the data is a serious concern maybe a part 2 post makes sense about how to automate the process. I'm thinking of a script approach that ingests a CSV of mailbox IDs to run Get-MailboxFolderStatistics against them and use Graph to put the data for each into their respective mailboxes. Maybe someone else is thinking Power Automate, whatever might work!

    * It's worth stating that my comments are hypothetical and are not statements about real or planned changes to the cmdlets discussed!

  • Jesse_Tedoff thanks for making the updates, but quick question - in my experience, just like Satyajit321 said, I'm always able to do Get-MailboxFolderStatistics -IncludeSoftDeletedRecipients against any of the many thousands of Inactive Mailboxes that clients I work with have.  I've never had that not work.  Considering this, is the note "critical prerequisite to collect the folder ID values before a mailbox is made inactive" actually even true at all, even sometimes?

     

    The text "-Include" is present in this page 3 times currently (now 4, because of this sentence), but not once in the original post itself.  I feel like the post missed this point and, even though the proposed proactive solution is admittedly slick, it's not really necessary and is a lot to take on especially for large org's like Satyajit321 said.

     

    On the other hand, if Get-MailboxFolderStatistics -IncludeSoftDeletedRecipients in fact sometimes cannot work, then I suppose this whole comment can just be ignored.

     

    Thanks in advance.

  • Thanks for the comments to date! Based on the feedback we've made a change that corrects some inaccuracy on whether FolderID data can be captured from inactive mailboxes.

  • Satyajit321's avatar
    Satyajit321
    Iron Contributor

    "critical prerequisite to collect the folder ID values before a mailbox is made inactive" - This is too much of a prerequisite to ask for a large organization, keeping this data for every mailbox to be deleted in hopes that we might have to recover data sometime if a redundant and failure prone activity. Try it on 1TB+ mailboxes, it never completes.

     

    It would be rather easier to allow running folderstaticstics on inactive mailboxes or exporting folderIDs on Content Search Entire mailbox report only exports. So that required folders can be filtered for the actual data export.

     

    "it is impossible to run Get-MailboxFolderStatistics since the cmdlet requires an active mailbox identity" - I don't think this statement is accurate, we can run 'Get-MailboxFolderStatistics -IncludeSoftDeletedRecipients' for inactive mailboxes too and get folderids whenever we need.

     

    Rest of the article is self-explanatory and adds that the folderId based search can be used against  Aux archives too, which is good.

  • This is a nice post, and I hope the Inactive Mailbox recovery/restore pages on Learn/Docs get the deserved updates as well.  It would have been great for this post and updated documentation to have been published BEFORE all the changes happened to break the Inactive Mailbox recovery/restore steps.

     

    The cat is out of the bag, the additional shards that can result from auto-expanding archives, can result for other reasons too, which are out of the control of the customer.  So this post ought to be targeted at all customers that use Inactive Mailboxes, not just customer that use auto-expanding archives.

  • In a world where multiple retention policies might be in place for mailboxes, inactive mailboxes are often in a soft-deleted state but not removed when expected (discussed in https://practical365.com/remove-inactive-mailbox/). This means that it is sometimes possible to retrieve folder ids from inactive mailboxes:

     

    [array]$InactiveMailboxes = get-exomailbox -InactiveMailboxOnly
    $InactiveMailboxes.count
    57

    $Mbx = $InactiveMailboxes[15]

     

    Get-EXOMailboxFolderStatistics -Identity $Mbx.distinguishedName -IncludeSoftDeletedRecipients | sort itemsinfolder -desc | ft name, itemsinfolder, folderid

    Name ItemsInFolder FolderId
    ---- ------------- --------
    Inbox 5508 LgAAAADj8u1/g9BJR7or4yVe1bjlAQBjB+G9FlKoTbFkTLBqgP2UAAAAAAEMAAAB
    DiscoveryHolds 3956 LgAAAADj8u1/g9BJR7or4yVe1bjlAQBjB+G9FlKoTbFkTLBqgP2UAABelNdzAAAB
    PersonMetadata 194 LgAAAADj8u1/g9BJR7or4yVe1bjlAQBjB+G9FlKoTbFkTLBqgP2UAAHzoNrmAAAD
    Recipient Cache 141 LgAAAADj8u1/g9BJR7or4yVe1bjlAQBjB+G9FlKoTbFkTLBqgP2UAAAAAAEgAAAD
    Sent Items 93 LgAAAADj8u1/g9BJR7or4yVe1bjlAQBjB+G9FlKoTbFkTLBqgP2UAAAAAAEJAAAB

     

    Just in case this helps someone...