A common question from Azure Information Protection administrators revolves around how they can decrypt protected messages and documents as part of eDiscovery processes. The purpose of this article is to describe common, supported approaches to performing eDiscovery across mailboxes and PST files.
Discovery First Approach
No matter what service or software your organization uses for eDiscovery, it's important to perform a first-pass discovery on the mailbox or PST file. In general, that pattern looks like:
Export mailbox to PST file (not necessary with Office 365 eDiscovery).
Use eDiscovery tool to perform discovery on contents.
Generate PST of in-scope items.
Use decryption tool to provide decrypted output of PST.
Complete import by adding decrypted output to eDiscovery tool.
Ideally, the eDiscovery process occurs prior to export, but in many organizations that's not the case.
Note: Office 365 Security and Compliance Center eDiscovery performs discovery prior to export. See below for additional details.
Office 365 eDiscovery
eDiscovery in Office 365 Security and Compliance Center is capable of searching for encrypted itemsprior to export. This has a few benefits. First, the output PST file, while still requiring decryption, will be much smaller than the raw mailbox dump. Mailboxes where eDiscovery is performed as the first step will see their size decreased by up to 96%. Office 365 Security and Compliance Center will be able to reason over protected content stored in Exchange online and export all discovered items, included the encrypted messages, to a PST file. Optionally, it can decrypt the encrypted mail on export. It should be noted that these decrypted mail items will be stored as individual files rather than bundled as a PST and today cannot decrypt protected attachments.
Office 365 eDiscovery can generate three types of PST output:
Messages and attachments protected with Azure Information Protection, as long as they originated in the tenant where eDiscovery is performed, will be indexed and included with the indexed export. Items that couldn't be decrypted due to originating from an external tenant will be included in the partially indexed output. Once the export of choice is complete, the PST can be processed by the decryption cmdlet, which will result in a PST that contains no encrypted content.
An alternative to the process above involves exporting the entire mailbox to a PST file, then running eDiscovery processes against that PST file. The common pitfall that causes delays in the discovery process is that administrators will attempt to decrypt the contents of the entire PST prior to performing eDiscovery. The Azure Information Protection PowerShell module supports PSTs up to 5GB in size. For this reason, it's important to trim down the data set prior to processing.
Rather than decrypting massive PST files that may take many hours, or days, to decrypt, when in reality less than 10% of the contents were encrypted, the following process is recommended:
Export PST from Exchange Online or Exchange Server, or from workstation where user had stored mail.
Import PST in to preferred third-party eDiscovery tool.
eDiscovery tool will likely error on all encrypted contents. Generate a PST of all encrypted items.
Use PowerShell module to decrypt this smaller PST file that contains all encrypted items.
Import 2nd output PST in to discovery tool.
While this results in extra round trips, it greatly reduces the time to resolution as only a single full-pass is required by the eDiscovery software rather than a full pass by the decryption cmdlet, then another pass by the eDiscovery tool.
If the above options aren't ideal for your organization, the best path forward will be to ask your eDiscovery vendor or partner to integrate the Microsoft Information Protection SDK into their application or service. The MIP SDK will allow them to decrypt the messages and documents as they're found, and to include the result in their index and discovery output. This does require that the organization has an account in your tenant with sufficient privileges, most likely super user.
Trimming down the set of data that must be decrypted by first or third party tools prior to performing decryption reduces the time and complexity required to deliver eDiscovery results to interested parties. The steps outlined above are the common approaches we see customers taking today.