We've noticed that customers may experience Exchange services timeout or long wait times for services or application to start up. This problem occurs when a server has no internet access or occasionally when a server has limited internet access. The cause of this problem is likely related to a routine check of the Certificate Revocation List (CRL) for .NET assemblies. In this post, I will provide some details regarding how CRL check affects Exchange server services and applications and how some registry settings can contribute to the problem (and solution).
CRL and CryptoAPI
A CRL is a list of revoked certificates, which is signed by a Certificate Authority (CA) and made freely available at a public distribution point. Each revoked certificate is identified in a CRL by its certificate serial number.
When certificate-enabled software (such as Exchange .NET-based services) uses a certificate, the Cryptographic Application Programming Interface (CryptoAPI), a Windows sub-system, will check the certificate signature and time validity and will also verify up-to-date certificate status to ensure that the certificate being presented has not been revoked.
To increase performance, the CryptoAPI caches CRLs and certificates referenced in Authority Information Access (AIA). The entries are cached in memory on a per process basis. When a certificate's status is verified against a CRL, CryptoAPI first searches the local certificate stores and the local cache for any CRL signed by the issuer (the CA) of the certificate being validated. If CryptoAPI fails to find a valid CRL locally, it will download suitable CRL using information in that certificate.
For more information, please refer to the CryptoAPI whitepaper Certificate Revocation and Status Checking.
There are a couple of registry keys related to timing of CRL retrieving:
This registry setting defines the default timeout for a single CRL retrieval. If this value is set to 0 or if this value is undefined, the default value that is used is 15,000 milliseconds.
Decreasing the amount of time to allow CRL retrieval can significantly improve performance when internet access is poor or non-existent. Setting the value to 200 (milliseconds) may be a reasonable timeout.
Location: HKLM\SOFTWARE\Microsoft\Cryptography\OID\EncodingType 0\CertDllCreateCertificateChainEngine\Config
The value is defined in milliseconds.
This registry setting defines the cumulative timeout for all CRL retrievals. If this value is set to 0 or if this value is undefined, the default value that is used is 20,000 milliseconds.
Decreasing the amount of time to allow all CRL retrievals can significantly improve performance when internet access is poor or non-existent. Setting the value to 500 (milliseconds) may be a reasonable timeout.
In Windows, the Service Control Manager (SCM) maintains an installed services database. When a service starts, the SCM will allow a defined amount of time for the service to complete its startup phase. If the SCM does not receive a "service started" notice from the service within this time-out period, the SCM terminates the process that hosts the service.
By default, the timeout threshold is defined as 30 seconds. This default can be modified for the machine by changing the following registry setting:
The value is defined in milliseconds.
Increasing the ServicesPipeTimeout value (for example, to 60000 or 90000) can increase the probability that all Exchange services will start successfully without a timeout error.
Problems with Exchange and CRL
When the .NET Framework loads an assembly which has an Authenticode signature it will always try to verify that signature. In Exchange Server 2010 (and Exchange Server 2007) all managed binaries and assemblies are signed. That means that whenever an Exchange managed assembly is updated (during a Rollup installation, for example) the certificate will be verified against the CRL. If the internet access is impaired, this can be quite time intensive, since it may require hitting the network several times to download up-to-date CRLs. This behavior will affect Exchange operation in the following ways:
Exchange services startup
Applying an Exchange Rollup may introduce a new certificate for updated assemblies in the package. CryptoAPI will send HTTP requests to download the latest CRL. If you don't have internet or have intermittent access, and a service has managed assemblies with two different certificates (because assemblies in Rollups may be built at different times), the system may consume 30 seconds attempting to retrieve the CRL (15 seconds per certificate). This will exhaust the default timeout period for service startup and the service will not start.
Possible solutions for this situation are:
Prevent the CRL check when an Exchange service starts. This can be done by setting an Exchange configuration option using the technique described in Exchange 2007 managed services might time out during certificate revocation checks.
Note: You probably did not encounter this problem when you did the initial installation of Exchange Server. That is probably because all assemblies in the original Exchange installation have the same certificate. Since the total default CRL retrieval timeout is 15 seconds (for one CRL download attempt) that generally leaves enough time for the service to successfully sta rt in the 30 seconds allocated by default by the SCM.
Exchange executable startup
When an executable loads a managed assembly, it will attempt to access the CRL. This can lead to sluggish startup if the internet access is poor, as the CRL retrieval must time out. For example, in Exchange, you may encounter this with the Exchange Management Console or Exchange Management Shell. This delay can be eliminated by turning off CRL check for the local user account via Internet Explorer. Instructions for turning off the CRL check are addressed in this link: How to Install the Latest Service Pack or Update Rollup for Exchange 2007.
(Look under the section titled "When Exchange cannot connect to the Internet").
Exchange Rollup or Interim Update installation
A similar delay occurs when installing an Exchange Update Rollup or Interim Update. During installation, the Native Image Generator (NGEN) is run to prepare the assemblies on the machine for running in native mode. During this process, NGEN must load each assembly, which triggers an attempt to retrieve the CRL, hitting the retrieval timeout many times. Again, this delay can be eliminated by following the instructions in the link described above.
Security best practice
The purpose of the CRL check is to help validate the identity of the author of an assembly. In an environment where an Exchange server is connected to the internet, the best practice is to maintain the CRL check process. Only in a secure environment (where the internet access is turned off or is tightly controlled) should the CRL check be disabled. In this environment it should be safe to turn off the CRL check since it is not going to succeed and will only cause performance problems.
-- Yongcai Liu
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.