(This is a first part of the series on troubleshooting RUS issues with the use of Diagnostics Logging. Stay tuned for more... :)
Many RUS problems can be identified through careful examination of the application event log. It is useful to use event logging to troubleshoot the RUS for many different behaviors, such as when the RUS is not stamping objects at all, appears to be taking a long time, or is stamping them with the wrong proxy addresses. This article describes how to use the event log to identify these issues.
It is up to the domain RUS's for a given domain to stamp the mail-enabled objects in that domain naming context. Exchange allows you to create up to one domain RUS for every DC in the domain. If a domain has more than one domain RUS, the first step in the troubleshooting process is to choose one particular RUS to troubleshoot
To begin logging the events of interest to the application log, increase logging on the following objects to Maximum on the Exchange server responsible for the domain RUS you've chosen:
MSExchangeAL\LDAP Operations MSExchangeAL\Address List Synchronization MSExchangeSA\Proxy Generation (Exchange 2003 only)
Once you've chosen a domain RUS to look at and turned up logging on that Exchange server, the next step is to choose an object to test with, such as a user that has not been stamped yet in that domain. Once you know which RUS you're looking at, and which user you're expecting it to stamp, you can begin taking a closer look at what the RUS is doing.
Repeatedly choosing Rebuild on the RUS or Apply This Policy Now on a policy can complicate the troubleshooting process by causing the RUS to process large numbers of objects. This results in the application log quickly overwriting itself and makes it difficult to follow the sequence of events described above. When troubleshooting the RUS, it is best to avoid Rebuilding or Applying and instead focus on a single test user and use only Update Now to check for new and modified objects. After an Update Now, you can walk through the events described above to understand what the RUS is doing to a particular recipient.
Question 1 - Is the RUS querying for changes?
First you should determine if the RUS is even looking for recipients to process. Based on the schedule, or when Update Now is chosen, the RUS will query the domain for any new or modified objects. The first step in troubleshooting a RUS that's not stamping is to verify that it's checking for changes at all.
In ADSI Edit or LDP, connect to the DC that the RUS points to, and find your test user. Look at the USNChanged attribute and make a note of the value.
Next, open up the application event log on the Exchange server responsible for the RUS. Go to View->Find. For Event ID put 8011. For Description put "Base 'DC" (without the enclosing quotes). This will take you to the most recent search for changes against the domain naming context:
Here you can see the RUS is searching for any objects with a USNChanged value between 273870 and 298312. You may notice that there are many other events 8011 in the application log that contain different searches. These are generated by many different operations. For the purpose of troubleshooting the stamping of users, though, the only 8011 events we are interested in are the ones where the base of the search is the domain in question. This is why it is useful to use Find to skip directly to an 8011 event that contains the text "Base 'DC" - this will take you directly to a search against a domain and skip over the rest of the 8011 events. If you have RUS's for different domains running on the same Exchange server, you may want to include the entire name of the domain in the Description portion of the Find window, so you can skip over any 8011 events for other domain RUS's.
If the user you've identified has a USNChanged value higher than the range of USNs in this event, then the RUS has not yet queried for that user. If the USNChanged on your user is much greater than the USNs currently being processed by the RUS, that's an indication that the RUS has fallen behind and is still catching up to the latest changes. One common reason for this is that a Rebuild was run. When you choose Rebuild, the RUS starts over from a USNChanged of 1 and queries for all objects in the domain. In a large domain, it can take hours or in some cases days for the RUS to process all the objects and catch up after running a Rebuild.
If the user you've identified has a USNChanged value lower than the range of USNs in this event, then this RUS has already passed that user. In that case, continue to search back through the application log until you find the 8011 that contains the range of USNs that includes the user in question. If you can not find it, another option is to make another change to the user. Any change to the object, even just changing the description, will cause the USNChanged to be bumped up to the latest value on the DC. So if the RUS has already gone past the user in question and you can not find the associated 8011, just make a change to the user and note the new USNChanged. Then watch the app log for the next 8011, which should include the USN of the user you just updated.
If there is no 8011 with "Base 'DC" in the description, then the domain RUS has not kicked off since logging was turned up, or if it has then the event wrapped out of the log. If a Rebuild is running it may be difficult to catch the 8011 against the domain root, since the application log will be filling up in a very short time during a Rebuild. See the next section for instructions on how to determine if a Rebuild is running.
If there is no 8011 and it does not appear that a Rebuild is running, check the schedule on the RUS. You can try choosing Update Now to get the RUS to kick off immediately, but do not start a Rebuild or Apply a policy. If you still see no 8011 querying the root of the domain within the next few minutes, it is likely that the RUS is hanging waiting for a response to a search against the DC.
When the RUS is hanging in an LDAP query, you can restart the System Attendant service to get it going again. However, the RUS may hang again unless the cause of the LDAP query hang is identified and corrected. This is usually caused by a network problem, and a Network Monitor capture of the query hanging will be needed to identify the cause.
If the 8011 does contain a range of USNChanged values that includes the USNChanged on the user, then you have answered the first question.