USN Rollbacks and Active Directory Replication Issues
Published Nov 30 2005 03:22 PM 4,360 Views

I recently worked on an interesting issue where certain Distribution Groups (DGs) in the Active Directory (AD) were not replicating properly with the Exchange 5.5 Directory Service (DS).  After adding several members to a DG in the AD, the changes did not replicate to the 5.5 server.  One particularly problematic DG, lets call it Execs, had 58 members in AD and only 23 in the 5.5 DS, even after several replication cycles.  Previously, we had gone through the basics of Active Directory Connector (ADC) troubleshooting, some of which are listed in article 253841, but the problem still persisted.  After some time of spinning our wheels we decided to examine more closely the Update Sequence Numbers (USNs) of the problem DG.

 

Before adding another member in AD, we checked the USNCreated and USNChanged values and found the following:

 

USNCreated:   345530

USNChanged: 11240563

 

Nothing particularly strange so far.  After adding another member on the AD side the values had changed to:

 

USNCreated:  307801

USNChanged: 3438089

 

The first odd thing I noticed was that the values decreased!  What was happening here?  This is really weird because under normal circumstances 1) USNCreated does not change and 2) the USNChanged increments and usually only by a bit (although on busier servers it may increment by more - but you can still tell that the new number is part of the same sequence).  For instance, in my test environment, with a single Exchange server and a single Global Catalog (GC), adding a member to a DG causes the USNChanged to increment by at most 2 or 3.

 

What was even more surprising was that after making this one change, the DG’s membership successfully replicated to the 5.5 DS.  We decided this was probably just luck.  We had previously added members to this same DG and it did not replicate. I think the difference this time was the choice of Domain Controller (DC) that we made changes on.  

 

Before making the DG change the msExchServer1HighestUSNVector looked like this (see the footnote describing this ADC attribute):

 

"OURDCA: 5225033"

"OURDCB: 11333798"

"OURDCC: 11269307"

"OURDCD: 3039867"  <<----- DC we made changes on

"OURDCE: 72170045"

 

"OURDCW: 11316411"

"OURDCX: 22501269"

"OURDCY: 6993025"

"OURDCZ: 21918680"

 

After adding a member to Execs, its USNchanged was now 3438089, higher than 3039867, and so the next time the ADC polled the AD the changes replicated to Exchange 5.5.  I speculated, given the USNChanged for Execs before adding the member (11240563), that the DC responsible for replicating the change to the 5.5 directory was one of the following:

 

"OURDCB: 11333798"

"OURDCC: 11269307"

"OURDCW: 11316411"

 

Because they seemed to have sequences in the same range as Execs. I also speculated that since their USNs were all higher than 11240563, Execs was not going to replicate to 5.5 until its USN exceeded the high water mark for one of these 3 DCs.  Graham McIntyre, our resident ADC guru, thinks the problem was that changes to the DG were not making it to the AD bridgehead which is the endpoint for the connection agreement responsible for the DG.

 

But I still wondered why the USNChanged and USNCreated got reset on OURDCD and so I took a flight of stairs down to talk to our Active Directory folks (Exchange runs on top of Windows, so naturally we sit a floor above the Windows Active Directory team J) .  They told me that one possible cause for this is a USN rollback, an occurrence that they have seen a few times recently.  USN rollbacks are described in detail in article 875495 and there are 2 ways to detect them:

 

  1. Applying the fix described in the article and looking out for the listed 2095, 1113, 1115 and 2103 events.

 

  1. Running repadmin /showutdvec * dc=<domain name>,dc=<domain suffix> and then looking at the output to determine if for any given DC A, some other remote DC B has a higher watermark USN for A than A has for itself. (If the difference in USN is only slight it could be a timing issue rather than a rollback - i.e. repadmin ran on A first, its high water mark USN incremented, the changes replicated to B and then repadmin ran on B)

Ultimately, option 1 was going to be hard to justify because our customer had a rigorous change control process they follow before applying a fix.  Most enterprise customers do, even for fixes that are known to definitively solve a specific problem (and this fix was only going to detect a problem we thought the customer might have).

 

That left us with option 2.  Unfortunately, with 44 DCs, going through the repadmin command output was going to be more difficult to go through (44 x 44) than the simple example in the article:

 

Repadmin /showutdvec dc1 dc=contoso,dc=com


Site1\DC1 @ USN 10 @ Time 2004-08-04 15:07:15
Site2\DC2 @ USN 24805 @ Time 2004-08-04 15:06:59


Repadmin /showutdvec dc2 dc=contoso,dc=com

Site1\DC1 @ USN 50 @ Time 2004-08-04 15:07:15
Site2\DC2 @ USN 24805 @ Time 2004-08-04 15:06:59

 

where DC1 has clearly experienced a rollback since DC2 has a higher USN for DC1 (50) than DC1 has for itself. (10).

 

Like a good boy scout, I wrote a Perl script, rollbackchecker.pl (see the script at the end of this post), to parse the output and detect possible rollbacks. Running the script showed several entries that looked like this:

 

---------- rollback -----------------

DC  OURDCD may have experienced a USN rollback. It's  Self Highest USN = 8280087 but the remote DC OURDCA has a USN = 8288138 for OURDCD that is higher than what OURDCD has for itself

 

---------- rollback -----------------

DC  OURDCD may have experienced a USN rollback. It's  Self Highest USN = 8280087 but the remote DC OURDCB has a USN = 8290610 for OURDCD that is higher than what OURDCD has for itself

 

We had indeed made our change to Execs on OURDCD.  According to the article, the most common sources of USN rollbacks are:

 

  1. Virtualized Hosting Environments, including but not limited to Microsoft Virtual Server 2005 and EMC VMWARE

  2. Software that backs up and restores an Active Directory operating system installation or a hard disk volume that contains the installation (including but not limited to Norton Ghost)

  3. Advanced disk subsystems that can selectively copy a volume that contains an Active Directory operating system installation that was saved in the past

On further questioning the customer said they may have done a system state restore on OURDCD because it had experienced ‘hardware issues’.  The article further states that there are only 2 ways to recover from a rollback.

 

  1. Use the Active Directory Installation Wizard (dcpromo.exe) to remove and then reinstall Active Directory (If you are not interested in the changes made on the problem DC)

  2. Restore the system state from a good recent backup using a supported method.

Our customer decided to dcpromo down OURDCD and then promote it back to a DC and since then they have not experienced DG replication issues.

 

This customer’s rollback manifested as an ADC replication issue but broken AD replication can affect Exchange in numerous ways.  In fact, to say that Exchange relies on AD is to grossly understate it.

 

The moral of the story is that you should avoid doing any of the listed things that can cause a USN rollback.

 

Footnote:

 

For those not familiar with msExchServer1HighestUSNVector, it is an attribute that the ADC uses to store the high-watermark USN for every DC with which it replicates. The ADC periodically polls the AD for objects with higher USNs than the last highest USN that successfully replicated with the 5.5 DS and replicates the new changes. There is a corresponding msExchServer2HighestUSNVector that is used to track replication changes from Exchange 5.5 to the AD. The exact mechanics of this process are described in article 253840.

 

The script I wrote:

 

#=================================================================================

# rollbackchecker by Jasper Kuria (Nov 11, 2005)

#

# Script to parse repadmin output and determine if we have a USN rollback

#

# Usage: perl rollbackchecker <repadmin output file>

#

# <repadmin output file> is generated using:

#

#     repadmin /showutdvec * dc=<domain name>,dc=<domain suffix>

#=================================================================================

 

 

$dcCount = 0;

$repadminPattern = "repadmin running command \/showutdvec";

$dcUSNPattern = "@ USN +[0-9]+ @";

$rollBackFound = 0;

 

# First pass to build up the (DC => Self USN) table

 

open(FILE, $ARGV[0]) or die;

while($line = <FILE>)

{

   if ($line =~ /$repadminPattern/o)

   {

      $currentDC = &GetCurrentDc($line);

      next;

   }

   if ($line =~ /$dcUSNPattern/o)

   {

       &AddDcUSNToTable($currentDC, $line);

   }

}

 

&PrintDcUSNTable;

 

# Second pass to check if for a given DC, some other DC has a higher USN for it.

 

open(FILE, $ARGV[0]) or die;

while($line = <FILE>)

{

   if ($line =~ /$repadminPattern/)

   {

      $currentDC = &GetCurrentDc($line);

      next;

   }

   if ($line =~ /$dcUSNPattern/)

   {

       &CheckAndReportRollBack($currentDC, $line);

   }

}

 

# report no rollbacks

if (!$rollBackFound)

{

   print "\nNo USN rollbacks were found\n";

}

 

 

#subroutines

 

sub GetCurrentDc {

 

      @tokens = split(/ +/, $line);  # split according to spaces

      $lastToken = $tokens[@tokens - 1];

      @fqdnTokens = split(/\./, $lastToken);

      $dcName = @fqdnTokens[0];

      $currentDc = uc ($dcName); #convert to uppercase for comparison

}

 

sub CheckAndReportRollBack {

 

        @tokens = split(/\s+/, $line);

        $usn = $tokens[3];

        @fqdnTokens = split(/\\/, $tokens[0]);

        $dcToCheck = uc($fqdnTokens[1]);

        if ($selfUSN = $dcUSNTable{$dcToCheck}) #does the DC have an entry in the table

        {

            if ($usn > $selfUSN)

            {

               $rollBackFound = 1;

               print "\n---------- rollback -----------------\n";

               print ("DC  $dcToCheck may have experienced a USN rollback. ");

               print("It's  Self Highest USN = $selfUSN but the remote DC $currentDC ");

               print("has a USN = $usn for $dcToCheck that is higher than what $dcToCheck has for itself\n");

            }

        }

}

 

sub AddDcUSNToTable

{

     @tokens = split(/\s+/, $line);

     $usn = $tokens[3];   

     @fqdnTokens = split(/\\/, $tokens[0]);

     $dcToCheck = uc($fqdnTokens[1]); #convert to uppercase

     if ($dcToCheck eq $currentDc)

     {

         $dcUSNTable{$currentDc} = $usn;

         $dcCount++;

     }

}

 

sub PrintDcUSNTable

{

   print "\nrepadmin /showutdvec ran successfully on $dcCount DCs\n\n";

   print "DC Name               Self Highest USN\n";

   print "=================     ==================\n";

   $num = 1;

   while (($dc, $selfUSN) = each(%dcUSNTable))

   {

       print ("$num.  $dc         $selfUSN\n");

       $num++;

   }

}

 

Thanks to Kent Dietz for his review of all this!

 

- Jasper Kuria

6 Comments
Not applicable
I love

"they may have done a system state restore on OURDCD"

They MAY have done one? I thought you said they had change control... If they can't tell you if they did a system state restore (exceedingly doubtful to the nth degree, they probably dropped an old disk image into place and called it a system state restore but regardless...) I would argue that they don't really have change control.


joe
Not applicable
hey man, that's what they told me :)
Not applicable
Great Post though... Loved reading it.
Not applicable
thanks. People like you are why I blog!
Not applicable
Understood on the thats what they told me. When I returned to IT and computers back in the mid-90's as a support tech my supervisor sat me down on the first day and said there are two rules you need to always remember here...

1. Believe none of what you hear and only half of what you see.

2. Users lie. They don't always intend to but it is the same result whether they intend it or not.

joe
Not applicable
true, true, Joe. I sometimes like to say "Trust but verify".

Version history
Last update:
‎Jul 01 2019 03:09 PM
Updated by: