DC replication only from remote to on-perm

Copper Contributor

Hi,

 

I have a setup with one DC on premises and two at the remote site. Recently I noticed that remote DC's are not replicating. If however I do some changes on remote DC it will relocate across all 3 DC's.

 

Also whatever I try to access from remote DC's onto on-perm one, it is failing, shares, Remote Desktop, change DC in Active Directory Users and Computers, running various tests. Only ping is passing trough.

 

From on-perm one I can access everything on the remote site.

 

Windows WF is off on all DC's for testing, there are no rules on physical FW's that would block any traffic over the tunnel as well.

 

I am getting various errors when trying to establish connection or replicate, mostly 1722 or The naming context is in the process of being remover if I try to replicate using AD Sites and Services.

 

Any thoughts?

2 Replies

@Milan_Najerica 

 

Yep, what @Dave Patrick said.

 

I would triple-check your firewall as error 1722 relates to RPC, which depends on port TCP 135 access, which a lot of uninformed firewall administrators do not let through. It's an incredibly common scenario.

 

Dave referenced PortQry already. You want to pull that down and run this specific command for checking RPC:

 

portqry -n [yourRemoteDcFQDN.goes.here] -e 135

 

For example, from one domain controller, you'd test connecting to the others using:

 

portqry -n otherdc01.mydomain.com -e 135

 

You really shouldn't have the Windows Firewall disabled - not even for testing purposes - on any host, but particularly for a domain controller. A much safer approach is to leave the Windows Firewall enabled and create a manual "allow all" rule scoped to IP address of the remote domain controllers (i.e. as entries in the "Remote IP address" list box within the "Scopes" tab.)

 

An "allow all" rule lets everything through while also affording some additional baked-in protections from the Windows Firewall. But I only mention this as an aside.

 

"The naming context is being removed" is expected in a 1722 scenario in any site that has more than a single domain controller.

 

Every domain controller has a process it runs called the "Knowledge Consistency Checker" - aka the KCC. In your site with two domain controllers, the KCC on one will try to connect to the remote domain controller from the other site and when it fails, it'll flag the "naming context" - also known as partitions - for removal. What will also happen is the second local domain controller will have a go at trying to connect to that remote domain controller, and so the cycle will repeat since neither can connect.

 

This results in a cycle of "naming context is being removed" errors listed by tools such as repadmin or the Directory Service event log. Even so, the only thing important to understand here is that this is a symptom and not a root cause.

 

Once you fix the basic connectivity issue causing the RPC failure, the symptoms may resolve themselves. I say may because it's also possible that if this has gone on for a while, there'll be knock-on issues with DNS and a myriad of other things.

 

With those kinds of considerations in mind, once the RPC connectivity is restored, it'd be prudent to have all three domain controllers have the DNS client entries (i.e. on the network interfaces) pointed to a single domain controller just until replication has succeeded across all three (or however many) domain controllers.

 

After this has succeeded, give it another couple of hours to be really sure important partitions like the Configuration and DomainDnsZones have replicated (they'll be listed in "repadmin /showreps" with a "successful" status) and you should be free to put the DNS client address back to whatever your preferred domain controller IP addresses are (don't point them to anything other than a domain controller within the same forest unless you know how to deal with knock-on impacts to service location records).

 

Anyhow, that's just a longer version of what Dave alluded to, and it is almost guaranteed to be a firewall (okay, maybe routing but that's less common) issue.

 

Cheers,

Lain