SOLVED

Authenticating to a RoDC is unsuccessful

Brass Contributor

I have the requirement to create a segregated network for a group of my users.  The network will contain 1 file server, an RoDC and a bunch of workstations. 

 

The workstations have no connectivity to any RWDC, however the File Server and RODC do have and should always have connectivity as these are dependent on a local connection through a firewall and do not require a VPN or WAN link to be available.

 

Replication is working between my RWDC and the RoDC (confirmed by DNS updates and AD Group Changes are successfully replicated).

 

However, I am still unable to log in to a workstation on the same network as the RoDC.

 

Here are the facts that I know:
* On the workstation, Network Location Service is not detecting the Domain (Sets network to Private)

* Appropriate users and workstations have been added to the Password Replication Policy (though as I understand it this should not be required as the RODC has connectivity to the RWDC)

* Appropriate users and workstations have been "pre-populated"

* RODC is a Global Catalog

* IP Address for the workstation is issued via DHCP on the File Server, with DNS entry pointing to the RODC.

 

I don't understand why this is not working.  Am I missing something?

11 Replies

@BrentStobbs 

 

Without any specific diagnostic material to inform us, we're just guessing here.

 

Something that comes to mind is whether the clients and even the RODC itself have come to the conclusion that they're actually in the same site. There's numerous ways to check this but I find an easily-accessible one is running the following from PowerShell (administrative elevation not required.)

 

 

[System.DirectoryServices.ActiveDirectory.ActiveDirectorySite]::GetComputerSite()

 

 

Other sources of useful information would be from running dcdiag.exe on the RODC and checking the System event log on the clients for errors reported by the NETLOGON source, etc. There's obviously others, but any errors from these would help us provide more specific feedback.

 

Here's some literature that offers further insight into specific processes and how they behave in an RODC context, which may give you additional clues on what you might like to investigate next.

 

In-depth process reviews and important RODC DNS records:

 

More generalised impact articles not specifically related to authentication issues:

 

Cheers,

Lain

Hi Lain,

Thanks for your reply, the links you provided confirm that what I am trying to achieve is not unusual, and confirm that my understanding of the RODC function is correct with the only potential issue is the dynamic update of DNS records which is a bridge I can cross in the future. This should not block authentication.

I should also mention that I have a Windows 2019 server in the perimeter network with the RODC that allows authentication and the NLA service correctly assigns the connection as DomainAuthenticated. However, at some point I had allowed this server to communicate directly with a RWDC (though communications are now blocked).

Running the powershell command you suggested returns a "Domain cannot be contacted" error on the workstation. Running it on the RODC (or other connected server) confirms the correct site.

If I do a lookup for my domain using NSLOOKUP (in either site), the RODC is not listed. Shouldn't it be listed here?

@BrentStobbs 

 

Hi Brent,

 

Without knowing the explicit query you're running through nslookup, I'm not sure. I can only generalise and say if you're looking for SOA or NS records, then no, it won't be there.

 

RODCs do not create NS (and SOA) records, as described below. Queries for those are going to resolve back to a writable domain controller, but this is something of an aside to your authentication issue and isn't the cause of the "domain not found" - at least not on its own.

 

Plan DNS Servers for Branch Office Environments | Microsoft Docs

 

Again, without any actual errors from the clients or RODC to work with, or supporting configuration information from things like ipconfig /all, I'm guessing, but my gut feeling is that your RODC isn't happy about something.

 

Perhaps have a read of the following and see what the nltest.exe command within returns both from your RODC as well as the impacted clients.

 

RODC logs DNS event 4015 with error code 00002095 - Windows Server | Microsoft Docs

 

I'd say start with comprehensively assessing your RODC using things like dcdiag.exe and critical event logs before working backwards to the clients themselves, where things like NETLOGON and SCHANNEL events within the System log are going to be quite useful.

 

One DNS tip I'll toss out there as yet another guess is be careful of DNS search suffixes, if you're using them. I've seen issues with DNS client timeouts where they haven't been implemented efficiently.

 

Similarly, with your nslookup checks, ensure you're testing with and without a trailing period on the hostname, as that is often useful for tracking the impact of search suffixes.

 

For example, my domain is robertsonpayne.com. Running:

 

nslookup -type=SOA robertsonpayne.com.

 

Will instruct the DNS client to avoid iterating through any search suffixes, resulting in a different (explicit and therefore more efficient) question being sent to the DNS server, where leaving off the final period does not (meaning all search suffixes are checked and devolved until they resolve.)

 

Cheers,

Lain

@BrentStobbs 

 

Also, when run on the RODC, did the GetComputerSite() return its own site, or some other site?

 

I'm working under the assumption that the RODC has been put into its own site and the current subnets (particularly those the clients are on) have been assigned to that site. If that isn't the case, then that gets back to what I was fishing for earlier with that command, which is site discovery leading the clients to try and talk across the firewall.

 

Cheers,

Lain

Hi Lain

Thank you for your ongoing assistance with this. You assumption is correct. The RoDC has been put into it's own site with the appropriate subnet and IP Link configured.

Replication between sites is working, as I can add/remove users to my administration group which allows logon to the DC, and this is accurately reflected after initiating a replication.

I wondered if something went amiss when setting up the RoDC, so I Promo'd it down and then DCPromo'd it again, but still the same issue with the exception of the cached data has gone (which is a good thing for troubleshooting imo).

When I run GetComputerSite() on my RoDC it correctly returns the site information (I have censored some information out with *s):

Name : ***Isolation
Domains : {}
Subnets : {192.168.2.0/24}
Servers : {******RODC.local.domain.com}
AdjacentSites : {****-Corporate-Lan}
SiteLinks : {***-Isolation}
InterSiteTopologyGenerator :
Options : None
Location :
BridgeheadServers : {*******RODC.local.domain.com}
PreferredSmtpBridgeheadServers : {}
PreferredRpcBridgeheadServers : {}
IntraSiteReplicationSchedule :

When I run this same command from the client workstation I get "The specified domain either does not exist or could not be contacted"

IPCONFIG /ALL on my RoDC (IP Addresses have been changed from the real addresses):

Connection-specific DNS Suffix . :
Description . . . . . . . . . . . : Microsoft Network Adapter Multiplexor Driver
Physical Address. . . . . . . . . : ********
DHCP Enabled. . . . . . . . . . . : No
Autoconfiguration Enabled . . . . : Yes
Link-local IPv6 Address . . . . . : fe80::e5a5:b81d:9ebe:4303%4(Preferred)
IPv4 Address. . . . . . . . . . . : 192.168.200.10(Preferred)
Subnet Mask . . . . . . . . . . . : 255.255.255.0
Default Gateway . . . . . . . . . : 192.168.200.4
DNS Servers . . . . . . . . . . . : ::1
127.0.0.1
192.168.100.10 (DC1 in Corporate Site)
192.168.100.15 (DC2 in Corporate Site)

IP Configuration on my client is set by DHCP with the a single DNS server being the RoDC.

The NSLOOKUP command I ran was NSLOOKUP LOCAL.DOMAIN.COM. When run on the RoDC or Client Workstation this returns the correct IP addresses of all my RWDCs (of which I have 4 in 3 Sites), but not the RoDC.

Based on this thread https://social.technet.microsoft.com/Forums/lync/en-US/22499c64-6016-4be5-8cb5-f538b49dd321/rodc-not... the RoDC should be listed here. I can confirm there is a PTR record for the RoDC in the Reverse Lookup Zone.

Additionally if doing a lookup for _ldap._tcp.dc._msdcs.local.domain.com after setting the type to all, only the RWDCs are listed. The same results occur when running this on my primary DC or within the RoDCs site.

The reason I'm fixated on this being the issue, is if a client workstation connects to the site while all connections to any RWDC are unavailable, the client still has to be able to query DNS to locate the RoDC before it can determine what site it is in and ask the appropriate DC (or RoDC) for authentication.

Currently my workstations are not aware there is a connectable RoDC that can tell them what site they are in.

I'm assuming that I am able to log on to the RoDC because it is knows it is a DC and doesn't need to go looking for one.

Or am I barking up completely the wrong tree?

@BrentStobbs 

 

Starting with a quick answer to the last question first in relation to logging directly onto the RODC: it's the same process as logging onto a client as the RODC - prior to caching the credential for the first time - does not have any user credentials locally.

 

So, it still has go through precisely the same process of talking to a writable domain controller and pulling down the credentials to cache (via the single object replication request and assuming it's eligible for doing so via the password replication policy) before doing the "normal" thing thereafter of only needing to check itself for subsequent logons.

 

That being the case, my gut feeling at this stage is that your successful logon is going to be because the RODC is communicating reliably with the other writable domain controllers, where something is still amiss from the client perspective.

 

I'll have a proper read of the full details in a bit and see if that jogs anything in my memory, too.

 

Cheers,

Lain

 

 

Okay, so I've done some more homework but it left me with a great area around the non service discovery-related DNS records, for which I can't find documentation as clear as that for service location records (which I linked previously). As such, I figured it'd be faster just to knock up an RODC quickly.

 

The short version is that it is indeed as I said before (in relation to the MS doco, of couse), which is you do not get an NS or SOA record for an RODC, only for the writables.

 

I didn't speak to A or PTR records but here's another "short version": Both exist for the host but not the domain/forest name, which directly contradicts the old TechNet post you've linked above.

 

Here's a quick screenshot showing the writable (testdc01.robertsonpayne.net) and the RODC (rodc01.robertsonpayne.net) where you can see - as per the MS article in DNS service location records - that the RODC only features within the site discovery scopes, not the wider domain/forest scopes.

LainRobertson_0-1651475952563.png

 

With respect to marrying up the subnet definition (192.168.2.0/24) from the first command and the ipconfig from the second (192.168.200.0/24), they aren't the same subnet associations. But then, I know you obfuscated the details so I'm not sure if that's an oversight or not. Just figured it was worth a quick mention.

 

If they don't match then that would result in the clients trying to talk across the site, but as I say, it may be just an issue in this post.

 

I'd make sure that:

  1. The site-specific service locator DNS records do indeed resolve against the RODC;
  2. Check the output of nltest /dsgetdc:yourdomain.local on the RODC and clients (clients may fail it from what you've said);
  3. Check the output of nltest /dnsgetdc:yourdomain.local /site:****Isolation /sitespec (where ****Isolation is your site name of the RODC from above);
  4. Check the System event log on the clients.

 

The /dnsgetdc variation could prove interesting from the client as you might get nothing back or possibly a writable domain controller via automatic site coverage (though you shouldn't.) This test specifically relates to which domain controllers have registered service location records specifically within the ****Isolation site in DNS.

 

If you do get more than just your RODC back in the dnsgetdc variation, then that's going to be one issue (or the issue.)

 

Cheers,

Lain

Thank you again for your on-going assistance here. Yes, the difference in the subnet was simply an oversight in the obfuscation.

My DNS resource records essentially match the screenshot you have posted, and the nltests returned pretty much what you'd expect, correctly working on the RoDC with the correct site and domain information, likewise on my server in the same network, but the client is still unable to find the domain.

I think I have narrowed down the issue, but still not sure what I am overlooking, and it doesn't make sense.

If I run NSLOOKUP and search for the domain (local.domain.com), it lists the (writable) DCs, which we have determined is what is expected. However, if I attempt to ping the domain it cannot be found. It doesn't resolve the name.

If I ping the full DNS name of the DC (PDC.local.domain.com), it still fails to resolve the name.

Back on the RoDC and connected/working member server, I can ping both the domain name and the full DNS name of the DC. The DC and Member Server are both configured to use the DNS server on the RoDC.

So given this information, I opened up the network to allow the client workstation to connect to the DNS in the Corporate Site. The workstation still fails to resolve the name of the DC or Domain.

* The workstation can successfully ping the DC by IP Address.
* The workstation can successfully resolve other DNS zones from my DNS server (Domain.com, Domain.local, etc) and the forwarded requests to the internet come back successfully (i.e. ping www.google.com) but just cannot resolve local.domain.com.

There must be something I'm overlooking, but I do not know what that could be.

@BrentStobbs 

 

No worries.

 

First, are the clients able to reach both UDP 53 and TCP 53 of the RODC as well as writable domain controller you just opened up access to?

 

If it's only UDP 53 and not TCP 53, you will run into issues with queries that have results sets larger than 512 bytes, as the Windows DNS client will cut over to TCP 53 and re-issue the query by default in such circumstances.

 

I only mention that as an aside though, as for the explicit host query for "pdc.domain.local", that shouldn't/wouldn't have been big enough.

 

Given your member server is passing all the tests, I'm more suspect on the workstation side of things, specifically, what questions they're asking.

 

Putting the quick check from above aside then, can you run the following for me from an impacted workstation within a PowerShell console? Administrator rights are not required.

 

Get-DnsClientGlobalSetting;

nslookup -type=ALL domain.local. <IP-of-pdc.domain.local>

nslookup pdc.domain.local. <IP-of-pdc.domain.local>

nslookup rodc.domain.local. <IP-of-pdc.domain.local>

 

I'm curious to know what resolves and what does not. I've also mentioned search suffixes before, and the command on line 1 will show any that are configured. Here's an example of the output from line 1.

 

LainRobertson_0-1651555656934.png

 

Here's example output for line 5 (which asks for all matching record types, making the bottom part of the results a bit busy, but useful) using my own test environment:

 

LainRobertson_1-1651555995464.png

 

You should see your "pdc.domain.local" in the results, all things being equal. PS: I don't need to see these results, it's just something for you to check given you said "pdc.domain.local" isn't resolving from the clients even when talking directly to "pdc.domain.local".

 

Be sure to include the trailing period after each hostname as show on lines 3, 5 and 7. This is important as it instructs the DNS client to not append any client search suffixes.

 

So, if we let the IPv4 address of your "pdc.domain.local" = 192.168.1.1, then an example using line 3 from above would look like:

 

nslookup domain.local. 192.168.1.1

 

You can also use the -debug switch on nslookup to see what impact devolution is having (though this requires leaving the trailing period out.) For example, slightly changing line 3 from above to the following will show if the initial question is actually for domain.local or something else based on any search suffixes.

 

nslookup -debug domain.local 192.168.1.1

 

Cheers,

Lain

best response confirmed by BrentStobbs (Brass Contributor)
Solution
Thank you again Lain for your quick and thorough response.

I have now resolved the issue and I am feeling quite stupid about it. The workstations (both of them) I was testing with were configured for DirectAccess. As the NLS is not available on the network, it was trying to connect via Direct Access and hence using NRPT to resolve the domain name, additionally DirectAccess was not able to reach the Domain Controllers, and as it therefore failed to connect, it was causing the DNS issues.

I have removed the DirectAccess configuration for the workstation and things started working as expected.

I am going to mark this response as the best answer, but I do want you to know that it was in no small part due to your assistance. I would not have come to this conclusion without the hints you have provided and your guidance with troubleshooting. Sincerely, thank you very much, I was really struggling.

No worries! It's always a relief when you find a solution, and there aren't any silly ones.

You wont be the last person to be caught out by name resolution policies, and as of Server 2016, there's now also DNS Server policies to add into the mix of "things you can't see that are undermining you." Lots of fun to be had troubleshooting DNS in this day and age.

Cheers,
Lain

1 best response

Accepted Solutions
best response confirmed by BrentStobbs (Brass Contributor)
Solution
Thank you again Lain for your quick and thorough response.

I have now resolved the issue and I am feeling quite stupid about it. The workstations (both of them) I was testing with were configured for DirectAccess. As the NLS is not available on the network, it was trying to connect via Direct Access and hence using NRPT to resolve the domain name, additionally DirectAccess was not able to reach the Domain Controllers, and as it therefore failed to connect, it was causing the DNS issues.

I have removed the DirectAccess configuration for the workstation and things started working as expected.

I am going to mark this response as the best answer, but I do want you to know that it was in no small part due to your assistance. I would not have come to this conclusion without the hints you have provided and your guidance with troubleshooting. Sincerely, thank you very much, I was really struggling.

View solution in original post