Forum Discussion
Windows 2016 Server stops DNS DHCP to Windows 10 PC's randomly
- Nov 14, 2020
I finally ended up solving the problem.
TL;DR - The Comcast Modem (IP: 192.168.1.1 - working perfectly as the gateway, full speeds, no packet loss) ALSO had a completely useless mysterious SECOND IP address at 192.168.1.254. The odd second IP (.254) also has an odd MAC address bound to it: 00:05:04:03:02:01.
I did so by giving up... and then preparing to wipe the system and start over from scratch. I appreciate the tips and recommendations, unfortunately none of them lead me to the solution, or were solutions I didn't feel were likely to solve the problem and would be a huge investment of time that either I ate as a loss, or provide the client with an astronomical bill for their 4 people business.In doing this careful prep I had to make sure I had a functional backup of the machine. As the client had a hard time identifying the software that they even used, let alone the supporting components; I knew I would be blamed for things being deleted - or spend the next 6 months hearing "but the files were there.." So the plan was to convert the server into a virtual machine. I could then rebuild the server from scratch; while still being able to reference (or even use until the rebuild was complete) the cloned - identical system running on the virtual machine. That way when I was told "These files are missing, they were right here!" on the old server, I could power it up and say "nope, never were there." Further, if I failed to export some critical component, I could also go back and still export/copy the item. - This was pretty much the only way I could nuke/reload a system with so many unknown aspects; and again - it would have allowed the process to happen with 0 downtime for the client. It's quite a bit of extra work, but it would have solved so many problems.
Once running on the virtual machine - I had did a few tests to ensure the virtual machine was running properly. I didn't want to nuke the actual server until I was confident we were all set. This included testing the networking components. I had a persistent ping running from one of the client machines to the server. I happened to be shutting down, or rebooting the server - and out of the corner of my eye, more by accident than anything - I noticed the "server" continued to respond to every ping; even though it should have dropped a few on reboot/shutdown.
Confused - I proceeded to entirely shutdown the server....... Pings still continued to respond on the server's IP address.
I said out-loud (at 4AM in the empty office) "What the ....." several times.
I disconnected the network physically from the VM. Pings still responded. Now... I've checked every freking item in this office multiple times for their IP's. I KNEW nothing in the office was on that IP address..... or so I thought.
After disconnecting every system, every printer, every wireless access point, I was still getting a response. At this point it was the modem, switch and client PC. It wasn't the switch; I had it's IP and could log into it. But just to be safe, I bypassed the switch. Modem to PC directly. Still responding. Ovbsioly shutting down the modem stopped the responses (I mean how could it not at this point, everything else was disconnected.)
And sooooooo we have it. The Mother ..... Comcast modem has a ghost IP Address. The modem / gateway was indeed 192.168.1.1; as expected. It ALSO had an IP of 192.168.1.254. This is a standard ARRIS modem found in residential locations all the time. I could identify no purpose for this mysterious second IP address. It wasn't serving comcast's "free xfinity wifi" that they like to do on their modems. This functionality was not active on the modem, wifi was disabled, etc. The Mac Address being something odd, 00:05:04:03:02:01, was also a clue something was amiss. Googling around; using the mac address to form a search that would turn up something useful lead me to other people that had encountered the same mysterious secondary IP address on their modem. They had contacted comcast - to which they of course were told "we don't know". Someone thought it had to do with some Cisco firmware and WPS.
I couldn't do anything about the STUPID mysterious second IP address on the modem; but I could move the server to a free, different IP address. After that....... everything immediately, and continues to work perfectly.
Previous techs/companies couldn't figure it out, and I had spent literally months trying to solve this completely unexplainable issue that provided so many confusing, contradictory results in diagnostics. To this day, I still have no clue what this second mysterious IP address is about. my over 25 years of working with tech, including years working FOR the cable company, in the modem department when they had just began rolling out, I have never seen a modem with a 2nd mysterious private IP address on the same subnet, both belonging to the modem. But alas, as we all know, having something else share/fight over the same IP address as the server, is never going to go well.
Thanks again for those that offered suggestions - I can say that I am glad I didn't spend a significant amount of time tracking down / resolving every server error; as that didn't logically explain some of the working/not working issues. With this mysterious IP we can at least theorize that going through a switch added a touch of latency; where a computer got a faster response from the server, allowing the proper route to be created; or other things of that nature in networking.
I wanted to post this in detail to hopefully help someone else one day with similar issues. I took the time to include the bit about turning it into a virtual machine, and ensuring all was working properly before nuking the server and reinstalling from scratch - as it may provide an idea or route for others needing to wipe/reload a server they are working on for any reason; to provide more up-time, less disruption and more confidence that all the data exists and is in place. Nothing is worse than thinking you have something backed up only to later find out you didn't, or it was corrupt, or you didn't export a file you needed before wiping the machine.
It was also an epic IT tale; at least for me. I've never had an issue I *couldn't* solve to this point. Sure, there's been many where it would be too costly (in time, or hardware) and it was decided to not solve said problem -- but not having even a clue as to what's going on? That was bothersome.
Cheers and thanks again to everyone for their shots at the issue. I truly appreciate your volunteerism.
Dave PatrickAlso - there is only 1 DC/server here. I don't know what happened before they called me in to replace 3 computers in January - but perhaps (inferring from logs) there was another server that no longer exists.
Thanks again for any input/suggestions you may have.
I didn't see remnants of failed or removed domain controllers but you can easily check for and remove them following along here.
https://docs.microsoft.com/en-us/windows-server/identity/ad-ds/deploy/ad-ds-metadata-cleanup