SOLVED

Weird Printing Issue: Windows Shared Printers accessible/visible via Hostname but not via IP Address

Iron Contributor

Do you like challenges? Because boy do I have a challenge!

 

Since about 8-10 months ago we've been facing weird printer issues which culminated this month to a massive amount of errors that got most of the company involved, and that allowed us to identify the core issues: on some machines (~7-8%) sometimes after reboot something happens with the Print Spooler that makes it so that printers are not advertised/available via IP Address, only via hostname.

 

Specifically the issue presents itself in 3 ways:

 

  • When sending a print from a Linux server the server will get an error and the Event Log will have the following error "Automatic The Line Printer Daemon (LPD) service refused a print job from %LINUXSERVERIP% for printer \\%WindowsIP%\%PrinterName% because the specified printer does not exist on this computer."
  • When attempting to map a printer from Windows Explorer by going to the Windows machine with \\%WindowsIP%\ in Windows Explorer the printer will be visible but trying to add it will result in error "Operation could not be completed (error 0x00000709)." which is generally associated with KBKB5006670, but that's not installed on our machines and the first instances of the aforementioned error are from December 2021/January 2022 so way before that patch was even released
  • When running the powershell command Get-Printer -Computername %WindowsIP%. If the command is run with the Hostname of the machine then the result is correct (a list of shared printers), if it's run with the IP Address of the machine then it throws the following error:

 

 

+ CategoryInfo : NotSpecified: (MSFT_Printer:ROOT/StandardCimv2/MSFT_Printer) [Get-Printer], CimException
+ FullyQualifiedErrorId : HRESULT 0x8007007b,Get-Printer​

 

 

And the most annoying thing is: if you restart the Spooler Service the problem completely disappears until next reboot...

Research on Google hasn't resulted in much success, except for one lone unanswered message: https://hardforum.com/threads/weird-network-printing-problem.1635293/

There is an XKCD for everything, isn't there? https://xkcd.com/979/

Additional analysis has been performed with Procmon, Wireshark, Process Explorer, WinDbg and xbootmgr with the following results:

 

Procmon
- Analysis of spoolsv.exe during execution of Get-Printer %WindowsIP% from another computer shows no other actions other than the network communication
- Analysis of spoolsv.exe during addition of shared printer through Windows Explorer shows the network connection and some RegQueryKey for HKU\%SIDOFREMOTEACCOUNT% and HKU\.DEFAULT\Printers\Connections\,,%WINDOWSIP%,%PRINTERNAME% with result of "NAME NOT FOUND" but nothing else
- Attempt at analysis of spoolsv.exe during boot through the Enable Boot Logging was successful but useless due to the problem not appearing when booting with that option enabled
- Additional analysis has been attempted through the Stack Summary Function to trace the stack down from spoolsv.exe but the only noticeable difference in the thread that was common between the working and non-working procmon dump was the presence of an additional branch called EatAuthInfoFromPacket on the dump of the working service.

Wireshark
- Superficial analysis of the traffic flow while executing Get-Printer from a remote machine shows winspool_AsyncEnumPrinters request and winspool_AsyncEnumPrinters response with protocol IREMOTEWINSPOOL, but no additional information and the stub data appears encrypted so I'm unable to garner additional information from it

Process Explorer
- Superficial analysis has been done on the spoolsv.exe process and its Threads and Stack and the only interesting point was that in the Strings of the spoolsv.exe process there was \\machinehostname and \\machinehostname.domain.com when it was broken and nothing when it wasn't. But I have to admit my knowledge of Windows Internals is insufficient to fully make heads and tails of it. OpenAI has been helping with the explanations though!

WinDbg
- Debugger has been attached to the spoolsv.exe process and testing done with both Get-Printer from a remote machine and attempting to map the printer through Windows Explorer, but in both cases no messages Debug messages were visible during execution. Additionally, I've created a process dump from Process Explorer and fed it to WindDbg to run the !analyze command, but it only returned a breakpoint, no actual error. Same as before, I'm new to this tool, so if you have any suggestions I'll be happy to take them!

xbootmgr
- xbootmgr -trace boot -traceflags dispatcher+latency -stackwalk readythread+threadcreate+profile+cswitch has been run to debug the service during boot but, same as Procmon, when the machine reboots with this tracing on the problem doesn't present itself so the output is fairly useless

 

And thus this is the summary. I'm a bit lost, and neither Google nor OpenAI appear to have any idea of what's happening here, so I'd appreciate any insights you might provide on additional troubleshooting for this issue or perhaps a resolution if you've faced it before.

1 Reply
best response confirmed by raindropsdev (Iron Contributor)
Solution

If someone else has this issue, here is the answer from Microsoft:

 

The Cause (back story) is multi-tiered. This is the basic rundown. Service Startup There have been many changes in WIndows to improve service startup times at boot. This allows some services to start earlier than they did in the past. A service may fail to start if it has a network dependency and times out before DAD is complete and the interface and IP are ready for use. Hardware Improvements in computer hardware is a major factor. All modern processors have multiple cores/threads, allowing parallel processing of operating system tasks. The speed of processors has also increased dramatically. These changes allow an operating system like Windows to perform multiple tasks faster and in parallel, and thus dramatically improve service start time. The amount of available RAM has increased quickly. This means less paging to disk, also improving startup time. The most significant improvement has been to storage. Storage is now primarily flash-based (SSD), and commonly uses a high speed NVMe interconnect. Even storage backends, like a NAS or SAN, are all-flash based these days. IO, latency, and throughput improvements between old spinny disks (HDD) and NVMe SSDs is in an order of magnitude of about 150+ times faster. This change happened in a time span of less than 10 years. The Result Prior to the improvements it took double-digit seconds for services to start on boot. Back when DAD was first added to Windows it could take minutes before all the services were ready.
 
Compensating for DAD was not necessary, so most code simply ignored the IP address state. The combined change of service startup behaviour and recent hardware improvements have allowed service startup to take single-digit seconds. Well before an IP address is ready to use, based on Windows behavior and RFC requirement. Simply changing the DAD transmit default for IPv4 to 1 is not a long-term solution. As hardware and service continue to improve it is feasible that even a single second's delay will be enough to cause a service failure at boot. Services experiencing the issue must be changed to monitor the IP address state prior to attempting a network connection or binding to an IP. Known Issues This is a list of common issues that CSS may face related to DAD and service startup. Service Using a Domain Account Fails to Start Services using a domain account have a special dependency on the network being ready and accessible to perform authentication with the domain controller. The service will not start without being authenticated. When the service starts and times out faster than it takes for the network to be ready, which is typically related to waiting on DAD to complete, then the service will not start. This is commonly seen with SQL Server, but it can happen to any service using a domain account for logon.


This issue can worked around by reducing the number of DAD transmits, disabling DAD, or setting the Recovery option on the service to restart. See the workaround:

Service Cannot Bind to an IP Address on Start This issue happens when the service tries to bind a service to an IP address but times out or errors out before the network is ready. Again, this typically happens because of the DAD wait. The network stack cannot bind a service to an IP address that is not in a Preferred state. This issue is seen often with the spooler (Print Server) service. Issues like this can be worked around by disabling DAD or setting the service startup to "Automatic (Delayed Start)". Other workarounds may not work when the service doesn't fail/stop, it simply continues without a service binding to an IP address. IPv6 Addresses Disappear from DNS Server on Reboot The DHCP client may request DNS registration before IPv6 DAD is complete. When this happens the IPv6 address is deleted/disappears from the DNS server during Dynamic DNS updating by the DNS client.

I hope this will help you and I would like to draw your attention to the fact that at the moment there is no final solution found.

1 best response

Accepted Solutions
best response confirmed by raindropsdev (Iron Contributor)
Solution

If someone else has this issue, here is the answer from Microsoft:

 

The Cause (back story) is multi-tiered. This is the basic rundown. Service Startup There have been many changes in WIndows to improve service startup times at boot. This allows some services to start earlier than they did in the past. A service may fail to start if it has a network dependency and times out before DAD is complete and the interface and IP are ready for use. Hardware Improvements in computer hardware is a major factor. All modern processors have multiple cores/threads, allowing parallel processing of operating system tasks. The speed of processors has also increased dramatically. These changes allow an operating system like Windows to perform multiple tasks faster and in parallel, and thus dramatically improve service start time. The amount of available RAM has increased quickly. This means less paging to disk, also improving startup time. The most significant improvement has been to storage. Storage is now primarily flash-based (SSD), and commonly uses a high speed NVMe interconnect. Even storage backends, like a NAS or SAN, are all-flash based these days. IO, latency, and throughput improvements between old spinny disks (HDD) and NVMe SSDs is in an order of magnitude of about 150+ times faster. This change happened in a time span of less than 10 years. The Result Prior to the improvements it took double-digit seconds for services to start on boot. Back when DAD was first added to Windows it could take minutes before all the services were ready.
 
Compensating for DAD was not necessary, so most code simply ignored the IP address state. The combined change of service startup behaviour and recent hardware improvements have allowed service startup to take single-digit seconds. Well before an IP address is ready to use, based on Windows behavior and RFC requirement. Simply changing the DAD transmit default for IPv4 to 1 is not a long-term solution. As hardware and service continue to improve it is feasible that even a single second's delay will be enough to cause a service failure at boot. Services experiencing the issue must be changed to monitor the IP address state prior to attempting a network connection or binding to an IP. Known Issues This is a list of common issues that CSS may face related to DAD and service startup. Service Using a Domain Account Fails to Start Services using a domain account have a special dependency on the network being ready and accessible to perform authentication with the domain controller. The service will not start without being authenticated. When the service starts and times out faster than it takes for the network to be ready, which is typically related to waiting on DAD to complete, then the service will not start. This is commonly seen with SQL Server, but it can happen to any service using a domain account for logon.


This issue can worked around by reducing the number of DAD transmits, disabling DAD, or setting the Recovery option on the service to restart. See the workaround:

Service Cannot Bind to an IP Address on Start This issue happens when the service tries to bind a service to an IP address but times out or errors out before the network is ready. Again, this typically happens because of the DAD wait. The network stack cannot bind a service to an IP address that is not in a Preferred state. This issue is seen often with the spooler (Print Server) service. Issues like this can be worked around by disabling DAD or setting the service startup to "Automatic (Delayed Start)". Other workarounds may not work when the service doesn't fail/stop, it simply continues without a service binding to an IP address. IPv6 Addresses Disappear from DNS Server on Reboot The DHCP client may request DNS registration before IPv6 DAD is complete. When this happens the IPv6 address is deleted/disappears from the DNS server during Dynamic DNS updating by the DNS client.

I hope this will help you and I would like to draw your attention to the fact that at the moment there is no final solution found.

View solution in original post