Semaphore timeout when running a backup where source and destination is on the same Storage Space

%3CLINGO-SUB%20id%3D%22lingo-sub-2226071%22%20slang%3D%22en-US%22%3ESemaphore%20timeout%20when%20running%20a%20backup%20where%20source%20and%20destination%20is%20on%20the%20same%20Storage%20Space%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-2226071%22%20slang%3D%22en-US%22%3E%3CP%3EDear%20Server%20%2F%20Storage%20Team%2C%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3EI%20am%20running%20a%20lab%20to%20get%20familiar%20with%20Windows%20Server%202022%20LTSC%20preview%20to%20get%20familiar%20with%20the%20new%20or%20changed%20features%20but%20also%20to%20help%20you%20hunting%20issues.%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3EFor%20a%20very%20long%20time%20now%20I%20faced%20issues%20with%20%22semaphore%20timeouts%22%20reported%20by%20the%20Macrium%20backup%20solution%20and%20Ned%20Pyle%20and%20Vlad%20helped%20me%20to%20narrow%20down%20that%20the%20issue%20is%20not%20a%20SMB%20issue%20but%20infact%20a%20problem%20with%20the%20storage%20subsystem.%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3EHere%20Storage%20Spaces%20consisting%20of%20two%20WD%20Red%20Plus%206%20TB.%20mirrored%2C%20ReFS%203.6.%3C%2FP%3E%3CP%3EIn%20previous%20releases%20of%20Windows%20Server%20the%20issue%20also%20affected%20the%20scenario%20when%20clients%20tried%20to%20backup%20to%20a%20SMB%20share%20causing%20the%20same%20issue.%20This%20improved%20over%20releases%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3ENow%20only%20one%20issue%20remains.%20Here%20is%20the%20Feedback%20hub%20item.%26nbsp%3B%3C%2FP%3E%3CP%3E%3CA%20href%3D%22https%3A%2F%2Faka.ms%2FAAbm84d%22%20target%3D%22_blank%22%20rel%3D%22noopener%20noreferrer%22%3Ehttps%3A%2F%2Faka.ms%2FAAbm84d%3C%2FA%3E%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3EScenario%3A%3C%2FP%3E%3CP%3EHyper-V%20Host%20Hosting%20various%20VMs%26nbsp%3B%3C%2FP%3E%3CP%3Eone%20VM%20data%20disk%20(VHDX)%20is%20located%20on%20a%20Storage%20Spaces%20Pooling%20using%20Dedup%20(dedup%20will%20run%20outside%20backup%20times)%3C%2FP%3E%3CP%3EWhen%20the%20backup%20application%20runs%20it%20will%20write%20the%20data%20from%20this%20vhdx%20via%20Block%20Level%20(tested%20with%20or%26nbsp%3B%20without%20CBT)%20to%20a%20file%20share%20(driveletter%20A)%3C%2FP%3E%3CP%3EThe%20timeout%20will%20occour%20reproducibly%20and%20so%20preventing%20a%20backup.%26nbsp%3B%3CBR%20%2F%3E%3CBR%20%2F%3EIf%20I%20try%20the%20same%20and%20do%20not%20use%20the%20SMB%20Share%20located%20on%20a%20volume%20on%20the%20Storage%20Spaces%20Pool%20the%20backup%20is%20fine.%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3ESource%3A%20VM%20%26gt%3B%20Data%20disk%20(driveletter%20N)%20located%20on%20Storage%20Pool%201%20volume%20(driveletter%20H)%3C%2FP%3E%3CP%3ETarget%3A%20SMB%20Share%20(driveletter%20A)%20located%20on%20Storage%20Pool%201%3C%2FP%3E%3CP%3EError%3A%20semaphore%20timeout%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3Emore%20details%20in%20the%20feedback%20hub%20from%20our%20investigation%3CBR%20%2F%3E%3CBR%20%2F%3Eunaffected%20builds%3C%2FP%3E%3CP%3EWindows%20Server%202019%20LTSC%201809%2017763.x%3C%2FP%3E%3CP%3EWindows%20Server%20vNext%20LTSC%20b20215%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3Eaffected%20builds%3A%3C%2FP%3E%3CP%3Eall%20other%20Windows%20Server%20vNext%20%2F%202022%20LTSC%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3Ethe%20issue%20happen%20very%20likely%20when%20the%20backup%20amount%20is%20high%20(20-40%20GB%20will%20work%20fine)%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3C%2FLINGO-BODY%3E%3CLINGO-LABS%20id%3D%22lingo-labs-2226071%22%20slang%3D%22en-US%22%3E%3CLINGO-LABEL%3EHyper-V%3C%2FLINGO-LABEL%3E%3CLINGO-LABEL%3Eissue%3C%2FLINGO-LABEL%3E%3CLINGO-LABEL%3EStorage%3C%2FLINGO-LABEL%3E%3CLINGO-LABEL%3EStorage%20Spaces%3C%2FLINGO-LABEL%3E%3C%2FLINGO-LABS%3E
Frequent Contributor

Dear Server / Storage Team,

 

I am running a lab with Windows Server 2022 LTSC preview to get familiar with the new or changed features but also to help you hunting issues.

 

For a very long time now I faced issues with "semaphore timeouts" reported by the Macrium backup solution and Ned Pyle and Vlad helped me to narrow down that the issue is not a SMB issue but infact a problem with the storage subsystem.

 

Here Storage Spaces consisting of two WD Red Plus 6 TB. mirrored, ReFS 3.6.

In previous releases of Windows Server the issue also affected the scenario when clients tried to backup to a SMB share causing the same issue. This improved over releases

 

Now only one issue remains. Here is the Feedback hub item. 

https://aka.ms/AAbm84d

 

Scenario:

Hyper-V Host Hosting various VMs 

one VM data disk (VHDX) is located on a Storage Spaces Pooling using Dedup (dedup will run outside backup times)

When the backup application runs it will write the data from this vhdx via Block Level (tested with or  without CBT) to a file share (driveletter A)

The timeout will occour reproducibly and so preventing a backup. 

If I try the same and do not use the SMB Share located on a volume on the Storage Spaces Pool the backup is fine.

 

 

Source: VM > Data disk (driveletter N) located on Storage Pool 1 volume (driveletter H)

Target: SMB Share (driveletter A) located on Storage Pool 1

Error: semaphore timeout

 

more details in the feedback hub from our investigation

unaffected builds

Windows Server 2019 LTSC 1809 17763.x

Windows Server vNext LTSC b20215

 

affected builds:

all other Windows Server vNext / 2022 LTSC

 

the issue happen when the backup data amount is large (20-40 GB will work fine)

 

4 Replies
Hi there,
Thank you for reaching out and being so willing to assist in resolving this concern. I’m reaching out on behalf of the Storage spaces team. They were able to have a look at your submitted information but didn’t quite have all the logs they were looking for. If you could attach the storage diagnostic cabs as part of this feedback that would help them greatly. Additionally, we would like some stordiag traces from before and after you encounter the issue. These can be generated by running the following command: “stordiag.exe -collectEtW -collectPerf -out C:\Users\wolfpack\Logs\ETWPerf”. You can then upload the containing folder (C:\Users\wolfpack\Logs\ETWPerf or wherever you decide to output to) to this feedback item for further investigation.
If you have any issues or concerns, please reach out and I’ll do my best to assist where possible.

@RahulTK since this is a quite complex topic would it be possible to have a teams meeting? 

@RahulTK in the release Windows Server 2022 LTSC 20313 the issue disappeared but I would not mark it as solved. There have been a lot of changes and Unfortunately I didn't test enough after each change, sorry.

 

1. upgraded the VM (backup source) from 20H2 to 21H1

2. upgraded Server Hyper-V Host (and SMB Share holder to 20313)

3. upgraded from Macrium Reflect 7.3.x to 8.0 beta, this one has not the same feature set (CBT and backup file encryption are no longer in place)

4. I have disabled the scheduled jobs for automatic background optimization for Dedup and only run the base scheduled jobs. Reason for this was a problem i have seen in b20292
in https://techcommunity.microsoft.com/t5/windows-server-insiders/b20292-cpu-spikes-since-b20292-due-de...

next steps:
I can provide you the logs via Onedrive 

When MR 8.0 final is out I will try to enable the CBT and later encryption to see if one of these are causing the issue. 
When this scenario has passed I will enable the background optimization again.

I have enabled Background Optimization but this did not negatively affect anything.
II have reinstalled the server from Scratch because of this issue
https://techcommunity.microsoft.com/t5/windows-server-insiders/issue-b20251-b20257-b20262-b20270-b20...

I will report back when MR 8 is released and I can leverage CBT and Encryption again, but in this case it seems to be more an application issue causing the overload / write latencies on the Storage Space to cause the semaphore timeout.

here are the logs from the session with Ned Pyle @RahulTK
https://1drv.ms/u/s!ApTx3d3fhinPhLQQzXJ847miz7kMQw?e=ry73Ei