[Server 22538] Deduplication gets stuck, refuses to cancel, while fsdmhost.exe spins indefinitely

Contributor

Feedback Hub link

 

On a brand-new disk (WDC WD60EFZX), after creating a storage space, ReFS, bitlocker, copying 3TB of data, and enabling deduplication, the deduplication process begins and starts deduplication, but eventually the threads of the process involved (fsdmhost.exe) appear to start spinning without accomplishing anything.  The machine's fans just come up and it goes wild as CPU is consumed, but memory, I/O, and other standard metrics, don't move.

 

I’ve left it running for ~24 hours, and during that time, taskmgr/resmon/procmon show that once the i/o stops, it never continues again.

 

After rebooting the server, running 'Start-DedupJob -Type Optimization', dedup appears to start, but it eventually gets stuck again.

 

Having done it many times now, it seems to be always stuck at the same file or part on disk, per the stats:

 

PS C:\Users\TReKiE> get-dedupstatus

FreeSpace    SavedSpace   OptimizedFiles     InPolicyFiles      Volume
---------    ----------   --------------     -------------      ------
2.98 TB      689.6 GB     17175              17178              E:

 

 

Additionally, the spinning continues after Stop-DedupJob, and after making the attempt to stop, Get-DedupJob requests refuse to return, and require breaking (Ctrl-C) to even return to PowerShell.

 

 

PS C:\Users\TReKiE> Stop-DedupJob -Volume E:
PS C:\Users\TReKiE> Get-DedupJob
 
Type               ScheduleType       StartTime              Progress   State                  Volume
----               ------------       ---------              --------   -----                  ------
Optimization       Manual             1:13 AM                0 %        PendingCancel          E:
PS C:\Users\TReKiE> Get-DedupJob
[stops here, never returns]

 

 

I attached minidumps taken of fsdmhost.exe and server health ETLs to the bug on Feedback Hub, however, I was unable to attach the full dumps as they seem to be too big (but available on request!).  All the event logs Application/System/Deduplication-Operational/Diagnostic, say everything is fine. 

 

This identical hardware has previously handled deduplication of a 14TB ReFS Storage Space on previous versions of Windows Server 2019 and 2022 without issue.

 

Of note, I originally wrote this for build 22526, but it still happens on 22538.

2 Replies

Having same issue with a Raid of 120TB, it just like how you described. 

Thanks Thomas, I'm glad I'm not alone on this one. As a small addition to what I wrote above, over this past weekend, I started all over, copied the same data, all the same settings, but used NTFS instead. That seemed to solve the issue, dedup now had no problem, and is working normally.