ReFS filesystem not responding periodically

%3CLINGO-SUB%20id%3D%22lingo-sub-3068998%22%20slang%3D%22en-US%22%3EReFS%20filesystem%20not%20responding%20periodically%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-3068998%22%20slang%3D%22en-US%22%3E%3CP%3E(%3CA%20href%3D%22https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fanswers%2Fquestions%2F704549%2Frefs-filesystem-not-responding-periodically.html%22%20target%3D%22_blank%22%20rel%3D%22noopener%20noreferrer%22%3EAlso%20posted%20on%20Q%26amp%3BA%3C%2FA%3E)%3C%2FP%3E%3CUL%3E%3CLI%3EStorage%20spaces%2C%20Simple%20(no%20parity)%2C%20Thin%20provisioning%3C%2FLI%3E%3CLI%3EVirtual%20disk%20larger%20than%20100%20TB%2C%2010's%20of%20TBs%20used%3C%2FLI%3E%3CLI%3EReFS%203.7%3C%2FLI%3E%3C%2FUL%3E%3CP%3EThe%20filesystem%20is%20not%20responding%20for%20about%201%20minute%2C%20for%20every%20~25%20minutes%20according%20to%20the%20Microsoft-Windows-ReFS%2FOperational%20event%20log%20(Event%20ID%20147).%26nbsp%3B%3CSPAN%3EA%20thread%20in%20the%20System%20process%20has%20~100%25%20core%20usage%20so%20I%20took%20a%20stack%20trace%3A%3C%2FSPAN%3E%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CPRE%20class%3D%22lia-code-sample%20language-cpp%22%3E%3CCODE%3Entoskrnl.exe!KiDeliverApc%2B0x1b6%0Antoskrnl.exe!KiApcInterrupt%2B0x328%0AReFS.SYS!CmsRotatingSkipList%26lt%3B_RANGE%2CSmsAllocationRegionEx%2COrderByStartOfRange%2CRegionLockPolicies%26gt%3B%3A%3AInlineRebalance%2B0x98%0AReFS.SYS!CmsRotatingSkipList%26lt%3B_RANGE%2CSmsAllocationRegionEx%2COrderByStartOfRange%2CRegionLockPolicies%26gt%3B%3A%3AAdd%2B0x1d0%0AReFS.SYS!CmsAllocator%3A%3ASplitRangeOnlyRegion%2B0x386%0AReFS.SYS!CmsAllocator%3A%3APinBitmapRegion%2B0x3176c%0AReFS.SYS!CmsAllocator%3A%3ATryProtectRangeIfDurablyFree%2B0x9b%0AReFS.SYS!CmsThinProvisioning%3A%3AUnmapWorkItemMethod%2B0x2e4%0AReFS.SYS!%3CLAMBDA_9A1CD484752CA8E9F6E914453BF80744%3E%3A%3A%3CLAMBDA_INVOKER_CDECL%3E%2B0x15%0AReFS.SYS!MspWorkerRoutine%2B0x46%0Antoskrnl.exe!ExpWorkerThread%2B0x14f%0Antoskrnl.exe!PspSystemThreadStartup%2B0x55%0Antoskrnl.exe!KiStartSystemThread%2B0x34%3C%2FLAMBDA_INVOKER_CDECL%3E%3C%2FLAMBDA_9A1CD484752CA8E9F6E914453BF80744%3E%3C%2FCODE%3E%3C%2FPRE%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3C%2FLINGO-BODY%3E%3CLINGO-LABS%20id%3D%22lingo-labs-3068998%22%20slang%3D%22en-US%22%3E%3CLINGO-LABEL%3EStorage%3C%2FLINGO-LABEL%3E%3C%2FLINGO-LABS%3E%3CLINGO-SUB%20id%3D%22lingo-sub-3072421%22%20slang%3D%22en-US%22%3ERe%3A%20ReFS%20filesystem%20not%20responding%20periodically%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-3072421%22%20slang%3D%22en-US%22%3EIf%20the%20cpu's%20are%20at%20100%25%20(Don't%20know%20how%20many%20you%20have%20in%20the%20system)%2C%20is%20there%20also%20a%20lot%20of%20network%20traffic%20on%20the%20system%3F%20Is%20there%20a%20task%20which%20dumps%20a%20lot%20of%20data%20on%20the%20system%20every%20half%20our%20for%20example%3F%20Had%20issues%20like%20this%20in%20the%20past%20with%20another%20(Non%20Storage%20Space)%20while%20dumping%20a%20backup%20database%20which%20was%20done%20with%20such%20speed%20that%20the%20disk%20system%20just%20couldn't%20handle%20it.%3C%2FLINGO-BODY%3E%3CLINGO-SUB%20id%3D%22lingo-sub-3094077%22%20slang%3D%22en-US%22%3ERe%3A%20ReFS%20filesystem%20not%20responding%20periodically%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-3094077%22%20slang%3D%22en-US%22%3EDid%20you%20discover%20that%20the%20problem%20is%3F%3C%2FLINGO-BODY%3E
New Contributor

(Also posted on Q&A)

  • Storage spaces, Simple (no parity), Thin provisioning
  • Virtual disk larger than 100 TB, 10's of TBs used
  • ReFS 3.7

The filesystem is not responding for about 1 minute, for every ~25 minutes according to the Microsoft-Windows-ReFS/Operational event log (Event ID 147). A thread in the System process has ~100% core usage so I took a stack trace:

 

 

ntoskrnl.exe!KiDeliverApc+0x1b6
ntoskrnl.exe!KiApcInterrupt+0x328
ReFS.SYS!CmsRotatingSkipList<_RANGE,SmsAllocationRegionEx,OrderByStartOfRange,RegionLockPolicies>::InlineRebalance+0x98
ReFS.SYS!CmsRotatingSkipList<_RANGE,SmsAllocationRegionEx,OrderByStartOfRange,RegionLockPolicies>::Add+0x1d0
ReFS.SYS!CmsAllocator::SplitRangeOnlyRegion+0x386
ReFS.SYS!CmsAllocator::PinBitmapRegion+0x3176c
ReFS.SYS!CmsAllocator::TryProtectRangeIfDurablyFree+0x9b
ReFS.SYS!CmsThinProvisioning::UnmapWorkItemMethod+0x2e4
ReFS.SYS!<lambda_9a1cd484752ca8e9f6e914453bf80744>::<lambda_invoker_cdecl>+0x15
ReFS.SYS!MspWorkerRoutine+0x46
ntoskrnl.exe!ExpWorkerThread+0x14f
ntoskrnl.exe!PspSystemThreadStartup+0x55
ntoskrnl.exe!KiStartSystemThread+0x34

 

 

 

6 Replies
If the cpu's are at 100% (Don't know how many you have in the system), is there also a lot of network traffic on the system? Is there a task which dumps a lot of data on the system every half our for example? Had issues like this in the past with another (Non Storage Space) while dumping a backup database which was done with such speed that the disk system just couldn't handle it.
Did you discover that the problem is?
This issue is about the ReFS filesystem itself so I think only Microsoft employees can address it.
Ok, I also see your post here https://docs.microsoft.com/en-us/answers/questions/704549/refs-filesystem-not-responding-periodicall... . But is creating a ticket with official Microsoft Support perhaps a better idea?
I don't know how to create a support ticket. Via the 'Get Help' app?
I think you can try https://support.serviceshub.microsoft.com/supportforbusiness , but if you have a Microsoft Partner (For licensing etc.) you could ask them?