Forum Discussion
26063 deduplication data corruption is still there.
Michael
A little update on this: For the first time in many years I had a blue screen on my actual server, which was Server 2019 until mid last year. The first one ever. Bluescreenview from nirsoft lists "ntfs.sys" as culpruit. I unprofessionaly suspect concurrent writes on the deduped volume residing on a tiered mirrored storage space (not S2D). And I unprofessionally suspect that it might be related to the issue with the dedup repo which corrupts data. But only you can tell, not me.
Computer:
i7-4960x, 32 GB RAM (ASRock x79 Fataility). Powersupply a Fujitsu 750 Watt Ex-Primergy.
C: = normal SSD,
😧= Deduped drive. Tiered Storage Space Mirror, 2*SSD, 2*HDD.
E: = Storage Spaces Parity with four drives.
Bitlocker aktive on 😧 and E:
Using NetQoS the speed is limited to 60 MB/s incoming since the tiered storage space cannot write faster. The blue screen happened during three parallel robocopy jobs over network, two of them with 😧 as destination. When the blue screen happened neither a dedupjob, nor a re-tiering job was running according to the event logs. After the crash and the filesystem check I copied linear, not concurrently, on and everything seemed fine.
This is the first ntfs.sys crash I saw anyway since about, maybe, XP SP0?
Here is the dump + the events around that crash.
https://joumxyzptlk.de/tmp/microsoft/NTFS-maybe-dedup-problem-first-server-2022-crash-ever-on-that-computer.7z
- MSSWRahmanMar 28, 2024
Microsoft
Hi Joachim,
Thank you for sharing the dump. It shows some allocation was corrupted, but since this is a mini-dump it is difficult to debug further.
Do you think you can reproduce the bugcheck issue? If yes, could you please configure a complete dump by setting the reg key below, rebooting the machine, and then reproduce the bugcheck, and then share the dump.
reg add "hklm\SYSTEM\CurrentControlSet\Control\CrashControl" /v CrashDumpEnabled /t reg_dword /d 1 /f
Did you notice event id 55 in system event log in the past when you originally hit the issue? If you can share the following event log files from the repro machine that would be great. %windir%\System32\winevt\Logs\system.evtx
%windir%\System32\winevt\Logs\Microsoft-Windows-Ntfs%4Operational.evtx
In this case the deduped drive (D:) is a Tiered Storage Space Mirror, 2*SSD, 2*HDD. Is this something you configured recently? Asking because in the original report of the issue, this was not mentioned.
We have tried to reproduce the issue internally but have not seen any corruptions. although dedup drive was not a tiered spaces mirror volume.
In your video, the chkdsk output shows cleanup of security descriptor and index files, but that does not indicate corruption. If by chance you have saved the chkdsk output from before, please share that as well.
Thank you.- Joachim_OtahalMar 28, 2024Iron ContributorPS: If you want a life demonstration with the repo, it can be done as a teams session with my work account email address removed for privacy reasons - but be aware, German eastern holiday starts right now, so next response there will be on Tuesday 2. of April since the work laptop is at work and not at home.
- Joachim_OtahalMar 28, 2024Iron Contributor
The original report of that issue was the completely encapsulated as a easy-to-repo VM on my Ryzen 5950x (SATA SSD), the Dual-Xeon 6226R (professional RAID controller with 2 SSDs RAID 1 for that volume), and with lower probability on that i7-4960x (the Tiered Storage Space mentioned above). So the storage below did not matter for the repo.
I activated the full dump now (local time 21:59), and try to reproduce during the next days. But it may take time to appear. Wish me luck!
As for NTFS event 55: Never. I am paranoid about noticing such thing. Especially since a customer had exchange database corruption several years just 'cause that bit was not noticed. This is checked every time the box is booted. Here that one rare example with NTFS warning after that crash. I guess you can read the xmlfilter variant, the rest of the OS is German :D.