storage
1051 Topicsmpi-stage: High-Performance File Distribution for HPC Clusters
When running containerized workloads on HPC clusters, one of the first problems you hit is getting container images onto the nodes quickly and repeatably. A .sqsh is a Squashfs image (commonly used by container runtimes on HPC). In some environments you can run a Squashfs image directly from shared storage, but at scale that often turns the shared filesystem into a hot spot. Copying the image to local NVMe keeps startup time predictable and avoids hundreds of nodes hammering the same source during job launch. In this post, I'll introduce mpi-stage, a lightweight tool that uses MPI broadcasts to distribute large files across cluster nodes at speeds that can saturate the backend network. The Problem: Staging Files at Scale On an Azure CycleCloud Workspace for Slurm cluster with GB300 GPU nodes, I needed to stage a large Squashfs container image from shared storage onto each node's local NVMe storage before launching training jobs. At small scale you can often get away with ad-hoc copies, but once hundreds of nodes are all trying to read the same source file, the shared source filesystem quickly becomes the bottleneck. I tried several approaches: Attempt 1: Slurm's sbcast Slurm's built-in sbcast seemed like the natural choice. In my quick testing it was slower than I wanted, and the overwrite/skip-existing behavior didn't match the "fast no-op if already present" workflow I was after. I didn't spend much time exploring all the configuration options before moving on. Attempt 2: Shell Script Fan-Out I wrote a shell script using a tree-based fan-out approach: copy to N nodes, then each of those copies to N more, and so on. This worked and scaled reasonably, but had some drawbacks: Multiple stages: The script required orchestrating multiple rounds of copy commands, adding complexity Source filesystem stress: Even with fan-out, the initial copies still hit the source filesystem simultaneously — a fan-out of 4 meant 4 nodes competing for source bandwidth Frontend network: Copies went over the Ethernet network by default — I could have configured IPoIB, but that added more setup The Solution: MPI Broadcasts The key insight was that MPI's broadcast primitive (MPI_Bcast) is specifically optimized for one-to-many data distribution. Modern MPI implementations like HPC-X use tree-based algorithms that efficiently utilize the high-bandwidth, low-latency InfiniBand network. With mpi-stage: Single source read: Only one node reads from the source filesystem Backend network utilization: Data flows over InfiniBand using optimized MPI collectives Intelligent skipping: Nodes that already have the file (verified by size or checksum) skip the copy entirely Combined, this keeps the shared source (NFS, Lustre, blobfuse, etc.) from being hammered by many concurrent readers while still taking full advantage of the backend fabric. How It Works mpi-stage is designed around a simple workflow: The source node reads the file in chunks and streams each chunk via MPI_Bcast. Destination nodes write each chunk to local storage immediately upon receipt. This streaming approach means the entire file never needs to fit in memory — only a small buffer is required. Key Features Pre-copy Validation Before any data is transferred, each node checks if the destination file already exists and matches the source. You can choose between: Size check (default): Fast comparison of file sizes—sufficient for most use cases Checksum: Stronger validation, but requires reading the full file and is therefore slower If all nodes already have the correct file, mpi-stage completes in milliseconds with no data transfer. Double-Buffered Transfers The implementation uses double-buffered, chunked transfers to overlap network communication with disk I/O. While one buffer is being broadcast, the next chunk is being read from the source. Post-copy Validation Optionally verify that all nodes received the file correctly after the copy completes. Single-Writer Per Node The tool enforces one MPI rank per node to prevent filesystem contention and ensure predictable performance. Real-World Performance In one run using 156 GPU nodes, distributing a container image achieved approximately 3 GB/s effective distribution rate (file_size/time), completing in just over 5 seconds: [0] Copy required: yes [0] Starting copy phase (source writes: yes) [0] Copy complete, Bandwidth: 3007.14 MB/s [0] Post-validation complete [0] Timings (s): Topology check: 5.22463 Source metadata: 0.00803746 Pre-validation: 0.0046786 Copy phase: 5.21189 Post-validation: 2.2944e-05 Total time: 5.2563 Because every node writes the file to its own local NVMe, the cumulative write rate across the cluster is roughly this number times the node count: ~3 GB/s × 156 ≈ ~468 GB/s of total local writes. Workflow: Container Image Distribution The primary use case is distributing Squashfs images to local NVMe before launching containerized workloads. Run mpi-stage as a job step before your main application: #!/bin/bash #SBATCH --job-name=my-training-job #SBATCH --ntasks-per-node=1 #SBATCH --exclusive # Stage the container image srun --mpi=pmix ./mpi_stage \ --source /shared/images/pytorch.sqsh \ --dest /nvme/images/pytorch.sqsh \ --pre-validate size \ --verbose # Run the actual job (from local NVMe - much faster!) srun --container-image=/nvme/images/pytorch.sqsh ... mpi-stage will create the destination directory if it doesn't exist. If your container runtime supports running the image directly from shared storage, you may not strictly need this step—but staging to local NVMe tends to be faster and more predictable at large scale. Because of the pre-validation, you can include this step in every job script without penalty—if the image is already present, it completes in milliseconds. Getting Started git clone https://github.com/edwardsp/mpi-stage.git cd mpi-stage make For detailed usage and options, see the README. Summary mpi-stage started as a solution to a very specific problem—staging large container images efficiently across a large GPU cluster—but the same pattern may be useful in other scenarios where many nodes need the same large file. By using MPI broadcasts, only a single node reads from the source filesystem, while data is distributed over the backend network using optimized collectives. In practice, this can significantly reduce load on shared filesystems and cloud-backed mounts, such as Azure Blob Storage accessed via blobfuse2, where hundreds of concurrent readers can otherwise become a bottleneck. While container images were the initial focus, this approach could also be applied to staging training datasets, distributing model checkpoints or pretrained weights, or copying large binaries to local NVMe before a job starts. Anywhere that a “many nodes, same file” pattern exists is a potential fit. If you're running large-scale containerized workloads on Azure HPC infrastructure, give it a try. If you use mpi-stage in other workflows, I'd love to hear what worked (and what didn't). Feedback and contributions are welcome. Have questions or feedback? Leave a comment below or open an issue on GitHub.Announcing Native NVMe in Windows Server 2025: Ushering in a New Era of Storage Performance
We’re thrilled to announce the arrival of Native NVMe support in Windows Server 2025—a leap forward in storage innovation that will redefine what’s possible for your most demanding workloads. Modern NVMe (Non-Volatile Memory Express) SSDs now operate more efficiently with Windows Server. This improvement comes from a redesigned Windows storage stack that no longer treats all storage devices as SCSI (Small Computer System Interface) devices—a method traditionally used for older, slower drives. By eliminating the need to convert NVMe commands into SCSI commands, Windows Server reduces processing overhead and latency. Additionally, the whole I/O processing workflow is redesigned for extreme performance. This release is the result of close collaboration between our engineering teams and hardware partners, and it serves as a cornerstone in modernizing our storage stack. Native NVMe is now generally available (GA) with an opt-in model (disabled by default as of October’s latest cumulative update for WS2025). Switch onto Native NVMe as soon as possible or you are leaving performance gains on the table! Stay tuned for more updates from our team as we transition to a dramatically faster, more efficient storage future. Why Native NVMe and why now? Modern NVMe devices—like PCIe Gen5 enterprise SSDs capable of 3.3 million IOPS, or HBAs delivering over 10 million IOPS on a single disk—are pushing the boundaries of what storage can do. SCSI-based I/O processing can’t keep up because it uses a single-queue model, originally designed for rotational disks, where protocols like SATA support just one queue with up to 32 commands. In contrast, NVMe was designed from the ground up for flash storage and supports up to 64,000 queues, with each queue capable of handling up to 64,000 commands simultaneously. With Native NVMe in Windows Server 2025, the storage stack is purpose-built for modern hardware—eliminating translation layers and legacy constraints. Here’s what that means for you: Massive IOPS Gains: Direct, multi-queue access to NVMe devices means you can finally reach the true limits of your hardware. Lower Latency: Traditional SCSI-based stacks rely on shared locks and synchronization mechanisms in the kernel I/O path to manage resources. Native NVMe enables streamlined, lock-free I/O paths that slash round-trip times for every operation. CPU Efficiency: A leaner, optimized stack frees up compute for your workloads instead of storage overhead. Future-Ready Features: Native support for advanced NVMe capabilities like multi-queue and direct submission ensures you’re ready for next-gen storage innovation. Performance Data Using DiskSpd.exe, basic performance testing shows that with Native NVMe enabled, WS2025 systems can deliver up to ~80% more IOPS and a ~45% savings in CPU cycles per I/O on 4K random read workloads on NTFS volumes when compared to WS2022. This test ran on a host with Intel Dual Socket CPU (208 logical processors, 128GB RAM) and a Solidigm SB5PH27X038T 3.5TB NVMe device. The test can be recreated by running "diskspd.exe -b4k -r -Su -t8 -L -o32 -W10 -d30 testfile1.dat > output.dat" and modifying the parameters as desired. Results may vary. Top Use Cases: Where You’ll See the Difference Try Native NVMe on servers running your enterprise applications. These gains are not just for synthetic benchmarks—they translate directly to faster database transactions, quicker VM operations, and more responsive file and analytics workloads. SQL Server and OLTP: Shorter transaction times, higher IOPS, and lower tail latency under mixed read/write workloads. Hyper‑V and virtualization: Faster VM boot, checkpoint operations, and live migration with reduced storage contention. High‑performance file servers: Faster large‑file reads/writes and quicker metadata operations (copy, backup, restore). AI/ML and analytics: Low‑latency access to large datasets and faster ETL, shuffle, and cache/scratch I/O. How to Get Started Check your hardware: Ensure you have NVMe-capable devices that are currently using the Windows NVMe driver (StorNVMe.sys). Note that some NVMe device vendors provide their own drivers, so unless using the in-box Windows NVMe driver, you will not notice any differences. Enable Native NVMe: After applying the 2510-B Latest Cumulative Update (or most recent), add the registry key with the following PowerShell command: reg add HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Policies\Microsoft\FeatureManagement\Overrides /v 1176759950 /t REG_DWORD /d 1 /f Alternatively, use this Group Policy MSI to add the policy that controls the feature then run the local Group Policy Editor to enable the policy (found under Local Computer Policy > Computer Configuration > Administrative Templates > KB5066835 251014_21251 Feature Preview > Windows 11, version 24H2, 25H2). Once Native NVMe is enabled, open Device Manager and ensure that all attached NVMe devices are displayed under the “Storage disks” section. Monitor and Validate: Use Performance Monitor and Windows Admin Center to see the gains for yourself. Or try DiskSpd.exe yourself to measure microbenchmarks in your own environment! A quick way to measure IOPS in Performance Monitor is to set up a histogram chart and add a counter for Physical Disk>Disk Transfers/sec (where the selected instance is a drive that corresponds to one of your attached NVMe devices) then run a synthetic workload with DiskSpd. Compare the numbers before and after enabling Native NVMe to see the realized difference in your real environment! Join the Storage Revolution This is more than just a feature—it’s a new foundation for Windows Server storage, built for the future. We can’t wait for you to experience the difference. Share your feedback, ask questions, and join the conversation. Let’s build the future of high-performance Windows Server storage together. Send us your feedback or questions at nativenvme@microsoft.com! — Yash Shekar (and the Windows Server team)BLOG: Determine and modernize Filesystem Deduplication
Version history - 1.4 revised script so ReFS volumes with classic dedup will be identified, added more eligibly checks and error handling - 1.3 added point #4 in migration guidance - 1.2 revised script - 1.1 formatting This blog explains the two Windows deduplication modes classic Windows Data Deduplication (ReFS or NTFS) and ReFS Deduplication (ReFS). It covers how they differ, why you should consider upgrading to Windows Server 2025 to leverage the new ReFS dedup engine, and clear warnings about scenarios where ReFS is not recommended. Practical migration guidance and detection commands are included. Differences between classic dedup and ReFS dedup File system: Classic dedup runs on NTFS or ReFS; ReFS dedup runs on ReFS and Windows Server 2025 or later, only. Implementation: They are separate engines with different metadata formats and management cmdlets. Management: Classic dedup uses the Dedup PowerShell module (Get‑DedupVolume, Start‑DedupJob, Disable‑DedupVolume). ReFS dedup uses its own ReFS dedup cmdlets (Get‑ReFSDedupStatus, Enable‑ReFSDedup). Conversion: There is no in‑place conversion between the two; metadata and chunk formats are incompatible. Improvements: the new in-line ReFS Deduplication leverages the advantages of ReFS files system. This makes deduplication more efficient and less CPU intensive. The new ReFS Deduplication can also compress data in-line using L1Z algorithm. This makes it up to par with enterprise solutions, often found in SAN storage or Linux appliances. Compression needs to be set per volume, and optional. Why upgrade to Windows Server 2025 Improved version of ReFS Filesystem Improved ReFS in-line deduplication + optional L1Z compression: Server 2025 includes enhancements to ReFS dedup performance, scalability, and integration with modern storage features. Support and fixes: Windows Server 2016 and 2019 are past mainstream support, increasing the likelihood of costly support cases and delayed fixes; upgrading reduces operational risk and ensures access to ongoing improvements. Future compatibility: Newer OS releases receive optimizations and bug fixes for ReFS and dedup scenarios that older releases will not. SMB compression: for reasonably faster data transfer at minimal CPU when transferring data through the networks. Feature and security related improvements refer to availabile Microsoft Windows Server 2025 Summit content on techcommunity.microsoft.com Scenarios where ReFS is not recommended ReFS on SAN in clustered CSV environments: Avoid placing ReFS dedup on top of SAN‑backed Cluster Shared Volumes (CSVFS) in production clusters; clustered SAN/CSV scenarios causing severe performance issues in practice. Please refer to the ReFS documentation. (personal opinion and experience, not endorsed by Microsoft): Many small, fast‑changing files: Workloads with frequent small writes—user profiles, redirected AppData, or applications that churn small config files (for example, Lotus Notes config files)—can cause locks, performance degradation, or unexpected behavior on ReFS. Exclude these paths from dedup or keep them on NTFS. Restrictions, like lockups or high RAM consumption, deadlocks / BSOD might have been addressed in Windows Server 2025. Improving reliability and performance is a top goal for ReFS, to improve the adoption and feature parity with NTFS. For information about feature parity please refer to the ReFS documentation. Migration guidance The following instructions describe a high level and supported migration path from Windows deduplication using the NTFS file system to native ReFS Deduplication. Note: Step #3, data migration is not required when already using ReFS with Data Deduplication. In this case it's enough to execute step #1 and #2. Note: Validate on non‑production data first. Plan for rehydration time and network/storage throughput. Ensure backups are current before starting. Make sure to have a full backup before upgrading Server OS or making changes. 1. Disable classic dedup on the NTFS source: Disable-DedupVolume -Volume YourDriveLetter: 2. Rehydrate (un‑deduplicate) the data: Start-DedupJob -Volume YourDriveLetter: -Type Unoptimization 3. Copy or move data to a ReFS volume (new target): For straightforward NTFS→ReFS copies, robocopy is recommended. Another alternative is the Windows Admin Center File Server Migration Feature For complex scenarios, open files long path names very large datasets (< 5 TB) or many small files restructuring, GUI (including Windows Server Core) automation, improved logging cloud/hybrid migrations I recommend the usage of GScopy Enterprise by GuruSquad for higher speed (up to 40%) and reliability. 4. Optionally remove the Windows Server feature When there is no old deduplication in use consider to remove the feature. Your advantages of doing so: removes an unneccessary service. removes the file system filter driver for dedup, which causes performance impacts, even when not in use. removes the PowerShell commandlets for the old dedup, so they cannot mistakenly used by existing scripts, unaware admins etc. When migrating files over network: SMB compression: consider both source and target run Windows Server 2025 and leverage SMB compression. SMB Compression is available in Microsoft xcopy, Microsoft robocopy and Gurusquad GScopy Enterprise. Balancing and Teaming with SMB: SMB does not require LFBO or SET Teaming. It automagically detects network links and actively balances on its own on Windows Server 2016 and later. Using teaming, depending the configuration, can negatively affect transfer speed. Quick detection and diagnostic commands Check file systems: Get-Volume | Select DriveLetter, FileSystem Check classic dedup feature: Get-WindowsFeature -Name FS-Data-Deduplication Get-DedupVolume Get-DedupStatus Check ReFS dedup: Get-Command -Module Microsoft.ReFsDedup.Commands Get-ReFSDedupStatus -Volume YourDriveLetter: Diagnostic script to detect both: <# .SYNOPSIS Detects classic NTFS Data Deduplication and ReFS Deduplication across local volumes. .DESCRIPTION - Reports NTFS volumes with classic Data Dedup enabled. - Lists ReFS volumes present on the host. - If the ReFS dedup cmdlet exists AND OS build >= 26100, checks ReFS dedup status per ReFS volume. - Color coding: * Classic dedup enabled → Yellow * Classic dedup not enabled → Cyan * ReFS dedup enabled → Green * ReFS dedup not enabled → Cyan .NOTES Version: 1.7 Author: Karl Wester-Ebbinghaus + Copilot Requirements: Elevated PowerShell session, PowerShell 5.1 or newer Supported OS: Windows Server 2025, Azure Stack HCI 24H2 or newer Unsupported OS: Windows 10, Windows 11 (script terminates) #> #region Initialization Write-Verbose "Initializing variables and environment..." $Volumes = $null $Volume = $null $DedupVolumesList = $null $DedupReFSVolumesList = $null $DedupReFSVolumesListLetters = $null $DedupReFSStatus = $null $refsCmd = $null $OSBuild = $null $runReFSDedupChecks = $null #endregion Initialization #region Volume Discovery Clear-Host Write-Verbose "Querying NTFS and ReFS volumes..." $Volumes = Get-Volume | Where-Object FileSystem -in 'NTFS','ReFS' #endregion Volume Discovery #region ReFS Dedup Cmdlet, OS Build and OS SKU Detection Write-Verbose "Checking for ReFS deduplication cmdlet..." $refsCmd = Get-Command -Name Get-ReFSDedupStatus -ErrorAction SilentlyContinue Write-Verbose "Reading OS build number..." try { $OSBuild = [int](Get-ItemProperty -Path 'HKLM:\SOFTWARE\Microsoft\Windows NT\CurrentVersion' -Name CurrentBuildNumber).CurrentBuildNumber } catch { Write-Verbose "Registry read for OS build failed. Falling back to Environment OSVersion." $OSBuild = [int][Environment]::OSVersion.Version.Build } # end try/catch for OS build detection Write-Verbose "Checking OS InstallationType and EditionID..." $CurrentVersionKey = Get-ItemProperty -Path 'HKLM:\SOFTWARE\Microsoft\Windows NT\CurrentVersion' $InstallationType = $CurrentVersionKey.InstallationType # "Client" or "Server" $EditionID = $CurrentVersionKey.EditionID # e.g. "AzureStackHCI", "ServerStandard", etc. Write-Verbose "Detected InstallationType: $InstallationType" Write-Verbose "Detected EditionID: $EditionID" Write-Verbose "Detected OSBuild: $OSBuild" # Block Windows 10/11 (Client OS) if ($InstallationType -eq 'Client') { Write-Error "Unsupported OS detected: Windows Client (Windows 10/11). Only Windows Server or Azure Stack HCI are supported. Script will terminate." exit } # Allow Azure Stack HCI explicitly if ($EditionID -eq 'AzureStackHCI') { Write-Verbose "Azure Stack HCI detected. Supported platform." } else { # Must be Windows Server if ($InstallationType -ne 'Server') { Write-Error "Unsupported OS detected. Only Windows Server or Azure Stack HCI are supported. Script will terminate." exit } Write-Verbose "Windows Server detected (EditionID: $EditionID). Supported platform." } Write-Verbose "Evaluating ReFS dedup eligibility based on cmdlet presence and build >= 26100..." $runReFSDedupChecks = $false if ($refsCmd -and ($OSBuild -ge 26100)) { $runReFSDedupChecks = $true Write-Verbose "ReFS dedup checks ENABLED (cmdlet present and OS build >= 26100)." } else { Write-Verbose "ReFS dedup checks DISABLED (cmdlet missing or OS build < 26100)." } #endregion ReFS Dedup Cmdlet, OS Build and OS SKU Detection #region Main Loop foreach ($Volume in $Volumes) { # begin foreach volume loop Write-Host "Volume $($Volume.DriveLetter): ($($Volume.FileSystem))" Write-Verbose "Processing volume $($Volume.DriveLetter)..." #region Classic Dedup + ReFS Volume Listing if ($Volume.FileSystem -eq 'NTFS' -or $Volume.FileSystem -eq 'ReFS') { Write-Verbose "Checking classic deduplication status for volume $($Volume.DriveLetter)..." $DedupVolumesList = Get-DedupVolume -Volume $Volume.DriveLetter -ErrorAction SilentlyContinue if ($DedupVolumesList) { Write-Host " → Classic Data Dedup ENABLED on $($Volume.DriveLetter), $($Volume.FileSystem)" -ForegroundColor Yellow } else { Write-Host " → Classic Data Dedup NOT enabled on $($Volume.DriveLetter),$($Volume.FileSystem)" -ForegroundColor Cyan } # end if classic dedup enabled Write-Verbose "Listing ReFS volumes on host..." $DedupReFSVolumesList = Get-Volume | Where-Object FileSystem -eq 'ReFS' if ($DedupReFSVolumesList) { $DedupReFSVolumesListLetters = ($DedupReFSVolumesList | ForEach-Object { $_.DriveLetter }) -join ',' Write-Host " → ReFS volumes present on host: $DedupReFSVolumesListLetters" } else { Write-Host " → No ReFS volumes detected on host" } # end if ReFS volumes present } # end NTFS/ReFS block #endregion Classic Dedup + ReFS Volume Listing #region ReFS Dedup Status if ($Volume.FileSystem -eq 'ReFS') { if ($runReFSDedupChecks) { Write-Verbose "Checking ReFS deduplication status for volume $($Volume.DriveLetter)..." $DedupReFSStatus = Get-ReFSDedupStatus -Volume $Volume.DriveLetter -ErrorAction SilentlyContinue if ($DedupReFSStatus) { Write-Host " → ReFS Dedup ENABLED on $($Volume.DriveLetter), $($Volume.FileSystem)" -ForegroundColor Green } else { Write-Host " → ReFS Dedup NOT enabled on $($Volume.DriveLetter), $($Volume.FileSystem)" -ForegroundColor Cyan } # end if ReFS dedup enabled } else { if (-not $refsCmd) { Write-Error " → Skipping ReFS dedup check: Get-ReFSDedupStatus cmdlet not present" -ForegroundColor Cyan } else { Write-Error " → Skipping ReFS dedup check: OS build $OSBuild < required 26100" -ForegroundColor Cyan } # end reason for skipping ReFS dedup check } # end if runReFSDedupChecks } # end if ReFS filesystem block #endregion ReFS Dedup Status Write-Host "" } # end foreach volume loop #endregion Main Loop #region End Write-Verbose "Script completed." #endregion End Recommendations and next steps Inventory: Identify volumes using NTFS dedup and ReFS dedup, and map workloads that create many small or rapidly changing files. Plan: Schedule rehydration and migration windows; test ReFS dedup on representative datasets. Upgrade: Prioritize upgrading servers still on 2016/2019 (End of Mainstream Support) to reduce support risk and gain the latest ReFS dedup improvements. Kindly consider reading my Windows Server Installation Guidance and Windows Server Upgrade Guidance Exclude: Keep user profiles, AppData, and other high‑churn small‑file paths off ReFS dedup or on NTFS. Consider Dedup and compression: Enable compression optionally. Mind ReFS dedup compression is not the same as compress files integration in File Explorer or File Explorer properties (Windows 9x). It's transparent to the application Make smart decisions: Avoid using dedup when the dataset is changing fast or your dedup + compression rate is below 20%. Usually you can expect 40% or more savings, and up to 80% in specific use cases like VDI VHDX Plan your dedup jobs: Ensure making use planning features for dedup jobs through PowerShell or Windows Admin Center (WAC) when using ReFS dedup on more than one volume per Server. Otherwise they might all run at the same time and impact your storage performance (esp. spinning rust) and consumption of RAM and CPU. Share and Educate: Inform your infrastructure team about the changes so they avoid using the traditional dedup on ReFS.176Views2likes0CommentsWindows Backup taking waaaaay to long
While I'm not a heavy user of these MS forums I have had to resort to them from time to time over the last 15-20 years. Yet I still can't figure out the organizational structure and it seems I can never find the right forum for my query. Almost every time my post gets moved to the correct forum or message board, or someone gives me a link directly to it. I expect it to be no different this time, and I'm perfectly fine with that. So here we go. I have Windows Server 2025 installed as a VM using MS's built-in Hyper-V on a Server 2025 computer. the VM is set up as a DC and all that stuff functions exactly as it should. However, doing the backup has suddenly gone from taking anywhere from 2 hours to a max that comes close to but has never exceeded four hours. Obviously, it depends on how much there is to actually back up. I've already gone through the troubleshooting tips to do things like checking the VSS settings and a bit of other stuff I can't exactly recall at the moment. I have an external physical 1TB usb hard drive attached to the physical computer and then it's attached as a drive to the Server 2025 VM and shows up in computer management/disk manager ad Disk 1, as it should. I have the VM set up to use this Disk 1 as the backup disk with the Windows Server Backup program. Some things I note and add here in case it matters. - The size of the VM disk for this Server 2025 VM is 500GB and the partition size of Drive C shows as 498.91GB with the remaining shown as 100MB for the EFI system partion and 1001MB for the recovery partition. - When backup starts, a new disk labeled Disk 2 appears in the disk management window on the VM and I note it's the same size as Drive C on the VM at 498.91GB. I'm wondering if this has anything to do with why my backups suddenly went from taking a max of 4 hours to as long as 20 hours to complete. Where is this virtual disk created? I looked on the VM host machine in the C:\programdata\microsoft\windows\Virtual Hard Disks directory, and it's not there. It's not on the VM machine because the virtual hard disk directory doesn't exist in that same location on the VM. THe host machine itself has a 2TB hard drive in it with 993GB of free space. Any advice or suggestions here? I have no idea why backups went from 2-4 hours to taking 20 hours or more to complete. Thanks for any help, advice or suggestions anyone can offer here. -Carl11Views0likes0CommentsBuilt an Android app for OneDrive duplicate detection (something I wish Microsoft offered natively)
After years of dealing with duplicate photos and videos in OneDrive, I built a solution and wanted to share it with the community. **The problem:** - Samsung Gallery sync creates duplicates in both "Camera Roll" and "Samsung Gallery" folders - WhatsApp media gets backed up twice (original + shared copy) - App resets/reinstalls trigger re-uploads with "(1)" suffixes - No native duplicate detection in OneDrive **What I built:** OneDrive MediaOps - an Android app that scans for duplicates directly in the cloud. No need to download files to a desktop first. Key features: - Cloud-based scanning (no downloads required) - Algorithm visually identical photos even with different filenames - Preview before deleting - Batch deletion **Why cloud-based matters:** With 50GB+ of photos, downloading everything to run a desktop duplicate finder wasn't practical. This scans media directly via the Graph API. Available on Google Play: https://play.google.com/store/apps/details?id=com.onedrive.mediaops&pcampaignid=web_share Would love feedback from the community - especially if you've been dealing with the Samsung/OneDrive sync duplicate issue.59Views0likes1CommentCan OneDrive mobile sync a specific local folder on Android or iOS?
Hi, I would like to understand what is officially supported regarding folder synchronization in the OneDrive mobile app on Android and iOS. Specifically: - Is it possible to select a specific local folder on a mobile device and configure OneDrive mobile to automatically sync that folder? - Or does the OneDrive mobile app only support manual uploads and limited automatic features such as camera roll backup? I am looking for an official clarification on the supported behavior of OneDrive on mobile devices. Thank you.76Views0likes1CommentMove files from Account A to Account B 165 GB data
I’ve been using Microsoft Office 365 Family for several years, sharing it with my family members. We have 6 accounts, each with 1 TB of storage. One of my accounts has now reached 985 GB, so I need to transfer files from Account A (my account) to Account B (my wife’s account). These files mainly consist of photos and videos — over 200,000 files in total. I tried transferring a 6 GB wedding shoot (.ISO format), but it took more than an hour. Re-downloading and uploading all the data isn’t feasible as my internet connectivity is limited to 100 mbps only. I’m looking for the best and most efficient way to move these files directly between OneDrive accounts.207Views0likes2CommentsCache drive reconfiguration in Server 2025 Storage Spaces Direct cluster
We have a three node S2D cluster running Server 2025, with the storage in a 3 way mirror, running Hyper-V VMs. Each node has 4 x NVMe drives that are currently being used as cache drives, but which are connected to a RAID controller (in HBA mode), so in the S2D configuration they appear as SSD drives rather than NVMe drives. We've purchased the required cables and drive bays to be able to reconfigure the NVMe drives so that they're attached directly to the PCIe bus, so they'll show up as NVMe drives and hopefully give us a performance boost, so I'm just trying to plan the reconfiguration. I was hoping it would be a relatively simple process of shutting everything down, reconfiguring the storage and bringing everything back online, but ChatGPT suggests things won't be that easy and that a complete reconfiguration of the storage would be required. So in a nutshell, can the cache drives be reconfigured without a complete rebuild of the S2D storage ? Cheers, Rob119Views0likes2CommentsvNVMe on Hyper-V to unlock PCIe 5.0 NVMe performance
On hosts with NVMe PCIe 5.0 (E3.S/U.2), Hyper-V guests still use virtual SCSI and leave a lot of performance on the table. We are paying for top-tier storage, yet software becomes the limiter. A virtual NVMe device that preserves checkpoints/Replica/Live Migration would align guest performance with modern hardware without forcing DDA and its operational trade-offs.568Views1like7Comments26063 deduplication data corruption is still there.
From Server 2022 up to this newest 26063 build, they all have the same problem, as described here: https://techcommunity.microsoft.com/t5/windows-server-insiders/server-vnext-26040-and-server-2022-deduplication-data-corruption/m-p/4047321 I am out of energy for today and give up for today. It seems to be impossible to get Microsoft to care for actual OS bugs instead of marketing.5KViews1like26Comments