vNVMe on Hyper-V to unlock PCIe 5.0 NVMe performance

festuc
Copper Contributor
Aug 27, 2025
Thanks for the reply and for sharing your CDM settings. I reran everything on my side using the exact same CrystalDiskMark 9.0.1 parameters from your screenshot (5 passes, 1 GiB, R70%/W30%, SEQ1M Q8T64 and RND4K Q32T64). Hardware and topology details below for context.
1) Drop in RND4K write after enabling the Hyper-V role (still testing on the host)
Host (64 cores) on bare-metal vs same host with the Hyper-V role enabled.
Sequential read/write stayed essentially identical (within ~±2%).
However, RND4K write (Q32T64) on the host fell by ~44% on both drives after enabling Hyper-V.
This was reproducible on:
Intel Optane P5800X 800 GB (U.2, PCIe 4.0 x4)
Solidigm D7-PS1030 3.2 TB (E3.S, PCIe 5.0 x4)
Same OS, power plan = High Performance, same CDM build, same test size/data pattern, AV exclusions in place.
In short: just turning on the Hyper-V role didn’t touch sequential, but it did hurt 4K random writes on the host by ~44% for me.
2) Performance inside the VM
VM configured with 32 vCPU (I do not want to break NUMA; NUMA Spanning = OFF on the host).
Sequential (SEQ1M Q8T64) inside the VM is almost identical to the host (within ~1–2%) on both drives.
RND4K read (Q32T64) inside the VM drops significantly versus the host:
Optane P5800X: about −55% vs host
D7-PS1030: about −74% vs host
RND4K write (Q32T64) inside the VM goes up versus the Hyper-V host run (roughly +30%), but it’s still ~25% below bare-metal.
My interpretation: the virtual SCSI + VMBus path and scheduling at T=64 threads inside the VM don’t scale like the host (especially for 4K random reads), while writes may benefit from coalescing/scheduling effects but still can’t match bare-metal.
3) Test setup (for completeness)
Host: 64 cores.
VM: 32 vCPU pinned to a single NUMA node (to keep locality); NUMA Spanning OFF at the host level.
Drives under test (one at a time, direct to CPU, no RAID):
Intel Optane P5800X 800 GB, U.2, PCIe 4.0 x4
Solidigm D7-PS1030 3.2 TB, E3.S, PCIe 5.0 x4
CrystalDiskMark 9.0.1 x64, Admin, same settings as in your screenshot:
5 passes, 1 GiB, R70/W30, SEQ1M Q8T64, RND4K Q32T64
Note: the “RND4K (µs)” row in CDM is the same Q32T64 test expressed in average µs, not QD1 latency.
Thanks again!
- L_Youtell_974
  Iron Contributor
  Aug 28, 2025
  Hi,
  i don't think you can obtain 100% of performance in your VM. Don't forget, you are not on the hard drive but you use an interface to write on a virtual drive so it take a little longer to write the information on the virtual drive.
  - festuc
    Copper Contributor
    Aug 28, 2025
    Thanks for the follow-up—totally agree that 100% host-class performance in a VM isn’t realistic due to the virtual I/O path. My goal isn’t zero-overhead; it’s to separate what’s expected from what might be avoidable, especially for SQL-like, small-IO latency.
    What I’m seeing (CDM 9.0.1, Profile: Default, 5×, 32 GiB, Random; Windows Server 2025):
    Sequential & higher-queue reads are fine in a VM
    SEQ 1MiB Q8T1 (Read): ~parity host ↔ VM on both drives.
    RND4K Q32T1 (Read/Write): ~parity on Optane; small drop on the Solidigm.
    The big gap is low-queue 4 K latency (Q1T1)
    Optane P5800X: 8.9 µs → 52.6 µs in VM (~6× slower 😒 ; 111k → 18.9k IOPS, −83%).
    Solidigm D7-PS1030: 55.1 µs → 119.7 µs in VM (~2.2× slower; 18.1k → 8.3k IOPS, −54%).
    That pattern feels heavier than “a little slower”: the vSCSI/VMBus/StorVSP path seems OK for throughput and for moderate queues, but penalizes Q1/T1 latency very strongly—precisely the OLTP zone.
    A couple of clarifications/questions:
    I’m not using exotic CDM presets (no T=64). This is the Default profile (RND4K Q1T1, RND4K Q32T1, SEQ Q1T1 Write, SEQ Q8T1 Read) to mimic SQL patterns.
    The host shows another oddity: after merely enabling the Hyper-V role, host-side RND4K write dropped ~44% (no VMs running). Is that expected (e.g., storage filter/stack changes when the role is installed)?
    If there are recommended tunables (StorVSC/StorVSP queueing, controller settings, best-practice for vSCSI) or an ETA/plan for vNVMe / multi-queue improvements, I’m happy to re-run with whatever parameters you suggest (can also switch to DiskSpd for 8 KiB Q1/T1 with latency percentiles).

Forum Discussion

vNVMe on Hyper-V to unlock PCIe 5.0 NVMe performance

1) Drop in RND4K write after enabling the Hyper-V role (still testing on the host)

2) Performance inside the VM

3) Test setup (for completeness)

Resources