Forum Discussion
About Limitations of NonePaged Pool Size in Windows 10
it’s a very advanced use case.
At present, Windows 10 and Windows 11 enforce a hard limit of 128GB for the non‑paged pool, regardless of how much physical RAM is installed. This is part of the kernel’s memory management design and cannot be overridden through registry changes or system settings. The 75% rule you mentioned applies, but the cap of 128GB is the maximum allowed.
Because of this, CUDA’s `cudaHostAlloc` and RDMA buffer registration will be constrained to that ceiling. The recommended approach is to split workloads into smaller blocks (≤128GB) and process them sequentially or in parallel streams. This ensures compatibility with the Windows kernel’s memory model.
Linux does indeed allow more flexibility in tuning kernel parameters, but on Windows this limit is not user‑configurable. As of now, there are no public plans to remove or extend the 128GB cap.
If your application requires larger contiguous non‑paged allocations, you may want to:-
Architect your pipeline to chunk data into smaller buffers.
Explore whether your workload can leverage paged memory with pinned allocations (depending on CUDA/driver support).
- Consider hybrid designs where RDMA or CUDA transfers operate on slices of the dataset.
I understand this feels restrictive on systems with >1TB RAM, but the current design prioritizes system stability and kernel integrity. Feedback like yours is valuable I would encourage you to submit it through the [Feedback Hub] so it reaches the Windows engineering team directly.