Nov 22 2018 01:45 AM
Hello!
I am trying to set up S2D on two node cluster for Hyper converged infrastructure. Unfortunately I observe significant write performance drop if we compare S2D storage with slowest physical hard drive performance participating in cluster.
What could cause this?
How to get better results?
My test environment
OS: Windows Server 2019 Datacenter Build 17723.rs5_release.180720-1452
Both nodes are connected directly using one 10 Gbps link for S2D
Each node have 1 Gbps link for management
S2D two node cluster configured with Cache disabled
Node 1
System: Supermicro X9SRH-7F/7TF
CPU: Intel Xeon E5-2620 2.00 GHz (6CPUs)
RAM: 32 GB DDR3
Network: Intel X540-AT2 10 Gbps copper
System drive: Samsung SSD 840 PRO 512 GB
Storage drives: Samsung SSD 850 PRO 512 GB, Samsung SSD 840 PRO 512 GB
Node 2
System: Intel S2600WTT
CPU: Genuine Intel CPU 2.30 GHz (ES) (28 CPUs)
RAM: 64 GB DDR4
Network: Intel X540-AT2 10 Gbps copper
System drive: INTEL SSDSC2BB240G7 240 GB
Storage drives: Samsung SSD 850 PRO 512 GB, Samsung SSD 840 PRO 512 GB
Before enabling S2D I turned off write cache for each SSD drive individually and tested their write performance by copying 30 GB large VHD file. Results were around 130 - 160 MB/s for Samsung SSD 840 PRO drives and around 60 - 70 MB/s for Samsung SSD 850 PRO drives.
After enabling S2D write performance drops to 40 - 44 MB/s (see attachment)
Dec 14 2018 02:29 PM
Thanks for evaluating S2D Clusters on Server 2019.
This configuration does not meet the fundamental requirement of S2D, as:
Please go over this blog, https://blogs.technet.microsoft.com/filecab/2016/11/18/dont-do-it-consumer-ssd for more details.
Also, please refer this article as well, on evaluating storage perf:
~Girdhar Beriwal
Dec 18 2018 08:30 AM
SolutionHi ,
Your nodes don’t comply with S2D requirements. Additionally, I would not recommend to measure performance by windows file copying, you’ll find arguments here:
Better use DiskSPD from MS.
Regarding the storage solution, you can look at virtual SAN vendors. I have a good experience of using Starwind vSAN for 2 servers cluster. The performance is better and no problem with configuration. You can find guide here:
Jan 09 2019 12:28 PM
Jan 09 2019 12:31 PM
Jan 16 2019 11:38 AM
Can you please send the output of following cmdlets:
1. Get-StoragePool
2. Get-PhysicalDisk
~Girdhar
Jan 17 2019 04:25 PM
Hi, Girdhar!
I physically removed from server the disk that accidentally went into S2D and then stuck in Primordial pool. Then I cleared it on another PC and created new partition. Then put back in cluter server and finally had option to use it without pooling.
But I plan to replace s2d 512 GB ssd driveswith larger ones so I still need to find option how to correctly remove disk from pool.
PS C:\Windows\system32> Get-StoragePool
FriendlyName OperationalStatus HealthStatus IsPrimordial IsReadOnly Size AllocatedSize
------------ ----------------- ------------ ------------ ---------- ---- -------------
Primordial OK Healthy True False 72.9 TB 9.31 TB
S2D on hc-cluster-1 OK Healthy False False 9.31 TB 1.84 TB
Primordial OK Healthy True False 11.53 TB 9.31 TB
PS C:\Windows\system32> Get-PhysicalDisk
DeviceId FriendlyName SerialNumber MediaType CanPool OperationalStatus HealthStatus Usage Size
-------- ------------ ------------ --------- ------- ----------------- ------------ ----- ----
22 ATA INTEL SSDSC2BB24 PHDV7171021B240AGN SSD False OK Healthy Auto-Select 223.57 GB
1004 Samsung SSD 850 PRO 512GB S250NSAG432476E SSD False OK Healthy Auto-Select 476.94 GB
1003 Samsung SSD 840 PRO Series S1AXNSAD800683Y SSD False OK Healthy Auto-Select 476.94 GB
1010 Samsung SSD 850 PRO 1TB S252NWAG304907F SSD False OK Healthy Auto-Select 953.87 GB
2016 Samsung SSD 850 PRO 512GB S250NSAG432479X SSD False OK Healthy Auto-Select 476.94 GB
1009 Samsung SSD 850 PRO 1TB S252NEAG301324Y SSD False OK Healthy Auto-Select 953.87 GB
2015 Samsung SSD 840 PRO Series S1AXNSAF111936H SSD False OK Healthy Auto-Select 476.94 GB
1008 Samsung SSD 850 PRO 1TB S252NWAG304891D SSD False OK Healthy Auto-Select 953.87 GB
2000 ATA Samsung SSD 850 S1SRNWAF913328T SSD False OK Healthy Auto-Select 953.87 GB
1007 Samsung SSD 850 PRO 1TB S252NWAG403194P SSD False OK Healthy Auto-Select 953.87 GB
2019 ATA Samsung SSD 850 S1SRNWAF914370B SSD False OK Healthy Auto-Select 953.87 GB
2020 ATA Samsung SSD 850 S2BBNEAG113774L SSD False OK Healthy Auto-Select 953.87 GB
2021 ATA Samsung SSD 850 S2BBNEAG113775K SSD False OK Healthy Auto-Select 953.87 GB
Jan 23 2019 12:12 PM
Hi Uedgars,
Actually I was looking for physical disk and storage pools when you got into the bad state, to understand why System.Storage.PhysicalDisk.AutoPool.Enabled property value was not honored. Let me know, if you still face issues with it in future.
Now that you have fixed, the better way of removing the disk from the pool is Remove-PhysicalDisk as you tried earlier, though you have to specify the full FriendlyName of the S2D StoragePool rather than S2d*. Once this succeeds, you will see the CanPool value of the disk to be True.
Let me know, if this doesn't work for you.
Also, if you feel, system is not behaving as you expect, please follow this link and share the zip file with us. We try to collect needed information, so that there is no back and forth 🙂
Thanks
Girdhar
Jan 23 2019 12:40 PM
Hello!
Remove-PhysicalDisk worked for me even with asterisk (SSD*) and ssd get moved from s2d pool to primordial pool, but the problem is with the next step. I want to get out disks also from primordial pool to see them in disk management and use as standalone disks in windows system. To do this I understood I need to use command Set-ClusterS2DDisk -CanBeClaimed $false, but I got error. I provided command and error message below.
Shortly: I had 4x 512GB ssds (2 pcs in each server, I have two). Then I added 8 more ssd drives with size 1tb each (4 pcs in each server). Then I had problem, that I am unable to use all of these space (problem I described earlier). As I was no success to extend my volume, I decided to remove 512gb ssds out of the pool and see what happens. Then I run commands Set-PhysicalDisk -Usage Retired, Repair-VirtualDisk and Remove-Physical disk. So far all worked good. And then finally I wanted to get these 512GB disks out of the Primordial pool using Set-ClusterS2DDisk -CanBeClaimed $false, but it was unsuccessful. I got error.
! Interesting thing is, that after 512GB ssd removal my StorageTier allowed maximum size changed and now it is around 2.8TB As I already have 930 GB volume that I want to extend, it means Tiear allows around 3.7TB This sounds much better and I believe it is the maximum for 8x1tb drives in mirror. But it is still strange, that with 4x512gb + 8x1tb my tier max size was only around 1.5TB
$disk=get-physicaldisk -FriendlyName "*Samsung SSD*" | ? {$_.size -eq "512110190592" -and $_.deviceid -ne 0}
Set-ClusterS2DDisk -CanBeClaimed $false -PhysicalDisk $disk
Set-ClusterS2DDisk : Failed to set cache mode on disks connected to node 'h11'. Run cluster validation, including the Storage Spaces
Direct tests, to verify the configuration
At line:2 char:1
+ Set-ClusterS2DDisk -CanBeClaimed $false -PhysicalDisk $disk
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : NotSpecified: (:) [Set-ClusterStorageSpacesDirectDisk], CimException
+ FullyQualifiedErrorId : HRESULT 0x8007139f,Microsoft.Management.Infrastructure.CimCmdlets.InvokeCimMethodCommand,Set-ClusterSt
orageSpacesDirectDisk
Physical Disks now looks like this.
PS C:\Windows\system32> get-physicaldisk -FriendlyName "*Samsung SSD*" | ? {$_.size -eq "512110190592" -and $_.deviceid -ne 0}
DeviceId FriendlyName SerialNumber MediaType CanPool OperationalStatus HealthStatus Usage Size
-------- ------------ ------------ --------- ------- ----------------- ------------ ----- ----
1004 Samsung SSD 850 PRO 512GB S250NSAG432476E SSD True OK Healthy Auto-Select 476.94 GB
1003 Samsung SSD 840 PRO Series S1AXNSAD800683Y SSD True OK Healthy Auto-Select 476.94 GB
2016 Samsung SSD 850 PRO 512GB S250NSAG432479X SSD True OK Healthy Auto-Select 476.94 GB
2015 Samsung SSD 840 PRO Series S1AXNSAF111936H SSD True OK Healthy Auto-Select 476.94 GB
PS C:\Windows\system32> get-physicaldisk -FriendlyName "*Samsung SSD*" | ? {$_.size -eq "512110190592" -and $_.deviceid -ne 0} | Get-StoragePool
FriendlyName OperationalStatus HealthStatus IsPrimordial IsReadOnly Size AllocatedSize
------------ ----------------- ------------ ------------ ---------- ---- -------------
Primordial OK Healthy True False 11.53 TB 7.45 TB
Primordial OK Healthy True False 11.53 TB 7.45 TB
Primordial OK Healthy True False 11.53 TB 7.45 TB
Primordial OK Healthy True False 11.53 TB 7.45 TB
Jan 23 2019 12:53 PM
And actually there is one more important question before I start to extend my storage.
At the beginning I had 4x512gb ssd drives. I enabled S2D without cache and it automatically created pool and tiers. When I started to add drives and faced problems posted here, I found there exists tier parameter column number. I figured out for the default tier template this parameter is set to auto. But for my tiered volume column number has value 2. Now, when I have 8 ssd drives (4 on each server), it was better for performance and drive wear equalization to set column number to 4 so all 4 drives at each server forms one stripe. Is it even possible to do? I was unable to find detailed specs about s2d operation in this level. And unfortunatelly some forums gave info that it is not possible to change column count after volume is created. Is it true? And if so, are there any technical recommendation how to choose this value?
Jan 25 2019 02:34 PM
Well as you figured it out, updating the Column count post volume creation is not possible.
Re-creating the volume should take the correct Column count.
Jan 25 2019 02:50 PM
Can you try Set-ClusterS2DDisk -CanBeClaimed:$false -PhysicalDisk $disk
Note the colon.
On the error: Set-ClusterS2DDisk : Failed to set cache mode on disks connected to node 'h11'
Have your created cache tiers manually?
Also, after running set-ClusterS2dDisk, check the Get-Disk output to see available disk.
Jan 28 2019 01:02 AM
I tried to run command using column but it still returns the same error.
PS C:\Windows\system32> $disk=get-physicaldisk -FriendlyName "*Samsung SSD*" | ? {$_.size -eq "512110190592" -and $_.deviceid -ne 0}
Set-ClusterS2DDisk -CanBeClaimed:$false -PhysicalDisk $disk
Set-ClusterS2DDisk : Failed to set cache mode on disks connected to node 'h11'. Run cluster validation, including the Storage Spaces Direct tests, to verify the configuration
At line:2 char:1
+ Set-ClusterS2DDisk -CanBeClaimed:$false -PhysicalDisk $disk
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : NotSpecified: (:) [Set-ClusterStorageSpacesDirectDisk], CimException
+ FullyQualifiedErrorId : HRESULT 0x8007139f,Microsoft.Management.Infrastructure.CimCmdlets.InvokeCimMethodCommand,Set-ClusterStorageSpacesDirectDisk
I enabled S2D with cache disabled (Enable-ClusterS2D -cachestate disabled)
I even did not know about cache tiers. How can I check their status? If I use Get-StorageTiers I only see my storage tier:
PS C:\Windows\system32> get-storagetier
FriendlyName TierClass MediaType ResiliencySettingName FaultDomainRedundancy Size FootprintOnPool StorageEfficiency
------------ --------- --------- --------------------- --------------------- ---- --------------- -----------------
Capacity Unknown SSD Mirror 1 0 B 0 B
MirrorOnSSD Unknown SSD Mirror 1 0 B 0 B
ssd-volume-1-MirrorOnSSD Capacity SSD Mirror 1 930 GB 1.82 TB 50.00%
Oh, I remembered I have turned deduplication on. Might be it disturbs something?
PS C:\Windows\system32> Get-DedupStatus | fl *
ObjectId : \\?\Volume{079e5b9b-7f17-4bea-bbd6-6de7bed066fd}\
Capacity : 998512787456
FreeSpace : 503428747264
InPolicyFilesCount : 13
InPolicyFilesSize : 868141497962
LastGarbageCollectionResult : 0
LastGarbageCollectionResultMessage : The operation completed successfully.
LastGarbageCollectionTime : 1/26/2019 4:39:45 AM
LastOptimizationResult : 0
LastOptimizationResultMessage : The operation completed successfully.
LastOptimizationTime : 1/28/2019 10:41:21 AM
LastScrubbingResult : 0
LastScrubbingResultMessage : The operation completed successfully.
LastScrubbingTime : 1/26/2019 4:40:48 AM
OptimizedFilesCount : 13
OptimizedFilesSavingsRate : 50
OptimizedFilesSize : 868141497962
SavedSpace : 440480093290
SavingsRate : 47
UnoptimizedSize : 935564133482
UsedSpace : 495084040192
Volume : C:\ClusterStorage\ssd-volume-1
VolumeId : \\?\Volume{079e5b9b-7f17-4bea-bbd6-6de7bed066fd}\
PSComputerName :
CimClass : ROOT/Microsoft/Windows/Deduplication:MSFT_DedupVolumeStatus
CimInstanceProperties : {Capacity, FreeSpace, InPolicyFilesCount, InPolicyFilesSize...}
CimSystemProperties : Microsoft.Management.Infrastructure.CimSystemProperties
Jan 28 2019 01:12 AM
So, If I have column count 2, it means S2D takes two drives for a stripe per server. And only when these drives are full, it starts to write into rest pair of drives. Right?
Then what happens, if I leave column count 2 and create a second tiered volume also with column count 2? Do S2D understands less loaded drives and distributes this volume around empty ones?
Or from performance perspective it is better to set column count of 4 for my setup? (I understand, that if it is set to 4, I can extend my tiered volume only if I add appropriate count of drives)
Dec 18 2018 08:30 AM
SolutionHi ,
Your nodes don’t comply with S2D requirements. Additionally, I would not recommend to measure performance by windows file copying, you’ll find arguments here:
Better use DiskSPD from MS.
Regarding the storage solution, you can look at virtual SAN vendors. I have a good experience of using Starwind vSAN for 2 servers cluster. The performance is better and no problem with configuration. You can find guide here: