Written by Jason Yi, PM on the Azure Edge & Platform team at Microsoft.
Acknowledgements: Dan Lovinger
Imagine this, you have an Azure Stack HCI cluster set up and ready to go. But you have that lingering question: What is your cluster’s storage performance potential? In such cases, you can rely on micro-benchmarking tools such as DiskSpd. And if you are not aware, the tool helps you customize and configure your own synthetic workloads by tweaking built in parameters. For more information, you can read about it here.
Most folks who already have experience with DiskSpd are likely familiar with the txt output option, which is also displayed in the terminal. The purpose behind this output was to present the data in a human readable format. We also aggregated some of the finer details to generate practical metrics for the users. This also means that we determined which metrics would be considered valuable. But, did you know that there is an option to output in XML, which reveals additional, granular data such as the total IOs achieved per second.
Let’s first take a few moments to review the txt output. As you may know, this output is split into four different sections:
Input settings:
CPU utilization details:
Total IO performance metrics:
Latency percentile analysis (-L parameter):
This result produces a detailed view of a couple performance metrics. That’s great, but what if you are interested in other data insights? If you did not read carefully through the DiskSpd wiki page, you may have missed the fact that there is a “hidden feature.” There is another output format that generates an XML file. This can be invoked by the -Rxml parameter and piped into an XML file with your preferred file name. But wait, there’s more! If you peep into the XML file, you will notice that there is more data than what was originally shown in the txt output, such as the total IOs achieved per second. More specifically, the XML output reveals more granular data as opposed to the aggregated data for the human eyes. If you wish to take a look, be warned – your eyes will burn from the squinting.
Before your eyes burn, let’s create a brief table of contents for the XML file.
<System> Under this element, you have some basic information regarding the system itself, such as the server/VM name, DiskSpd version, number of processors, etc.
<Profile> Under this element, you will find your input parameters from when you ran DiskSpd. To name a few, this includes the queue depth, thread count, warm up time, test duration, etc. There are quite a few sub-elements within this section. Luckily, most of them are self-explanatory, and so let us focus on a few of them.
<TimeSpan> This element is not to be confused with the above <TimeSpan> element. This section contains the results of your DiskSpd test. It is similar to the data presented in the txt file, but with added granular data. More specifically, you can view the CPU usage, IOPS statistics and latency statistics (average total milliseconds, standard deviation, etc.), in their respective sub-elements:
This may give rise to the question; can you modify the contents of this XML file and pipe it back into DiskSpd? Yes, you absolutely can! In fact, there is another parameter precisely for this purpose (-X). Here are the following steps to get you started: (great for batch testing!)
In case you wanted to start somewhere, I’ve included a short script that takes in a DiskSpd XML output named “output.xml” and extracts the total IOs achieved per second into a neat CSV file for you to view (ensure they are in the same path). This might be a good place to start if you want to get more data insights about IOPS. **Foreshadowing**
Hopefully, this provides a solution for those situations where you always wanted a more detailed form of data or to run DiskSpd batch tests. You can also imagine that there are a variety of ways you can manipulate the XML output through PowerShell scripts. Alas, this is for another day.
# Written by Jason Yi, PM
# This script takes the output XML file from DISKSPD and extracts the IOPS and time (seconds) and neatly organizes it into a CSV file.
# Ensure that your XML output file is in the same directory as this script when running.
# create path, input file, and node variables
$path = Get-Location
$file = [xml] (Get-Content "$path\output.xml")
$nodelist = $file.SelectNodes("/Results/TimeSpan/Iops/Bucket")
$ms = $nodelist.getAttribute("SampleMillisecond")
# store the bucket objects into a variable
$buckets = $file.Results.TimeSpan.Iops.Bucket
# update the xml from milliseconds to seconds
for ($i = 0; $i -lt $buckets.length; $i++){
$temp = $buckets[$i].SampleMillisecond
$tempUpdate = [int]($temp)/1000
$buckets[$i].SampleMillisecond = "$tempUpdate"
}
# select the objects you want in the csv file
$nodelist |
Select-Object @{n='Time (s)';e={[int]$_.SampleMillisecond}},
@{n='Total IOPS';e={[int]$_.Total}} |
Export-Csv "$path\time_v_iops.csv" -NoTypeInformation -Encoding UTF8 -Force # Have to force encoding to be UTF8 or data is in one column (UCS-2)
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.