Disk performance

Storage Benchmarking

There are 4 things that you may want to consider

I/O Latency
I/O latency is defined simply as the time that it takes to complete a single I/O operation. For a conventional spinning disk, there are 3 sources of latency – seek latency, rotational latency and transfer time.

  1. Command Overhead
  2. Seek Latency is how long it takes for the disk head assembly to travel to the track of the disk where the data will be read/written. The fastest high-end server drives today to have a seek time around 4 ms. The average desktop disk is around 9ms (Taken from Wikipedia)
  3. Rotational Latency is the delay taken for the rotation fo the disk to bring the disk sector under the read-write-head. For a 7200 rpm disk, latency is around 4.17 ms (Taken from Wikipedia)
  4. Transfer Time is the time taken for the time it takes to transmit or move data from one place to another. Transfer time equals transfer size divided by data rate.
Typical HDD figures (From Wikipedia)
HDD spindle
speed [rpm]
latency [ms]
4,200 7.14
5,400 5.56
7,200 4.17
10,000 3.00
15,000 2.00

So the simplistic calculation

overhead + seek + latency + transfer
0.5ms + 4ms  + 4.17ms + 0.8ms = 9.47ms

Acceptable I/O

A question frequently asked is what is the acceptable I/O? According to the Kaminario site, which states that
The Avg. Disk sec/Read performance counter indicates the average time, in seconds, of a read of data from the disk. The average value of the Avg. Disk sec/Read performance counter should be under 10 milliseconds. The maximum value of the Avg. Disk sec/Read performance counter should not exceed 50 milliseconds.



  1. What Is an Acceptable I/O Latency?
  2. Disk Performance
  3. Difference between Seek Time and Rotational Latency in Disk Scheduling


Lenovo new 4-Socket Servers SR860-V2 and SR850-V2

This month, Lenovo launched 2 new mission-critical servers based on new 4-socket-capable third-generation Intel Xeon Scalable processors.

  • ThinkSystem SR860 V2, the new 4U 4-socket server, supporting up to 48x 2.5-inch drive bays and up to 8x NVIDIA T4 GPUs or 4x NVIDIA V100S GPUs.

  • ThinkSystem SR850 V2, the new 2U 4-socket server, supported up to 24x 2.5-inch drive bays, all of which can be NVMe if desired.



BOIS settings for OEM Server with EPYC

Taken from Chapter 4 of https://developer.amd.com/wp-content/resources/56827-1-0.pdf


Selected Explanation of Setting. (See Document for FULL explanation)

1. Simultaneous Mult-Threading (SMT) or HyperThreading (HT)

  • IN HPC Workload, the SMT are usually turned off

2. x2APIC

  • This option helps the operating system deal with interrupt more efficiently in high cores count configuration. It is recommended to enable this option. This option must be enabled if  using more than 255 threads

3. Numa Per Socket (NPS)

  • In many HPC applications, ranks and memory can be pinned to cores and NUMA Nodes. The recommended value should be NPS4 option. However, if the workload is not NUMA aware or suffers when the NUMA complexity increase, we can experiment with NSP1.

4. Memory Frequency, Infinity Fabric Frequency, and coupled ve uncoupled mode

Memory Clock and Infinity Fabric Clock can run at synchronous frequencies (coupled mode) or at asynchronous frequencies (uncoupled mode)

  • If the memory is clocked at lower than 2933 MT/s, the memory and fabric will run in coupled mode which has the lowest memory latency
  • If the memory is clocked at  3200 MT/s, the memory and fabric clock will run in asynchronous mode has higher bandwidth but increased memory latency.
  • Make sure APBDIS is set to 1 and fixed SOC Pstate is set to P0

5. Preferred IO

Preferred IO allows one PCIe device in the system to be configured in a preferred mode. This device gets preferential treant on the infinity fabric

6. Determinism Slider

  • Recommended to choose Power Option. For this mode, the CPUs in the system performance at the maximum capability of each silicon device. Due to the natural variation existing during the manufacturing process, some CPUs performances may be varied,  but will never fall below “Performance Determinism mode”