September 1, 2020 by kittycool only

Disk performance

Storage Benchmarking

There are 4 things that you may want to consider

I/O Latency
I/O latency is defined simply as the time that it takes to complete a single I/O operation. For a conventional spinning disk, there are 3 sources of latency – seek latency, rotational latency and transfer time.

Command Overhead
Seek Latency is how long it takes for the disk head assembly to travel to the track of the disk where the data will be read/written. The fastest high-end server drives today to have a seek time around 4 ms. The average desktop disk is around 9ms (Taken from Wikipedia)
Rotational Latency is the delay taken for the rotation fo the disk to bring the disk sector under the read-write-head. For a 7200 rpm disk, latency is around 4.17 ms (Taken from Wikipedia)
Transfer Time is the time taken for the time it takes to transmit or move data from one place to another. Transfer time equals transfer size divided by data rate.

Typical HDD figures (From Wikipedia)
HDD spindle speed [rpm]	Average rotational latency [ms]
4,200	7.14
5,400	5.56
7,200	4.17
10,000	3.00
15,000	2.00

So the simplistic calculation

overhead + seek + latency + transfer
0.5ms + 4ms + 4.17ms + 0.8ms = 9.47ms

Acceptable I/O

A question frequently asked is what is the acceptable I/O? According to the Kaminario site, which states that
The Avg. Disk sec/Read performance counter indicates the average time, in seconds, of a read of data from the disk. The average value of the Avg. Disk sec/Read performance counter should be under 10 milliseconds. The maximum value of the Avg. Disk sec/Read performance counter should not exceed 50 milliseconds.

References:

June 29, 2020 by kittycool only

Lenovo new 4-Socket Servers SR860-V2 and SR850-V2

This month, Lenovo launched 2 new mission-critical servers based on new 4-socket-capable third-generation Intel Xeon Scalable processors.

ThinkSystem SR860 V2, the new 4U 4-socket server, supporting up to 48x 2.5-inch drive bays and up to 8x NVIDIA T4 GPUs or 4x NVIDIA V100S GPUs.

ThinkSystem SR850 V2, the new 2U 4-socket server, supported up to 24x 2.5-inch drive bays, all of which can be NVMe if desired.

References:

What’s New – ThinkSystem SR860 V2 and SR850 V2

June 26, 2020 by kittycool only

BOIS settings for OEM Server with EPYC

Taken from Chapter 4 of https://developer.amd.com/wp-content/resources/56827-1-0.pdf

Selected Explanation of Setting. (See Document for FULL explanation)

1. Simultaneous Mult-Threading (SMT) or HyperThreading (HT)

IN HPC Workload, the SMT are usually turned off

2. x2APIC

This option helps the operating system deal with interrupt more efficiently in high cores count configuration. It is recommended to enable this option. This option must be enabled if using more than 255 threads

3. Numa Per Socket (NPS)

In many HPC applications, ranks and memory can be pinned to cores and NUMA Nodes. The recommended value should be NPS4 option. However, if the workload is not NUMA aware or suffers when the NUMA complexity increase, we can experiment with NSP1.

4. Memory Frequency, Infinity Fabric Frequency, and coupled ve uncoupled mode

Memory Clock and Infinity Fabric Clock can run at synchronous frequencies (coupled mode) or at asynchronous frequencies (uncoupled mode)

If the memory is clocked at lower than 2933 MT/s, the memory and fabric will run in coupled mode which has the lowest memory latency
If the memory is clocked at 3200 MT/s, the memory and fabric clock will run in asynchronous mode has higher bandwidth but increased memory latency.
Make sure APBDIS is set to 1 and fixed SOC Pstate is set to P0

5. Preferred IO

Preferred IO allows one PCIe device in the system to be configured in a preferred mode. This device gets preferential treant on the infinity fabric

6. Determinism Slider

Recommended to choose Power Option. For this mode, the CPUs in the system performance at the maximum capability of each silicon device. Due to the natural variation existing during the manufacturing process, some CPUs performances may be varied, but will never fall below “Performance Determinism mode”

S	M	T	W	T	F	S
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

The Linux Cluster

Linux Cluster Blog is a collection of how-to and tutorials for Linux Cluster and Enterprise Linux

Hardware

Disk performance

Storage Benchmarking

Acceptable I/O

References:

Lenovo new 4-Socket Servers SR860-V2 and SR850-V2

References:

BOIS settings for OEM Server with EPYC

Selected Explanation of Setting. (See Document for FULL explanation)

1. Simultaneous Mult-Threading (SMT) or HyperThreading (HT)

2. x2APIC

3. Numa Per Socket (NPS)

4. Memory Frequency, Infinity Fabric Frequency, and coupled ve uncoupled mode

5. Preferred IO

6. Determinism Slider