Taken from Chapter 4 of https://developer.amd.com/wp-content/resources/56827-1-0.pdf
Selected Explanation of Setting. (See Document for FULL explanation)
1. Simultaneous Mult-Threading (SMT) or HyperThreading (HT)
- IN HPC Workload, the SMT are usually turned off
- This option helps the operating system deal with interrupt more efficiently in high cores count configuration. It is recommended to enable this option. This option must be enabled if using more than 255 threads
3. Numa Per Socket (NPS)
- In many HPC applications, ranks and memory can be pinned to cores and NUMA Nodes. The recommended value should be NPS4 option. However, if the workload is not NUMA aware or suffers when the NUMA complexity increase, we can experiment with NSP1.
4. Memory Frequency, Infinity Fabric Frequency, and coupled ve uncoupled mode
Memory Clock and Infinity Fabric Clock can run at synchronous frequencies (coupled mode) or at asynchronous frequencies (uncoupled mode)
- If the memory is clocked at lower than 2933 MT/s, the memory and fabric will run in coupled mode which has the lowest memory latency
- If the memory is clocked at 3200 MT/s, the memory and fabric clock will run in asynchronous mode has higher bandwidth but increased memory latency.
- Make sure APBDIS is set to 1 and fixed SOC Pstate is set to P0
5. Preferred IO
Preferred IO allows one PCIe device in the system to be configured in a preferred mode. This device gets preferential treant on the infinity fabric
6. Determinism Slider
- Recommended to choose Power Option. For this mode, the CPUs in the system performance at the maximum capability of each silicon device. Due to the natural variation existing during the manufacturing process, some CPUs performances may be varied, but will never fall below “Performance Determinism mode”