IBM Spectrum Scale Container Native Storage Access (CNSA) allows the deployment of Spectrum Scale in a Red Hat OpenShift cluster. Using a remote mount attached file system, CNSA provides a persistent data store to be accessed by the applications via the IBM Spectrum Scale Container Storage Interface (CSI) driver using Persistent Volumes (PVs).
Nvidia and IBM did a complex proof-of-concept to demonstrate the scaling of AI workload using Nvidia DGX, Red Hat OpenShift and IBM Spectrum Scale at the example of ResNet-50 and the segmentation of images using the Audi A2D2 dataset. The project team published an IBM Redpaper with all the technical details and will present the key learnings and results.
What is IO-500 Node Challenge?
The IO-500 10 Node Challenge is a ranked list comparing storage systems that work in tandem with the world’s largest supercomputers. By limiting the benchmark to 10 nodes, the test challenges single client performance from the storage system. Each system is evaluated using the IO-500 benchmark that measures the storage performance using read/write bandwidth for large files and read/write/listing performance for small files…. from InsideHPC
For more information, do look at WekaIO Beats Big Systems on the IO-500 10 Node Challenge
- NVMe storage via RDMA storage via E8, Excelero
Lowest-Latency Distributed Block Storage for IBM Spectrum Scale
Excelero NVMesh, Lowest-Latency Distributed Block Storage for IBM Spectrum Scale
- Community server + Spectrum Scale Erasure coding
IBM Spectrum LSF and IBM Spectrum Scale User Group Erasure Code Edition
- IBM ESS NVMe edition (going to be released in this Q4)
- Existing IBM ESS
Accelerate with IBM Storage: Building and Deploying Elastic Storage Server (ESS)
- SCA19 – 01 – Spectrum Scale Use Cases (Ulf Troppens, IBM)
- SCA19 – 02 – Optimize your data pipeline for analytics and AI (Par Hettinga, IBM)
- SCA19 – 03 – Data Management for autonomous driving development (Frank Kraemer, IBM)
- SCA19 – 04 – What is new in Spectrum Scale (Wei Gong, IBM)
- SCA19 – 05 – What is new in ESS (Chris Maestas, IBM)
- SCA19 – 06 – What is new in Support (Ravikumar Ramaswamy, IBM)
- SCA19 – 07 – Spectrum Scale on AWS Marketplace (Smita Raut, IBM)
- SCA19 – 09 – Genomics Deployments Enabling Precision Medicine with IBM Spectrum Scale (Sandeep Patil, IBM)
- SCA19 – 10 – Running Spark Hadoop workload on Spectrum Scale (Wei Gong, IBM)
- SCA19 – 11 – Lenovo – HPC Storage Solutions Update (Michael Hennecke, Lenovo)
- SCA19 – 12 – Spectrum Scale and containers (Smita Raut, IBM)
- SCA19 – 13- Tiering cold data to IBM Spectrum Archive (Khanh Ngo, IBM)
- EuXFEL–online & offline data processing and storage (Martin Gasthuber, Stefan Dietrich, Janusz Malka – DESY/IT Kryzsztof Wrona, Janusz Szuba – EuXFEL CHEP16 – San Francisco)
- How NFS exports became more dynamic with Spectrum Scale 5.0.2
- HPC storage on AWS (IBM Spectrum Scale)
- Upgrade with Excluding the node(s) using Install-toolkit
- Offline upgrade using Install-toolkit
- IBM Spectrum Scale for Linux on IBM Z ? What’s new in IBM Spectrum Scale 5.0.2
- What’s New in IBM Spectrum Scale 5.0.2
- Starting IBM Spectrum Scale 5.0.2 release, the installation toolkit supports upgrade rerun if fresh upgrade fails.
- IBM Spectrum Scale installation toolkit enhancements over releases 18.104.22.168
- Announcing HDP 3.0 support with IBM Spectrum Scale
- IBM Spectrum Scale Tuning Overview for Hadoop Workload
- Making the Most of Multicloud Storage
- Disaster Recovery for Transparent Cloud Tiering using SOBAR
- Your Optimal Choice of AI Storage for Today and Tomorrow
- Analyze IBM Spectrum Scale File Access Audit with ELK Stack
- Mellanox SX1710 40G switch MLAG configuration for IBM ESS
- Protocol Problem Determination Guide for IBM Spectrum Scale SMB and NFS Access issues
- Access Control in IBM Spectrum Scale Object
- IBM Spectrum Scale HDFS Transparency Docker support
- Protocol Problem Determination Guide for IBM Spectrum Scale Log Collection
mmlspdisk which Lists information for one or more GPFS Native RAID pdisks. To check faulty disks, do the commands
# mmlspdisk all --not-ok mmlspdisk: [I] No disks were found.
# mmlspdisk all --replace mmlspdisk: [I] No disks were found.
We have encountered a situation where a defunct disk was accepting IO request and did not return any failure in time. As a result, these IO requests hanged there till time out (default 10 seconds). Typically, Spectrum Scale/GPFS will fail to read or write a disk, the failure is written in log and we have to shift IO to other available disks which should be quick.
Normally such operations should return in 20 milliseconds or less. When we have IO timeout, this request has wasted us
10 seconds / 20 milliseconds = 500 times of time. Even if Spectrum Scale/GPFS is able to choose a fast disk in the second attempt, we are much slower than normal.
Due to the utilization of striping technology, a bad/slow disks always affects IO of many files, much more than the situation without striping. IO on the same file involves more than several disks, and the IO has to wait for the slowest request to return. So a bad/slow disk may have considerable influence on Spectrum Scale/GPFS performance.