Accelerate Your End-to-End AI, Data Analytics and HPC Workflows with the NVIDIA NGC Catalog

The NGC catalog is a hub of GPU-optimized AI, high-performance computing (HPC), and data analytics software that simplifies and accelerates end-to-end workflows

GTC 2021 Keynote with NVIDIA CEO Jensen Huang

NVIDIA CEO Jensen announced NVIDIA’s first data center CPU, Grace, named after Grace Hopper, a U.S. Navy rear admiral and computer programming pioneer. Grace is a highly specialized processor targeting largest data intensive HPC and AI applications as the training of next-generation natural-language processing models that have more than one trillion parameters.

Further accelerating the infrastructure upon which hyperscale data centers, workstations, and supercomputers are built, Huang announced the NVIDIA BlueField-3 DPU.

The next-generation data processing unit will deliver the most powerful software-defined networking, storage and cybersecurity acceleration capabilities.

Where BlueField-2 offloaded the equivalent of 30 CPU cores, it would take 300 CPU cores to secure, offload, and accelerate network traffic at 400 Gbps as BlueField-3— a 10x leap in performance, Huang explained.

Performance Required for Deep Learning

There is this question that I wanted to find out about deep learning. What are essential System, Network, Protocol that will speed up the Training and/or Inferencing. There may not be necessary to employ the same level of requirements from Training to Inferencing and Vice Versa. I have received this information during a Nvidia Presentation

Training:

  1. Scalability requires ultra-fast networking
  2. Same hardware needs as HPC
  3. Extreme network bandwidth
  4. RDMA
  5. SHARP (Mellanox Scalable Hierarchical Aggregation and Reduction Protocol)
  6. GPUDirect (https://developer.nvidia.com/gpudirect)
  7. Fast Access Storage

Influencing

  1. Highly Transactional
  2. Ultra-low Latency
  3. Instant Network Response
  4. RDMA
  5. PeerDirect, GPUDirect