Performance Required for Deep Learning


There is this question that I wanted to find out about deep learning. What are essential System, Network, Protocol that will speed up the Training and/or Inferencing. There may not be necessary to employ the same level of requirements from Training to Inferencing and Vice Versa. I have received this information during a Nvidia Presentation

Training:

  1. Scalability requires ultra-fast networking
  2. Same hardware needs as HPC
  3. Extreme network bandwidth
  4. RDMA
  5. SHARP (Mellanox Scalable Hierarchical Aggregation and Reduction Protocol)
  6. GPUDirect (https://developer.nvidia.com/gpudirect)
  7. Fast Access Storage

Influencing

  1. Highly Transactional
  2. Ultra-low Latency
  3. Instant Network Response
  4. RDMA
  5. PeerDirect, GPUDirect

 

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.