Nvidia DGX Data Centre Reference Design

This is a white Paper from Nvidia which is an interesting information for easy deployment of DGX Servers for Deep Learning

  1. Nvidia DGX POD Reference Design Whitepaper (pdf)

Checking and Modifying Timestamp of whole Directory recursively

Step 1: Show the complete date, time and year for a specific file

$ ls -l --full-time
-rwxrwxr-x  1 root root  1109 2018-07-20 12:52:52.587945000 +0800 Allwmake
drwxrwxr-x  5 root root  4096 2018-07-20 12:52:52.602945000 +0800 applications
drwxrwxr-x  3 root root  8192 2018-07-20 12:53:19.536973000 +0800 bin
-rw-rw-r--  1 root root 35646 2018-07-20 12:52:52.592945000 +0800 COPYING
drwxrwxr-x  5 root root  4096 2018-07-20 12:53:19.936974000 +0800 doc
drwxrwxr-x  8 root root  4096 2018-07-20 12:53:20.039974000 +0800 etc
drwxr-xr-x  4 root root  4096 2018-07-20 12:55:17.230101000 +0800 platforms
-rw-rw-r--  1 root root  1620 2018-07-20 12:52:52.597945000 +0800 README.org
drwxrwxr-x 38 root root  4096 2018-07-20 12:53:22.032976000 +0800 src
drwxrwxr-x 17 root root  4096 2018-07-20 12:54:45.114064000 +0800 tutorials
drwxrwxr-x  7 root root  4096 2018-07-20 12:55:15.939099000 +0800 wmake

Step 2: If you wish to modify the time-stamp for the entire directory, you can use the command,

# for file in `find .`; do touch $file; done

References:

  1. touch – change file timestamps(Unix Tutorial)

 

Resolving Orphaned Objects in Centrify Access Manager

On the Centrify Access Manager, when we search for the userid, the Centrify Access Manager is not found.
But when we add the userid in the system, it mentioned that the userid is duplicated. It seems that the userid has been cached and orphaned somewhere in Centrify.

Step 1: To find out duplicated users / objects, you may use Analyze feature in Access Manager. See Pix 1

Step 2: Analyse Results

You will notice

– Duplicate users in zones
– Orphan zone data objects and invalid data links

Step 3: Right-Clicked to fix the isses

You should be able to add the user.

 

Formatting NVME Partition on CentOS 7

Step 1: Create a partition:

# sudo fdisk /dev/nvme0n1
Choose “n” to create a new partition
Then "p" and "1" for new partition
Using default paratmeter, "w" to write data to disk

Step 2: Create a file system on it:

# sudo mkfs -t ext4 /dev/nvme0n1p1

Step 3: Create a mount point somewhere convenient:

# sudo mkdir /media/nvme

Step 4: Mount the new partition on that mount point:

# sudo mount /dev/nvme0n1p1 /media/nvme

Step 5: Permanently Mount the Device
Step 5a. To find the UUID first

# sudo blkid

Step 5b: To get it to mount every time, add a line to /etc/fstab:

UUID=nvme_UUID /media/nvme ext4 defaults 0 0

(where nvme_UUID is the value taken from “sudo blkid”)

Step 6 (Optional): At this point, the whole thing belongs to ‘root’

To change the ownership to a specific user (with the partition mounted):

# sudo chown -R user:usergroup /media/nvme

Nvidia Tesla versus Nvidia GTX Cards

References

  1. Performance Comparison between NVIDIA’s GeForce GTX 1080 and Tesla P100 for Deep Learning
  2. Comparison of NVIDIA Tesla/Quadro and NVIDIA GeForce GPUs

 

Nvidia EULA

Key clauses are: 2.1.3 that states no DC deployment, commercial hosting and broadcast services
http://www.nvidia.com/content/DriverDownload-March2009/licence.php?lang=us&type=GeForce

 

FP64 64-bits (Double Precision) Floating Point Calculation


Pix taken from Comparison of NVIDIA Tesla/Quadro and NVIDIA GeForce GPUs

FP16-16bits (Half Precision) Floating Point Calculation


Pix taken from Comparison of NVIDIA Tesla/Quadro and NVIDIA GeForce GPUs