Installing ALPS 2.0 from source on CentOS 5

What is ALPS Project?

The ALPS project (Algorithms and Libraries for Physics Simulations) is an open source effort aiming at providing high-end simulation codes for strongly correlated quantum mechanical systems as well as C++ libraries for simplifying the development of such code. ALPS strives to increase software reuse in the physics community.

Good information on installing ALPS can be found on ALPS Wiki’s Download and install ALPS for Ubuntu 9.10, Ubuntu 10.04, Ubuntu 10.10, Debian and MacOS

Installing ALPS with Boost

# wget http://alps.comp-phys.org/static/software/releases/alps-2.0.2-r5790-src-with-boost.tar.gz

You will need either gfortran or Intel Fortran Compiler. If you are installing using gfortan

# yum install gcc-c++ gcc-gfortran

If you want to use the evaluation tools, you will need to install a newer version of Python than the provided 2.4. You can install from source or use an unofficial repository for binary RPMs. This is not required if you just want to run your compiled simulations (c++ applications), but make sure you still have python headers (specify -DALPS_BUILD_PYTHON=OFF when invoking cmake):

# yum install python-devel

BLAS/LAPACK is necessary. Make sure you have EPEL repository ready. For more information,Red Hat Enterprise Linux / CentOS Linux Enable EPEL (Extra Packages for Enterprise Linux) Repository

# yum install blas-devel lapack-devel

CMake 2.8.0 and HDF5 1.8 need to be installed. There is a wonderful scripts that comes with ALPS that help to compile CMAKE 2.8 and HDF5.1.8 with CentOS 5

$ $HOME/src/alps2/script/cmake.sh $HOME/opt $HOME/tmp
$ $HOME/src/alps2/script/hdf5.sh $HOME/opt $HOME/tmp

Build ALPS

Create a build directory (anywhere you have write access) and execute cmake giving the path to the alps and to the boost directory:

# cmake -D Boost_ROOT_DIR:PATH=/path/to/boost/directory /path/to/alps/directory

For example if the alps precompiled directory is in /root/alps-2.0.2

# cmake -D Boost_ROOT_DIR:PATH=/root/alps-2.0.2/boost /root/alps-2.0.2/alps

To install in another directory, set set the variable CMAKE_INSTALL_PREFIX

# cmake -DCMAKE_INSTALL_PREFIX=/path/to/install/directory /path/to/alps/directory

For example:

# cmake -DCMAKE_INSTALL_PREFIX=/usr/local/alps-2.0.2 /root/alps-2.0.2/alps

Build and test ALPS

$ make -j 8
$ make test
$ make install

* HDF5.1.8 binaries and  libraries are very useful not only for compiling ALPS but other applications require HDF5.1.8. You may want to consider to move its binaries and libraries to the /usr/local/ directories

Compiling GotoBLAS2 in Nehalem and newer CPU

GotoBLAS2 uses new algorithms and memory techniques for optimal performance of the BLAS routines. The download site can be found at GotoBLAS2 download

# wget http://cms.tacc.utexas.edu/fileadmin/images/GotoBLAS2-1.13_bsd.tar.gz
# tar -zxvf GotoBLAS2-1.13_bsd.tar.gz
# cd GotoBLAS2
# gmake clean
# gmake TARGET=NEHALEM

you will get

GotoBLAS build complete.

  OS               ... Linux
  Architecture     ... x86_64
  BINARY           ... 64bit
  C compiler       ... GCC  (command line : gcc)
  Fortran compiler ... INTEL  (command line : ifort)
  Library Name     ... libgoto2_nehalemp-r1.13.a (Multi threaded; Max num-threads is 8)

you will see the resulting libraries and softlinks

libgoto2.a -> libgoto2_nehalemp-r1.13.a
libgoto2_nehalemp-r1.13.a
libgoto2_nehalemp-r1.13.so
libgoto2.so -> libgoto2_nehalemp-r1.13.so

You can create a /usr/local/GotoBLAS2 and copy the files there and do the PATHING.

If you are having issues, do take a look at Error in Compiling GotoBLAS2 in Westmere Chipsets

Taxonomy of File System (Part 2)

This writeup is a replica subset copy of the presentation of “How to Build a Petabyte Sized Storage System” by Dr. Ray Paden as given in LISA’09. This information is critical for Administrators to make decision on File Systems. For full information do look at “How to Build a Petabyte Sized Storage System”

Taxonomy of File System (Part 1) dealt with 3 file system – Convention I/O, Networked File System, Networked Attached Storage.

4. Basic Clustered File Systems

  1. File access is parallel
    • supports POSIX API, but provides safe parallel file access semantics
  2. File system overhead operations
    • File System overhead operations is distributed and done in parallel
    • No single server bottlenecks ie no metadata servers
  3. Common component architecture
    • commonly configred using seperate file clients and file servers (costs too much to have a seperate storage controller for every node)
    • some file system allow a single component architecture where file clients and file serves are combined (ie no distinction between client and server -> yield very good scalling for async applications)
  4. File System access file data through file servers via the LAN
  5. Example: GPFS, GFS, IBRIX Fusion

5. SAN File Systems

  1. File access in parallel
    • supports POSIX API, but provides parallel file access semantics
  2. File System overhead operations
    • Not done in parallel
    • single metadata with a backup metadata server
    • metadata server is accessed via LAN
    • metadata server is a potential bottleneck, but this is not considered a limitation since these file system are generally used for smaller cluster.
  3. Dual Component Architecture
    • file client/server and metadata server
  4. All disks connected to all file client/server nodes via the SAN, not the LAN
    • file data accessed via the SAN, not the LAN
    • inhibits scaling due to cost of FC SAN
  5. Examples: Stornext, CXFS, QFS

6. Multi-Components File System

  1. File access in parallel
    • Supports POSIX API
  2. File System overhead operations
    • Lustre: metadata server per file system (with backup) accessed via LAB
    • Lustre: potential bottleneck (deploy multiple file systems to avoid backup)
    • Panasas: Director Blade manages protocol
    • Panasas: contains a director blade and 10 disks accessible via Rthernet
    • Pasanas: This provides multiple metadata servers reducing contention
  3. Multi-Component Architecture
    • Lustre: file clients, file servers, metadata servers
    • Panasas: file clients, director blade
    • Panasas: Director Blade encapsulates file service, metadata service,storage controller operations
  4. File clients access file data through file servers or director blades via the LAN
  5. Examples: Lustre, Panasas

Taxonomy of File System (Part 1)

This writeup is a replica subset copy of the presentation of “How to Build a Petabyte Sized Storage System” by Dr. Ray Paden as given in LISA’09. This information is critical for Administrators to make decision on File Systems. For full information do look at “How to Build a Petabyte Sized Storage System”

1. Conventional I/O

  1. Used generally for “Local File Systems”
  2. Support POSIX I/O  model
  3. Limited form of parallelism
    • Disk level parallelism possible via striping
    • Intra-Node process parallelism (within the node)
  4. Journal extent based semantics
    • Journalling (AKA logging). Log information about operations performed on the file system meta-data as atomic transactions. In the event of a system failure, a file system is restored to a consistent state by replaying the log for the appropriate transactions.
  5. Caching is done via Virtual Memory which is slow….
  6. Example: ext3, NTFS, ReiserFS

2. Networked File Systems

  1. Disk access from remote nodes via network access
    • Generally based TCP/IP over ethernet
    • Useful for in-line interactive access (e.g. home directories)
  2. NFS is ubiquitos in UNIX/Linux Environments
    • Does not provide genuinely parallel model of I/O
      • Not cache coherent
      • Parallel write requires o_sync and -noac options to be safe
    • Poorer performance for HPC jobs especially parallel I/O
      • write: only 90MB/s on system capable of 400MB/s (4 tasks)
      • read: only 381 MB/s on a system capable of 40MB/s (16 tasks)
    • Used POSIX I/O API, but not its esmantics
    • Traditional NFS is limited by “single server” bottleneck
    • Parallel is not designed for parallel file access, by placing restriction on an file access and/or doing non-parallel file server, it may be good enough performance.

3. Network Attached Storage (AKA: Appliances)

  1. Appliance Concept
    • Focused on CIFS and/or NFS protocols
    • Integrated HW/SW storage product
      • Integrate servers, storage controllers, disks, networks, file system, protocol all into single product
      • Not intended for high performance storage
      • “black box” design
    • Provides an NFS server and/or CIFS/Samba solution
      • Server-based product; they do not improve client access or operation
      • Generally based on Ethernet LANS
    • Examples:
      • NetApp, Scale-out File System (SoFS)

Which File System Blocksize is suitable for my system?

Taken from IBM Developer Network “File System Blocksize”

Although the article has referenced to General Parallel File System (GPFS), but there are many good pointers System Administrators can take note of.

Here are some excerpts from the article……..

This is one question that many system administrator asked before we start preparing the system. How do choose a blocksize for your file system? IBM Developer Network (File System Blocksize) recommends the following block size for various type of application.

IO Type Application Examples Blocksize
Large Sequential IO Scientific Computing, Digital Media 1MB to 4MB
Relational Database DB2, Oracle 512kb
Small I/O Sequential General File Service, File based Analytics,Email, Web Applications 256kb
Special* Special 16KB-64KB

What if I do not know my application IO profile?

Often you do not have good information on the nature of the IO
profile or the applications are so diverse it is difficult to optimize
for one or the other. There are generally two approaches to designing
for this type of situation separation or compromise.

Separation

In this model you create two file systems, one with a large file system blocksize for sequential applications and one with a smaller block size for small file applications. You can gain benefits from having file systems of two different block sizes even on a single type of storage. Or you can use different types of storage for each file system to further optimize to the workload. In either case the idea is that you provide two file systems to your end users, for scratch space on a compute cluster for example. Then the end users can run tests themselves by pointing the application to one file system or another to and determining by direct testing which is best for their workload. In this situation you may have one file system optimized for sequential IO with a 1MB blocksize and one for more random workloads at 256KB block size.

Compromise

In this situation you either do not have sufficient information on workloads (i.e. end users won’t think about IO performance) or enough storage for multiple file systems. In this case it is generally recommended to go with a blocksize of 256KB or 512KB depending on the general workloads and storage model used. With a 256KB block size you will still get good sequential performance (though not necessarily peak marketing numbers) and you will get good performance and space utilization with small files (256KB has minimum allocation of 8KB to a file). This is a good configuration for multi-purpose research workloads where the application developers are focusing on their algorithms more than IO optimization.

Using mpstat to display SMP CPU statistics

mpstat is a command-line utilities to report CPU related statistics.

For CentOS, to install mpstat, you have to install the sysstat package (http://sebastien.godard.pagesperso-orange.fr/)

# yum install sysstat

1. mpstat is very straigtforward. Use the command below. On my 32-core machine,

# mpstat -P ALL
11:10:11 PM  CPU   %user   %nice    %sys %iowait    %irq   %soft  %steal   %idle    intr/s
11:10:13 PM  all   40.75    0.00    0.03    0.00    0.00    0.00    0.00   59.22   1027.50
11:10:13 PM    0    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00   1000.50
11:10:13 PM    1    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
11:10:13 PM    2  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00
11:10:13 PM    3    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
11:10:13 PM    4  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00
11:10:13 PM    5  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00
11:10:13 PM    6    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
11:10:13 PM    7  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00
11:10:13 PM    8    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00     16.50
11:10:13 PM    9    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
11:10:13 PM   10    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
11:10:13 PM   11    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
11:10:13 PM   12  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00     10.50
11:10:13 PM   13    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
11:10:13 PM   14   99.50    0.00    0.50    0.00    0.00    0.00    0.00    0.00      0.00
11:10:13 PM   15  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00
11:10:13 PM   16    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
11:10:13 PM   17    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
11:10:13 PM   18    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
11:10:13 PM   19  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00
11:10:13 PM   20  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00
11:10:13 PM   21    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
11:10:13 PM   22    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
11:10:13 PM   23    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
11:10:13 PM   24  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00
11:10:13 PM   25  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00
11:10:13 PM   26    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
11:10:13 PM   27  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00
11:10:13 PM   28    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00

where

CPUProcessor number. The keyword all indicates that statistics are calculated as averages among all processors.

%user – Show the percentage of CPU utilization that occurred while executing at the user level (application).

%nice – Show the percentage of CPU utilization that occurred while executing at the user level with nice priority.

%sys  – Show the percentage of CPU utilization that occurred while executing at the system level (kernel). Note that this does not include time spent servicing interrupts or softirqs.

%iowaitShow the percentage of time that the CPU or CPUs were idle during which the system had an outstanding disk I/O request.

%irq Show the percentage of time spent by the CPU or CPUs to service interrupts.

%softShow the percentage of time spent by the CPU or CPUs to service softirqs. A softirq (software interrupt) is one of up to 32 enumerated software interrupts which can run on multiple CPUs at once.

%stealShow the percentage of time spent in involuntary wait by the virtual CPU or CPUs while the hypervisor was servicing another virtual processor.

%idleShow the percentage of time that the CPU or CPUs were idle and the system did not have an outstanding disk I/O request.

intr/sShow the total number of interrupts received per second by the CPU or CPUs.

2. Getting average from mpstat

To get an average you have to invoke the interval and count argument. In the example, interval is 2 second for 5 count

# mpstat -P ALL 2 5

At the end of the statistics report, you will see an average

Average:     CPU   %user   %nice    %sys %iowait    %irq   %soft  %steal   %idle    intr/s
Average:     all   40.76    0.00    0.03    0.00    0.00    0.00    0.00   59.21   1047.50
Average:       0    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00   1000.60
Average:       1    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
Average:       2  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00
Average:       3    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
Average:       4   99.90    0.00    0.10    0.00    0.00    0.00    0.00    0.00      0.00
Average:       5  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00
Average:       6    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
Average:       7  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00
Average:       8    0.00    0.00    0.10    0.00    0.00    0.00    0.00   99.90     17.30
Average:       9    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
Average:      10    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
Average:      11    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
Average:      12   99.90    0.00    0.00    0.00    0.00    0.10    0.00    0.00     29.70
Average:      13    0.00    0.00    0.10    0.00    0.00    0.00    0.00   99.90      0.00
Average:      14   99.50    0.00    0.50    0.00    0.00    0.00    0.00    0.00      0.00
Average:      15  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00
Average:      16    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
Average:      17    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
Average:      18    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
Average:      19  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00
Average:      20  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00
Average:      21    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
Average:      22    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
Average:      23    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
Average:      24  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00
Average:      25   99.90    0.00    0.10    0.00    0.00    0.00    0.00    0.00      0.00
Average:      26    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
Average:      27  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00
Average:      28    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
Average:      29    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
Average:      30  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00
Average:      31    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00