March 18, 2013 by kittycool only

Hadoop and MPI

At this point of writing, I am reading the Hadoop the definitive Guide, 3rd Edition from Oreilly. I thought I capture some information down from what I learn.

Here is a summary of the difference between MPI and MapReduce

	MPI	MapReduce
Location of Data	Shared Storage	Data Locality (within the Node)
Complexity	MPI require the programmer to handle the mechanics of the data flow exposed via low-level C routines and constructs	Programmer in terms of functions of key and value pairs, and the data flow is implicit.
Compute Characteristics	Large-Scale Distributed Computation. A failed process will slow or halt the progress of the computation. MPI has to explicitly manage checkpoint and recovery	Shared Nothing Architecture. The implementation detects failed map or reduce tasks and reschedules replacements on machines that are healthy

March 18, 2013 by kittycool only

Hadoop and Traditional RDMS

At this point of writing, I am reading the Hadoop the definitive Guide, 3rd Edition from Oreilly. I thought I capture some information down from what I learn.

Here is a summary of the difference between Traditional RDMS and MapReduce

	Traditional RDBMS	MapReduce
Data size	Gigabytes	Petabytes
Access	Interactive and batch	Batch
Updates	Read and write many times	Write once, read many times
Structure	Static schema	Dynamic schema
Integrity	High	Low
Scaling	Nonlinear	Linear

March 6, 2013 by kittycool only

PBS scripts for mpirun parameters for Chelsio / Infiniband Cards

If you are running Chelsio Cards, you may want to specify the mpirun parameters to ensure the

/usr/mpi/intel/openmpi-1.4.3/bin/mpirun 
-mca btl openib,sm,self --bind-to-core 
--report-bindings -np $NCPUS -machinefile $PBS_NODEFILE $PBS_O_WORKDIR/$file

–bind-to-core: Bind each MPI process to a core
–mca btl openib,sm,self: (Infiniband, shared memory, the loopback)

For information on Interprocess communication with shared memory,

see Speaking UNIX: Interprocess communication with shared memory

March 1, 2013 by kittycool only

Sequential execution of the Parallel or serial jobs on OpenPBS / Torque

If you have an requirement to execute jobs in sequence after the 1st job has completed, only then the 2nd job can launch, you can use the -W command

For example, if you have a running job with a job ID 12345, and you run the next job to run only after job 12345 run.

$ qsub -q clusterqueue -l nodes=1:ppn=8 -W depend=afterany:12345 parallel.sh -v file=mybinaryfile

You will notice that the job will hold until the 1st job executed.

.....
.....
24328 kittycool Hold 2 10:00:00:00 Sat Mar 2 02:20:12

S	M	T	W	T	F	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31

The Linux Cluster

Linux Cluster Blog is a collection of how-to and tutorials for Linux Cluster and Enterprise Linux

Month: March 2013

Hadoop and MPI

Hadoop and Traditional RDMS

PBS scripts for mpirun parameters for Chelsio / Infiniband Cards

Sequential execution of the Parallel or serial jobs on OpenPBS / Torque