Home Exam

Here it is.

IMPORTANT: The data files are now at /users/courses/368-4064/data.

Syllabus

Parallel programming:

Shared-memory programming with Cilk

Shared-memory programming with threads

Message passing with MPI

Parallel and high-performance computer architectures:

Implementations of shared-memory

Memory consistency

Synchronization primitives

Interconnection networks

Introduction to parallel algorithms:

Sorting

Dense matrix algorithms (multiplication, solution of linear systems of equations)

The fast-Fourier transform

Prerequisites

C programming, Linear Algebra, Computer Structure

Grading

I plan to give a home exam during a 14-day period that will be decided upon with the students. That is, the grade will be based on an exam which will be part programming and experimentation and part theory. Students will have 14 days to complete the exam. The exam is individual.

There will also be exercises during the course, probably 4 programming exercises. They are also individual and mandatory.

Exercises

Running Cilk programs:

  • cd to /users/courses/368-4064 to make sure itís accessible
  • The examples are in examples
  • To compile (in your home directory), use /users/courses/368-4064/bin/cilk (you can define an alias to type less)
  • To compile sum.cilk without any options: cilk sum.cilk
  • Now you can run a.out
  • Use --nproc 1 to run on 1 processor, --nproc 2 to run on 2
  • The command uptime gives indication as to whether the machine is busy
  • To be able to make measurements, compile with ĖO2 Ėcilk-profile (the -O2 is the usual gcc optimization flag; donít make measurements without it)
  • To be able to measure the critical path, compile with Ėcilk-critical-path, but be aware that measurements are then less reliable
  • To generate running time statistics, run with --stats 1
  • For more details, see the Cilk Reference Manual below

Exercise 1: sort.cilk (instructions inside)

Exercise 2: instructions, ex-aio.c. Due date: December 24.

Exercise 3: Parallel matrix-vector multiplication. You also need a couple of files: lab3.c, lab3-orig, airfoil.grid, airfoil_map.p16, pwt.grid, pwt_map.p16. See below on how to run MPI programs under PBS.

Course Materials

Lecture notes for October 22.

Lecture notes for October 29 (and probably for the next lecture)

Chapter 1 and 2 of the lecture notes (these are somewhat old so they focus mostly on MPI, see the next items for material on Cilk).

Chapter 3 of the lecture notes (together with the MPI material in chapter 1, this is the material for the December 17 lecture).

Chapter 4 of the lecture notes (for the January 7 lecture).

A Minicourse on Multithreaded Programming, by Charles E. Leiserson and Harald Prokop.

Cilk 5.3.1 Reference Manual.

Running MPI Programs under PBS

We run MPI programs using a job-sumission system called PBS. The programs can be compiled on nova or any of the plab computers (plab-02 up to plab-34; some may be down). The plab computers are also serving as Linux workstations in the lab in room 004. Programs are submitted from nova (only; not from plabs) and they run on plab computers. Here are instructions that explain how to compile MPI programs and how to use PBS.

I recommend that you download the sample files below on a Unix or Linux machine, as opposed to downloading on a Windows machine and transferring them to your account. In one case that I have seen downloading the samples on a Windows machine caused some hidden control characters to be inserted into the files and this prevented PBS from running them properly.

To compile an MPI program, use the command mpicc. This command accepts the same arguments as gcc.

We run programs using a job submission system called PBS. To run a program, you submit a script to PBS wait until it completes. You can only use PBS on nova!

Before you try to use PBS, make sure that you can use rsh to/from any of the plabs without typing your password, otherwise PBS and MPI won't work. To ensure that rsh works, copy this file to your home directory under the name .rhosts (don't forget the period), change in the file username to your user name, and give it permissions 600 (use chmod 600 ~/.rhosts). Check that it works by running rsh from nova to one of the plabs and from one of the plabs to another and making sure you are not requested to type your password.

Here is a sample PBS script called script.pbs. The MPI program that you want to run and its arguments is specified in the last line. In our case, the program is hello.c, and we compile it using mpicc -O3 -o hello hello.c

We run the program using the command qsub -l nodes=3:m256MB script.pbs (3 is the number of computers that will run the program in parallel, and the string m256MB tells PBS that we only want to use the machines in room 004, which are all connected to a single Ethernet switch). We get back from qsub the job identifier; in my case it was 11.nova.math.tau.ac.il.

We can find out whether the job is running or not and what other jobs are running using the command qstat. The important states that a job can be in are queued (Q; waiting), running (R), and exiting (E).

You can find out more details about about your job using the command qstat -f 11 (here 11 is the job id).

You can cancel the job, whether it is still waiting or running using the command qdel 11 (here 11 is the job id).

When the job completes, its standard output and standard error are copied into the directory that contains the script under the names .o (output) and .e (error). In my case, these files were called script.pbs.o11 and script.pbs.e11.

PBS will send you mail when the job starts running, exits, or is aborted. (Due to the line #PBS -m abe in the script.)

The script requests 5 minutes CPU time (total over all processors) using the line #PBS -l cput=05:00. PBS will kill the job when it exceeds its CPU time. We allow running times of up to 1 hour CPU time. If you need more, talk to me.

The graphical tool xpbsmon is useful for determinig the status of the entire system. Here is a screenshot (click to zoom in)

You can see that I am running two jobs, one on 3 nodes (brown; id 12) and another on 18 nodes (blue; id 13). There are 7 free nodes (green), 4 when are inaccesible (black), and 1 which is down (red). To set the tool up click on its Pref button and add nova as a server (delete other ones). To update the view, click Pref and then on Redisplay (or set up the auto update feature).

 

Books and Links

The following books are in the library:

Parallel Computer Architecture: a Hardware-Software Approach, by Culler and Singh.

Using MPI: writing portable parallel programs with the message-passing interface, by Gropp et al.

Parallel Programming with MPI, by Pacheco.

Introduction to Parallel Computing: design and analysis of algorithms, by Kumar et. al.

Operating Systems, by Sivan Toledo, in Hebrew (only the chapter on programming with threads)

The following links should prove useful:

The Cilk project

The MPI homepage

The MPICH homepage (a widely-used implementation of the standard)

HPCU, Israelís supercomputing center

The Top 500 computers in the world

Last updated on Sunday, March 03, 2002