CMSC 714 – High Performance Computing
Fall 2018 - OpenMP Programming Assignment
Due Monday, October 8, 2018 @ 6:00 PM
The purpose of this programming assignment is to gain experience in writing OpenMP programs. You will start with a working serial program (quake.c) that models an earthquake and add OpenMP directives to create a parallel program.
HINTS
The goal is be
systematic in figuring out how to parallelize this program. You should start
by using the gprof command to figure out what parts of the program take the
most time (to use gprof, you will need to compile your program with the
-pg
switch). From there you should examine the loops in the most important
subroutines and figure out how to add OpenMP directives.
The programs will be run on a single compute node of the deepthought2 machine (deepthought2.umd.edu). You already have account information and should know how to run jobs on the machine from the MPI project.
WHAT TO TURN IN
You should submit
your program and the times to run it on the input file
quake.in (for 1,
2,
4, 8 and 16 threads). Since quake runs for a while on this input dataset
for small numbers of threads, for your testing another input file that runs for
much less time is in quake.in.short.
So that
you don't have to make copies of the somewhat large input data files, they are
available on deepthought2
in ~asussman/public/714/OpenMP/data
. A copy of the serial quake.c
is also available in ~asussman/public/714/OpenMP/src
.
You also must submit a short report about the results (1-2 pages) that explains:
Using OpenMP
To
compile openMP we will be using gcc version 4.8.1 (the default version on
deepthought2, which you can get by doing module load gcc
on
the deepthought2 login node), which nicely has openMP support built in. In general, you can compile this assignment with:
$ gcc -fopenmp -pg -o quake quake.c -lm
The -fopenmp
tells the compiler to, you guessed it, recognize OpenMP directives. -lm
is required because our program uses the math library. -pg
needs to be added to collect profiling data when the program is run; you can remove this option before you do final performance testing.
The environment variable OMP_NUM_THREADS sets the number of threads (and presumably processors) that will run the program. Set the value of this environment variable in the script you submit the job from. It defaults to using all available cores, and on a deepthought2 node that means 20.
RUNNING THE PROGRAM
Quake reads its input file from standard input, and produce its output on standard output. Quake generates an output message periodically (every 30 of its simulation time steps), so you should be able to tell if it is making progress.
GRADING
The project will be graded as follows:
Item |
Pct |
Correctly runs with 1 thread |
10% |
Correctly runs with 16 threads |
40% |
Performance with 1 thread |
10% |
Speedup of parallel version |
20% |
Writeup |
20% |