For this assignment, you will implement a hybrid MPI+OpenMP version of Assignment 2. There are three parts/steps to this assignment.
First you will implement a 2D decomposition in MPI using non-blocking routines. You will add two arguments to the command line at the end that enable providing the X and Y dimensions of the MPI virtual grid. As an example, if we are running with 64 MPI processes, we should be able to use a 8x8 or 16x4 or 4x16 virtual grid of processes. The modified command line will like this:
mpirun -np <# of processes> ./life <data-file-name> <# of generations> <X_limit> <Y_limit> <# of processes in X> <# of processes in Y>
You can assume that X_limit and Y_limit will be powers of 2 as will be the number of processes you will be running on. You can also assume that you will be running the program on a minimum of 16 processes and X_limit and Y_limit are much larger than the number of processes in each dimension.
Next you will create a hybrid MPI+OpenMP version of the program implemented in Part I by adding support for OpenMP in the sequential compute region of the program. This is the code region that is executed sequentially by each MPI process in the Part I implementation.
Finally, once you have a correctly working implementation from Part II, you will study the impact of using a varying number of processes vs. threads on performance. You will use 2 nodes of zaratan for these studies. On one extreme, you can use 256 MPI processes (and 1 thread/node) on 2 nodes, and on the other extreme, you can create 16 MPI processes, 8 on each node, and create 16 OpenMP threads per MPI process. And you can do anything in between.
You must submit the following files and no other files:
: parallel version using non-blocking Isend/Irecv routines, where the file extension depends on the language used for the implementation
: parallel version using non-blocking Isend/Irecv routines and OpenMP, where the file extension depends on the language used for the implementation
that will compile your programs successfully on
zaratan when using mpicc or mpicxx.
Make sure that the executable
names are life-nonblocking-2d
and life-nonblocking-hybrid
, and do not include the executable in the tarball.
), compress it to .tar.gz
) and upload that to gradescope.
The project will be graded as follows:
Component | Percentage |
2D MPI only version runs correctly | 40 |
Hybrid version runs correctly | 30 |
Performance evaluation and writeup | 30 |