The purpose of this programming assignment is to gain experience in parallel programming on a cluster and MPI. For this assignment, you have to write a parallel implementation of a program to simulate the Game of Life.
The game of life simulates simple cellular automata. The game is played on a rectangular board containing cells. At the start, some of the cells are occupied, the rest are empty. The game consists of constructing successive generations of the board. The rules for constructing the next generation from the previous one are:
You can use the provided serial code as a baseline to develop your parallel implementation. It provides some basic functionality such as parsing the input file and exporting the final board state to a CSV file. Your task is to implement the parallel version using C, C++, or Fortran, and MPI. You can adapt this Makefile and batch script for your needs.
Your program should read in a data file containing the coordinates of cells that are initially alive. Sample files are located here: life.1.256x256.data and life.2.256x256.data (256x256 board). Each line in this file represents the coordinates of a cell on the board that is live. For instance, the following entry:
1,3
means that the cell at position [1, 3] is live in the initial state. You can
also find many other sample patterns on the web (use your favorite search
engine on "game of life" and/or "Conway").
Your program should take four command line arguments: the name of the data file, the number of generations to run before stopping the program, X_limit, and Y_limit (representing the size of the board). To be more specific, the command line of your program should be:
./life <data-file-name> <# of generations> <X_limit> <Y_limit>
The number of processes used to launch the program is specified as part of the
mpirun command with the -np
argument.
mpirun -np <# of processes> ./life <data-file-name> <# of generations> <X_limit> <Y_limit>
Your program should write a single file called
<data-file-name>.<no-of-generations>.csv
(from one
designated rank) that contains comma separated values representing the coordinates of the live cells on the board.
There should be one line (containing the x coordinate, a comma, and then the y
coordinate) for each occupied cell at the end of the last generation/iteration.
Sample output files are available:
The only print from your program to standard output should be from process 0 that looks like this:
TIME: Min: 25.389 s Avg: 27.452 s Max: 41.672 s
where Min, Avg and Max time (in seconds) are calculated using MPI reduction
operations over the individual time measurements of the "main" loop (sum of all
generations) on different processes for the sample life.512x512.data input
file.
// C++:
cout << "TIME: Min: " << local_time << " s Avg: " << local_time << " s Max: " << local_time << " s\n";
// C:
printf("TIME: Min: %f s Avg: %f s Max: %f s\n", local_time, local_time, local_time);
Figure out how you will decompose the problem for parallel execution. Remember that MPI (at least the OpenMPI implementation) does not always have great communication performance and so you will want to make message passing infrequent. Also, you will need to be concerned about static load balancing during data distribution/domain decomposition. You should use the 1D decomposition (over rows) and implement two different versions: one using blocking Send/Recv and other using non-blocking Isend/Irecv.
You can assume that X_limit and Y_limit will be powers of 2 as will be the number of processes you will be running on. You can also assume that you will be running the program on a minimum of 4 processes and X_limit and Y_limit are much larger than the number of processes.
When you need to distribute the initial state of the board to different processes, you can read the entire file on one process (say rank 0) and then send messages from rank 0 to everyone else.
You must submit the following files and no other files:
life-blocking.[c,C,f77,f90]
: parallel version using blocking Send/Recv routines, where the file extension depends on the language used for the
implementation
life-nonblocking.[c,C,f77,f90]
: parallel version using non-blocking Isend/Irecv routines
Makefile
that will compile your code successfully on
zaratan when using mpicc or mpicxx, and generate these executables: life-blocking
, life-nonblocking
.
report-assign1.pdf
) with performance
results (two line plots). The line plots should present the execution times to run
the two parallel versions respectively on the input file life.512x512.data (for
4, 8, 16, 32, 64, and 128 processes), running on a 512x512 board for 500 iterations.
In total, you will be running each program 6 times.
In the report, you should mention:
LastName-FirstName-assign1
), compress it to .tar.gz
(LastName-FirstName-assign1.tar.gz
) and upload that to gradescope.
The project will be graded as follows:
Component | Percentage |
---|---|
Runs correctly with 4 processes | 30 (15% each version) |
Runs correctly with 16 processes | 40 (20% each version) |
Performance with 4 processes | 10 (blocking version) |
Performane with 16 processes | 10 (blocking version) |
Writeup | 10 |