The purpose of this programming assignment is to gain experience in parallel programming on a cluster and Charm++. For this assignment you have to write a parallel implementation of the prefix sum algorithm.
Your program should read in a file containing the values that will be used to initialize the 1D array of integers. A sample file is available here. You can generate other sample input files using this Python code.
Your program should take three command line arguments: the name of the input data file, the number of chares, and the name of the output data file. To be more specific, the command line of your program should be:
./prefix <input filename> <# of chares> <output filename>
The number of processes the program will run on is specified as part of the mpirun command with the -np argument.
mpirun -np <# of processes> ./prefix <input filename> <# of chares> <output filename>
Your program should write a single file (name set using the command line parameter) that contains values of the 1D sequence after the prefix sum computation. Each line should contain one number. This is the correct output file for the sample input file above. The only print from your program to standard output should be from the main chare that looks like this:
TIME: 41.672 s
where time is measured for the prefix sum calculation and excludes the time for file reading/writing. Make sure that you use "-O2" as a compiler flag for fast timings.
You can use the parallel prefix sum algorithm discussed in the class, which is referred to as prefix sum with recursive doubling. You can assume that the number of numbers is much larger than the number of chares. The CI file will look something like this:
mainmodule prefix {
readonly CProxy_Main mainProxy;
readonly int numChares;
readonly CProxy_Prefix prefixArray;
mainchare Main {
entry Main(CkArgMsg∗);
entry void done();
};
array [1D] Prefix {
entry Prefix();
entry void phase(int);
entry void passValue(int phase, int value);
};
};
You must submit the following files and no other files:
prefix.ci, prefix.C
, (optional header file): your parallel implementation
Makefile
that will compile your code successfully on deepthought2 when using charmc. You can see a sample Makefile here. Make sure that the executable name is prefix and do not include the executable in the tarball. NOTE: assignments without a Makefile will not be graded. You can load the deepthought2 charm module using:
module load charmpp
LastName-FirstName-assign5
), compress it to .tar.gz (LastName-FirstName-assign5.tar.gz
) and upload that to ELMS.
The project will be graded as follows:
Component | Percentage |
---|---|
Runs correctly on 1 process, 4 chares | 20 |
Runs correctly on 16 processes, 64 chares | 20 |
Runs correctly on 20 processes, 70 chares | 30 |
Speedup on 16 processes, 64 chares | 20 |
Writeup | 10 |