Parallel Computing (CMSC416/CMSC616)

Assignment 3: Performance Tools

Due: October 18, 2024 @ 11:59 PM Eastern Time

The purpose of this programming assignment is to gain experience in using performance analysis tools for parallel programs. There are two parts to this assignment. In Part I, you will run an existing parallel code, LULESH and collect performance data using HPCToolkit. In Part II, you will use performance data (gathered using another tool called Caliper) provided to you, and analyze this data using Hatchet.

Part I: Recording performance data

Downloading and building LULESH

You can get LULESH by cloning its git repository as follows:
git clone https://github.com/LLNL/LULESH.git

For this assignment, we will use an older version of gcc (9.4.0) and openmpi. You can get this by doing: module load openmpi/gcc/9.4.0. If you get an error, unload openmpi/gcc/11.3.0 that you might have loaded for Assignment 2.
You can use CMake to build LULESH on zaratan by following these steps:


        mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_COMPILER=mpicxx -DCMAKE_CXX_FLAGS="-g -O3" -DMPI_CXX_COMPILER=`which mpicxx` -DWITH_OPENMP=Off -DWITH_SILO=Off ..
make
This should produce an executable lulesh2.0 in the build directory.

Running LULESH

Lets say you want to run LULESH on 8 processes for 10 iterations/timesteps. This would be the mpirun line:
mpirun -np 8 ./lulesh2.0 -i 10 -p

Using HPCToolkit

HPCToolkit can be loaded on zaratan via the hpctoolkit/gcc/9.4.0 module. You can use HPCtoolkit to collect profiling data for a parallel program in three steps.

  1. Step I: Running the code (LULESH) with hpcrun (in a batch job):
    mpirun -np <num_ranks> hpcrun ./exe <args>
    This will generate a measurements directory.
  2. Step II: First post-processing step on the measurements directory (this can be run on the login node)
    hpcstruct <measurements_directory>
  3. Step III: Second post-processing step on the measurements directory (this can be run on the login node)
    hpcprof <measurements-directory>
    This will generate a database directory.
You can use hpcviewer or Hatchet with from_hpctoolkit to analyze the database directory generated in Step III.

Part II: Analyzing performance data using Hatchet

Installing Hatchet

You can install hatchet on zaratan or your local computer using pip:
pip3 install hatchet
The above command should successfully install hatchet, provided that it runs without any errors. On zaratan, you may have to add --user to the pip command.

In case this does not work OR you want to install from source, then you can do so by following the steps below. First clone the hatchet git repository and add the path of the directory where you cloned hatchet to your PYTHONPATH:


        git clone https://github.com/hatchet/hatchet
git checkout v1.4.0
export PYTHONPATH=<path-to-hatchet>:$PYTHONPATH
Next you can install the requirements using pip and then run install.sh inside the hatchet directory:

        cd hatchet/
pip install -r requirements.txt
source install.sh

Datasets

You can find four datasets from four different runs of LULESH at: lulesh-1core.json lulesh-8cores.json lulesh-27cores.json lulesh-64cores.json. These were gathered by running LULESH with 1, 8, 27, and 64 processes respectively. These profiles were gathered using Caliper, hence you can use from_caliper in the hatchet API to read them. You will use these profiles/datasets for all the tasks below.

Analysis Tasks

  1. Problem 1: Analyze lulesh-1core.json and identify the top N functions where the code spends the largest amounts of (exclusive) time. You should set the value of N in your Python program from the first command line argument to your script. The only output from the Python script should be N lines, where each line prints the function name, followed by a space, and then time (exclusive) spent in that function. What we will run to check correctness (an example):
    ./problem1.py 3
    Sample output below (this is what the output should look like, not the actual output/correct answer):
    
                func1_name 0.574
    func2_name 0.522
    func3_name 0.374
  2. Problem 2: Use the load_imbalance() function on the lulesh-64cores.json dataset, and given a command line parameter, X (value of X≥1), identify the function that is X from the top (top starts at 1 and not 0) if the functions are sorted in decreasing order of the imbalance. You should set the value of X in your Python program from the first command line argument to your script. The only output from the script should be the processes list generated by hatchet for this function that have the most imbalance. What we will run to check correctness (an example):
    ./problem2.py 6
    Sample output below (this is what the output should look like, not the actual output/correct answer):
    
                [60 15  0 51  3]
                
  3. Problem 3: Create two graphframes for the 8 and 64 process case, use drop_index_levels() on both, and then subtract the 8-process graphframe from the 64-process one. Identify N functions with the largest time differences (NOT the absolute magnitude) between the two runs where N is a command line argument. You should set the value of N in your Python program from the first command line argument to your script. The only output from the Python script should be N lines, where each line prints the function name, followed by a space, and then different in time (exclusive) between the two runs for that function. What we will run to check correctness (an example):
    ./problem3.py 4
    Sample output below (this is what the output should look like, not the actual output/correct answer):
    
                func1_name 0.574
    func2_name 0.552
    func3_name 0.522
    func4_name 0.374

What to Submit

You must submit the following files and no other files:

  • Database directory generated on 8 processes for Part 1 renamed to lulesh-8processes.
  • Python scripts that use hatchet for the analyses: problem1.py, problem2.py, and problem3.py.
  • A report (called report-assign3.pdf) that describes what you did, and which hatchet functions you found to be the most useful?
You should put the dataset, 3 Python scripts, and report in a single directory (named LastName-FirstName-assign3), compress it to .tar.gz (LastName-FirstName-assign3.tar.gz) and upload that to gradescope. Replace LastName and FirstName with your last and first name, respectively.

Tips

Grading

The project will be graded as follows:

Component Percentage
Successful data collection 30
Problem 1 correctness 20
Problem 2 correctness 20
Problem 3 correctness 20
Writeup 10