|
CMSC 714
Syllabus
Projects
Readings
Lectures
Exams
Dates
Grades
Piazza
|
Note: for each class (after the intro material), 4 students
will be responsible for emailing me (als@cs.umd.edu)
with ~4 discussion question on the reading(s) for that day by 6PM the day
before the class, and be prepared to ask those questions and help explain
the paper to the rest of the class.
Introduction - What and Why?
1/26 Parallel Computing and Parallel Computers
1/31 Applications of Parallel Computing
2/2 No class
Programming Models
2/7-9 Expressing Parallelism (Explicit Control)
-
V.S. Sunderam, G.A. Geist, J. Dongarra, and R.
Manchek, "The PVM Concurrent Computing System: Evolution,
Experiences, and Trends", Parallel Computing, 20(4), 1994. [PDF]
-
J. J. Dongarra, S. W. Otto, M. Snir, and D.
Walker, "A message passing standard for MPP and workstations," Communications
of the ACM, 39(7),
1996, pp. 84-90. [PDF]
2/14 Introduction to Debugging Parallel Programs
2/16 Expressing
Parallelism (Implicit Control) - Joseph Barrow, Timothy Dunlap, Janani
Gururam, Somay Jain
-
William W. Carlson et al, "Introduction to UPC
and Language Specification," CCS-TR-99-157. [PDF]
-
L. Dagum and R. Menon, "OpenMP: An Industry-Standard
API for Shared-Memory Programming," IEEE Computational Science & Engineering,
5(1), 1998. [PDF]
2/21-23 Expressing Parallelism (Hybrids) - Keerthi Kalyanam, Sri
Kankanahalli, Tara Larrue, Kyunghun Lee
-
Steve W. Bova et.
al., "Parallel Programming with Message Passing and Directives",
Computing in Science & Engineering, 3(5), 2001. [PDF]
-
Brent
Leback, Michael Wolfe, and Douglas Miles
"The PGI Fortran and C99 OpenACC Compilers", Proceedings of Cray User Group (CUG) meeting, 2012. [PDF]
2/28 Expressing Parallelism (Frameworks) - Sai Koppuravuri, Honglei
Li, Chirag Majithia
-
S. Balay,
W. D. Gropp, L. C. McInnes, and B. F. Smith, "Efficient
Management of Parallelism in Object Oriented Numerical Software
Libraries", In E. Arge, A. M. Bruaset, and H. P. Langtangen,
editors, Modern Software Tools in Scientific Computing, pages
163--202, Birkhäuser Press, 1997. [PDF]
-
T. Goodale, G. Allen, G. Lanfermann,
J. Massó, T. Radke, E. Seidel, and J. Shalf., "The Cactus
Framework and Toolkit: Design and Applications", In
Proceedings of Vector
and Parallel Processing - VECPAR 2002, Springer, 2003. [PDF]
Architectures
3/2 Shared Memory - Koutilya PNVR, Tejo
Pothuraju, Michael Saugstad, Han-Chin Shing
-
J. Laudon and D. Lenoski, "The SGI Origin: a ccNUMA
highly scalable server," In Proceedings of 1997 International Symposium on
Computer Architecture (ISCA '97), May 1997. [PDF]
-
SGI, "Technical Advances in the SGI®
UV Architecture™," SGI White paper, 2012. [PDF]
3/7 Message Passing and Communication - Ashwin
Lakshmanaswamy, Virinchi Srinivas, Guowei Sun, Ahmed Taha
-
Robert M. Metcalfe and David R. Boggs, "Ethernet:
distributed packet switching for local computer networks," Communications
of the ACM, (19)7, 1976. [PDF]
-
Mellanox Technologies white paper,
"Introduction to InfiniBand.". [PDF]
3/9 Custom Machines
- Jerry Tan, Shuhao Tan, Rui Wang, Jiahao Wu
-
S.R. Alam, J.A. Kuehn, R.F. Barrett,
J.M. Larkin, M.R. Fahey, R. Sankaran, P.H. Worley, "Cray
XT4: An Early Evaluation for Petascale Scientific Simulation",
In Proceedings of SC'07, Nov. 2007. [PDF]
-
A. Gara, et. al, "Overview of the Blue Gene/L
system architecture", IBM Journal of Research and Development, 49(2/3), Fall
2005. [PDF]
3/14 Class canceled - snow!
3/16 Stream Processing and GPUs - Yuntao Liu, Roozbeh
Yousefzadeh, Joseph Barrow, Timothy Dunlap
-
A. E. Eichenberger , et. al, "Using advanced
compiler technology to exploit the performance of the Cell Broadband Engine
architecture", IBM Systems Journal, 45(1),
Jan. 2006. [PDF]
-
"Debunking the 100X GPU vs. CPU myth:
an evaluation of throughput computing on CPU and GPU",
In Proceedings of 2010 International Symposium on Computer
Architecture (ISCA), May 2010. [PDF]
3/21-23 No class - spring break
3/28 Computational Grids - Janani Gururam,
Somay Jain, Keerthi Kalyanam, Sri Kankanahalli
-
I. Foster and C. Kesselman, "Computational Grids",
Chapter 2 of The Grid: Blueprint for a New Computing Infrastructure,
Morgan Kaufmann, 1999. [PDF]
-
A. Chervenak, I. Foster, C. Kesselman, C.
Salisbury, S. Tuecke, "The Data Grid: Towards an Architecture for the
Distributed Management and Analysis of Large Scientific Datasets",
Journal of Network and Computer Applications, 23:187-200, 2001. [PDF]
3/30 Clouds - Sai Kopppuravuri, Tara Larrue,
Kyunghun Lee, Honglei Li
-
Jeffrey Dean and Sanjay Ghemawat, "MapReduce:
Simplified Data Processing on Large Clusters", In Proceedings of OSDI'04, pp.
137-150 [PDF]
-
Michael Stonebraker, Daniel Abadi, David J.
DeWitt, Sam Madden, Erik Paulson, Andrew Pavlo, Alexander Rasin, "MapReduce
and Parallel DBMSs: Friends or Foes?", Communications of the ACM,
53(1), Jan. 2010, pp. 64-71. [PDF]
4/4 Clouds, cont. - Yuntao Liu, Chirag
Majithia, Koutilya PNVR, Tejo Poghuraju
-
M. Zaharia, R. Xin, P. Wendell, T. Das,
M. Armbrust, A. Dave, X. Meng, J. Rosen, S. Venkataraman,
M. Franklin, A. Ghodsi, J. Gonzalez, S. Shenker, and I. Stoica,
"Apache Spark: A Unified Engine for Big Data
Processing,", Communications of the ACM, 59(11), Nov. 2016. [PDF]
-
B. Hindman, A. Konwinski, M. Zaharia,
A. Ghodsi, A.D. Joseph, R. Katz, S. Shenker and I. Stoica, " Mesos: A Platform for Fine-Grained Resource Sharing in the Data
Center", In Proceedings of 8th USENIX Symposium on
Networked Systems Design and Implementation (NSDI), USENIX,
March 2011. [PDF]
Tools
4/6 Event Ordering and Race Detection - Michael Saugstad, Han-Chin
Shing, Ashwin Lakshmanaswamy, Virinchi Srinivas
-
L. Lamport, "Time, Clocks, and the Ordering of Events
in a Distributed System", Communications of the ACM, 21(7), 1978, pp. 558-564.
[PDF]
-
S. Savage, M. Burrows, G. Nelson, P. Sobalvarro, and
T. Anderson, "Eraser: A Dynamic Data Race Detector for Multi-Threaded Programs",
In Proceedings of the 16th Symposium on Operating Systems Principles, ACM Press,
Oct. 1997. [PDF]
4/11 Data Collection and Instrumentation -
Guowei Sun, Ahmed Taha, Jerry Tan, Shuhao Tan
-
Nicholas Nethercote and Julian Seward, "Valgrind:
A Framework for Heavyweight Dynamic Binary Instrumentation",
In Proceedings of the 2007 ACM/SIGPLAN Conference on
Programming Language Design and Implementation (PLDI), June 2007. [PDF]
-
B. R. Buck and J.K. Hollingsworth , "An API for Runtime Code Patching," International Journal of High Performance Computing Applications, 14 (4), Winter 2000, pp. 317-329. [PDF]
4/13 Cache Tools
- Rui Wang, Jiahao Wu, Roozbeh Yousefzadeh, Joseph Barrow
-
J. Mellor-Crummey, D. Whalley, and K. Kennedy,
"Improving Memory Hierarchy Performance for Irregular Applications Using
Data and Computation Reorderings", International Journal of Parallel
Programming, 29(3), June 2001.
[PDF]
Margaret Martonosi, Anoop Gupta, Thomas Anderson,
"MemSpy: analyzing memory system bottlenecks in programs", ACM SIGMETRICS
Performance Evaluation Review, 20(1), 1992. [PDF]
4/18 Runtime Parallelization - Timothy
Dunlap, Janani Gururam, Somay Jain, Keerthi Kalyanam
-
S.J. Fink, S.R. Kohn, and S.B. Baden, "Ef
ficient
Run-time Support for Irregular Block-Structured Applications", Journal of
Parallel and Distributed Computing, 50(1), 1998. [PDF]
-
G. Agrawal, A. Sussman, and J. Saltz, "An
Integrated Runtime and Compile-time Approach for Parallelizing Structured
and Block Structured Applications", IEEE Transactions on
Parallel and Distributed Systems, 6(7), 1995. [PDF]
Systems Issues
4/20 Finding Idle Cycles - Sri
Kankanahalli, Sai Koppuravuri, Tara Larrue, Kyunghun Lee
-
M. Litzkow, M. Livny, and M. Mutka, "Condor - A Hunter of Idle Workstations", In Proceedings of International Conference on Distributed Computing Systems, June 1988, pp. 104-111. [PDF]
For a more up-to-date detailed history of the Condor project, see:
D. Thain, T. Tannenbaum, and M. Livny " Distributed
Computing in Practice: The Condor Experience", Concurrency
and Computation: Practice and Experience , Vol. 17, Nos. 2-4,
2005. [PDF]
-
David P. Anderson, Carl
Christensen and Bruce Allen, "Designing a Runtime System for Volunteer
Computing", In Proceedings of SC'06, November 2006. [PDF]
4/25 Parallel I/O - Honglei Li, Chirag
Majithia, Koutilya PNVR, Tejo Pothuraju, Michael Saugstad, Shuhao Tan, Rui Wang
-
Terry Jones, Alice Koniges, and R. Kim Yates, "Performance of the IBM General Parallel File System", In Proceedings of 14th International Parallel and Distributed Processing Symposium (IPDPS'00), April 2000. [PDF]
-
A. Acharya, M. Uysal, and J. Saltz,
"Active Disks: Programming Model, Algorithms and Evaluation", In Proceedings of Eighth International Conference on Architectural Support for Programming Languages and Operating Systems, October 1998. [PDF]
4/27 Midterm Exam
Applications
5/2 Applications - Han-Chin Shing, Ashwin
Lakshmanaswamy, Virinchi Srinivas, Guowei Sun, Ahmed Taha, Jiahao Wu,
Roozbeh Yousefzadeh
-
U. Catalyurek, M. Beynon, C. Chang, T. Kurc, A. Sussman, and
J. Saltz, "The Virtual Microscope",
IEEE Transactions on Information Technology in Biomedicine, Vol. 7, No. 4, 2003.
[PDF]
-
David E. Shaw et. al., "Millisecond-scale molecular dynamics simulations on Anton", Proceedings of SC'09, November 2009. [PDF]
5/4 Project Demos
-
Boggle Puzzle - Dunlap, Shing, Kankanahalli, Dunlap
-
Parallel Hybrid Framework for Graph Processing - Jain, Majithia,
Shivapuram, Srinivas
-
Patch Matching for Image Segmentation - Taha, Tan
-
Auto-tuning for scalable parallel 3-D FFT - Tan, Yousefzadeh
5/9 Project Demos
-
Model Selection for SVM - Koppuravuri, Pothuraju, Kalyanam, PNVR
-
Parallel Graph-based Semi-supervised Learning - Sun, Wang
-
Load Balancer - Wu, Lee, Li
-
Parallelized Landsat Data Processing - Larrue, Saugstad, Gururam
5/11 SC16 Gordon Bell award finalist
-
Peter Vincent, Freddie Witherden,
Brian Vermeire, Jin Seok Park, and Arvind Iyer, "Towards Green Aviation with Python at Petascale ",
In Proceedings of SC16, November 2016.
[PDF]
|