Systems for Machine Learning (CMSC828G)

Lecture Paper Readings

There will be two assigned readings for each lecture (starting on Feb 25). All students are supposed to read all the assigned papers before the lecture. Your class participation grade for the course will be determined by:

  • Questions about the reading: Every student will submit 2-3 questions or discussion topics on one of the readings (If your last name starts with A-N: Reading 1; if your last name starts with P-Z: Reading 2). These will be due 9 AM on the date of the lecture. Name the PDF that you will submit on gradescope as follows: MMDD-LastName-FirstName.pdf where MM and DD are the month and date of the lecture.
  • Short presentation on the reading: Once in the semester, each student (paired with another student) will do a short 5-minute presentation on one of the assigned readings. You will upload a PDF of the presentation by 9 AM on the date of the lecture. Name the PDF that you will submit on gradescope as follows: MMDD-LastName-FirstName.pdf where MM and DD are the month and date of the lecture. These presentations should be 4-5 minutes (in total including both students' parts) and follow the provided format (PDF, PPTX).
Resource on how to read a scientific paper.

Lecture Slides

No. Date Topic and Slides Reading 1 Presenters Reading 2 Presenters
1 Jan 28 Course Introduction and Overview
2 Jan 30 Introduction to HPC / Systems
Feb 4 (contd.)
3 Feb 6 Introduction to GPU Programming
4 Feb 11 Introduction to Triton Programming
5 Feb 13 Introduction to Deep Learning
6 Feb 18 Transformers and Performance Modeling Attention 2017
Feb 20 No class
7 Feb 25 Challenges in High Performance DL COTS HPC 2013 Extra-Deep 2023
8 Feb 27 Parallel Training PyTorch DDP 2020 UB, KB PyTorch FSDP 2023 AB, ZCh
Mar 4 (contd.) Megatron-LM 2019 PD, CDz AxoNN 2024 Guest - Siddharth Singh
Mar 6 Pipeline and Hybrid Parallel Training GPipe 2018 DE, AH Hybrid Parallelism 2021 Guest - Deepak Narayanan [video]
10 Mar 11 Optimizing GPU Kernels Sputnik 2020 HH, LH Flash Attention 2022 Guest - Tri Dao [video]
Mar 13 Deep Learning Compilers TVM 2018 DJ, DK TorchDynamo 2024 Guest - Jason Ansel [video]
Mar 18 Spring Break
Mar 20 Spring Break
Mar 25 Optimizers HyLo 2022 BM, AN Distributed Shampoo 2023 Guest - Shi & Iwasaki [video]
Mar 27 Sparsity in Training MoE 2017 SP, SS MegaBlocks 2022 Guest - Trevor Gale [video]
Apr 1 Memory offload vDNN 2016 IR, MS ZeRO-Infinity 2021 Guest
Apr 3 Introduction to Inference Transformers 2022 XT, PU vLLM 2023 Guest
Apr 8 Approximating Attention Top-k 2021 CU, WW H2O 2023 Guest
April 10 Midterm Exam (during class)
Apr 15 Long context optimizations
Apr 17 Quantization
Apr 22 Data Movement
Apr 24 Data Movement
Apr 29 Specific Models
May 1 No class? TBD
May 6 Project Presentations
May 8 Project Presentations
May 13 Project Presentations
May 15 Final Project Due