CMSC498Y meets on Mondays and Wednesdays. A tentative course schedule is below. Slides will be posted to ELMS before the lecture and then linked here after the lecture. If lecture is recordings, it will typically be available under the zoom tab on ELMS (sometimes it will be uploaded to files tab in the recordings folder).
Week | Day | Date | Topic | Materials | Assigned Reading |
---|---|---|---|---|---|
1 | Wed | Jan 24 | Course Overview & Background | [1_overview_and_bio.pdf] | |
2 | Mon | Jan 29 | Background |
Protein sequences (translation, codon, popular databases), Statistics review (models, data, likelihood, MLE, KL divergence), Random sequence model
[2_prob_review.pdf] [2_stats_review.pdf] |
BSA Sections 1.3, 11.1 (through multinomial), 11.2 (relative entropy only), 11.3 (ML only), 11.5 (ML only) |
Wed | Jan 31 | Markov Models | Markov Models, Model Discrimination, Hidden Markov Models (definition only)
[3_markov_models.pdf] |
BSA Sections 3.1, 3.2 (through formal definition) | |
3 | Mon | Feb 5 | Decoding HMMs |
Viterbi algorithm, Posterior decoding (Forward algorithm, Backward algorithm)
[4_hmm_decoding.pdf] |
BSA Section 3.2 |
Wed | Feb 7 | Training HMMs |
Supervised learning (Maximum Likelihood), Unsupervised learning (Expectation–maximization algorithm, Baum-Welch)
[5_hmm_training.pdf] |
BSA Section 3.3 | |
4 | Mon | Feb 12 | Multiple Sequence Alignment (MSA) |
Edit distance, Sum-of-pairs (SOP) MSA problem, Star Alignment, Consistency, Progressive Alignment
[6_msa.pdf] |
BSA Chapter 2, Chapter 6 |
Wed | Feb 14 | Profile HMMs |
Profile HMMs, Supervised training from MSAs, Alignment with Viterbi, Classification
[6_msa.pdf] Assignment #1 released due Thurs Feb 29 |
BSA Chapter 5 (through 5.4) | |
5 | Mon | Feb 19 | Lab Day 1 |
Lab on MSAs, HMMs, and Viterbi
Download data: [hmm-msa-lab.zip] |
|
Wed | Feb 21 | Lab Day 2 |
Lab on MSAs, HMMs, and Viterbi
Download code: [hmm-msa-lab.zip] |
||
6 | Mon | Feb 26 | RNA Secondary Structure |
Debrief on lab; Intro to RNA Secondary Structure
[8_rna_secondary_structure.pdf] |
BSA Section 10.1 |
Wed | Feb 28 | Grammars |
Stochastic Context Free Grammars (SCFG)
[9_scfg.pdf] Assignment #1 DUE TOMORROW! |
BSA Chapter 9 (skip Section 9.4) | |
7 | Mon | Mar 4 | Optimization | Maximium Base Pairs (Nussinov's Algorithm) and Minimum Energy
[10_rna_opt.pdf] |
BSA Section 10.2 through first sub-section on Energy minimization (skip SCFG sub-section) |
Wed | Mar 6 | UFold |
UFold Input and Output Construction for CNN (UNet)
[11_ufold_part1.pdf] |
[cdpfold paper]
[ufold paper] |
|
8 | Mon | Mar 11 | UFold Encoder |
UFold Contraction Path and Related Operations (e.g. Convolution, Max Pool)
[12_ufold_part2.pdf] |
[Chapter 9 - Convolutional Networks] |
Wed | Mar 13 | UFold Decoder |
UFold Expansion Path and Related Operations (e.g. Convolution, Upsampling)
[13_ufold_part3.pdf] Assignment #2 released last week (Mar 8) due Mar 29! |
||
9 | Mon | Mar 18 | No class | Spring break | |
Wed | Mar 20 | No class | Spring break | ||
10 | Mon | Mar 25 | Lab Day 3 | UFold Lab - training, and testing, and modifying UFold
[Download] |
|
Wed | Mar 27 | Lab Day 4 | UFold Lab - training, and testing, and modifying UFold
[Download] Assignment #2 DUE THIS FRIDAY (Mar 29)! |
||
11 | Mon | Apr 1 | Midterm Review | See the midterm exam study guide | |
Wed | Apr 3 | Midterm Exam | |||
12 | Mon | Apr 8 | No class | Rescheduled to Thursday at 1pm for eclipse viewing | |
Wed | Apr 10 | Intro to Protein Structure |
Recap of class; Intro to protein struture; amino acids, backbone, side chain, alpha-helix, beta-strand, beta-sheet, phi/psi angles, Ramanchandran Principle
[14_protein_structure.pdf] |
[1 Intro to Protein Structure] | |
Thu | Apr 11 | Structure Comparison and Databases |
PBD vs UniProt, Structure File Formats (PDB), Superposition, RMSD, Structure Alignment, TM-score, TM-align, GDT, 1DDT
[15_structure_comparison_and_db.pdf] Assignment #3 released tomorrow |
[3 Structure Alignment]
[4 Data Resources for Structural Bioinformatics] [6 Introduction to Protein Structure Prediction] |
|
13 | Mon | Apr 15 | Protein Secondary Structure Prediction |
Multi-class classification, Accuracy metrics for multiple classes (micro vs. macro vs. weighted average), Segment of Overlap Score (SOP), PHD method
[16_protein_secondary_structure_prediction.key] |
[Mona Singh's Lecture Notes (skip 1.1 and 1.2)] |
Wed | Apr 17 | Alphafold2 |
Alphafold2: Overview, Inputs, and Featurization
[17_alphafold2_overview.pdf] |
[alphafold2.pdf] [alphafold2-supp.pdf] | |
14 | Mon | Apr 22 | Evoformer part 1 |
Alphafold2: Initialization of MSA and pair representation, Evoformer, Self-Attention
[18_alphafold2_evoformer.pdf] Assignment #3 DUE TODAY! Assignment #4 released tomorrow |
|
Wed | Apr 24 | Evoformer part 2 |
Alphafold2: Outer mean product, Triangle updates (with and without self-attention)
[18_alphafold2_evoformer.pdf] |
||
15 | Mon | Apr 29 | No class | Canceled for NSF CISE workshop | |
Wed | May 1 | Structure Module |
ESMfold overview
[19_protein_language_models.pdf] |
||
16 | Mon | May 6 | Protein language models |
Training protein language models with masked language model task, other applications of vector representations (e.g. clustering by family and contact prediction)
[19_protein_language_models.pdf]
Assignment #4 DUE TODAY! |
|
Wed | May 8 | Last class |
Overview of take-home final exam and get started on data analysis
Take-home FINAL EXAM is released |
||
Wed | May 15 | Exam DUE | Take-home FINAL EXAM DUE TODAY |