CMSC 828L Deep Learning
Staff
Professor David Jacobs AV Williams, 4421
Office Hours: Tuesday, 11-12
djacobs-at-cs
TAs: Chengxi Yi (yechengxi-at-gmail)
Angjoo Kanazawa (firstname. lastname-at-gmail)
Soumyadip Sengupta (senguptajuetce-at-gmail)
Jin Sun (firstnamelastname-at-cs)
Hao Zhou (zhhoper-at-gmail)
Readings
Much of the reading for
class will come from two books available on-line.
Neural Networks and Deep
Learning, by Michael Nielsen
Other reading material appears in the schedule below.
Requirements
Students registered for this class must complete the following assignments:
Presentation: Students will form eight groups of four students each. Each group will be responsible for one class. They will present papers and lead a discussion on one of the discussion topics listed on the schedule. Discussion topics are marked in blue (applications) and red (more theoretical material). Professor Jacobs will lead the discussion for topics not selected by the students. Note that there is room on the schedule for some groups to suggest their own topics. Presentations will be graded according to the following rubric.
Paper Summaries: For eight of the discussion classes, students must turn in a one page summary of one of the papers to be discussed on that day. Summaries should contain one paragraph that summarizes the paper, and one paragraph that provides some analysis of the work in the paper, including suggestions for possible questions to discuss. Summaries must be handed in before the start of class, and students must attend class on the days in which they hand in summaries.
Problem Sets: There will be three problem sets assigned during the course. These will include programming projects and may also include written exercises.
Final Project: Students will undertake a final project for the class. These may be done alone or in teams. Students should discuss their topic with the professor.
Assignments
Problem Set |
Assigned |
Due |
9/20/16 |
10/11/16 |
|
10/18/16 |
11/8/16 |
|
Final Project |
|
12/8/16 |
Tentative Schedule
|
Date |
Topic |
Presenters |
Reading |
Class 1 |
8/30 |
Introduction |
|
|
Class 2 |
9/1 |
Intro to Machine Learning: |
|
Deep Learning, Chapter 5 |
Class 3 |
9/6 |
Intro to Machine Learning: Linear models (SVMs and Perceptrons, logistic regression) |
|
For Logistic Regression see this chapter from Cosmo Shalizi |
Class 4 |
9/8 |
Intro to Neural Nets: What a shallow network computes. |
|
Deep Learning, Chapter 6 Neural Networks and Deep Learning, Chapter 2 |
Class 5 |
9/13 |
Training a network: loss functions, backpropagation and stochastic gradient descent. |
|
A tutorial on energy based learning, by Lecun et al. Neural Networks and Deep Learning, Chapter 3 |
Class 6 |
9/15 |
Neural networks as universal function approximators |
|
Approximation by superpositions of a sigmoidal function, by
George Cybenko (1989). Multilayer
feedforward networks are universal approximators, by
Kurt Hornik, Maxwell Stinchcombe,
and Halbert White (1989) Neural Networks and Deep Learning, Chapter 4 |
Class 7 |
9/20 |
Deep Networks: Backpropagation and regularization, batch normalization |
|
Deep Learning, Chapter 7 |
Class 8 |
9/22 |
VC Dimension and Neural Nets |
David |
VC
Dimension of Neural Networks, by Sontag |
Class 9 |
9/27 |
Why are deep networks better than shallow? |
David |
G. F. Montufar, R. Pascanu,
K. Cho, and Y. Bengio. On
the number of linear regions of deep neural networks. In NIPS,
pages 2924–2932, 2014. The Power of Depth for Feedforward Neural Networks |
Class 10 |
9/29 |
Why are deep networks better than
shallow? |
David |
Benefits
of depth in neural networks Matus
Telgarsky |
Class 11 |
10/4 |
Convolutional Networks |
|
Deep Learning,
Chapter 9 |
Class 12 |
10/6 |
Applications: Imagenet |
David |
ImageNet Classification with Deep Convolutional Neural
Nets by Krivhevsky et al. Very Deep Convolutional Neural
Networks for Large-Scale Image Recognition, by Simonyan
and Zisserman Deep Residual Learning for Image
Recognition by He et al. Residual Networks are Exponential
Ensembles of Relatively Shallow Networks by Veit
et al. Also of interest: Neural Networks and
Deep Learning Chapter
5 On the
Difficulty of Training Recurrent Neural Networks by Pascanu
et al. |
Class 13 |
10/11 ECCV |
Applications: Detection |
Ankan, Upal,
Amit, Weian |
Rich
feature hierarchies for accurate object detection and semantic segmentation
by Girshick et al. |
Class 14 |
10/13 ECCV |
Audio |
Jiao,
Philip |
WaveNet:
A Generative Model for Raw Audio by van den Oord
et al. See also the Wavenet
blog
post |
Class 15 |
10/18 |
What does a neuron compute? |
Nitin, Kiran |
Visualizing and Understanding Convolutional Networks by Zeiler and Fergus |
Class 16 |
10/20 |
Dimensionality reduction, linear (PCA, LDA) and manifolds, metric learning |
|
PCA (slides from Olga Veksler) LDA (slides from Olga Veksler) Metric Learning, a Survey, by Brian Kulis An elementary proof of the Johnson-Lindenstrauss Lemma, by Dasgupta and
Gupta |
Class 17 |
10/25 |
Autoencoders and dimensionality reduction in networks |
|
Deep Learning, Chapter 14 |
Class 18 |
10/27 |
Applications: Natural Language
Processing (eg., Word2vec) |
Amr, Prudhui, Sanghyun, Faez |
Efficient Estimation of Word Representations in Vector Space by Mikolov et al. |
Class 19 |
11/1 |
Applications: Joint Detection |
Chinmaya, Huaijen, Ahmed, Spandan |
Convolutional Pose Machines by Wei et al. Stacked Hourglass Networks for Human Pose Estimation by Newell et al. Recurrent
Network Models for Human Dynamics by Fragkiadaki |
Class 20 |
11/3 |
Neuroscience: What does a neuron
do? |
David |
Spiking Neuron Models (Cambridge Univ. Press) Chapter 1 and Sections 10.1, 10.2 |
Class 21 |
11/8 |
Applications: Bioinformatics |
Somay, Jay, Varun, Ashwin |
Predicting effects of noncoding variants with deep learning–based sequence model by Zhou and Troyanskaya |
Class 22 |
11/10 |
Optimization in Deep Networks |
Zheng |
The Loss Surfaces of Multilayer Neural Networks by Choromanska et al. No Bad Local Minima: Data independent training error guarantees for multi-layer neural networks by Soudry and Carmon |
Class 23 |
11/15 |
Generalization in Neural Networks |
David |
Generative
Adversarial Networks by Goodfellow
et al. Margin Preservation of Deep
Neural Networks by Sokolic |
Class 24 |
11/17 |
Applications: Face recognition |
Hui, Huijing, Mustafa |
Deepface: Closing the Gap to Human Level Performance in Face Verification by Taigman et al. Facenet: a Unified Embedding for Face Recognition and Clustering by Schroff et al. Deep Face Recognition by Parkhi et al. |
Class 25 |
11/22 |
Spatial Transformer Networks |
Angjoo |
Spatial
Transformer Networks by Jaderberg et al. WarpNet:
Weakly Supervised Matching for Single-view Reconstruction by Kanazawa et
al. |
Class 26 |
11/29 |
Recurrent networks, LSTM |
|
|
Class 27 |
12/1 |
Applications: Scene
Understanding |
Abhay,
Rajeev, Palabi |
Attend, Infer,
Repeat: Fast Scene Understanding with Generative Models by Eslami, et
al. |
Class 28 |
12/6 NIPS |
Applications: Generating Image Captions |
Mingze, Chirag, Wei, Yanzhou |
Deep
Fragment Embeddings for Bidirectional Image
Sentence Mapping by Karpathy, et al Deep
Visual-Semantic Alignments for Generating Image Descriptions by Karpathy, et al DenseCap:
Fully Convolutional Localization Networks for Dense Captioning
by Johnson et al |
Class 29 |
12/8 NIPS |
Overview discussion: |
David |
Building Machines That Learn and
Think Like People by Brenden M. Lake, Tomer D. Ullman, Joshua B. Tenenbaum,
and Samuel J. Gershman |