Instructor Sanjeev Arora (arora AT cs.princeton.edu). Office hour: Wed 4:30-5:30, COS407
Teaching assistant Gon Buzaglo (gon.buzaglo AT princeton.edu). Office hour: Mon 4:30-5:30, COS431
Lectures Monday/Wednesday 3:00-4:30 pm
Location Friend Center 008

This course is a mathematical and conceptual introduction to Deep Learning: basic concepts, model classes, paradigms, and attempts at analysis. We will cover some ML theory (learning rate, SGD, generalization, etc.) and then some advanced topics: Normalization, Implicit Bias, Generative Models, Recurrent Nets, Contrastive Learning, Self-Supervised Learning, Transformers, Diffusion Models, Private Learning, Interpretability, Fine-tuning of Large Pretrained Models, etc.

Prerequisites: The course is appropriate for graduate students who felt very comfortable in undergraduate coursework that involved math proofs (e.g., CS theory, Optimization/Applied Math, etc.). Undergraduates need instructor's permission to enroll.

Course Materials: We will mainly use the Theory of Deep Learning book draft, which can be found here.

Assignments and Projects

  • Assignments (50%)
  • Projects (50%)
    • Description: The course project is a research project suggested by you. Projects may be either theoretical or practical in nature, as long as they are closely relevant to the course topics. See detailed guidelines here, and past projects here.
    • Project grading: 40% for the project paper, 10% for the final project presentation.
    • Collaboration recommended in teams of 2-3 (talk to us about deviations; larger teams should take on more complicated projects).
    • Project Milestones
      • Project and team selection: November 1
      • Project idea presentations: November 19 (3 min talk per team)
      • Final project presentations: December 10 (after classes end, whole day with two sessions)
      • Final submission: Morning of December 15
Please note that all project presentations are in person.

Tentative schedule

Date Topic Materials
Sept 3 Optimization theory Lecture PDF
Sept 8 Optimization contd. Lecture PDF
Sept 10 Generalization theory Lecture PDF
Sept 15 Generalization (lecture by Noam Razin) Lecture PDF
Sept 17 Role of training algorithm in generalization (Implicit bias) (lecture by Noam Razin) Lecture PDF
Sept 22 Credit Attribution/influence functions Lecture PDF
Sept 24 Credit Attribution/Shapley value Lecture PDF
Sept 29 KL divergence and Distribution learning Lecture PDF
Oct 1 Diffusion Models Lecture PDF
Oct 6 Deep learning architectures: convolutions + theory Lecture PDF
Oct 8 Deep learning architectures: normalization + theory Lecture PDF
Oct 19 Deep learning architectures: normalization + theory Lecture PDF
Oct 21 Language modeling, cross-entropy, notions of generalization Lecture PDF
Oct 27 Post-training: Making LLMs useful Lecture PDF
Oct 29 Post-training: Making LLMs useful Lecture PDF
Nov 3 Learning to be robust against adversaries Lecture PDF
Nov 5 Guest Lecture by Tri Dao
Nov 10 Generative Adversarial Networks Lecture PDF
Nov 13 Generative Adversarial Networks Lecture PDF
Nov 17 Deep Learning Optimization Lecture PDF
Nov 19 Project Ideas
Nov 24 The Physics of Representations: Superposition Lecture PDF
Dec 1 AI for math Lecture PDF
Dec 3 Guest Lecture by Tengyu Ma