Instructor Sanjeev Arora (arora AT cs.princeton.edu). Office hour: Wed 4:30-5:30, COS407
Teaching assistant Gon Buzaglo (gon.buzaglo AT princeton.edu). Office hour: Mon 4:30-5:30, COS431
Lectures Monday/Wednesday 3:00-4:30 pm
Location Friend Center 008

This course is a mathematical and conceptual introduction to Deep Learning: basic concepts, model classes, paradigms, and attempts at analysis. We will cover some ML theory (learning rate, SGD, generalization, etc.) and then some advanced topics: Normalization, Implicit Bias, Generative Models, Recurrent Nets, Contrastive Learning, Self-Supervised Learning, Transformers, Diffusion Models, Private Learning, Interpretability, Fine-tuning of Large Pretrained Models, etc.

Prerequisites: The course is appropriate for graduate students who felt very comfortable in undergraduate coursework that involved math proofs (e.g., CS theory, Optimization/Applied Math, etc.). Undergraduates need instructor's permission to enroll.

Course Materials: We will mainly use the Theory of Deep Learning book draft, which can be found here.

Assignments and Projects

  • Assignments (50%)
    • Collaboration in groups of 2-3 is encouraged.
    • Assignment 1: Optimization
      • Released: September 10
      • Due: September 22
    • Assignment 2: Generalization
      • Released: September 24
      • Due: October 6
    • Assignment 3
      • Released: October 8
      • Due: After fall break (October 22)
    • Assignment 4
      • Released: October 22
      • Due: November 3
    • Assignment 5 (tentative, shorter one)
  • Projects (50%)
    • Description: The course project is a research project suggested by you. Projects may be either theoretical or practical in nature, as long as they are closely relevant to the course topics. We will try to release some examples from past years soon.
    • Project grading: 40% for the project paper, 10% for the final project presentation.
    • Collaboration recommended in teams of 2-3 (talk to us about deviations; larger teams should take on more complicated projects).
    • Project Milestones
      • Project and team selection: November 1
      • Project idea presentations: November 19 (3 min talk per team)
      • Final project presentations: December 10 (after classes end, whole day with two sessions)
      • Final submission: Morning of December 15
Please note that all project presentations are in person.

Tentative schedule

Date Topic Materials
Sept 3 Optimization theory Lecture PDF
Sept 8 Optimization contd. Lecture PDF
Sept 10 Generalization theory Lecture PDF
Sept 15 Generalization contd Lecture PDF
Sept 17 Role of training algorithm in generalization (Implicit bias)
Sept 22 Credit Attribution/influence functions
Sept 24 Linear data models
Sept 29 KL divergence and Distribution learning
Oct 1 Diffusion Models
Oct 6 Diffusion Models (contd)
Oct 8 Deep learning architectures: convolutions + theory
Oct 19 Deep learning architectures: normalization + theory
Oct 21 Language modeling, cross-entropy, notions of generalization
Oct 26 Transformers. Scaling laws
Oct 28 Emergence of complex skills from scaling.
Nov 1 LLM Alignment
Nov 3 Project ideas
Nov 8 LLM Alignment 2
Nov 15 Recurrent architectures for language modeling: State space models