Schedule

Reading sources:

Week 1

Mon
Feb. 10
Welcome!
We introduce our topic and discuss how the course works.
Learning Objectives
Getting Oriented
Reading
Course syllabus
Notes
Welcome slides
Data, Patterns, and Models
Warmup
Set up your software.
Assignments
Math pre-assessment.
Wed
Feb. 12
The Classification Workflow in Python
We work through a simple, complete example of training and evaluating a classification model on a small data set.
Learning Objectives
Navigation
Experimentation
Reading
PDSH: Data Manipulation with Pandas (through "Aggregation and Grouping")
Notes
Lecture notes
Live notes
Warmup
Manual linear prediction
Assignments
Blog Post: Penguins

Week 2

Mon
Feb. 17
Linear Score-Based Classification
We study a fundamental method for binary classification in which data points are assigned scores. Scores above a certain threshold are assigned to one class; scores below are assigned to another.
Learning Objectives
Theory
Experimentation
Reading
Linear Classifiers from MITx.
Notes
Lecture notes
Live notes
Warmup
Decision Boundaries
Wed
Feb. 19
Statistical Decision Theory and Automated Decision-Making
We discuss the theory of making automated decisions based on a score function. We go into detail on thresholding, error rates, and cost-based optimization.
Learning Objectives
Theory
Experimentation
Reading
PDSH: Introduction to Numpy
Notes
Lecture notes
Live notes
Warmup
Choosing a Threshold
Assignments
Blog Post: Design and Impact of Automated Decision Systems

Week 3

Mon
Feb. 24
Auditing Fairness
We introduce the topics of fairness and disparity in automated decision systems using a famous case study.
Learning Objectives
Social Responsibility
Experimentation
Reading
BHN: Introduction
Machine Bias by Julia Angwin et al. for ProPublica.
Notes
Lecture notes
Live notes
Warmup
Experiencing (Un)Fairness
Wed
Feb. 26
Statistical Definitions of Fairness in Automated Decision-Making
We offer formal mathematical definitions of several natural intuitions of fairness, review how to assess them empirically on data in Python, and prove that two major definitions are incompatible with each other.
Learning Objectives
Social Responsibility
Theory
Reading
BHN: Classification (ok to skip "Relationships between criteria" and below)
Notes
Lecture notes
Live notes
Warmup
BHN Reading Check
Assignments
Blog Post: Auditing Bias
OR
Blog Post: Bias Replication Study

Week 4

Mon
Mar. 03
Normative Theory of Fairness
We discuss some of the broad philosophical and political positions that underly the theory of fairness, and connect these positions to statistical definitions.
Learning Objectives
Social Responsibility
Reading
BHN: Relative Notions of Fairness
Notes
Discussion guide shared on Canvas
Warmup
COMPAS and Equality of Opportunity
Wed
Mar. 05
Critical Perspectives
We discuss several critical views that seek to move our attention beyond the fairness of algorithms and towards their role in sociotechnical systems. We center two questions: who benefits from a given data science task? What tasks could we approach instead if our aims were to uplift the oppressed?
Learning Objectives
Social Responsibility
Reading
Data Feminism: The Power Chapter by Catherine D'Ignazio and Lauren Klein
"The Digital Poorhouse" by Virginia Eubanks
"Studying Up: Reorienting the study of algorithmic fairness around issues of power" by Barabas et al.
Notes
Discussion guide shared on Canvas
Warmup
Power, Data, and Studying Up
Assignments
Blog Post: Limitations of the Quantitative Approach

Week 5

Mon
Mar. 10
No class
Phil is giving a talk at Michigan State
Wed
Mar. 12
Introduction to Model Training: The Perceptron
We study the perceptron as an example of a linear model with a training algorithm. Our understanding of this algorithm and its shortcomings will form the foundation of our future explorations in empirical risk minimization.
Learning Objectives
Theory
Reading
No reading today, but please be ready to put some extra time into the warmup. It may be useful to review our lecture notes on score-based classification and decision theory when completing the warmup.
Notes
Lecture notes
Live notes
Warmup
Linear Models, Perceptron, and Torch
Assignments
Blog Post: Implementing Perceptron

Break

Mon
Mar. 17
Spring Break!
Wed
Mar. 19
Spring Break!

Week 6

Mon
Mar. 24
Convex Empirical Risk Minimization
We introduce the framework of convex empirical risk minimization, which offers a principled approach to overcoming the many limitations of the perceptron algorithm.
Learning Objectives
Theory
Reading
Convexity Examples by Stephen D. Boyles, pages 1 - 7 (ok to stop when we start talking about gradients and Hessians).
Notes
Lecture notes
Live notes
Warmup
Practice with Convex Functions
Wed
Mar. 26
Gradient Descent
We study a method for finding the minima of convex functions using techniques from calculus and linear algebra.
Learning Objectives
Theory
Reading
No reading today, but please budget some extra time for the warmup.
Notes
Lecture notes
Live notes
Warmup
A First Look at Gradient Descent
Assignments
Blog Post: Implementing Logistic Regression

Week 7

Mon
Mar. 31
Feature Maps and Regularization
We re-introduce feature maps as a method for learning nonlinear decision boundaries, and add regularization to the empirical risk minimization problem in order to control the complexity of our learned models.
Learning Objectives
Theory
Experimentation
Reading
No reading today -- please think hard about your project pitches!
Notes
Lecture notes
Live notes
Warmup
Project Pitches
Wed
Apr. 02
Linear Regression
We introduce regression (prediction of numerical outcomes) and study the ridge regression model for linear regression.
Learning Objectives
Theory
Experimentation
Reading
No reading today.
Notes
Lecture notes
Live notes
Warmup
Ordinary Least-Squares Linear Regression

Week 8

Mon
Apr. 07
Bias-Variance Tradeoff
We explore the bias-variance tradeoff in regression and connect it to the phenomenon of overfitting.
Learning Objectives
Theory
Experimentation
Notes
Lecture notes
Live notes
Warmup
Variance of a Random Variable and Prediction
Assignments
Blog Post: Double Descent
Wed
Apr. 09
Vectorization and Feature Engineering
We illustrate the interplay of vectorization and feature engineering on image data.
Learning Objectives
Experimentation
Implementation
Reading
Image Kernels Explained Visually by Victor Powell
Notes
Lecture notes
Live notes
Warmup
Small break today: no warmup.

Week 9

Mon
Apr. 14
Kernel Methods
We introduce kernel methods for using high-dimensional feature maps in linear empirical risk minimization without the need to explicitly form feature vectors.
Learning Objectives
Theory
Experimentation
Notes
Lecture notes
Live notes
Warmup
Project Update
Assignments
Blog Post: Kernelized Logistic Regression
Wed
Apr. 16
The Problem of Features and Deep Learning
We motivate deep learning as an approach to the problem of learning complex nonlinear features in data.
Learning Objectives
Theory
Experimentation
Notes
Lecture notes
Live notes
Warmup
Nonlinear Fitting and Convexity

Week 10

Mon
Apr. 21
Contemporary Optimization
We briefly introduce two concepts in optimization that have enabled large-scale deep learning: stochastic first-order optimization techniques and automatic differentiation.
Learning Objectives
Theory
Experimentation
Notes
Lecture notes
Live notes
Warmup
Project Update
Assignments
Blog Post: Advanced Optimization
Wed
Apr. 23
Deep Image Classification
We return to the image classification problem, using deep learning and large-scale optimization to optimize convolutional kernels as part of the training process.
Learning Objectives
Theory
Experimentation
Reading
Convolutional Neural Networks from MIT's course 6.036: Introduction to Machine Learning.
Notes
Lecture notes
Live notes
Warmup
What needs to be learned?

Week 11

Mon
Apr. 28
Text Classification and Word Embedding
We briefly study the use of word embeddings for text classification.
Learning Objectives
Theory
Experimentation
Reading
Efficient Estimation of Word Representations in Vector Space by Mikolov et al. (sections 1, 4, 5)
Notes
Lecture notes
Live notes
Warmup
Project Update
Assignments
Deep Music Classification
Wed
Apr. 30
Unsupervised Learning and Autoencoders
We introduce unsupervised learning through the framework of autoencoders.
Learning Objectives
Theory
Experimentation
Reading
K-Means Clustering from PDSH
Notes
Lecture notes
Live notes
Warmup
Compression factor of k-means

Week 12

Mon
May. 05
Neural Autoencoders and Dimensionality Reduction
We use neural autoencoders to learn low-dimensional structure in more complex data sets.
Learning Objectives
Theory
Experimentation
Notes
Lecture notes
Live notes
Warmup
Project Update
Wed
May. 07
Project presentations!
We celebrate your projects and learn about what you've done!
Learning Objectives
Project
No matching items



© Phil Chodrow, 2025

References

Barocas, Solon, Moritz Hardt, and Arvind Narayanan. 2023. Fairness and Machine Learning: Limitations and Opportunities. Cambridge, Massachusetts: The MIT Press.
Vanderplas, Jacob T. 2016. Python Data Science Handbook: Essential Tools for Working with Data. First edition. Sebastopol, CA: O’Reilly Media, Inc.