Schedule
- Readings in normal font should be completed and annotated ahead of lecture.
- Readings in italic provide optional additional depth on the material.
- Assignments are listed on the day when I suggest you begin working on them.
Reading sources:
- PSC: Lecture notes I’ve written for this course, hosted here.
- PDSH: The Python Data Science Handbook by Vanderplas (2016).
- BHN: Fairness and Machine Learning: Limitations and Opportunities by Barocas, Hardt, and Narayanan (2023).
Week 1
Mon Feb. 10 |
Welcome! | ||||
We introduce our topic and discuss how the course works. | |||||
Learning Objectives Getting Oriented |
Reading Course syllabus |
Notes Welcome slides Data, Patterns, and Models |
Warmup Set up your software. |
Assignments Math pre-assessment. |
|
Wed Feb. 12 |
The Classification Workflow in Python | ||||
We work through a simple, complete example of training and evaluating a classification model on a small data set. | |||||
Learning Objectives Navigation Experimentation |
Reading PDSH: Data Manipulation with Pandas (through "Aggregation and Grouping") |
Notes Lecture notes Live notes |
Warmup Manual linear prediction |
Assignments Blog Post: Penguins |
Week 2
Mon Feb. 17 |
Linear Score-Based Classification | ||||
We study a fundamental method for binary classification in which data points are assigned scores. Scores above a certain threshold are assigned to one class; scores below are assigned to another. | |||||
Learning Objectives Theory Experimentation |
Reading Linear Classifiers from MITx. |
Notes Lecture notes Live notes |
Warmup Decision Boundaries |
||
Wed Feb. 19 |
Statistical Decision Theory and Automated Decision-Making | ||||
We discuss the theory of making automated decisions based on a score function. We go into detail on thresholding, error rates, and cost-based optimization. | |||||
Learning Objectives Theory Experimentation |
Reading PDSH: Introduction to Numpy |
Notes Lecture notes Live notes |
Warmup Choosing a Threshold |
Assignments Blog Post: Design and Impact of Automated Decision Systems |
Week 3
Mon Feb. 24 |
Auditing Fairness | ||||
We introduce the topics of fairness and disparity in automated decision systems using a famous case study. | |||||
Learning Objectives Social Responsibility Experimentation |
Reading BHN: Introduction Machine Bias by Julia Angwin et al. for ProPublica. |
Notes Lecture notes Live notes |
Warmup Experiencing (Un)Fairness |
||
Wed Feb. 26 |
Statistical Definitions of Fairness in Automated Decision-Making | ||||
We offer formal mathematical definitions of several natural intuitions of fairness, review how to assess them empirically on data in Python, and prove that two major definitions are incompatible with each other. | |||||
Learning Objectives Social Responsibility Theory |
Reading BHN: Classification (ok to skip "Relationships between criteria" and below) |
Notes Lecture notes Live notes |
Warmup BHN Reading Check |
Assignments Blog Post: Auditing Bias OR Blog Post: Bias Replication Study |
Week 4
Mon Mar. 03 |
Normative Theory of Fairness | ||||
We discuss some of the broad philosophical and political positions that underly the theory of fairness, and connect these positions to statistical definitions. | |||||
Learning Objectives Social Responsibility |
Reading BHN: Relative Notions of Fairness |
Notes Discussion guide shared on Canvas |
Warmup COMPAS and Equality of Opportunity |
||
Wed Mar. 05 |
Critical Perspectives | ||||
We discuss several critical views that seek to move our attention beyond the fairness of algorithms and towards their role in sociotechnical systems. We center two questions: who benefits from a given data science task? What tasks could we approach instead if our aims were to uplift the oppressed? | |||||
Learning Objectives Social Responsibility |
Reading Data Feminism: The Power Chapter by Catherine D'Ignazio and Lauren Klein "The Digital Poorhouse" by Virginia Eubanks "Studying Up: Reorienting the study of algorithmic fairness around issues of power" by Barabas et al. |
Notes Discussion guide shared on Canvas |
Warmup Power, Data, and Studying Up |
Assignments Blog Post: Limitations of the Quantitative Approach |
Week 5
Mon Mar. 10 |
No class | ||||
Phil is giving a talk at Michigan State | |||||
Wed Mar. 12 |
Introduction to Model Training: The Perceptron | ||||
We study the perceptron as an example of a linear model with a training algorithm. Our understanding of this algorithm and its shortcomings will form the foundation of our future explorations in empirical risk minimization. | |||||
Learning Objectives Theory |
Reading No reading today, but please be ready to put some extra time into the warmup. It may be useful to review our lecture notes on score-based classification and decision theory when completing the warmup. |
Notes Lecture notes Live notes |
Warmup Linear Models, Perceptron, and Torch |
Assignments Blog Post: Implementing Perceptron |
Break
Mon Mar. 17 |
Spring Break! | ||||
Wed Mar. 19 |
Spring Break! | ||||
Week 6
Mon Mar. 24 |
Convex Empirical Risk Minimization | ||||
We introduce the framework of convex empirical risk minimization, which offers a principled approach to overcoming the many limitations of the perceptron algorithm. | |||||
Learning Objectives Theory |
Reading Convexity Examples by Stephen D. Boyles, pages 1 - 7 (ok to stop when we start talking about gradients and Hessians). |
Notes Lecture notes Live notes |
Warmup Practice with Convex Functions |
||
Wed Mar. 26 |
Gradient Descent | ||||
We study a method for finding the minima of convex functions using techniques from calculus and linear algebra. | |||||
Learning Objectives Theory |
Reading No reading today, but please budget some extra time for the warmup. |
Notes Lecture notes Live notes |
Warmup A First Look at Gradient Descent |
Assignments Blog Post: Implementing Logistic Regression |
Week 7
Mon Mar. 31 |
Feature Maps and Regularization | ||||
We re-introduce feature maps as a method for learning nonlinear decision boundaries, and add regularization to the empirical risk minimization problem in order to control the complexity of our learned models. | |||||
Learning Objectives Theory Experimentation |
Reading No reading today -- please think hard about your project pitches! |
Notes Lecture notes Live notes |
Warmup Project Pitches |
||
Wed Apr. 02 |
Linear Regression | ||||
We introduce regression (prediction of numerical outcomes) and study the ridge regression model for linear regression. | |||||
Learning Objectives Theory Experimentation |
Reading No reading today. |
Notes Lecture notes Live notes |
Warmup Ordinary Least-Squares Linear Regression |
Week 8
Mon Apr. 07 |
Bias-Variance Tradeoff | ||||
We explore the bias-variance tradeoff in regression and connect it to the phenomenon of overfitting. | |||||
Learning Objectives Theory Experimentation |
Notes Lecture notes Live notes |
Warmup Variance of a Random Variable and Prediction |
Assignments Blog Post: Double Descent |
||
Wed Apr. 09 |
Vectorization and Feature Engineering | ||||
We illustrate the interplay of vectorization and feature engineering on image data. | |||||
Learning Objectives Experimentation Implementation |
Reading Image Kernels Explained Visually by Victor Powell |
Notes Lecture notes Live notes |
Warmup Small break today: no warmup. |
Week 9
Mon Apr. 14 |
Kernel Methods | ||||
We introduce kernel methods for using high-dimensional feature maps in linear empirical risk minimization without the need to explicitly form feature vectors. | |||||
Learning Objectives Theory Experimentation |
Notes Lecture notes Live notes |
Warmup Project Update |
Assignments Blog Post: Kernelized Logistic Regression |
||
Wed Apr. 16 |
The Problem of Features and Deep Learning | ||||
We motivate deep learning as an approach to the problem of learning complex nonlinear features in data. | |||||
Learning Objectives Theory Experimentation |
Notes Lecture notes Live notes |
Warmup Nonlinear Fitting and Convexity |
Week 10
Mon Apr. 21 |
Contemporary Optimization | ||||
We briefly introduce two concepts in optimization that have enabled large-scale deep learning: stochastic first-order optimization techniques and automatic differentiation. | |||||
Learning Objectives Theory Experimentation |
Notes Lecture notes Live notes |
Warmup Project Update |
Assignments Blog Post: Advanced Optimization |
||
Wed Apr. 23 |
Deep Image Classification | ||||
We return to the image classification problem, using deep learning and large-scale optimization to optimize convolutional kernels as part of the training process. | |||||
Learning Objectives Theory Experimentation |
Reading Convolutional Neural Networks from MIT's course 6.036: Introduction to Machine Learning. |
Notes Lecture notes Live notes |
Warmup What needs to be learned? |
Week 11
Mon Apr. 28 |
Text Classification and Word Embedding | ||||
We briefly study the use of word embeddings for text classification. | |||||
Learning Objectives Theory Experimentation |
Reading Efficient Estimation of Word Representations in Vector Space by Mikolov et al. (sections 1, 4, 5) |
Notes Lecture notes Live notes |
Warmup Project Update |
Assignments Deep Music Classification |
|
Wed Apr. 30 |
Unsupervised Learning and Autoencoders | ||||
We introduce unsupervised learning through the framework of autoencoders. | |||||
Learning Objectives Theory Experimentation |
Reading K-Means Clustering from PDSH |
Notes Lecture notes Live notes |
Warmup Compression factor of k-means |
Week 12
Mon May. 05 |
Neural Autoencoders and Dimensionality Reduction | ||||
We use neural autoencoders to learn low-dimensional structure in more complex data sets. | |||||
Learning Objectives Theory Experimentation |
Notes Lecture notes Live notes |
Warmup Project Update |
|||
Wed May. 07 |
Project presentations! | ||||
We celebrate your projects and learn about what you've done! | |||||
Learning Objectives Project |
No matching items
© Phil Chodrow, 2025
References
Barocas, Solon, Moritz Hardt, and Arvind Narayanan. 2023. Fairness and Machine Learning: Limitations and Opportunities. Cambridge, Massachusetts: The MIT Press.
Vanderplas, Jacob T. 2016. Python Data Science Handbook: Essential Tools for Working with Data. First edition. Sebastopol, CA: O’Reilly Media, Inc.