
Spring 2026
I’m Prof. Phil Chodrow. Please call me “Phil” or “Prof. Phil.” “Prof. Chodrow” is ok if that’s what makes you comfortable.
Math, social networks, dynamical systems, data science, applied statistics, information theory, effective teaching
♟️ Chess (let’s play!), 🎿 nordic skiing, 🍵 tea, 🍵 hiking, 🥘 cooking, 📖 reading, 🖖🏻 Star Trek Deep: Space Nine.
I have two cats who also love math
I practice aikido (and had short hair)

Hello! My name is Phil and if I were a vegetable, I would be garlic. Garlic can be either a quiet helper or a charismatic center of a dish, and that reflects my personality as part introvert and part performer.
Culinary vegetables like tomatoes, zucchini, and eggplant are ok!
Machine learning is the theory and practice of designing automated systems for prediction and decision-making which adapt their behavior based on data.

As technical specialists, we’ll be especially interested in understanding what is happening in the training process and how to measure how well it worked.
Theory: math of models and training.
Experimentation: evaluating the performance of models on real and synthetic data
Social Responsibility: thinking critically about the social impacts of the models we build.
Implementation: creating models in structured code.
Navigation: working with the Python ecosystem, especially pytorch.
This includes:
\[ \begin{aligned} f(\mathbf{x}; \mathbf{w}) &= \sigma(\mathbf{w}^\top \mathbf{x}) &\text{(model)} \\ \ell(y_i, s) &= - \left[ y_i \log(s) + (1 - y_i) \log(1 - s) \right] &\text{(loss function)}\\ L(\mathbf{w}) &= \sum_{i=1}^n \ell(y_i, f(\mathbf{x}_i; \mathbf{w})) \\ \nabla_{\mathbf{w}} L(\mathbf{w}) &= \sum_{i=1}^n (f(\mathbf{x}_i; \mathbf{w}) - y_i) \mathbf{x}_i &\text{(gradient)} \\ \mathbf{w}' &\leftarrow \mathbf{w} - \eta \nabla_{\mathbf{w}} L(\mathbf{w}) &\text{(training loop)} \\ \end{aligned} \]
You don’t need to understand this right now but you will by the end of the course.
Before |
Complete the warmup problem ahead of class and submit on Gradescope |
During |
Present the warmup problem to your group (~15 mins).Lecture: theory (~25 mins) + implementation and experiments (~25 mins). |
After |
Work on the week’s homework assignment or miniproject.Study lecture notes to prep for quizzes and exam. |
Formative assessments are for you to deepen your understanding outside of class.
Summative assessments are to hold you accountable to good learning practices and evaluate your growth in the material.
Closed book, timed, proctored assessments that draw on a similar problem bank as the homework.
A ~4 week long project in which you apply methods from class to a project of your choice, completed as a group.
Suppose we have the following data set which we’d like to model:

Perfectly fitting the data is called interpolation:

This approach will perform poorly on similar new data:

This is also called “failure to generalize” or “overfitting.”
A theoretical, schematic view of the data:

\[ \begin{aligned} y_i = f(x_i) + \epsilon_i \end{aligned} \]
Successful machine learning modeling learns the signal \(f(x_i)\) while ignoring the noise \(\epsilon_i\).
Learning the true signal allows us to generalize better to new data:

\[ \begin{aligned} MSE = \frac{1}{n} \sum_{i=1}^n (y_{i,\mathrm{actual}} - \hat{y}_{i,\mathrm{prediction}})^2 \end{aligned} \]
Lower MSE on test set \(\implies\) better predictive performance.
Modeling signal and noise with linear trends and Gaussian probability distributions.
\[ \begin{aligned} p(\epsilon;\mu, \sigma^2) &= \frac{1}{\sigma \sqrt{2 \pi}} \exp\left( -\frac{(\epsilon - \mu)^2}{2 \sigma^2} \right) \end{aligned} \]
Read the syllabus and post questions on Campuswire.
Complete the warmup problem (go/cs-451) and submit on Gradescope.
Email me LOAs
Join Campuswire