Syllabus

Welcome! CSCI 451: Machine Learning is an advanced elective on the topic of algorithms that learn patterns from data. Artificial intelligence, predictive analytics, computational science, pattern recognition, signal processing, and data science are all disciplines that draw heavily on techniques from machine learning. The course focuses almost entirely on predictive models, and places special emphasis on the interrelationship between mathematical and algorithmic descriptions of machine learning models.

1 Learning Objectives

A learning objective is a primary goal for your learning by the end of the course. I measure your success (and mine) by the extent to which you achieve these objectives during your time in the course.

Theory	You will describe the mathematical structure of modern machine learning algorithms and the mathematical details of several core examples.
Implementation	You will implement classification and regression algorithms in efficient, usable Python programs.
Navigation	You will navigate the package ecosystem for machine learning in Python.
Experimentation	You will experiment with machine learning models, audit their performance, and communicate about your findings.
Social Responsibility	You will interrogate sources of bias, harm, and disparity in machine learning models, especially with regard to gender, race, and class.
Project	You will complete a long-term project that involves significant implementation or experimentation with machine learning tools.

2 Rough Sequence of Topics

Numbers correspond roughly to weeks in the semester.

Introduction. Signal and noise. Data generating models. Likelihood of the data.
Linear regression. Gradients, matrix-vector notation.
More linear regression. Feature engineering. Model selection. Bias-variance tradeoff.
Regularization; Ridge and Lasso regression.
Classification. Generalized empirical risk minimization. Logistic regression.
Decision theory, expected utility. Bias and fairness in classification.
Spring break.
Advanced optimization. Automatic differentiation, stochastic gradient and related methods.
More gradient methods.
Introducing deep learning. Convolutional neural networks.
Word embedding. Sequence data: tokenization and Markov models.
Sequence data: transformers and modern large language models.
Projects

3 Key Logistics and Policies

Lecture

Mondays and Wednesdays
75 Shannon Street, Room 206
Section A: 12:45pm-2:00pm
Section B: 2:15pm-3:30pm

Instructor

Phil Chodrow

	75 Shannon Street, Room 218
	pchodrow@middlebury.edu, though please don’t email me about class stuff – use Campuswire instead.
	Student Hours Mondays, 3:30pm-4:30pm Tuesdays, 2pm-3pm Fridays, 9am-10am

Important Policies

I encourage you to call me Phil or Prof. Phil. “Professor Chodrow” is fine if that’s what’s most comfortable for you.

You need a laptop and an internet connection for this course, but you don’t need to buy books or other supplies.

Generally speaking, you should only email me if you need to talk about something personal or sensitive. We’ll use Campuswire for all standard course communications.

Student Hours are your time to come chat with me about course content. I want to see you in Student Hours.

4 How You’ll Demonstrate Learning

4.1 Foundational Components

Warmup Problems

The primary way in which you’ll prepare for class each day is by completing a warmup problem. These problems typically ask you to apply a technique or idea from the previous lecture, and often lead into a topic that we’ll cover in the next lecture.

Each day, a few students will be randomly selected to present their solutions for the warmup problem to their classmates. Presenting students should be prepared to explain their reasoning and answer questions about their solutions. Regardless of whether or not they present, all students submit the warmup problem as part of the problem sets completed throughout the course.

Homework Problem Sets

Each week during the first part of the course, we’ll have a problem set due. The problem set will typically include 5-6 problems, two of which will be the warmup problems from that week.

Typically, problem sets are primarily for practicing mathematical concepts, with some programming problems included as well.

Quizzes

We’ll have three quizzes during the first half of the course, spaced two weeks apart. Quizzes will be timed for 15 minutes, and will include problems similar to those you’ve seen on problem sets and in class. Care on homework and warmup problems will prepare you well for quizzes.

Midterm Exam

The midterm exam will take place approximately in Week 7 or 8 of the course. It will cover all material from the first half of the course, and will be similar in style to the problems you’ve seen on problem sets and quizzes.

Oral Exam

The oral exam will be a 15-minute one-on-one conversation with me in which you’ll tackle one of several possible prompts which I’ll distribute ahead of time. The oral exam will be primarily theoretical; you may be asked to show mathematical or conceptual reasoning on the oral exam but will not be required to write code.

The oral exam will take place the week after the midterm exam.

4.2 Building on the Foundation

Miniprojects

In the second part of the course, problem sets will be replaced by miniprojects that encourage you to synthesize multiple ideas from the course. Miniprojects will usually be programming-oriented and will often involve model implementation, experimentation, and discussion.

Miniprojects will be completed in Jupyter notebooks and submitted via Gradescope.

Final Project

Your final project will address a topic of your choosing related to machine learning. The project may involve theory, implementation, experimentation, or a combination of these elements. Although solo projects are permitted, group projects of 2-3 students are strongly encouraged.

Your project deliverables will include:

A project proposal.
Multiple short project milestone reports.
A final project poster.
A shared reflection in which you and your group will discuss what you achieved and learned during the project.
A GitHub repository containing your project code.

5 Grading

Your average in the course is:

20% Quizzes
20% Midterm Exam
10% Oral Exam
10% Homework Problem Sets
15% Miniprojects
25% Final Project

Revision 2/12: In computing averages for the homework sets, I’ll drop the lowest \(N\) problems, where \(N\) is the number of homework problem sets released. For example, if there are 5 homework problem sets, I’ll drop the lowest 5 problems across all of those sets.

Revision 2/12: The lowest of the three quiz grades will be dropped when calculating your final quiz average.

Each of these components will be graded according to the policies described below. After calculating your average, letter grades will be assigned on a straight scale:

A: 93-100%
A-: 90-92.9%
B+: 87-89.9%
B: 83-86.9%
B-: 80-82.9%
C+: 77-79.9%
C: 73-76.9%
C-: 70-72.9%
D: 60-69.9%
F: 0-59.9%

Instructor Discretion

At my discretion, I may adjust a final grade upward by 1/3 of a letter grade (e.g., B- to B) for students who have demonstrated learning or achievement beyond the categories captured in the grade breakdown above. Some ways to demonstrate such learning or achievement include:

Exceptional improvement during the course of the semester.
Exceptional engagement in class discussions or office hours.
Exceptional creativity or insight in assignments or projects.

These adjustments are fully discretionary and never subject to negotiation.

5.1 Warmup Problems

Warmup problems are not graded (other than in the homework problem set). It’s to your advantage to do your best on the warmups because:

They’ll help you prepare for class.
You may be called on to present them in class.
They form part of the homework problem sets, so you’ll need to turn them in anyway.

Warmup problems are turned in on Gradescope ahead of the day that they are due. The initial submission is graded on effort rather than correctness and is factored in as part of the homework grade.

There is no penalty beyond social discomfort for being called on to present a warmup problem in which you don’t feel confident. When presenting a warmup problem, it’s completely fine and even encouraged to get some help from your groupmates.

5.2 Homework Problem Sets

Homework problem sets will be submitted on Gradescope, where you’ll also receive your grades and feedback.

Each problem will be its own “assignment” on Gradescope, so in a typical problem set you’ll make multiple submissions. This is annoying but necessary so that we can handle homework resubmissions.

Each homework problem will be graded on an EMRN scale:

E (Excellent, 100%): The solution is essentially perfect.
M (Meets Expectations, 85%): The solution is mostly correct, with only minor errors.
R (Revision Needed, 60%): The solution displays some learning but has significant errors or omissions.
N (No Credit, 0%): The solution is missing or otherwise does not demonstrate learning.

Revisions and Lateness

All problems submitted before the due date will receive feedback and an opportunity to revise and resubmit the problem. Revised problems are eligible for up to full credit. Revisions are due 1 week after feedback is returned.

Problems are accepted up to 1 week late without penalty; however, late problems are not eligible for revisions.

(Almost) No Extensions

Revision 2/12: Please keep in mind that if you miss the initial deadline, you can still submit the problem without penalty. You’ll miss out on the opportunity to revise, but an excellent solution will still receive full credit regardless.

Extensions to the initial submission deadline or revision submission deadline are never given on homework assignments without support from a dean or protocol outlined in a Letter of Accommodation (LOA). Extensions permitted by LOAs must be requested by email 24 hours in advance of the given deadline. Extensions permitted by LOAs requested less than 24 hours in advance may be considered if additionally supported by a dean.

5.3 Quizzes, Midterm Exam

Quizzes and the midterm exam will be graded on a numerical scale with partial credit.

There will be three quizzes throughout the course; the lowest of the three will be dropped when calculating your final grade.

5.4 Oral Exam

The oral exam will be graded on multiple EMRN scales related to the depth and fluency of the learning you demonstrate.

5.5 Miniprojects

Miniprojects will be graded on an EMRN scale:

E (Excellent, 100%): The project is essentially perfect, including all required components and demonstrating a deep understanding of the material.
M (Meets Expectations, 85%): The project is mostly correct, with only minor errors.
R (Revision Needed, 60%): The project displays some learning but has significant errors or omissions.
N (No Credit, 0%): The project is missing or otherwise does not demonstrate learning.

Like homework problem sets, miniprojects submitted before their due date will receive feedback and an opportunity to revise and resubmit the project. Revised projects are eligible for up to full credit. Revisions are due 1 week after feedback is returned.

5.6 Final Project

The final project will be graded on multiple EMRN scales according to different project components and the learning they demonstrate.

6 Collaboration and Academic Integrity

6.1 Collaboration is Encouraged!

Collaboration with human beings is highly encouraged in this class. I’d love it if you:

Studied lecture notes and readings together.
Worked together on warmup problems, homework assignments, and miniprojects.
Discussed project ideas and strategies with your classmates.

6.2 Solo, Unassisted Assessments

Quizzes, the midterm exam, the oral exam are completed by yourself and without any devices, notes, or aids. Use of external resources on these assessments is considered an honor code violation and will be treated accordingly.

6.3 Intellectual Integrity

My guiding principle regarding academic integrity is simple:

You are intellectually responsible for all work that you submit for assessment.

The primary way in which I enforce this principle is by expecting you to demonstrate your learning in settings without access to external aids. In general, there’s nothing stopping you from AI’ing your way through homeworks and miniprojects, but this is very likely to turn out poorly for you on quizzes, exams, and oral exams.

7 Generative AI

Generative artificial intelligence, especially large language models (LLMs), have rapidly transformed the ways that we write, research, explore, play, and solve problems. The purpose of this course is to support you in deeply learning a topic. Can using LLMs help? We are about three years into the “LLM era,” and there is now an emerging body of evidence about the impact of LLM usage on learning. While the story is incomplete and changing, the evidence available suggests that use of LLMs can be double-edged.

If you are curious about how students at Middlebury are using LLMs, some faculty in our Department of Economics recently conducted a survey analysis with many interesting findings.

7.1 LLMs can (maybe? sometimes?) help you learn

LLMs enable many processes related to learning at an unprecedented level of scale and personalization. With appropriate prompting, LLMs can generate practice problems, offer multiple explanations of concepts, identify and explain errors in your own reasoning, and offer personalized tutor-like experiences. Companies like Khan Academy are racing to develop LLM-powered learning platforms, and some schools are designed with LLMs at the center of the learning experience. It is useful to remember that the primary incentive for many of these efforts is to make money for shareholders rather than the follow the best available evidence or practices in education.

That said, there is very little scholarly evidence that the use of LLMs is ever actually beneficial to learning. It may be that LLMs tend to enable learning the same material while saving time, but evidence here is also lacking.

7.2 LLMs can stop you from learning

We all know that a student using an LLM can generate solutions to math problems faster than a student working purely by hand. However, a recent article suggests that the first student is going to retain less mathematical understanding and ultimately perform worse on tests and other tasks that require them to show their skills unassisted by AI (Bastani et al. 2025). Worse still, that student is unlikely to understand this impact: students who used LLMs and were free to request solutions to problems in the study didn’t believe that they performed worse on assessments, even when they did.

The authors of this article offer a simple hypothesis:

Asking LLMs for the answer to a problem prevents you from the learning which that problem is intended to offer you.

Students who instead use LLMs in ways that emphasize working through steps, answering questions along the way, and avoiding revealing the full solution appear to have retained mathematics understanding comparably to those who used no LLMs at all. Even here, however, students were misled: students using the LLM with “tutoring guardrails” thought they performed much better than they actually did. So, in this study, even best-case usage of LLMs was neutral for learning and harmful for self-awareness.

Similarly, there is some preliminary evidence that LLMs can make you feel faster at completing tasks (like coding) even when they are actually making you slower (Becker et al. 2025). In one notable and very recent experiment, two researchers at Anthropic Shen and Tamkin (2026) conducted randomized experiments in which developers were assigned to learn a new Python library. The researchers write:

Anthropic is the maker of the Claude family of LLMs

We find that AI use impairs conceptual understanding, code reading, and debugging abilities, without delivering significant efficiency gains on average.

Altogether, the question of how AI impacts learning is an active area of exploration, with much more research likely ahead. That said, existing evidence suggests that relying heavily on LLMs in unfamiliar settings or for learning new material is likely to be detrimental to your learning.

7.3 In This Class

My thinking about generative AI in this course stems from a simple goal:

I want you to grow in your ability to think theoretically and practically about machine learning models.

This means that I will be encouraging learning practices which help you learn, and giving you feedback and grades based primarily on your learning.

I’m fully supportive of uses of generative AI which promote your practice and learning in critical reasoning. I am not supportive of uses of generative AI which outsource your practice and learning to the model; that’s counterproductive to your growth.

7.4 More on LLMs

How Do They Work?

LLMs use next-token prediction to predict the next entries in a sequence of text by using the previous entries. This enables them to mimic human-produced text. So, if you saw the sequence “I love Math Foundations of _____”, you might guess that the next word might be “Computing” and that it’s probably not “Pineapples.”

LLMs are trained using reinforcement learning with human feedback (RLHF) to produce sentences that are not just realistic but also helpful, pleasant, nonoffensive, or accurate. They do this using a multistage training process that involves humans rating the quality of candidate texts. RLHF explains the tendency of many LLMs to be highly positive–even to the point of sycophantic–in how they interact with us.

8 Inclusion, Access, and Participation

I commit to an inclusive, accessible, participatory, and safe classroom for CSCI 0451.

8.1 Disabilities and Clearing Barriers

For legal reasons, I am prohibited from offering accommodations to students who do not present letters of accommodation from the DRC. So, get a letter if you need it!

If any aspect of this course raises barriers to your full and equitable participation, it is my job to clear those barriers. A common way in which barriers arise is from unintentional failure to design for all students, including students with disabilities. I’ve done my best, but still may have fallen short! If you have a documented disability, please send me your letter of accommodation from the Disability Resource Center as soon as possible. You do not need to describe your disability or justify your accommodations. I will incorporate your accommodations and work to clear learning barriers to the best of my ability.

If you believe that you may have a disability, please contact the Disability Resource Center as soon as possible. The DRC works with students confidentially and never discloses disability-related information to faculty without your permission.

8.2 Academic Support

No matter who you are, it’s normal to feel challenged by your courses. We have many academic resources to support your success in CSCI 0451. I strongly encourage everyone to make use of all of these resources.

My Student Hours are time for you to come talk with me about any aspect of the course. We’ll usually hold Student Hours in the big lounge outside Room 224 (the one with the windows).
Experienced CS students hold Peer Help hours some weekday evenings. These sessions are great times for you to work on Warmup problems (before class) and lab reports. You can also ask general questions about course content and connect with other students.

8.3 Classroom Environment

Inclusion, access, and participation are collective projects. I expect all students to contribute to a healthy course environment.

We embrace diversity of age, background, beliefs, race, ethnicity, gender, gender identity, gender expression, national origin, religious affiliation, sexual orientation, and other apparent and non-apparent axes of identity. Discrimination is not tolerated in CSCI 0451. Discriminatory speech or acts may lead to engagement with the Community Standards Office.

You deserve to be addressed in the manner that reflects who you are. I welcome to tell me your pronouns and/or chosen name at any time, either in person or via email. I expect all students to address each other according to their expressed gender markers, and commit to doing the same.

You deserve to fully and equitably participate in our learning environment. I commit to ensuring that the materials and assessments in this course are accessible to all students, and I welcome feedback on where I can do better. Middlebury’s Disability Resource Center can help you remove barriers to learning in this and other courses.

You deserve a learning environment free from gender-based discrimination, sexual harassment, sexual assault, domestic violence, dating violence, and stalking. If you experience these behaviors or otherwise know of a Title IX violation, you have many options for support and/or reporting. Middlebury’s Civil Rights and Title IX Office (CRTIX) can help you navigate your options. Please be aware that I am a Responsible Employee, which means that I am required by the College to report incidents of sexual harassment or sexual violence to CRTIX. There are resources for emotional and mental health care, advocacy, and academic support listed here, some of which are confidential.

References

Bastani, Hamsa, Osbert Bastani, Alp Sungu, Haosen Ge, Özge Kabakcı, and Rei Mariman. 2025. “Generative AI Without Guardrails Can Harm Learning: Evidence from High School Mathematics.” Proceedings of the National Academy of Sciences 122 (26): e2422633122. https://doi.org/10.1073/pnas.2422633122.

Becker, Joel, Nate Rush, Elizabeth Barnes, and David Rein. 2025. “Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity.” arXiv. https://doi.org/10.48550/arXiv.2507.09089.

Shen, Judy Hanwen, and Alex Tamkin. 2026. “How AI Impacts Skill Formation.” arXiv. https://doi.org/10.48550/arXiv.2601.20245.