Project Report

Your project report is your final deliverable for your project in CS 451 and your opportunity to show off your learning and achievement in this course!

1 Your “Report” is a GitHub Repository

Your project report is housed in your project’s GitHub repository, and the entire repository is part of the deliverable. There is a specific file (a Jupyter notebook) housed in your repository that serves as the “report.” This file is called report.ipynb and should be located in the root directory of your repository. Think of this file as a demonstration or tutorial for a user who wants to understand what you did and how to use your code.

2 Required Sections of Your Report

Please include the following sections in your report. Additional sections may be included but are not required.

Some projects may be exempted from parts of this required structure by request based on the nature of the project – please contact me to request an exemption and state which of the requirements you propose to modify.

2.1 Frontmatter

The beginning of your report must include:

The cool title of your project.
The names of your team members.
The URL to your GitHub repository (and please ensure that the repository is public).

2.2 Abstract

Your abstract is a one-paragraph summary of the problem you addressed, the approach(es) that you used to address it, and the big-picture results that you obtained.

2.3 Introduction

Your introduction should describe the big-picture problem that you aimed to address in your project. What’s the problem to be solved, and why is it important? As part of your introduction, please include a brief literature review in which you discuss 3 scholarly sources (journal articles or similar) that have attempted similar tasks. Any consistent citation format for the literature review is acceptable.

2.4 Values Statement

Your values statement should be a few paragraphs that address the following set of questions:

Who are the potential users of your project? Who, other than your users, might still be affected by your project?
Who benefits from technology that solves the problem you address?
Who could be harmed from technology that solves the problem you well address?
What is your personal reason for working on this problem?
Based on your reflection, would the world be a more equitable, just, joyful, peaceful, or sustainable place based on the technology that you implemented?

2.5 Data

In this section, you should describe the data set(s) that you used for your project.

Say where the data came from, how it was collected, and how you gained access to it.
Explain which components of the data are to be used as features, and which as targets (if applicable).
Load the data into Python and describe its structure for the reader.

If you performed any data cleaning prior to loading the data into your notebook, describe it here.

In some cases, you may have generated your own synthetic data set; if so, please describe this process here. If your project involves no data, then you can instead substitute a section in which you describe the task (e.g. the reinforcement learning environment) that your model attempts to learn.

Exploratory data visualization of your data set is highly encouraged. Please ensure that all plots or tables have appropriate axis labels, titles, and surrounding discussion.

2.6 Your Approach

This is the primary section where you should describe what you did. Carefully describe:

What features of your data you used as predictors for your models, and what features (if any) you used as targets.
Whether you subset your data in any way, and for what reasons.
What model(s) you used trained on your data, and how you chose them.
How you trained your models, and on what hardware.
How you evaluated your models (loss, accuracy, etc), and the size of your test set.

This section is required to contain a runnable demo that includes an example of your model performing its designated task (likely making a prediction on a test set and being scored on the prediction).

You are not required to include model training this section (training the model in a script and then loading a pickled version here is fine), but if your training process is short then it’s fine to include it.

As described in Section 3, your notebook should not contain any of the class definitions for your model or data preparation pipelines.

2.7 Results

This is the section in which you describe the main findings or achievements of your model. You can report things like accuracies on train/test data, loss scores, confusion matrices, comparisons between models, etc. Your code in this section should generate the figures and tables that you use to report your results.

Please remember: your results do not speak for themselves. While figures and tables are highly effective forms of communication, your prose is necessary to tell your story.

2.8 Concluding Discussion

Your conclusion is the right time to assess:

In what ways did our project work?
Did we meet the goals that we set at the beginning of the project?
How do our results compare to the results of others who have also studied similar problems?
If we had more time, data, or computational resources, what might we do differently in order to improve further?

2.9 Group Contributions Statement

When writing your group contributions statement, please keep in mind that everyone’s contributions are visible in the commit history of your GitHub repository.

In your group contributions statement, please include a short paragraph for each group member describing how they contributed to the project:

Who worked on which parts of the source code?
Who performed or visualized which experiments?
Who led the writing of which parts of the blog post?
Etc.

3 Repository Modularization

In addition to your report.ipynb file, your repository should be organized to separate your source files from your report. Here’s how this should look:

3.1 `data`

Please place files corresponding to your dataset in a data directory. This includes any raw data files, as well as any processed intermediate data files that you create.

3.2 `src`

Your src directory contains your Python source code, especially including any class definitions for your model and data preparation pipeline. These should then be imported into your report.ipynb file.

3.3 `scripts`

If you used any Python scripts to acquire your data, prepare your data, or train an intensive model, they should go in this folder.

3.4 Comments

Your scripts and src files should be commented. Use of AI for comment creation is permitted, but you must ensure that the comments are accurate and concise.

4 `report.ipynb` must run

One of the requirements of your project is that it must run without error, in order, when opened as a Jupyter notebook. This corresponds to hitting the “Run All” button in the Jupyter notebook interface.

Part of my assessment will be:

Cloning your repository.
Opening report.ipynb in Jupyter.
Hitting “Run All” and seeing what happens.

You can ensure this by checking that your local copy of your repository runs without error prior to submitting your project report.

To achieve this, you are likely to need to place a pickled version of your model under version control; this is not necessarily ideal version control practice but is fine in this case.

4.1 Packages

If your repository requires certain packages to run beyond what we’ve used in class, please note them in README.md.

5 Submit Your Work

5.1 Gradescope

To submit your work, one group member must submit your repository on Gradescope. Please make sure to add the names of all group members to the submission.

5.2 (Optional): Individual Reflection

If there is something that you would like to share with me about the experience of working on your project (things you learned, things you struggled with, team dynamics, ways in which your project may not fully represent your learning, etc), then you can submit an optional individual reflection on the corresponding Gradescope assignment.