In this warmup, we’ll get some intuition for weighted averages that will also show up when we study the friendship paradox in graphs.

Middlebury College proudly advertises an average class size of approximately 16 students.

Part A

Think back. Would you say that the sizes of the courses that you’ve taken at Middlebury average out to roughly 16? More? Less?

Part B

Let’s consider two ideas of “average class size.”

Suppose that there are \(n\) classes, with class \(i\) of size \(s_i\). One “average class size” is just the average of these \(n\) numbers. We’ll call this \(\langle s \rangle\).

\[ \begin{aligned} \langle s \rangle = \frac{1}{n} \sum_{i=1}^{n} s_i\;. \end{aligned} \]

Now instead consider the following weighted average. We’ll survey each student and ask them to report the sizes of all of their classes. We’ll then collect all of these reports into a single list of numbers.

If every student is enrolled in exactly \(h\) courses, then there will be \(mh\) entries in this list where \(m\) is the number of students

We’ll call this weighted average \(\langle s \rangle_w\). Give a formula in terms of the class sizes \(s_i\) for \(\langle s \rangle_w\). Your formula should include the quantity

\[ \begin{aligned} \langle s^2 \rangle = \frac{1}{n} \sum_{i=1}^{n} s_i^2\;. \end{aligned} \]

Your formula may also include quantities like the total number of classes and the mean class size. Please do not include an explicit dependence on the total number of students. Your formula should give the correct answer even in the case that not all students are enrolled in the same number of classes.

Hint: How many times does a class of size \(s\) appear in the list?

Part C

You can use the code below to generate some sample data:

import numpy as np
class_sizes = np.random.randint(5, 30, 100)

Implement functions to compute the average class size and weighted average class size according to your responses in Part C, and compare. Which one is larger?

Part D

Is the average class size that a student experiences likely to be larger or smaller than the average class size that Middlebury advertises?