Data = Signal + Noise
Reading
There’s no designated reading today, but you will likely find it helpful to review some differential calculus to complete the warmup problem below.
Warmup Problem
Recall that, if \(f:\mathbb{R}\rightarrow \mathbb{R}\) is a differentiable function, then \(x^*\) is a critical point of \(f\) if \(\frac{df}{dx}(x^*) = f'(x^*) = 0\).
Definition 1 A function \(g:\mathbb{R}\rightarrow \mathbb{R}\) is monotonic increasing on an interval \(I \subseteq \mathbb{R}\) if for all \(x_1, x_2 \in I\) such that \(x_1 < x_2\), we have \(g(x_1) \leq g(x_2)\). Function \(g\) is strictly monotonic increasing if \(g(x_1) < g(x_2)\) for all such \(x_1, x_2\).
An important mathematical property we’ll use in class is the following theorem:
Theorem 1 Let \(f:\mathbb{R}\rightarrow \mathbb{R}\) be a differentiable function such that \(f(x) > 0\) for all \(x \in \mathbb{R}\). Then, \(x^*\) is a critical point of \(f\) if and only if \(x^*\) is also a critical point of the function \(h(x) = \log f(x)\).
(You may assume that the \(\log\) is base \(e\), which is sometimes also written \(\ln f(x)\).)
Informally, this theorem says that if we want to find a critical point of \(f\), it’s ok to take the logarithm of \(f\) first and then find the critical points of \(\log f\) instead. In many machine learning contexts it’s much easier to work with \(\log f\).
Part A
Prove that the function \(k(x) = \log x\) is strictly monotonic increasing on the interval \((0, \infty)\). It’s sufficient to evaluate the derivative of \(k\) and apply the mean value theorem.
Part B
Use Part A to prove Theorem 1. It’s sufficient to calculate \(\frac{dh}{dx}\) in terms of \(\frac{df}{dx}\) (use the chain rule!). Can it be true that one of \(\frac{df}{dx}(x^*)\) or \(\frac{dh}{dx}(x^*)\) is zero while the other is non-zero?