A First Look at Gradient Descent
In this problem, we’re doing to give a computational introduction to two fundamental ideas of modern computational machine learning:
- Minimizing a function is often a useful thing to do and
- This can often be done by moving in directions guided by the derivative of the function.
For our first example, we’ll see how minimizing a function by iteratively moving in the direction of the derivatives can be used to solve a familiar math problem.
Let \(\mathbb{R}^+\) be the set of strictly positive real numbers. Let \(a \in \mathbb{R}^+\). Consider the function \(g_a:\mathbb{R}^+ \rightarrow \mathbb{R}\) with formula
\[ g_a(x) = \frac{1}{2}x^2 - a \log x\;. \]
Part A
A critical point of \(g_a\) is a point \(x_0\) where \(\frac{dg_a(x_0)}{dx} = 0\). Find the critical point of this function in the set \(\mathbb{R}^+\). We’ll call this point \(x_a\).
Part B
Use the second derivative test to show that \(g_a\) is strictly convex on the set \(\mathbb{R}^+\). It follows that the critical point \(x_a\) is a global minimum of \(g_a\).
Part C
Implement the following algorithm as a Python function:
Inputs: \(a \in \mathbb{R}^+\), tolerance \(\epsilon > 0\), learning rate \(\alpha\), \(\mathrm{maxsteps} = 100\).
- Start with an initial guess \(x = a\).
- Set \(x' \gets 2a\) and \(j \gets 0\).
- While \(|x' - x| > \epsilon\):
- if \(j > \mathrm{maxsteps}\)
- Break
- \(x \gets x'\)
- \(x' \gets x - \alpha \frac{dg_a(x)}{dx}\);.
- \(j \gets j + 1\).
- if \(j > \mathrm{maxsteps}\)
- Return \(x\).
You should use the formula for \(\frac{dg_a(x)}{dx}\) that you found in Part A.
Test your function like this:
= 9, epsilon = 1e-8, alpha = 0.2) mystery_fun(a
Please show:
- One setting of \(\alpha\) for which your function returns a real number very close to the exact value of \(x_a\).
- One setting of \(\alpha\) for which your function fails to return a real number close to the exact value of \(x_a\) within the maximum number of steps.
Part D
Is it possible to compute the positive square root of a positive real number using only the operations of addition, subtraction, multiplication, and division?
© Phil Chodrow, 2025