GSoC #3: A Slight Detour

1 minute read

Published:

My PR’s almost done - I promise!!

After a meeting yesterday with my mentor Rob, we were looking to add the final touches with quality-of-life features, just hoping to fix one last bug with pytensor.graph.replace.graph_replace. However, at the last minute, whilst we were discussing what the function should return, I noticed that I had been minimizing the wrong thing all along!

Let the model’s log likelihood be \(f(x) = \log(p(y \mid x, \theta))\). Whilst we are indeed evaluating the gradient \(f'(x)\) and Hessian \(f''(x)\) at the mode \(x_0\), it isn’t the mode of \(f(x)\), rather it is the mode of the posterior \(\log(p(x \mid y, \theta))\):

\begin{equation} \log(p(x \mid y, \theta)) = \log(p(y \mid x, \theta)) + \log(p(x \mid \theta)) + const \end{equation}

\begin{equation} = f(x) + \left(-\frac{1}{2}(x - \mu)^T Q (x - \mu) + \frac{1}{2} \text{logdet}(Q) \right) + const \end{equation}

\begin{equation} \approx f(x_0) + (x - x_0)f’(x_0) + \frac{1}{2}(x - x_0)^2f^”(x_0) -\frac{1}{2}(x - \mu)^T Q (x - \mu) + \frac{1}{2} \text{logdet}(Q) + const \end{equation}

\begin{equation}\label{eqn:log_posterior} = -\frac{1}{2}x^T(Q - f^”)x + x^T(Q\mu + f’(x_0) - x_0f^”(x_0)) + \frac{1}{2}\text{logdet}(Q) + const \tag{1} \end{equation}

So note that whilst we are evaluating the derivatives of \(f\) at the mode, it is \eqref{eqn:log_posterior} which we maximise (or minimise the negative of) to find \(x_0\). Gah!

Fortunately, this should be a fairly straightforward refactor: just change the equation to be minimised (as it is simply \(f(x)\) from before with a few extra terms). It also had the positive side-effect of getting me to think further ahead in INLA by requiring me to additionally think about how to specify \(Q\) and \(\mu\) (they’ll simply be extra args methinks).

Second time’s the charm hopefully!