What is a Hamiltonian Monte-Carlo method?

(Editor Note: These notes are not my own text they are copied from MacKay and the Astrophysics source below. They’ll be gradually edited over the next few months I provide them because writing them was useful for me)

The Hamiltonian Monte Carlo is a Metropolis method, applicable to continuous state spaces, that makes use of gradient information to reduce random walk behaviour.
These notes are based on a combination of  http://python4mpia.github.io/index.html and David MacKays book on Inference and Information theory.

For many systems whose probability P(x) can be written in the form
P(x) = \frac{\exp^{-E(x)}}{Z}
not only -E(x) but also its gradient with respect to x
can be readily evaluated. It seems wasteful to use a simple random-walk
Metropolis method when this gradient is available – the gradient indicates
which direction one should go in to find states that have higher probability!

In the Hamiltonian Monte Carlo method, the state space x is augmented by
momentum variables p, and there is an alternation of two types of proposal.
The first proposal randomize the momentum variable, leaving the state x un-changed. The second proposal changes both x and p using simulated Hamiltonian dynamics as defined by the Hamiltonian
H(\mathbf{x}, \mathbf{p}) = E(\mathbf{x}) + K(\mathbf{p}),
where K(\mathbf{p}) is a ‘kinetic energy’ such as K(\mathbf{p}) = \mathbf{p^{\dagger}}\mathbf{p} / 2. These two proposals are used to create
(asymptotically) samples from the joint density
P_{H}(\mathbf{x}, \mathbf{p}) = \frac{1}{Z_{H}} \exp{[-H(\mathbf{x}, \mathbf{p})]} = \frac{1}{Z_{H}}\exp{[-E(\mathbf{x})]} \exp{[-K(\mathbf{p})]}

This density is separable, so the marginal distribution of \mathbf{x} is the
desired distribution \exp{[-E(\mathbf{x})]}/Z. So, simply discarding the momentum variables, we obtain a sequence of samples \lbrace\mathbf{x}^{(t)}\rbrace that asymptotically come from P(\mathbf{x})

An algorithm for Hamiltonian Monte-Carlo
First, we need to compute the gradient of our objective function.
Hamiltonian Monte-Carlo makes use of the fact, that we can write our likelihood as
\mathcal{L} = \exp{log \mathcal{L}} = \exp{-E}
where E = - log\mathcal{L} is the “energy”. The algorithm then uses
Hamiltonian dynamics to modify the way how candidates are proposed:

 The true value we used to generate the data was \alpha = 2.35.
 The Monte-Carlo estimate is \hat{\alpha} = 2.3507.

Mean: 2.35130302322
Sigma: 0.00147375480435
Acceptance rate = 0.645735426457


Leave a Reply

Your email address will not be published. Required fields are marked *