Tuesday, March 20, 2012

The Joys of MCMC

Much of my PhD work has been about making various MCMC methods (Markov Chain Monte Carlo for the uninitiated) work. Usually the process looks something like this:

  • Come up with a model.
  • Write MCMC code to sample parameters of the model.
  • Run MCMC code with available data.
  • Notice that it doesn't work.
  • Spend several months getting the MCMC to work.
Sometimes it is not the sampling that is at fault; there have been problems with the model specification, which required going back and fixing the model. But around 75% of the time, the problem is lack of convergence of the MCMC chains. They are supposed to gradually converge to the posterior distribution of parameters in the model, given the data, which then allows you to draw parameter samples that are representative of this distribution. In practice, except for very simple models, this only happens reliably and speedily after you have tuned a lot of settings for your MCMC.

I recently came across an old discussion in The American Statistician (found here, it might be behind a paywall depending on where you are), where three experienced practitioners discuss best MCMC practices. I wish I had read this discussion a few years ago, for two reasons: 1) It contains a lot of useful tips that I had to learn the hard way. 2) It makes it clear that MCMC vary a lot, depending on your personal preference, domain of application, degree of complexity of your model and so on.

What I take away from this discussion is that in order to be successful at sampling with MCMC, you need to be tenacious, resourceful, and rigorous. Whenever I have failed at an attempt, it was usually because I took a "choose two out of three" approach to these qualities. Here's hoping success will come easier in the future.