I (Bob) am going to give a talk at the next NYC Machine Learning Meetup, on 19 January 2012 at 7 PM:

There’s an abstract on the meetup site. The short story is that Stan’s a directed graphical model compiler (like BUGS) that uses adaptive Hamiltonian Monte Carlo sampling to estimate posterior distributions for Bayesian models.

The official version 1 release is coming up soon, but until then, you can check out our work in progress at:

- Google Code: Stan.

January 7, 2012 at 7:01 am |

Hi Bob,

Wow, this looks cool — I wish I could come to your presentation! I really must learn more about Hamiltonian MC. But until then, perhaps you can help me with a simple question: can Hamiltonian MC be used with discrete variables? (E.g., can you calculate the derivatives you need?).

Mark

January 10, 2012 at 5:56 pm |

Nope. HMC is just for continuous parameters.

Wherever practical, we just marginalize out discrete parameters. The Stan extension to BUGS makes it possible to do that in the model itself (examples to come in the manual, which I haven’t written yet).

Otherwise, we use exact Gibbs for discrete parameters with few outcomes and slice sampling for discrete parameters with a large or unbounded number of outcomes. But we haven’t thought too hard about the discrete case. In particular, we haven’t computed the Markov blanket in such a way as to make either of these operations at all efficient.

HMC also works best for unbounded parameters with tails that are not too light (lighter than the Gaussian). So we’ve done a whole lot of work transforming things like positive variables (variance/precision/deviation), bounded variables (probability, correlation), simplexes, and covariance matrices. It’s been calc and matrices 101 around here computing all the Jacobian determinants!

The best thing to read is Radford Neal’s chapter in the new

Handbook of MCMC— it’s one of the sample chapters.We’re using an adaptive version developed by Matt Hoffman, called the no-U-turn sampler. There’s an arxiv paper.

January 11, 2012 at 2:03 am |

I am starting anew on Graphical Models. However, ever since I came to know about Graphical Models, I have always wondered whether something like Stan was possible or not :-). This is encouraging! :-)

January 11, 2012 at 4:37 pm |

BUGS, OpenBUGS and JAGS all have roughly the same functionality as Stan. They use Gibbs sampling instead of Hamiltonian Monte Carlo (technically, BUGS uses adaptive rejection sampling within Gibbs and JAGS uses slice sampling within Gibbs). These can actually be faster than (adaptive) Hamiltonian MC in some cases, such as models where all the priors are conjugate. They’re all interpreted.

There are some other compiled versions. HBC and Passage, in particular, are both written in Haskell but compile to C++. Both have more limited forms of models than BUGS and their ilk. Stan’s a bit more expressive than BUGS in terms of what can be in a model.

Koller and Friedman’s book on graphical models goes over the kinds of algorithms that can be automated, including structural models we’re not considering. Bishop’s machine learning book also has a chapter on automatic graphical model algorithms.

If you want to see some slick automatic compilation for undirected graphical models with structure, check out McCallum et al.’s Factorie.

January 13, 2012 at 12:11 am

Thanks for the pointers! I am sure I would enjoy playing with some of these now :-).

January 18, 2012 at 12:34 am |

You might also want to check Hal Daume’s HBC (Hierarchical Bayes Compiler).

January 23, 2012 at 2:53 pm |

We know HBC pretty well. I spoke to Hal about it at length before starting the Stan project.

We’re pretty up on all the competition. More like what we’re doing are the recent Church extensions that do HMC. Here’s the original paper:

but I think there were more recent things at NIPS this year.

I also know the PyMC folks were working on similar HMC + auto-dif approaches.

Also, theano in Python is trying to do some of the same things:

http://deeplearning.net/software/theano/