Bayesian Posterior Entropy as a Measure of Uncertainty


Bob and Mitzi's chalkboard

Bob and Mitzi's chalkboard

Mark Johnson’s comment in my last post about a Bayesian proposal for identifying known unknowns, sent Mitzi and me to the chalkboard after dinner (yes, we’re nerdy enough to have a chalkboard in our breakfast nook) to see if at least estimating normal means works the same way. Yes, it does. Posterior variance is always lower than prior variance about the mean parameter. Of course it is, since the precisions (inverse variances) are non-negative and additive. I’ve read all of this many times, but it takes a real example to make things stick.

While writing various beta distributions on the board, it dawned on me that uncertainty in our predictive inferences, not uncertainty in our parameters, is what matters. For that, entropy is a better measure. For a Bernoulli distribution with parameter theta, the entropy H(theta) is defined by:


Given that we have a posterior distribution p(theta|x) over the Bernoulli parameter theta, which represents the chance that someone has congestive heart failure given that they have report x. To measure uncertainty, look at the Bayesian estimate of the posterior mean of H(theta), which is just the weighted average of the entropy over the posterior distribution of theta, which is:

Bayesian entropy

This now has the right properties. The highest possible posterior entropy here is with all probability mass centered at 0.5, leading to an entropy of 1. Even if the posterior for theta is tighter, if it’s shifted more centrally, it’ll result in increased entropy.

The lowest possible entropy will result with all of the probability mass centered at either 0 or 1, leading (in the limit of a delta function [cool animation on Wikipedia]) to an entropy of 0.

PS: is a very sweet little app for adding LaTeX-derived images to blogs, which is especially easy with their permanent links (I don’t quite trust that) and WordPress’s image insertion button. It reminds me of a Mac app I used in the 90s that tried to render LaTeX WYSIWYG. But it’s still a major pain compared to the joys of directly using LaTeX or even of just rendering something that looks ugly in HTML.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: