[**Update: 22 July 2009** Please see Russ’s comments and my responses for clarification of my confusion about what he meant by “validation”.]

Russ Altman was recently quoted as saying:

“You should get your prior, put it in an envelope, and get it time-stamped by the post office,” says Russ Altman. “Otherwise you are going to have a rough idea of what genes are involved, and that is going to poison your prior.”

in the article:

- In Shekhar, Chandra. 2009. From SNPs to Prescriptions: Can Genes Predict Drug Response?.
*Biomedical Computation Review***5**(3):10-19.

Altman’s NIH Center Simbios publishes the glossy magazine *Biomedical Computation Review*, so I doubt it’s a misquote.

To give you some notion of context, Altman’s group is using prior information for gene-drug, drug-target, and gene-gene interactions derived from public databases as the basis for analyzing a genome-wide association study. He’s paraphrased in the article as saying the key is to “avoid infusing any bias into the analysis of the association data”, for which he offers the advice quoted above.

**Noooooo!** It’s especially distressing to hear this kind of FUD coming from someone using informative priors. What Altman’s afraid of is that strong enough priors will “bias” our estimates. In the limit, set the prior mean to the desired answer and prior variance to zero, and your posterior is the same as your prior. Adjust the prior variance to get whatever posterior you want.

The mistake he’s making is that we’re not playing the hypothesis testing game here, as we might do in a clinical drug trial. Instead, we’re doing general statistical modeling. Our performance is measured on held-out data, not on fit to distribution as in classical statistics. The empirical bottom line is: **tuning your prior can help with prediction on held-out data**.

Every application of a probabilistic model depends on subjective assumptions about model structure and its fit to some real-world process. For a Bayesian, the prior’s no different in kind than other modeling assumptions, such as linearity of regression predictors or exchangability of samples (i.e. repeatable trials, on which all of classical statistics is based). If you’re going to use informative priors, as Altman et al. did, then there’s no reason, either empirical or conceptual, not to fit them along with every other aspect of the model.

Yes, priors are subject to overfitting. So are ordinary regression coefficients. The fact that we want to use our posterior predictions to guide our research helps insure against overfitting priors. This would be different if we wanted to use our posterior predictions to get our drug past the FDA. Gelman et al.’s *Bayesian Data Analysis* has extensive advice on testing Bayesian model (over)fit, with lots of biology-derived examples (dosage response, survival rates, etc.).

How about we rename “prior” to “hierarchical parameter”, or better yet, “multilevel parameter”, to account for non-nested “priors”, and just get on with modeling?

July 21, 2009 at 6:05 pm |

NO! You misinterpreted what I was saying. I said the quote to stress that you can’t build the answer into your prior *during the validation phase*. In order to have a credible validation, you need to set your prior, seal it, and THEN apply the method to make predictions. Once your method is validated and you are making real predictions. go crazy and use everything you know–but during validation, don’t cheat!

July 22, 2009 at 12:54 pm |

Russ: Thanks for the response. I figured I might be misinterpreting something. Is there a paper with the statistical details? Were you talking about the M-BISON paper on micro-array association? It looks like a very interesting use of informative priors.

I was trying to emphasize that from a Bayesian perspective, the prior is just another parameter in the model, and shouldn’t get any special treatment.

I see model-building as a continuous loop of tuning and testing fit, so I’m confused by what you mean by “validation phase”. Do you mean looking to see that an interaction you expect to see is actually found by the algorithm? With fine-grained enough priors you could percolate instances to the top, which would definitely fall into the realm of overfitting the model.

July 22, 2009 at 12:59 pm |

As you know, for the purposes of academic publication, one needs to prove the performance of any new classifier and a common way to do this is to emulate how it will do in the world when it is let loose on real, unsolved problems. In that case, we save some known stuff that is not used in training, so we can see how the algorithm would do on these “gold standard” known results, as a proxy for how it will do on true things that we don’t know later. That’s what I mean by the validation phase.

July 22, 2009 at 1:43 pm |

I see the semantic confusion — what you call “validation” is what I’d call “testing” or “evaluation”, not what I’d call “development”. During development, the prior’s in play for tuning. During evaluation, it really can’t be, because you could get any result you wanted.

This brings up an issue with reporting cross-validation results rather than true held-out test results. I think of cross-validation like development, not like testing. The problem often found in papers is a confusion of optimal cross-validated parameters with true test results. It’s too easy to overfit cross-validation, so despite the name, tuned cross-validation results aren’t to be trusted for what Russ is calling “validation”.

Of course, the real evaluation we care about is prediction of things we don’t yet know. Too bad that’s so much work in biology.

October 3, 2011 at 1:35 pm |

Hi Sandie. So fare i can see, Arbuckle has answered your question. I dont see any sense.