Finally! The BellKor Pragmatic Chaos team, winner of the last two Netflix Progress Prizes just topped the 10% improvement on the development set required to be evaluated for the grand prize of US$ 1M. Check out the the Netflix Prize Leaderboard.
Given what the public’s said about this contest before, I can’t say I’m too shocked at the public’s comments to the NY Times’ Big Blog post And the Winner of the $1 Million Netflix Prize (Probably) Is …. They were overwhelmingly mocking and negative. They thought the academics were giving Netflix free work. They thought 10% better than lousy recommendations were still lousy.
I’ve said this before, but what the public doesn’t realize is just how valuable this kind of data is. They’d probably be horrified at how much time we “waste” on bakeoffs like this one where there is no reward. A lot of groups out there probably would’ve paid Netflix for this kind of data, just like we pay the partly publicly funded Linguistic Data Consortium.
Plain and simple, this is a relatively huge data regression problem with data missing nonrandomly. The top teams achieved a fairly amazing reduction in variance by pooling multiple predictors. Given the (root) mean-square error (MSE) evaluation metric, and the relation of the bias-variance decomposition of mean square error, it’s not surprising the best systems were all ensembles of classifiers.
The effort required to fit these models and dream up structures and parameterizations was a Herculean effort. Check out the details up to last year in:
- Bell, Robert M., Yehuda Koren, and Chris Volinsky. 2008. The BellKor 2008 Solution to the Netflix Prize.