<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
		>
<channel>
	<title>Comments on: Yahoo!&#8217;s Learning to Rank Challenge</title>
	<atom:link href="http://lingpipe-blog.com/2010/03/10/yahoos-learning-to-rank-challenge/feed/" rel="self" type="application/rss+xml" />
	<link>http://lingpipe-blog.com/2010/03/10/yahoos-learning-to-rank-challenge/</link>
	<description>Natural Language Processing and Text Analytics</description>
	<lastBuildDate>Sat, 04 Feb 2012 20:56:48 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
	<item>
		<title>By: lingpipe</title>
		<link>http://lingpipe-blog.com/2010/03/10/yahoos-learning-to-rank-challenge/#comment-6565</link>
		<dc:creator><![CDATA[lingpipe]]></dc:creator>
		<pubDate>Wed, 17 Mar 2010 17:58:41 +0000</pubDate>
		<guid isPermaLink="false">http://lingpipe-blog.com/?p=3837#comment-6565</guid>
		<description><![CDATA[The vectors also contain slightly different non-zero dimensions in the main and adaptation versions of the bakeoff.

Because ERR only depends on editorial grade, if you can predict the editorial grade of a document/query vector pair, you can optimize the ranking. 

I&#039;d guess Yahoo! was just playing it safe on privacy after the AOL and Netflix debacles.]]></description>
		<content:encoded><![CDATA[<p>The vectors also contain slightly different non-zero dimensions in the main and adaptation versions of the bakeoff.</p>
<p>Because ERR only depends on editorial grade, if you can predict the editorial grade of a document/query vector pair, you can optimize the ranking. </p>
<p>I&#8217;d guess Yahoo! was just playing it safe on privacy after the AOL and Netflix debacles.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Mathieu</title>
		<link>http://lingpipe-blog.com/2010/03/10/yahoos-learning-to-rank-challenge/#comment-6558</link>
		<dc:creator><![CDATA[Mathieu]]></dc:creator>
		<pubDate>Wed, 17 Mar 2010 03:48:06 +0000</pubDate>
		<guid isPermaLink="false">http://lingpipe-blog.com/?p=3837#comment-6558</guid>
		<description><![CDATA[Just a minor detail but the vectors are 700-dimensional.

The problem with treating the learning to rank problem as a mere classification or regression problem is that you don&#039;t use the *relative* position of documents with each other, and they are more general problems to solve. 

Something worth mentioning is that queries in the test set don&#039;t exist in the training set. This means that the feature vectors necessarily contain information about both the document d and the query q, not d alone.

I agree that the absence of the raw data is a bit disappointing. I guess two possible explanations are Yahoo not willing to disclose this data and data size. This means that the learning and possibly feature selection/dimensionality reduction are the main things that will distinguish the teams.

Interestingly, the current 20 best teams are all in a 0.01 range, with regards to the ERR.]]></description>
		<content:encoded><![CDATA[<p>Just a minor detail but the vectors are 700-dimensional.</p>
<p>The problem with treating the learning to rank problem as a mere classification or regression problem is that you don&#8217;t use the *relative* position of documents with each other, and they are more general problems to solve. </p>
<p>Something worth mentioning is that queries in the test set don&#8217;t exist in the training set. This means that the feature vectors necessarily contain information about both the document d and the query q, not d alone.</p>
<p>I agree that the absence of the raw data is a bit disappointing. I guess two possible explanations are Yahoo not willing to disclose this data and data size. This means that the learning and possibly feature selection/dimensionality reduction are the main things that will distinguish the teams.</p>
<p>Interestingly, the current 20 best teams are all in a 0.01 range, with regards to the ERR.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: lingpipe</title>
		<link>http://lingpipe-blog.com/2010/03/10/yahoos-learning-to-rank-challenge/#comment-6537</link>
		<dc:creator><![CDATA[lingpipe]]></dc:creator>
		<pubDate>Thu, 11 Mar 2010 21:08:56 +0000</pubDate>
		<guid isPermaLink="false">http://lingpipe-blog.com/?p=3837#comment-6537</guid>
		<description><![CDATA[Thanks for the feedback; I updated the post accordingly.   The rules were pretty clear when I actually read them!  I should&#039;ve sent you (and the other organizers) a draft first.  

It&#039;s really hard for us poor (in the cash sense, not the &quot;woe is me&quot; sense) startups to interpret all the legalese.  My own eyes sort of glaze over at the language, which always seems vaguely menacing.]]></description>
		<content:encoded><![CDATA[<p>Thanks for the feedback; I updated the post accordingly.   The rules were pretty clear when I actually read them!  I should&#8217;ve sent you (and the other organizers) a draft first.  </p>
<p>It&#8217;s really hard for us poor (in the cash sense, not the &#8220;woe is me&#8221; sense) startups to interpret all the legalese.  My own eyes sort of glaze over at the language, which always seems vaguely menacing.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Olivier Chapelle</title>
		<link>http://lingpipe-blog.com/2010/03/10/yahoos-learning-to-rank-challenge/#comment-6535</link>
		<dc:creator><![CDATA[Olivier Chapelle]]></dc:creator>
		<pubDate>Thu, 11 Mar 2010 19:51:44 +0000</pubDate>
		<guid isPermaLink="false">http://lingpipe-blog.com/?p=3837#comment-6535</guid>
		<description><![CDATA[Regarding the prize requirement: in fact, one of the rules state that &quot;each winning Team will be required to create and submit to Sponsor a presentation&quot;. There is no need to go to Haifa if you can&#039;t make it. 

And about the clause 4a: I&#039;m not a lawyer, but my understanding is that this clause is meant to prevent an entanglement resulting from simultaneous participation in two challenges with conflicting rules.]]></description>
		<content:encoded><![CDATA[<p>Regarding the prize requirement: in fact, one of the rules state that &#8220;each winning Team will be required to create and submit to Sponsor a presentation&#8221;. There is no need to go to Haifa if you can&#8217;t make it. </p>
<p>And about the clause 4a: I&#8217;m not a lawyer, but my understanding is that this clause is meant to prevent an entanglement resulting from simultaneous participation in two challenges with conflicting rules.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

