<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
		>
<channel>
	<title>Comments on: Dolores Labs&#8217; Text Entailment Data from Amazon Mechanical Turk</title>
	<atom:link href="http://lingpipe-blog.com/2008/09/15/dolores-labs-text-entailment-data-from-amazon-mechanical-turk/feed/" rel="self" type="application/rss+xml" />
	<link>http://lingpipe-blog.com/2008/09/15/dolores-labs-text-entailment-data-from-amazon-mechanical-turk/</link>
	<description>Natural Language Processing and Text Analytics</description>
	<lastBuildDate>Wed, 08 Feb 2012 17:47:08 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
	<item>
		<title>By: Cade</title>
		<link>http://lingpipe-blog.com/2008/09/15/dolores-labs-text-entailment-data-from-amazon-mechanical-turk/#comment-13534</link>
		<dc:creator><![CDATA[Cade]]></dc:creator>
		<pubDate>Fri, 15 Apr 2011 02:26:08 +0000</pubDate>
		<guid isPermaLink="false">http://lingpipe.wordpress.com/?p=171#comment-13534</guid>
		<description><![CDATA[Cujcgw YMMD with that awnesr! TX]]></description>
		<content:encoded><![CDATA[<p>Cujcgw YMMD with that awnesr! TX</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: lingpipe</title>
		<link>http://lingpipe-blog.com/2008/09/15/dolores-labs-text-entailment-data-from-amazon-mechanical-turk/#comment-2818</link>
		<dc:creator><![CDATA[lingpipe]]></dc:creator>
		<pubDate>Fri, 19 Sep 2008 16:30:13 +0000</pubDate>
		<guid isPermaLink="false">http://lingpipe.wordpress.com/?p=171#comment-2818</guid>
		<description><![CDATA[Yes, that&#039;s exactly what happens.  I just ran the data removing any annotator with an estimated sensitivity or specificity less than 50%.  It makes almost no difference for the model-based approach, but makes a huge difference for majority voting, which winds up half way between the model-based results and unfiltered voting results if you remove the outliers.  I was surprised the results were this robust.  I&#039;ll have to add another blog entry.  I&#039;m on the road now, so I might not get this stuff posted for a few days.

There&#039;s a minor identifiability problem with the model with bad annotators in  that you can get the same results with 20% accurate annotators being wrong a lot and 80% accurate annotators being right a lot.  So when you open up the possibility of less than 50% sensitivity or specificity, you have to run until you don&#039;t get these degenerate solutions (that is, until the chains mix the way the Bayesians like to see them during the Gibbs samples).]]></description>
		<content:encoded><![CDATA[<p>Yes, that&#8217;s exactly what happens.  I just ran the data removing any annotator with an estimated sensitivity or specificity less than 50%.  It makes almost no difference for the model-based approach, but makes a huge difference for majority voting, which winds up half way between the model-based results and unfiltered voting results if you remove the outliers.  I was surprised the results were this robust.  I&#8217;ll have to add another blog entry.  I&#8217;m on the road now, so I might not get this stuff posted for a few days.</p>
<p>There&#8217;s a minor identifiability problem with the model with bad annotators in  that you can get the same results with 20% accurate annotators being wrong a lot and 80% accurate annotators being right a lot.  So when you open up the possibility of less than 50% sensitivity or specificity, you have to run until you don&#8217;t get these degenerate solutions (that is, until the chains mix the way the Bayesians like to see them during the Gibbs samples).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Brendan O'Connor</title>
		<link>http://lingpipe-blog.com/2008/09/15/dolores-labs-text-entailment-data-from-amazon-mechanical-turk/#comment-2816</link>
		<dc:creator><![CDATA[Brendan O'Connor]]></dc:creator>
		<pubDate>Fri, 19 Sep 2008 08:52:39 +0000</pubDate>
		<guid isPermaLink="false">http://lingpipe.wordpress.com/?p=171#comment-2816</guid>
		<description><![CDATA[This is great.  I&#039;m struck by how small the worker model residuals are.

I have the exact numbers on my end to round out your comparisons of the techniques.  I put it back &lt;a href=&quot;http://blog.doloreslabs.com/2008/09/amt-fast-cheap-good-machine-learning/#comment-583&quot; rel=&quot;nofollow&quot;&gt;on the other post&lt;/a&gt;.

On removing junk annotations: the model is figuring out that junk annotators are actually junk, so they should already be exerting very weak influence on the posterior of the labels, and therefore a weak influence the sens/spec estimates of other workers.  So if I had to bet, I&#039;d say taking out junk annotators won&#039;t change the model&#039;s inferences by very much?   Though junk annotators do matter a lot; in my other comment on the other post i found that throwing out the junk makes naive voting perform as well as anything we&#039;ve seen so far...

Hooray for open data/software/etc.  This all sounds like a good kind of science to me.]]></description>
		<content:encoded><![CDATA[<p>This is great.  I&#8217;m struck by how small the worker model residuals are.</p>
<p>I have the exact numbers on my end to round out your comparisons of the techniques.  I put it back <a href="http://blog.doloreslabs.com/2008/09/amt-fast-cheap-good-machine-learning/#comment-583" rel="nofollow">on the other post</a>.</p>
<p>On removing junk annotations: the model is figuring out that junk annotators are actually junk, so they should already be exerting very weak influence on the posterior of the labels, and therefore a weak influence the sens/spec estimates of other workers.  So if I had to bet, I&#8217;d say taking out junk annotators won&#8217;t change the model&#8217;s inferences by very much?   Though junk annotators do matter a lot; in my other comment on the other post i found that throwing out the junk makes naive voting perform as well as anything we&#8217;ve seen so far&#8230;</p>
<p>Hooray for open data/software/etc.  This all sounds like a good kind of science to me.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Opening up Academic Research on IR and Machine Learning</title>
		<link>http://lingpipe-blog.com/2008/09/15/dolores-labs-text-entailment-data-from-amazon-mechanical-turk/#comment-2814</link>
		<dc:creator><![CDATA[Opening up Academic Research on IR and Machine Learning]]></dc:creator>
		<pubDate>Thu, 18 Sep 2008 17:09:13 +0000</pubDate>
		<guid isPermaLink="false">http://lingpipe.wordpress.com/?p=171#comment-2814</guid>
		<description><![CDATA[[...] Pedersen for finally saying out loud (in the latest issue of Computational Linguistics, thanks to Bob Carpenter for the pointer) what I&#8217;ve long thought about academic publications on topics like [...]]]></description>
		<content:encoded><![CDATA[<p>[...] Pedersen for finally saying out loud (in the latest issue of Computational Linguistics, thanks to Bob Carpenter for the pointer) what I&#8217;ve long thought about academic publications on topics like [...]</p>
]]></content:encoded>
	</item>
</channel>
</rss>

