<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
		>
<channel>
	<title>Comments on: API Design: Should I Reify Taggings for CRFs and HMMs?</title>
	<atom:link href="http://lingpipe-blog.com/2009/10/06/api-design-reify-taggings-crf-hmm/feed/" rel="self" type="application/rss+xml" />
	<link>http://lingpipe-blog.com/2009/10/06/api-design-reify-taggings-crf-hmm/</link>
	<description>Natural Language Processing and Text Analytics</description>
	<lastBuildDate>Sat, 04 Feb 2012 20:56:48 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
	<item>
		<title>By: lingpipe</title>
		<link>http://lingpipe-blog.com/2009/10/06/api-design-reify-taggings-crf-hmm/#comment-5616</link>
		<dc:creator><![CDATA[lingpipe]]></dc:creator>
		<pubDate>Wed, 07 Oct 2009 17:52:24 +0000</pubDate>
		<guid isPermaLink="false">http://lingpipe-blog.com/?p=2654#comment-5616</guid>
		<description><![CDATA[Taggings add two classes, &lt;code&gt;Tagging&lt;/code&gt; and &lt;code&gt;ScoredTagging&lt;/code&gt;, and their associated object allocations and space with the pressure it puts on garbage collection (relatively speaking not much of an issue here).  I originally went with arrays of strings, which are even more economical than lists of strings.

If I add &lt;code&gt;Tagging&lt;/code&gt;, I can eliminate the interface &lt;code&gt;corpus.TagHandler&lt;/code&gt; in favor of &lt;code&gt;ObjectHandler&lt;Tagging&gt;&lt;/code&gt;.  One longer-term benefit of this is that it&#039;s easy to write generic cross-validating corpora (which is one case where you need the pairs of inputs and to be iterated).  A downside is that you can only implement &lt;code&gt;ObjectHandler&lt;/code&gt; for one generic type -- how would I handle taggings and chunkings?

If I kept string arrays as outputs instead of &lt;code&gt;Tagging&lt;/code&gt;, it&#039;d certainly mean less fiddling with the existing HMM classes.   But there&#039;s no way to generify the handlers because &lt;code&gt;TagHandler&lt;/code&gt; is really tied to char sequence inputs.]]></description>
		<content:encoded><![CDATA[<p>Taggings add two classes, <code>Tagging</code> and <code>ScoredTagging</code>, and their associated object allocations and space with the pressure it puts on garbage collection (relatively speaking not much of an issue here).  I originally went with arrays of strings, which are even more economical than lists of strings.</p>
<p>If I add <code>Tagging</code>, I can eliminate the interface <code>corpus.TagHandler</code> in favor of <code>ObjectHandler&lt;Tagging&gt;</code>.  One longer-term benefit of this is that it&#8217;s easy to write generic cross-validating corpora (which is one case where you need the pairs of inputs and to be iterated).  A downside is that you can only implement <code>ObjectHandler</code> for one generic type &#8212; how would I handle taggings and chunkings?</p>
<p>If I kept string arrays as outputs instead of <code>Tagging</code>, it&#8217;d certainly mean less fiddling with the existing HMM classes.   But there&#8217;s no way to generify the handlers because <code>TagHandler</code> is really tied to char sequence inputs.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Paraba</title>
		<link>http://lingpipe-blog.com/2009/10/06/api-design-reify-taggings-crf-hmm/#comment-5614</link>
		<dc:creator><![CDATA[Paraba]]></dc:creator>
		<pubDate>Wed, 07 Oct 2009 07:10:09 +0000</pubDate>
		<guid isPermaLink="false">http://lingpipe-blog.com/?p=2654#comment-5614</guid>
		<description><![CDATA[Question 1:

I would say: go with the list. The Tagging interface doesn&#039;t seem to provide anything interesting, unless you plan to add something to it the future. I guess one thing that would make life easier in some cases was the case if there was a way to access  pairs easily, i.e. if the Tagging interface had a method that returned List&lt;Pair&gt; but this seems also stupid.

Question 2:
I would prefer ScoredObject. Again, unless you plan to add something to ScoredTagging in the future.

I&#039;m not actually using LingPipe, that&#039;s why I cannot really answer Questions 3 and 4 and also don&#039;t take my recommendation too seriously.]]></description>
		<content:encoded><![CDATA[<p>Question 1:</p>
<p>I would say: go with the list. The Tagging interface doesn&#8217;t seem to provide anything interesting, unless you plan to add something to it the future. I guess one thing that would make life easier in some cases was the case if there was a way to access  pairs easily, i.e. if the Tagging interface had a method that returned List&lt;Pair&gt; but this seems also stupid.</p>
<p>Question 2:<br />
I would prefer ScoredObject. Again, unless you plan to add something to ScoredTagging in the future.</p>
<p>I&#8217;m not actually using LingPipe, that&#8217;s why I cannot really answer Questions 3 and 4 and also don&#8217;t take my recommendation too seriously.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

