<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
		>
<channel>
	<title>Comments on: Custom Java Map for Binary Features</title>
	<atom:link href="http://lingpipe-blog.com/2010/02/09/custom-java-map-for-binary-features/feed/" rel="self" type="application/rss+xml" />
	<link>http://lingpipe-blog.com/2010/02/09/custom-java-map-for-binary-features/</link>
	<description>Natural Language Processing and Text Analytics</description>
	<lastBuildDate>Sat, 04 Feb 2012 20:56:48 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
	<item>
		<title>By: David R. MacIver</title>
		<link>http://lingpipe-blog.com/2010/02/09/custom-java-map-for-binary-features/#comment-6380</link>
		<dc:creator><![CDATA[David R. MacIver]]></dc:creator>
		<pubDate>Thu, 11 Feb 2010 14:29:51 +0000</pubDate>
		<guid isPermaLink="false">http://lingpipe-blog.com/?p=3632#comment-6380</guid>
		<description><![CDATA[Thanks, it&#039;s appreciated.]]></description>
		<content:encoded><![CDATA[<p>Thanks, it&#8217;s appreciated.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: David R. MacIver</title>
		<link>http://lingpipe-blog.com/2010/02/09/custom-java-map-for-binary-features/#comment-6379</link>
		<dc:creator><![CDATA[David R. MacIver]]></dc:creator>
		<pubDate>Thu, 11 Feb 2010 14:16:55 +0000</pubDate>
		<guid isPermaLink="false">http://lingpipe-blog.com/?p=3632#comment-6379</guid>
		<description><![CDATA[I&#039;m not entirely sure why it was done this way, but there&#039;s a whole pile of weird behaviours around the core collection implementations. Another example of weirdness is the way the implementations of LinkedHashMap/HashMap interact (there&#039;s a package private constructor for HashMap which takes a boolean argument. If you pass true to it it behaves like a LinkedHashMap, if you pass false to it it behaves like a normal one). 

I ended up writing my own HashMap implementation for the Scala standard library and was surprised at how easy it was to beat java.util.HashMap in performance. 

The avoidance of dependencies makes sense. I figured that was probably the case, just thought I&#039;d mention it as prior art rather than jump right in and tell you to write your own. :-)]]></description>
		<content:encoded><![CDATA[<p>I&#8217;m not entirely sure why it was done this way, but there&#8217;s a whole pile of weird behaviours around the core collection implementations. Another example of weirdness is the way the implementations of LinkedHashMap/HashMap interact (there&#8217;s a package private constructor for HashMap which takes a boolean argument. If you pass true to it it behaves like a LinkedHashMap, if you pass false to it it behaves like a normal one). </p>
<p>I ended up writing my own HashMap implementation for the Scala standard library and was surprised at how easy it was to beat java.util.HashMap in performance. </p>
<p>The avoidance of dependencies makes sense. I figured that was probably the case, just thought I&#8217;d mention it as prior art rather than jump right in and tell you to write your own. :-)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: lingpipe</title>
		<link>http://lingpipe-blog.com/2010/02/09/custom-java-map-for-binary-features/#comment-6372</link>
		<dc:creator><![CDATA[lingpipe]]></dc:creator>
		<pubDate>Wed, 10 Feb 2010 17:58:25 +0000</pubDate>
		<guid isPermaLink="false">http://lingpipe-blog.com/?p=3632#comment-6372</guid>
		<description><![CDATA[P.S.  I added &lt;a href=&quot;http://www.drmaciver.com/blog/&quot; rel=&quot;nofollow&quot;&gt;David R. MacIver&#039;s blog&lt;/a&gt; t our roll.  It has the same kind of programming and NLP focus as this blog.]]></description>
		<content:encoded><![CDATA[<p>P.S.  I added <a href="http://www.drmaciver.com/blog/" rel="nofollow">David R. MacIver&#8217;s blog</a> t our roll.  It has the same kind of programming and NLP focus as this blog.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: lingpipe</title>
		<link>http://lingpipe-blog.com/2010/02/09/custom-java-map-for-binary-features/#comment-6371</link>
		<dc:creator><![CDATA[lingpipe]]></dc:creator>
		<pubDate>Wed, 10 Feb 2010 17:56:46 +0000</pubDate>
		<guid isPermaLink="false">http://lingpipe-blog.com/?p=3632#comment-6371</guid>
		<description><![CDATA[Great catch!   It made writing the whole post worthwhile.  

Does anyone know why the @author(s) Bloch and Gafter did it this way?  It seems so wrongheaded.  The only motivation I see is to have a single implementation of hashing.  Their approach also makes having null entries simpler, but come on, that could&#039;ve been handled with a single boolean, though it&#039;d sure complicate all the implementations.

Somewhere in the back of my mind, I knew that Java&#039;s hash sets were implemented on top of their maps.  If I&#039;d profiled my new implementation versus the default one, it would&#039;ve also become obvious. 

We try to stay away from any dependencies other than Java.  Paying customers seem to be allergic to GPL-like dependencies (though tolerant of Apache and often excepting MySQL and Linux from their GPL aversion).  

I have written a bunch of other collection extensions, mainly for small sized collections to save space.  Next up, a space efficient resizable hash set.  

I&#039;ve been meaning to write primitive-specific collections for a while, but I so rarely use Java collections for primitives in a tight loop that they&#039;ve never been a bottleneck, so I&#039;ve never gotten around to it.]]></description>
		<content:encoded><![CDATA[<p>Great catch!   It made writing the whole post worthwhile.  </p>
<p>Does anyone know why the @author(s) Bloch and Gafter did it this way?  It seems so wrongheaded.  The only motivation I see is to have a single implementation of hashing.  Their approach also makes having null entries simpler, but come on, that could&#8217;ve been handled with a single boolean, though it&#8217;d sure complicate all the implementations.</p>
<p>Somewhere in the back of my mind, I knew that Java&#8217;s hash sets were implemented on top of their maps.  If I&#8217;d profiled my new implementation versus the default one, it would&#8217;ve also become obvious. </p>
<p>We try to stay away from any dependencies other than Java.  Paying customers seem to be allergic to GPL-like dependencies (though tolerant of Apache and often excepting MySQL and Linux from their GPL aversion).  </p>
<p>I have written a bunch of other collection extensions, mainly for small sized collections to save space.  Next up, a space efficient resizable hash set.  </p>
<p>I&#8217;ve been meaning to write primitive-specific collections for a while, but I so rarely use Java collections for primitives in a tight loop that they&#8217;ve never been a bottleneck, so I&#8217;ve never gotten around to it.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: David R. MacIver</title>
		<link>http://lingpipe-blog.com/2010/02/09/custom-java-map-for-binary-features/#comment-6365</link>
		<dc:creator><![CDATA[David R. MacIver]]></dc:creator>
		<pubDate>Wed, 10 Feb 2010 10:12:09 +0000</pubDate>
		<guid isPermaLink="false">http://lingpipe-blog.com/?p=3632#comment-6365</guid>
		<description><![CDATA[This would probably be a better idea if java.util.HashSet weren&#039;t internally implemented on top of a HashMap...

In the meantime, you might be better off using gnu trove (it&#039;s LGPL, so the license is tolerable) or implementing your own open addressing based HashSet (it&#039;s not overly hard) if you don&#039;t want the dependency.]]></description>
		<content:encoded><![CDATA[<p>This would probably be a better idea if java.util.HashSet weren&#8217;t internally implemented on top of a HashMap&#8230;</p>
<p>In the meantime, you might be better off using gnu trove (it&#8217;s LGPL, so the license is tolerable) or implementing your own open addressing based HashSet (it&#8217;s not overly hard) if you don&#8217;t want the dependency.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

