The Danger of Overloading Generic Methods


We ran into a problem with generic methods in logistic regression. LingPipe’s logistic regression classifieruses a feature extractor to map objects to feature vectors, which are maps from strings to numbers. There used to be a single public classification method, so the class looked like:

    ConditionalClassification classify(E);

We thought it’d be convenient to have a second classification method that could classify feature vectors (which are mappings from string-valued features to numerical values)

    ConditionalClassification classify(E);
    ConditionalClassification classify(Map<String,
                                           ? extends Number>) 

It compiles just fine, and I think it’s pretty clear what the difference is between the overloaded methods. Now consider creating an instance of this type:

LogisticRegressionClassifier<Map<String,? extends Number>>

All together now: “Doh!“.

The new class can be constructed, but I can’t call either classify() method with a Map<String,? extends Number> argument, because it matches both methods, which turn out to have identical signatures from a client perspective. Java method resolution (which happens statically), picks the most specific signature that matches the argument type. You can see why it’s stuck with two identical arguments.

But hang on, the compiler shouldn’t let us compile a class with two methods with the same signature. So why did it work? It all goes back to the semantics of generics in Java, which are based on generic erasure. With erasure, these two methods have the following signatures:

    ConditionalClassification classify(Object)
    ConditionalClassification classify(Map<String,
                                           ? extends Number>)

The problem is that this isn’t the view presented to client code; to clients, there are two identical classes, so code won’t be able to use them, because it won’t be able to resolve the method signature.

And that, kids, is how the method

classifyFeatures(Map<String,? extends Number>)

got its name.

2 Responses to “The Danger of Overloading Generic Methods”

  1. Santi Says:

    The usual debate, “generics yes or generics no”. Erasure and lack of covariance render java generic cumbersome.

    In this case, aren’t you just trading-off the benefits of polymorphism for nothing? At the end a feature extractor is used for getting the features, couldn’t all that logic be encapsulated there? That would make the design a bit more orthogonal. I guess another question is whether the Classifier interface really needs to be generic.

  2. lingpipe Says:

    As impoverished as they are, I’m all for generics.

    In particular, I believe the classifier interface needs to be generic. We write classifiers for all sorts of different objects. A feature extractor is then used to map the object being classified to a feature vector (a map from features to numerical values), and the heavy lifting’s usually done downstream from that when the feature vectors are converted to plain old sparse vectors using a symbol table.

    Speaking of genericity, I made the features strings. I could’ve made all of that generic, too, but it leads to lots of parameters and gets real cumbersome really quickly.

    The problem came about by wanting to bypass feature extraction and feed in known feature vectors. This could be done to speed up evals, or to probe the behavior of the classifier.

    The other design idea I was playing with was adding a no-arg method to LogisticRegressClassifier that returns a classifier over feature vectors. It’s a more orthogonal design that way, if I understand what Santi means by “orthogonal”. The reason we didn’t do it that way is opacity.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s