We ran into a problem with generic methods in logistic regression. LingPipe’s logistic regression classifieruses a feature extractor to map objects to feature vectors, which are maps from strings to numbers. There used to be a single public classification method, so the class looked like:
LogisticRegressionClassifier<E> ConditionalClassification classify(E);
We thought it’d be convenient to have a second classification method that could classify feature vectors (which are mappings from string-valued features to numerical values)
LogisticRegressionClassifier<E> ConditionalClassification classify(E); ConditionalClassification classify(Map<String, ? extends Number>)
It compiles just fine, and I think it’s pretty clear what the difference is between the overloaded methods. Now consider creating an instance of this type:
LogisticRegressionClassifier<Map<String,? extends Number>>
All together now: “Doh!“.
The new class can be constructed, but I can’t call either
classify() method with a
Map<String,? extends Number> argument, because it matches both methods, which turn out to have identical signatures from a client perspective. Java method resolution (which happens statically), picks the most specific signature that matches the argument type. You can see why it’s stuck with two identical arguments.
But hang on, the compiler shouldn’t let us compile a class with two methods with the same signature. So why did it work? It all goes back to the semantics of generics in Java, which are based on generic erasure. With erasure, these two methods have the following signatures:
ConditionalClassification classify(Object) ConditionalClassification classify(Map<String, ? extends Number>)
The problem is that this isn’t the view presented to client code; to clients, there are two identical classes, so code won’t be able to use them, because it won’t be able to resolve the method signature.
And that, kids, is how the method
classifyFeatures(Map<String,? extends Number>)
got its name.