(Abstract) Base Classes vs. Interfaces

by

When definining a low-level abstraction as part of a framework, say vectors and matrices to make this concrete, there’s a choice of whether to code to interfaces or specify a (possibly abstract) base class.

There are two key differences between the base class and interface approach. First, in Java (and .NET from what I gather), classes must extend exactly one other class (Object is the default), but can implement zero or more interfaces. Second, interfaces can’t specify any implementations of methods (though they may define static constants).

But as Kirill Osenkov’s blog entry Choosing: Interface vs. Abstract Class points out, there are serious downstream consequences of this choice.

Most notably, once an interface is released, it has profound backward compatibility implications. Particularly, any attempt to add a method to the interface will break backward compatibility for anyone who has implemented the interface. As long as the base class implements the added method, there’s no such problem.

The conclusion I’ve come to after several years of coding to interfaces (following the Java collections framework for guidance), is that they almost all should’ve been (abstract) base classes. The exceptions are util.Compilable, util.Scored, and other lightweight marker-type interfaces, most notably corpus.Handler.

So what do you do if you forgot a critical method in an interface? I’m in that position right now with our matrix.Vector interface. I’ve just implemented multinomial logistic regression classification (aka max entropy, aka soft max, aka log linear classifiers) and found that I need two new vector operations to make the inner loop efficient. One, I need to be able to find the non-zero dimensions of a vector, and two, I need to add a scaled sparse vector to a vector.

I’ve added the methods to the interface and to the low-level abstract implementation matrix.AbstractVector.

I’m banking on no one having implemented Vector, because there’s really no reason to do it, so changing the interface won’t be so terrible.

I’d dearly love to add a method to the tokenizer.TokenizerFactory interface to create tokenizers from character sequences as well as slices. But alas, I fear too many users have coded to the existing interface, so I’m not going to do that.

One Response to “(Abstract) Base Classes vs. Interfaces”

  1. Kirill Osenkov Says:

    C# has extension methods – you can add behavior to existing types without changing them :) Even without extension methods, you could probably use static methods to achieve what you want.

    Yes, I think a vector should be a class – you don’t expect to be two different entities to act like a vector – there’s only one clearly defined entity, so making it a base class is a good idea.

    One more interesting thing that I found: usually, if a type name ends with “able”, it it probably better as an interface. It’s not always true, but works pretty well in general.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s