Decorator and Adapter Patterns Everywhere

by
Wikipedia Object Adapter Pattern UML

Now that we’ve been gearing up some applications involving logistic regression and feature-based clustering, I’ve been getting lots of requests for how to do things in LingPipe. The answers all point to two general design patterns:

Really, I’m not an architecture astronaut. No more pattern talk, I promise, just case studies.

Case Study 1: Single Evaluation Spanning Cross-Validations

Now that we have a cross-validating corpus built in for classifiers, it’s getting heavy use. The typical use is to evaluate each fold (perhaps collecting cross-fold stats for summaries):

XValidatingClassifierCorpus corpus = ...;
for (int fold = 0; fold < numFolds; ++fold)
    corpus.setFold(fold);
    Classifier classifier = trainClassifier(corpus,fold);
    ClassifierEvaluator eval = new ClassifierEvaluator(classifier);
    corpus.visitTest(eval);
}

You can print out stats in the loop, or collect up stats for the whole corpus. But what if we want to combine the evaluations? The trick is to write a simple mutable classifier filter (more patterns):

class Filter implements Classifier {
    Classifier mC;
    public Classification classify(Object x) {
        return mC.classify(x);
    }
}

Now, we pass the filter into the evaluator and set the classifier in it before each round:

XValidatingClassifierCorpus corpus = ...;
Filter filter = new Filter();
ClassifierEvaluator eval = new ClassifierEvaluator(filter);
for (int fold = 0; fold < numFolds; ++fold)
    corpus.setFold(fold);
    Classifier classifier = trainClassifier(corpus,fold);
    filter.mC = classifier;
    corpus.visitTest(eval);
}

That’s it. After the loop’s done, the single evaluator holds the combined eval under cross-validation. We got around the immutability of the single classifier held by the evaluator by writing a simple filter that has a mutable object.

Case 2: Nullary Tokenizer Factory Constructors

This case just came up on our mailing list. I made a regrettable design decision in writing only the fully qualified class name of a tokenizer during serialization so that it gets reconstituted using reflection over the nullary (no-arg) constructor. But what if you want a regular-expression based tokenizer that has no nullary constructors? Simple, write an adapter.

class MyRegexTokenizer implements TokenizerFactory {
    static final String MY_REGEX = ...;
    public MyRegexTokenizerFactory() {  super(MY_REGEX); }
}

That’s it. Same behavior only now we have the necessary nullary constructor for my brain-damaged serializer.

Case Study 3: Decorators

Let’s say you have a corpus and it’s being passed into some program that takes a long time to chew on it but doesn’t give any feedback. We can instrument the corpus with a decorator to give us a little feedback:

final Corpus corpus = ...;
Corpus decoratedCorpus = new Corpus() {
    public void visitTest(Handler h) {
        System.out.println("visiting test");
        corpus.visitTest(h);
    }
    ...
}

Yes, that’s it. Well, actually we need to fill in the ellipses with the same thing for visitTrain().

On the same topic, suppose we have a text corpus and we want to restrict it to only texts of length 30 to 50 (yes, that just came up this week in a client project). We just apply the same trick twice, filtering the corpus by filtering the handler:

final Corpus<TextHandler> corpus = ...
Corpus<TextHandler> boundedCorpus
    = new Corpus<TextHandler>() {
        public void visitTest(TextHandler handler) {
            copus.visitTest(new TextHandler() {
                    public void handle(String in) {
                        if (in.length > 30 && in.length < 50)
                            getHandler().handle(in);
                    }
                });   
        }
    };

Basically, these approaches all put something in between the implementation you have and the implementation that’s needed, usually acting as a filter. While it’s not quite Lisp, it’s getting close in terms of parenthesis nesting, and is a handy tool for your Java toolbox.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Follow

Get every new post delivered to your Inbox.

Join 819 other followers