Upgrading Java Classes with Backward-Compatible Serialization

by

When using Java’s default serialization, any changes to the member variables or method signatures of the class breaks backward compatibility in the sense that anything serialized from the old class won’t deserialize in the new class. So do similar changes to superclasses or changing the superclass type.

After a bit of overview, I’ll show you the trick I’ve been using to maintain backward compatibility when upgrading classes with new meaningful parameters.

Serial Version ID

The problem stems from the fact that Java uses reflection to compute a serial version ID for each class. (You can calculate it yourself using the serialver command that ships with Sun/Oracle’s JDK.) When methods or member variables change, the serial version changes. To deserialize, the serialized class’s ID must match the current class’s ID.

As long as the (non-transient) member variables don’t change, this problem can be tackled by explicitly declaring the serial version ID. (See the Serializable interface javadoc for how to declare). With an explicit serial version ID declared, the serialization mechanism uses the declared value rather than computing it from the class signature via reflection.

If there are serializable objects out there you need to deserialize, use serialver to calculate what the version ID used to be, then declare it in the class. If you’re starting from scratch, the ID can be anything.

So always use an explicit serial version when you first create a class. It can’t hurt, and it’s likely to help with backward compatibility.

Taking Control with the Externalizer Interface

But what if (non-transient) member variables (non-static class variables) change? Version IDs won’t help, because the default read and write implemented through reflection will try to deserialize the new variables and find they’re not there and throw an exception.

You need to take control using the Externalizable interface, which extends Serializable. Now you control what gets read and written.

Serialization and deserialization will still work, as long as the new member variable is either not final or is defined in the nullary (no-arg) constructor.

The Simplest Case of The Problem

Suppose we have the class

import java.io.*;
public class Foo implements Externalizable {
    static final long serialVersionUID = 42L;
    public String mArg1;
    public Foo() { }
    public void setArg1(String arg1) { mArg1 = arg1; }
    public void readExternal(ObjectInput in) 
        throws IOException, ClassNotFoundException {
        mArg1 = (String) in.readObject();
    }
    public void writeExternal(ObjectOutput out) 
        throws IOException {
        out.writeObject(mArg1);
    }
}

So far, so good. If I add the following main(),

    public static void main(String[] args) 
        throws IOException, ClassNotFoundException {
        Foo foo = new Foo();
        foo.mArg1 = "abc";
        System.out.println("1: " + foo.mArg1);

        FileOutputStream out = new FileOutputStream("temp.Foo");
        ObjectOutput objOut = new ObjectOutputStream(out);
        objOut.writeObject(foo);
        objOut.close();

        FileInputStream in = new FileInputStream("temp.Foo");
        ObjectInput objIn = new ObjectInputStream(in);
        Foo foo2 = (Foo) objIn.readObject();
        System.out.println("2: " + foo2.mArg1);
        objIn.close();
    }

and compile and run, I get

c:\carp>javac Foo.java
c:\carp>javac Foo.java
c:\carp>java Foo
1: abc
2: abc
c:\carp>ls -l temp.Foo
----------+ 1 Bob Carpenter None 31 2010-05-04 15:14 temp.Foo

So far, so good.

But now suppose I add a second variable to Foo,

public class Foo ...
...
    public String mArg2;
...
    public void setArg2(String arg2) { mArg2 = arg2; }
...

And try to run just the deserialization with the existing file,

    public static void main(String[] args) 
        throws IOException, ClassNotFoundException {

        FileInputStream in = new FileInputStream("temp.Foo");
        ObjectInput objIn = new ObjectInputStream(in);
        Foo foo2 = (Foo) objIn.readObject();
        System.out.println("2: " + foo2.mArg1);
        objIn.close();
    }

Because I haven’t changed readExternal(), we still get the same answer.

But what if I want to serialize the second argument? And give it a default value of null for classes serialized before it was added? The obvious change to read and write don’t work,

    public void readExternal(ObjectInput in) 
        throws IOException, ClassNotFoundException {
	mArg1 = (String) in.readObject();
	mArg2 = (String) in.readObject();
    }
    public void writeExternal(ObjectOutput out) 
        throws IOException {
        out.writeObject(mArg1);
        out.writeObject(mArg2);
    }

It’ll throw an exception, as in

c:\carp>javac Foo.java
c:\carp>java Foo
Exception in thread "main" java.io.OptionalDataException
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1349)
        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
        at Foo.readExternal(Foo.java:12)
        at java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1792)
        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1751)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
        at Foo.main(Foo.java:37)

Now what?

The Trick: Marker Objects

Here’s what I’ve been doing (well, actually, I’m doing this through a serialization proxy, but the idea’s the same),

    public void readExternal(ObjectInput in) 
        throws IOException, ClassNotFoundException {
        Object obj = in.readObject();
        if (Boolean.TRUE.equals(obj)) {
            mArg1 = (String) in.readObject();
            mArg2 = (String) in.readObject();
        } else {
            mArg1 = (String) obj;
        }
    }
    public void writeExternal(ObjectOutput out) 
        throws IOException {
        out.writeObject(Boolean.TRUE);
        out.writeObject(mArg1);
        out.writeObject(mArg2);
    }

The new implementation always writes an instance of Boolean.TRUE (any small object that’s easily identifiable would work), followed by values for the first and second argumet. (In practice, these might be null in this class, so we’d have to be a bit more careful.) The trick here is the external read method reads an object, then checks if it’s Boolean.TRUE or not. If it’s not, we have an instance serialized by the previous version, so we cast that object to a string, assign it to the first argument and return. If it is, we read two args and assign them.

Now the result of deserializing our previous instance works again:

c:\carp>javac Foo.java
c:\carp>java Foo
2: abc

And there you have it, backward compatibility with new non-transient member variables.

The Fine Print

The trick only works if there’s an object that’s being serialized (or a primitive with a known restricted range of possible values). It doesn’t have to be the first variable, but there has to be something you can test to tell which version of the class you have.

One Response to “Upgrading Java Classes with Backward-Compatible Serialization”

  1. Fantine Says:

    mKafPQ Very true! Makes a change to see smooene spell it out like that. :)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Follow

Get every new post delivered to your Inbox.

Join 819 other followers