This is another entry in the list of JavaProblems.
Ok... imagine I have a Java program that defines a data structure as a set of Serializable classes. I load/save instances of the data structure to/from files.
Now I create a new version of my code, and the serialised data structures of the two versions are not compatible. How do I convert between versions of serialised files?
Do I have to load the class for each version from a different class loader and then use reflection to create instances of the new version from data stored in objects of the old version? But what about private fields?
I'm not sure about how this works in Java, but in C++, I'd have an object that understands the old serialised format, and the new interface/functionality - not necessarily the same as the new object (think compatibility layer). Then, when this intermediate object is reserialised, it saves itself in the newer format. Eventually, it obsoletes itself.
I've approached this problem with custom serialization. The first byte written is a version byte, and is used like this:
private void writeObject( ObjectOutputStream? out ) throws IOException { out.writeInt( 2 ); // use the various out.writeX methods to write the instance variables of // the latest version of the class } private void readObject( ObjectInputStream? in ) throws IOException { int version = in.readInt(); switch( version ) { case 1 : readObjectVersion1( in ); break; case 2 : readObjectVersion2( in ); break; } } private void readObjectVersion1( ObjectInputStream? in ) throws IOException { // use in.readX methods and assign to instance variables } private void readObjectVersion2( ObjectInputStream? in ) throws IOException { // this usually calls readObjectVersion1 then reads the extra instance // variables here, but could be completely different }
That approach is great, as long as you think of it in advance. Is there a need for a JavaIdiom such as NeverUseDefaultSerialisation?? -- NatPryce
I think "never" may be too strong, but maybe I'm just being wishy-washy. In order to support the principles of refactoring though, a way to control changes to serialized versions is required, and yes, it only works if you plan for it in advance (or during a beta period when you can make incompatible changes to data formats).
Now I create a new version of my code, and the serialised data structures of the two versions are not compatible. How do I convert between versions of serialised files?
Generally there are two approaches:
Lazy conversion: When you read you detect the difference and reformat on the fly.
Eager: Some process runs over your saved objects and converts them en masse.
I think both require that some object in the system has knowledge about the before and after properties of the object. It's hard to believe that a general purpose converter would be successful often enough. On the other hand, it's easy enough to envision a framework where you register converters by class name and serial version ID and a manager which looks up and applies the conversions as needed.
I've not done this kind of conversion, but I've noticed that ObjectInputStream? has some new methods in the 1.2 JDK which would help. See the ObjectInputStream?.readFields method and see where that takes you.
I also have a library which will work with JDK 1.1 that parses the serialized stream format and lets you examine (and potentially modify) the serialized data without having to load the object's class. I'd intended to write a batch serialized data updater for a project of my own, but was side tracked some time ago. If you are interested I'd be glad to mail you the code.
-- mkrajnak@execpc.com