State Object

Yet another MicroArchitecture production.

The basic StateObject is an object which has nothing but state in it, and all elements public.

see ValueObject, DataTransferObject, EssenceObject (hippy name :)), transferObject (Sun's new name, anyone else?).

Why do we need one of these, and what makes it (micro)architectural?

Consider serialization of an object. When we serialize we usually want to create an object which contains only the data of interest. In most systems this is the instance data, without the class that wraps it.

By encapsulating this data explicitly in an object, it is easy to serialize. We simply grab the StateObject. It is also easy to deserialize, simply pass it as an argument to an appropriate constructor/setter. This is very useful whenever we need to deal with persistent or distributed architectures i.e. bridge referential boundaries or process boundaries.

So, say you need to export a class to a database; if you have (or can get) a state object from it then any further mappings are made very easy. Say you want to express a class in two different languages. A StateObject can indicate the common data elements. Say you want a fast delta communication channel, a StateObject is easy to difference against some "basis" object. Say you want to decorate an object with different sets of functions, then a StateObject is very handy.

Some modern systems like Java provide generic mechanisms to do Serialization/Deserialization for you. The problem with these is that they are opaque, architecturally significant (so perhaps better not to be hidden), and limited to JVM-to-JVM communication. On the other hand they are very convenient, as black box systems often are. Under the covers, however, they are generating StateObject s, usually with some dynamic type information added to the native object state.

See: TreeInSql for a fairly complicated mapping from a tree.

KentBeck calls the StatePattern "State Object". The StatePattern is quite different from this, since the "state" in StatePattern is an object that has no data, that encapsulates the behavior that changes depending on the state of the object. He can't have both names. That's just greedy :).

The StateObject pattern is called "Value Object" by Floyd Marinescu at http://www.theserverside.com/resources/patterns_review.jsp but ValueObject means something different for most people. The Sun J2EE patterns also have a pattern called ValueObject (see http://developer.java.sun.com/developer/restricted/patterns/ValueObject.html) but it seems to be the intersection of these two patterns, i.e. they use a real ValueObject to represent a StateObject.

ValueObject and StateObject are the same I think. Its interesting to note that in the paper referenced it is described as improving performance and portability. These are architectural attributes. I am wrestling with using the same name. What do people here think? --RichardHenderson

Okay, I decided to keep my name. ValueObject is getting a sense of being immutable which is not relevant here. Its just an object of state, the state of an object, yeah, its a StateObject.

Enter Number 3, roll of thunder, flash of lightning etc :)

I am not convinced. When you serialize an object, you are translating it to be a bit-stream. Why is serializing a StateObject any better than serializing the full object? It seems to me that it would be just as much work, and all you are doing it making two classes where one would do.

Firstly the objects can be pretty heavyweight, so stripping out the functional metadata is an obvious efficiency. Then there's the clients of the StateObject. They may not be identical therefore the object may need to be manipulated in flight as in a relational database mapping. This implies the StateObject should have its own identity. Finally, this way of looking at things maps directly to what computer systems actually physically do. The JNI libraries uses StateObject (as I understand it) though they call them native peers. No point in sending Java functional stuff to a C library. Indeed it is incorrect since it then limits the C code to being a consumer. Not much use in an O/S binding.

It is not obvious to me. -RalphJohnson

What is your alternative Ralph? I need classes of objects which correspond to the world I program. My programming world has state objects in it. I can map them to objects in databases and track their movements, decoration with functions, transformations etc. The real world doesn't understand Java or Smalltalk.

It is funny that you claim that serializing objects in Java has a bad smell. [refers to an earlier version which was rude about Java Serialization] I have called this pattern "Record Object", and have always labeled it a bad smell. Serialization in Smalltalk is simple and I would never want to replace it with dumb records. Why is it so much worse in Java? I do not find your arguments at all convincing. Perhaps you should give some code examples. -RalphJohnson

If you are talking to another Java system, or can do with mementos, then you are correct. Non-dumb records are not portable or robust or transparent. Dumb ones are.

Whats the difference between a StateObject and the MementoPattern?

I think it is a matter of intent. Specifically the very first clause of Memento in GoF is: Without violating encapsulation. My version is fully public i.e. the object has its own identity and structure as data. The MementoPattern implies that the object has no structure outside the class which generates it. When passed around to other classes it is a black-box lump of bits. This is great for ensuring a persistence manager (Caretaker) can be made generic, but no good for an object designed to solve architectural problems such as those in the original explanation.

In summary, a MementoPattern has no meaning outside its parent class, just identity. A StateObject is an independent object which can be used to pass information across architectural boundaries. As such it is not owned by any one class. --RichardHenderson.

Is there any performance hit by implementing the Serializable interface in java? I was under the idea it was just a MarkerInterface. Is this correct? Then if there isnt any performance hit why doesnt object implement Serializable to save us the hassle? Just wondering. -CaseyHelbling

It seems that there would only be a performance hit for a class that implements Serializable when you actually serialized an instance (hence unavoidable) or when you loaded the class (due to the additional bytecode needed if you did custom serialization). Performance impact seems negligible.

The real issue with Serializable is that you often have to customize it, and doing so raises many issues and becomes quite hairy. The result is usually a CodeSmell. There are definite hard limitations to your freedom of customization. Overall, serialization is very useful for a few specific situations, and very hard to make use of otherwise.

At least, that is what I have read in my various Java books, seen online, and learned from my peers.

By the way, I recently used a StateObject to implement a persistence scheme that successfully abstracted storage to a relational database, the file system, and via HTTP (and nearly anything else).

--RobWilliams

Yes, that's a good explanation. Performance becomes a factor with a lot of small-grained objects over a network (e.g. a stock ticker) but often isn't a big deal. Another cheesy one is Cloneable. The default (shallow copy) just isn't very useful. In the real world most objects are not shallow and tend to have nastiness like relationships and aliasing. The standard OO model is poverty-stricken in this regard. A StateObject can help here too. I commonly use them as arguments to factories, or their little cousins, constructor methods. ---RH

The heart of any serialization system is the serialized data, because readers and writers must agree on its format and be able to use it. From this point of view, creating special intermediate objects is an implementation detail: whether some StateObject is used or not, the same objects are converted to and from the same representation.

I think this discussion fails to agree on the merits of some solutions (JavaSerialization, SmalltalkSerialization?, DIY options, etc.) because it confuses some problem domain aspects:

serializing to a stream or persisting to a database

storing and reconstructing single objects or object graphs with pointers

communicating with other "platforms", with related programs, with the same program at a later time

allowing an exact reconstruction of the objects or preserving only the important part of their state

Java Serialization applies to stream serialization, takes care of complex object graphs transparently,is interoperable between Java programs only, stores all instance variables unless you spend significant effort; other solutions have other properties and tradeoffs. --LorenzoGatti

Yes, there are a few levels going on. I have refactored slightly to separate the definition from the value judgements. If we agree that StateObject is a reasonable name for the beast, then we can look at the interesting stuff.

Part of the problem is the sense that a pattern can only be a pattern if it is good. Or it is an AntiPattern. This confuses the semantics somewhat by implying a value. I use "pattern" to mean a definable class of things (in their original meanings). Particularly a pattern may be good in one context and bad in another, therefore its characterization as independently good or bad is spurious. So, apart from the philosophising :). For non-persisted JVM-JVM serialization, use Java Serialization. Everywhere else, an appreciation of the underlying structures and why they are there is worth having. Wherever Serialization is, a StateObject or two hide under there, somewhere. --RichardHenderson.

CategoryArchtecture?