Java Autoboxing Is Not

Java 5's "autoboxing" is really a form of coercion, just like the language coerces ints to longs. The type of a value should not depend on where it happens to be stored. If the language moves a value from TheStack to TheHeap, it shouldn't suddenly change type.


I guess the question becomes just what is AutoBoxing? then? Or are you saying AutoBoxing? is a fallicious term in its entirety? Or just that Java should have all ints as Integers and decide where the data is stored based on context (i.e. the VM should be clever enough to figure it out for you)? --JamesHollidge

Java did not invent the concept nor terminology "box"/"boxing"/"autoboxing", so the term itself is not at issue.

But other languages that do boxing, Lisp and Haskell for instance, do not change the type/class of something depending on whether it is boxed or unboxed, whereas Java int is not a Java Integer. Thus the above complaint. What Java should do instead is a different question.

I don't know about Lisp, but Haskell (the language) doesn't do boxing/unboxing. Compilers may certainly do so as an optimization technique, and the main compilers (including Hugs and GHC) do allow a programmer to specify an intent to 'unbox' certain types of values (e.g. unboxed arrays for space/time optimization purposes), but these aren't part of the core language. Further, when the programmer does explicitly specify use of unboxed values, they do possess a great many different properties (especially constraints on use) from the boxed values. In Haskell, an unboxed 'UArray' is NOT an 'Array' any more than a Java Integer is a Java 'int'.

I don't imagine Lisp is any different in this aspect. The only reason to ever 'unbox' a value is because the 'box' is expensive, but when changing from boxed to unboxed you necessarily lose whatever benefits the 'box' gave you in the first place. In Haskell, what you lose is lazy and corecursive evaluations and the ability to store arbitrary-sized input (UArray and its cousins can only be used with Int, Double, and other machine-precision types). In Java, what you lose is garbage collection, referencing, and access to synchronization primitives. And, quite assuredly, you lose something by any explicit 'unboxing' the Lisp language might provide - enough that you won't be able to say that the "type/class" never changed.

Responding to James: Implicit 'unboxing' performed ONLY by the compiler when it is a provably valid optimization seems a fine idea in theory, but (unfortunately) this often would not be provable due to the common presence of inheritance and modular code. E.g. in Java it requires proving that nobody else ever holds a reference to the 'Integer' without also holding a reference to the object holding the 'Integer'... very easy for inner-loops on a stack (which is, admittedly, one of the places it most matters) but is a bit difficult for 'Integer' objects found within an object or collection (e.g. akin to the 'UArray' in Haskell). If programmers really need the speed and space, they'll need also the ability to specify the boxing or unboxing explicitly.

OTOH, speed and space costs for boxed vs. unboxed are usually only different by some constant amount - one that will not grow over time as computer technology continues to advance to the beat and drums of Moore's Law. Even if a compiler or linker -cannot- perform the optimization due to other language constraints (e.g. involving modularity of code, reflection, etc.), it may be worth rejecting the use of explicit 'unboxed' types in order to reduce the complexity of the language and diminish temptations towards premature optimization. The only cost is alienating within your audience of programmers those who really, truly do need that last bit of non-algorithmic speed or space (and are willing to jump through hoops and suffer for it). Even for these people, you could allow them to specify compiler options to push for the optimization on the much lesser constraint of doing it where it isn't provably -unsafe-.


Technical nit: 'int's are stored on the heap -- when they are fields in a class.

Technical nit nit: 'int's are only stored on the heap when they are fields in an object.

Technical nit of a nit of an int nit: 'int's are not stored on the heap. They do not have heap data structures associated with their storage. The object is stored on the heap. A part of the object is used to store an int. You cannot refer to a field of the object separately from the object itself.


One irksome facet of Java's autounboxing is that it sometimes results in NullPointerExceptions?. This is particularly annoying as I and many other Java developers have previously trained ourselves just to look for uses of the . operator as sources of these exceptions, so if the line the exception is thrown from looks like:

int foo = bar.method(baz.othermethod());

Then bafflement frequently ensues when bar and baz have both been checked and found not to be null. Of course, the other part of this problem is that NullPointerException doesn't say what's null.


EditText of this page (last edited August 17, 2010) or FindPage with title or text search