Tuesday, May 27, 2008

Java should have value types

One of the features Java lacks is the notion of immutability (similar to some variants of constness in C++). Immutability is a very important design tool in my opinion and I would put it the highest on the list of features I would like to see in a new Java. This one is actually quite suitable for a point release and I really hope that JDK 1.7 will at least get an annotation for immutable types (such as the one proposed with the JCIP book). It deserves better than an annotation (or for that matter a tagging interface), but I don't have any hopes of Sun adding keywords to the language anytime soon. You can follow the RFE in Sun's bug parade if you are interested.

Others have made good summaries of why you want to use immutable objects, so I won't list the reasons why they are good again. In short it all comes down to easier and safer code. It also allows for performance optimizations that are not possible without (hashcode caching and String.intern() to start with, but even more in distributed computing).

What I would like to see is to go a step further and introduce value types. With this I mean types of objects that are not only immutable but are treated as values in all regards, which most importantly adds the notion that they should use value equality, i.e. two instances of a type are considered the same whenever all members are the same. The members of value objects are restricted to other value types.

Some existing languages use this notion, most noticably C# (in form of structs and enums) or Lava (a language unrelated to Java). Java itself has a number of value types such as all primitives and some object classes such as String. But the notion of a value type is not well established and in fact primitives are treated quite differently to String, which in turn has its own specific hacks in the Java compiler.

Instead of having primitives and targetted optimizations for specific classes such as String Java should have a proper notion of value types that are:
  • immutable
  • reference only other value types
  • use value identity
All value types would be first class objects (no more special primitives) and all the optimizations done for primitives and String could be extended to include not only further JDK classes (there are plenty of candidates) but also classes defined by a normal Java programmer. Imagine having the equivalent of String.intern() for all your data.

So my fictional version of Java that I call Java3k would have value types so we:
  • can write safer code
  • don't have to worry about writing hashCode() and equals(..)
  • can rely on a lot of performance optimization by the compiler
  • can forget about primitives (and all the boxing/unboxing annoyances)
  • have an easier time writing multi-threaded and distributed applications
I would even consider disallowing anything but reference identity for types that are not value types, but that would most likely cause too many portability issues when migrating from existing Java code to Java3k.

I think a separation of your types into value types and the rest (which can then further be separated) is a very valuable approach and personally I prefer taking it as far as I can in the sense that I try to make as many types value types as I can and disallow value identity outside that group. It makes for much cleaner code even if you don't have language support.

The other effect it has is that it leads you towards writing more and more code as pure functions. Instead of mutating state on objects you let them create new objects without changing themselves. The substring method on Java's String class is such an example: it doesn't change the String instance it is called upon but creates a new object. That might make it confusing for beginners, but it means that it will never cause unwanted side effects (and the part of being confusing is only since Java let's you ignore return values, which is a feature that falls under the category of Pretty Bad Ideas). And not having side effects is extremely valuable -- similar to many other good ideas you have to get used to it (unless you have been raised on a functional programming language), but once you get the idea you don't want to miss it.

So that is my first feature for my fictious Java3k: explicit value types instead of primitives and mystery hacks for String and the like. It would not only add a lot of value in terms of safer and faster code but also get rid of enough cruft to effectivly reduce the language complexity significantly.

PS: once you have value types you could also add derivation by restriction such as used in XML Schema, but maybe that's a Java4k feature.

1 comment:

yecril said...

Me too.
And value types should be aggregatable, especially into arrays.
And arrays of predefined size should be value types.
I can hardly imagine that these features have never been discussed, and I would appreciate a link to such a discussion.