Sets Break Collection Contract

In the javadoc documentation for Collections, it says

public boolean contains(Object o)
Returns true if this collection contains the specified element. More formally, returns true if and only if this collection contains at least one element e such that (o==null ? e==null : o.equals(e)).

Unfortunately, this must not be true for Sets. As it says in the javadoc documentation for java.util.Set:

Note:
Great care must be exercised if mutable objects are used as set elements. The behavior of a set is not specified if the value of an object is changed in a manner that affects equals comparisons while the object is an element in the set. A special case of this prohibition is that it is not permissible for a set to contain itself as an element.

What is the problem with that? First, assume we have a class MutableInteger, which is similar to integer but mutable. It implements equals() and hashCode() in the expected way -- such that new MutableInteger(i).equals(new MutableInteger(j)) equals to i==j.

Now we have the following method:

 public static boolean f(Collection c) {
     MutableInteger i = new MutableInteger(0);
     c.add(i);                // line 3
     i.setValue(2);
     return c.contains(i);    // Always returns true.
 }

According to the Collection interface, this method always returns true, and it certainly does for a Vector, LinkedList or an ArrayList, because the method 'containes(i)' compares all elements to i using equals. As we already have the reference in c, this must return true.

But look what happens if we call it with a set:

 boolean b = f(new HashSet());

Because the hashValue of c changes in line 3, it is possible that, when looking up the object in the HashSet, it won't be found.

Thus, sets break the contract built by collections.


Aren't you assuming that the Collection contract includes an implicit rule that you can change an element at any time? I don't think it does. Changing the identity or key of an element after putting it in a set, hash table, or a number of other structures would, in most implementations, be an error. It's not Java's Set class that's in error; it's this usage.

I agree. This is a problem only for ValueTypes? that are not immutable. The equality of mutable ReferenceTypes? should be the same as testing their object identity. Value types should be immutable. That way, sets do not break the collection contract.

And this actually should work, since the default implementation of equals() delegates to ==. So, arguably, the example above with MutableInteger will either work if MutableInteger does not implement equals(), or fail because it does, but in this latter case MutableInteger is the one breaking the contract. Striking this from JavaDesignFlaws might be a good idea.

The problem is maybe just the documentation. The Collection page does not mention at all that you should not insert mutable elements to a collection. The Set page does, however. This is, IMHO, breaking the LSP (LiskovSubstitutionPrinciple)


On the other hand, this can be seen as a good example of when it is appropriate to "break the rules". A small violation of LSP enables us to use Set as a Collection, which has lots of benefits.


Let's consider this situation:

  public static void f(Set s) {
     MutableInteger i = new MutableInteger(0);
     MutableInteger j = new MutableInteger(1);
     s.add(i); s.add(j);
     i.setValue(1); // Uh oh.
  }

In the last line, if we want to consider MutableIntegers equal if the values are equal, then we cannot just do this - suddenly our Set is not a set in the mathematical sense, so it's perfectly reasonable that the Set can no longer obey its general contract. Even if we could somehow notify the Set every time one of its elements was changed, how would we resolve the situation? Which ought we to remove in order to maintain the set nature - i or j?

It simply doesn't make sense to use a Set in this way. Maybe what we really need is an ImmutableObject subclass of Object, which defines #equals according to the equality of all data members, and is checked by the compiler to really be immutable (no data members are public; and all are modified only in constructors). Then a Set could refuse to #add a mutable object. But that would make it much less useful...

KarlKnechtel


Comments on page:


EditText of this page (last edited October 5, 2014) or FindPage with title or text search