Association Class

In some circles, an AssociationClass is a class that contains a map (dictionary) from objects of one class to objects of another class. In others, it encapsulates a single tuple from that dictionary.

To understand the difference between the two definitions, consider a Person/Job/Employment scenario.

In the AssociationClass-as-tuple camp, basically the UML crowd, there is at most one link (of AssociationClass Employment) between any person and job. Each instance of the AssociationClass is a link. The attributes of the link are properties of the relationship between the specific connected objects. In the case of a binary association, say "employment", between classes Person and Job, we could make the association into an AssociationClass named Employment and add a hireData attribute.

In the AssociationClass-as-map camp, there may be many links between person and job; the attributes of each link are usually moved into the associated classes, although it is also possible to model them on a per-link basis in the implementation.

People in the tuple camp wonder why such a thing would be useful outside the database world. People in the map camp see AssociationClass as a more explicit way to diagram a QualifiedAssociation?.

Contributors: MichaelFeathers, PeterMerel

In UML, an association class is "A modeling element that has both association and class properties. An association class can be seen as an association that also has class properties, or as a class that also has association properties" (UML 1.1 Semantics, Appendix B: Glossary).

Also found: "association class symbol designates an association that has class-like properties, such as attributes, operations, and other associations".

It follows that, in UML, an AssociationClass is a Class, a tuple. As such it is an unnecessary, redundant feature. --MarcoScheurer

I'm afraid neither of your quotes lead me to your conclusion. An association can have plurality, ergo an AssociationClass can have more than one tuple. That makes it handy and far from redundant. --PeterMerel

I don't think this follows. An association is a tuple which relates classifiers. It defines a set of tuples which relate instances. This is parallels the fact that a class defines a set of instances. So, an instance of an AssociationClass is exactly one link object in the same way that an instance of a class is exactly one object of the class (not counting other associations). Or, should we start to say that an instance of a class is a set of objects?

An association defines a set of tuples which relate instances, sure. The rest is your surmise. Now perhaps there is some point in promoting this surmise, and I'm sure we could engage in some debate about it, but frankly I find this whole thing quite Swiftian. Does your class encapsulate an associative array, hash, dictionary, or whatchamacallit? Then I'll be happy to diagram it as an AssociationClass. You agree that you won't be using an AssociationClass for anything else, so kindly accept that I do and stop arguing about which end of the UML egg to open at breakfast.

As to your comments about AssociationClass as some kind of "bad pattern", we are speaking English here, and English adopts meanings by their use, not by fiat. I call a class that encapsulates an associative array (map/hash/dictionary/whatchamacallit) an AssociationClass. You may happily call it anything else you like. But when I speak of AssociationClass(es) I'm content that you know now what I mean, and that you agree that any other meaning is quite useless. Beyond that, please breakfast at any end of the egg you prefer. --PeterMerel

The following figure, from Rational UML documentation, shows an example of AssociationClass:

You can get the same "design" without the concept of an AssociationClass, if you make Job a (regular) class:

                 11-n       0-n 0-1
        Company <----------> Job <----------> Person

I find this simpler, more precise, and even closer to implementation (which is not bad, even if one believes that "you should not think about implementation details at this level"). The only reason I can see for using the AssociationClass notation here is that "Job" is considered "2nd class" (maybe because it was not discovered at the same time as Company and Person) and therefore not considered part of the so called "real world" model. In my opinion, this is poor practice, and AssociationClass does not deserve a name: it is not a useful pattern. --MarcoScheurer

In general, I do not find them useful, but as Alistair corrected (and I botched) below AssociationClass(es) have the additional constraint that (using this example) there is at most one instance of job for every pair of person and company. If someone was interested in showing that, they'd have to add it as an additional constraint on your diagram. -- MichaelFeathers

AFAIK within UML, if you specify one (or both) of the association ends are {non-unique}, then you can say a Person may have more than one Job at a Company. By default, association ends are {unique}. -- JevonWright?

Yes, but in UML this is a restriction rather than a constraint. In fact, I'm pretty sure it comes from the "data modeling" practice of using an "intermediate table" to store "relationships" between "entities", with a compound primary key: this is where the uniqueness of the Company / Person pairs comes from.

I'd rather have a rule added to the relevant class description in addition to the traditional attributes and operations. You then have the liberty to state that "each person only participates in at most two employments with one employer at a time" if that is what you want. -- MarcoScheurer

Also, reflecting, I've thought of one instance where I did use an AssociationClass of the kind Mike suggested (extra behaviour/attributes per tuple). But I didn't model this as an AssociationClass; I modeled it as a class with two associations to the associated classes, plus extra stuff. --PeterMerel

Yes, that is what they usually end up being. The only advantage to using a UML association class is that there is an implicit constraint that there is at most one instance per link. So, using the example above, each person may participate in at most one employment at a time. Not terribly realistic. -- MichaelFeathers

No, it would mean that each person only participates in at most one employment with one employer at a time, which is realistic. AssociationClass is an example of ClassUnfolding, which is useful at times, and not at others.

There are several uses for an AssociationClass. One is to store data about the link. If Students take Courses, then we might care that this student took that course in that semester and got that grade. Students retake courses, so you can maintain a history of that student and that course.

The other use is to control use of pointers. One experienced developer said he shies away from paired pointers, too hard to keep in sync. So he always made an association class and had it point to each side of the relationship - see ClassUnfolding. --AlistairCockburn

To my eyes, the textual diagram above is the same as the UML diagram. The text diagram shows the result an unfolding idiom that reifies a link into an object... and is often called an association class. I would be interested in knowing why Marco thinks it is different. I see these differences:

The UML diagram shows the intent better, i.e., that we have taken this link and reified it. The textual diagram must be seen as an idiom, and hence presupposes knowledge of the idiom in the reader.
Showing an association class brings along the implication that the Employer can refer directly to the Employee, without passing through the Job. The unfolding idiom used in the text diagram implies that Employer must talk though Job to get to Employee. I happen to want to know which is true when I look at a diagram.
The UML diagram is more cluttered. Since I am familiar with ClassUnfolding, I can turn almost any model into a few base classes, with gazillions of association classes hanging off links, and more association classes hanging off those links. Doing so would be valid modeling, but the diagram gets so cluttered by even the third AssociationClass, that I don't. I find one AssociationClass on a page is all I can cognitively handle.

--AlistairCockburn

Yes, the textual diagram is essentially the same as the UML diagram, and I used the words "same design" to introduce it. I agree that the "Job" class may have been found by an unfolding idiom (a kind of refactoring?).

I disagree with Alistair's first point. To me, it is not intent but a trace of the process that lead to the discovery of the "Job" class that is displayed in the UML diagram. Why should this reification be traceable? I do not think that the reader should see the textual diagram as an idiom and recognize "Job" as a link: it is a class. This must be the fudamental difference between our views.

More differences between these diagrams:

The model implied from the textual diagram is more flexible: you can exchange the limitation of association class in favor of rules encapsulated in the relevant classes. It is also possible to model that a Company can have unfilled positions: Jobs with no associated Person. If a job can be shared by several persons you would probably add another class, "Employment", between "Job" and "Person" ("Position" might be a better name than "Job" then). How do you do this with "Job" an association class?
The textual diagram is closer to the implementation of this model in (all?) object-oriented programming languages. It could be created by ReverseEngineering object-oriented code. On the other side, the UML diagram could (only?) be created by ReverseEngineering RDBMS tables.
The textual diagram is simpler: it only uses boxes (for classes) and arrows (for associations). I may draw such a diagram with a user. The less obfuscation the better. What benefit do you gain for the additional notation of association class? Uniqueness of Company/Person pairs: as explained before I consider this an artificial restriction, a stigmata of entity relationship modeling and RDBMS storage rather than a useful concept in need of a specific notation. Direct reference from Company to Person: you can show this by adding an arrow from Company to Person in the textual diagram to represent the "employees" association. Is the extra cost justified? I think not.

--MarcoScheurer

You might have less concepts, but you now have 3 relationships instead of one. What is the cost of increasing the number of interconnections in the model?

You also seem to suggest that you can simply throw extra relationships at a problem when two objects need to communicate directly. That might be called model-hacking. The "Job" object and "Employee" relationships both realise the same problem-domain concept. What is the cost of this repetition? The alternative - that the two relationships really are different, would require the extra relationship on both models. But then you couldn't use one relationship as a shortcut for navigating the other two.

You suggest that the 1-pair restriction is a result of data modeling theory. I don't buy this. The restriction that does not apply to data modeling. In the ShlaerMellorMethod, associative objects are used to formalize relationships (same as data modeling), but additional members in the compound identifier can be used when multiple instances of a relationship are needed for a given pair of peers.

The fact that code can be reverse engineered into one model, but not the other, suggests that one model contains more information than the code, and the other doesn't. I favour the model that contains more information.

Extra concepts in the meta model generally aid understanding. Imagine a world without named-numbers. Instead of saying "my car has 4 wheels, 3 pedals, 2 doors", you might say "my car has succ(succ(succ(succ(null)))) wheels, succ(succ(succ(null))) pedals, succ(succ(null)) doors". Less concepts, less easy to understand.

-- DaveWhipp