Like composition and aggregation, association is a constructive relationship. Like aggregation, C++ implements association with pointers. These similarities make association a general and flexible relationship attractive to analysts because they can use it wherever they use aggregation or (with some compromise) composition. However, the same generality that makes it attractive to analysts also makes it more difficult to program. The relationship's flexibility is a product of its semantics, and its difficulty is a consequence of implementing those semantics.
UML Classes | Association Classes | Abstract Representation |
---|---|---|
class project; // forward declaration class contractor { private: project* theProject; } class project { private: contractor* theContractor; } |
||
(a) | (b) | (c) |
class project;
.Although we can use association in place of aggregation and composition, I believe that they better model some application domain (i.e., "real world") situations than association, and I recommend reserving it for situations that require its symmetry and bidirectional capabilities. Situations requiring objects to send messages in both directions justify the added burdens, major and minor, association places on programmers.
When we create a whole-part relationship with composition or aggregation, it's clear which class is the whole and which is the part. This distinction is often sufficient, and we can name the relationship when it isn't. However, the classes in an association relationship are peers, and how they interact is often less clear. Consequently, it's more common for class designers to name or label associations than other relationships. However, association's bidirectionality can confuse how we read the label.
(a) | (b) |
The C++ compiler component is the second or middle stage in the C++ compiler system. It reads and translates each preprocessed source code file individually, once, from beginning to end. Consequently, the only program information available to the compiler component comes from the #included header files and the single source code file. However, association is bidirectional, implying that the peers "know about" the other - each class specification references the other class. Since we can't specify both classes first, we need another mechanism to solve the cross-reference problem. That mechanism is a forward declaration.
contractor.h | project.h |
---|---|
class project; // forward declaration
class contractor
{
private:
project* theProject;
. . .
};
|
class contractor; // forward declaration
class project
{
private:
contractor* theContractor;
. . .
};
|
class
keyword and an identifier, the name of a class. A forward declaration is a "promise" that a programmer will provide more detail about the class in the future. The "promise" allows the compiler to continue processing a file but limits what programmers can put in a class specification.
In this example, when the compiler processes the contractor class specification, it puts information in the object file that the linker or loader uses to connect variable theProject to the project class. Similarly, when it processes the project class specification, it adds information joining theContractor to the contractor class. Association is the only relationship that requires a forward declaration because it is the only bidirectional class relationship.
Forward declarations only work with pointer variables. For reasons explained below, forward declarations do not work with inheritance or composition - not with any non-pointer variable. Fortunately, correctly structured programs do not require forward declarations for these relationships or aggregation. Nevertheless, forward declarations are necessary to deal with association's bidirectionality, but they cannot circumvent all the problems bidirectionality causes.
Association, in conjunction with the C++ compiler system, restricts what programmers can put in a class specification more than the other class relationships2. Whenever we define a variable in a program, the compiler uses its type to determine its size (i.e., how much memory to allocate to store it). Objects are variables, so the compiler sums an object's member variables' sizes to determine the object's overall size. The summing process depends on the compiler "seeing" the full class specification.
Inheritance and composition entail embedding one object in another (see Instantiating a subclass and Building a whole-part, respectively). The compiler can complete the embedding because the required class organization allows it to "see" and process the superclass specification before the subclass and the part class specifications before the whole class. But this organization isn't possible when we implement association.
Forward declarations solve the class ordering problem because we implement association with pointer variables, and a pointer's size is independent of the size of the data it references (i.e., points to). Forward declarations are unnecessary and unhelpful when programming either inheritance or composition; we could use them with aggregation, but to no advantage. Association also restricts the functions programmers can inline in a class specification.
Peer1.h | Peer1.cpp | Peer2.h |
---|---|---|
class Peer2; class Peer1 { private: Peer2* p2; public: void bar(); }; |
void Peer1::bar() { p2->foo(); } |
class Peer1; class Peer2 { private: Peer1* p1; public: void foo() { ... } }; |
inline
keyword or by putting the function body in the class specification. In this example, the Peer1 function bar (highlighted in coral), sends the foo message to Peer2. Although bar is a small function, programmers can only prototype it in the class specification (highlighted in blue), not inline it. (UML class diagrams typically don't include the member variables implementing a relationship, but the example includes them to clarify the bar function.)
Function prototypes allow the compiler to validate function calls:
Message-passing code becomes increasingly fragile (difficult to use and maintain) with each added inline function. So, a helpful rule of thumb is to minimize inline functions in classes related by association.
Association is a constructive relationship conveniently characterized by the phrase "has a," but reading well in both directions. It has many property values in common with composition and aggregation, but it is bidirectional - UML's only bidirectional relationship. The following figure summarizes association's property values; use it to check and complete your entries in one of the blank Class Relationship Tables located at the end of the chapter.
(a) | (b) |
(c) | (d) |
class peer2; class Peer1 { Peer2* p2; }; |
class peer1; class Peer2 { Peer1* p1; }; |
2 Java is not as complex as C++ and doesn't experience these limitations. C++ supports stack and heap objects, allows fundamental-type data throughout a program, and utilizes a one-pass compiler followed by a separate linker or loader process. Alternatively, Java only supports heap objects, limits where programmers use fundamental-type data, and utilizes a two-pass compiler with dynamic class loaders. The Java compiler builds its symbol table during the first pass and generates (virtual) machine code, called byte code, during the second pass. These differences simplify the Java compiler and the organization of Java programs while also creating situations where the Java compiler needlessly recompiles files where the C++ compiler does not. The unnecessary compilations are irrelevant for small programs with small files but can significantly increase the development time for large programs.