10.5.1. Building Aggregation

Time: 00:06:04 | Download: Large, Large (CC), Small | Streaming (CC) | Slides (PDF)

Review

Pointers are notoriously error-prone. Nevertheless, their power and flexibility make them an indispensable part of programming. We can alleviate some anticipated problems by embedding pointers in classes and accessing them with member functions. Once the functions are debugged and validated, pointers are no longer a "monster in the closet." Indeed, smart pointers, a topic covered in CS 2420, are an abstract data type created with classes that further automate C++'s "raw" pointers and insulate programmers from some of their most troublesome behaviors. We can take a similar approach by using pointers to implement aggregation.

Two classes, a Car and an Engine, joined by an aggregation symbol. — **Aggregation: a whole-part or "Has A" relationship**. Aggregation consists of one or more part objects bound to a whole object by pointers. In this way, complex objects, like a `Car`, can be built from more simple objects, like an `Engine` or a `Transmission`. So, if we read the relationship from the whole to the part, we can say a `Car` *has an* `Engine`. Or, we can read from the part to the whole saying an `Engine` is *part of* a `Car`. Pointers make aggregation more flexible than composition.

The UML forms the aggregation connector as a line with a hollow or outlined diamond attached to the whole class and the plain or undecorated end attached to the part class.

An abstract representation of an aggregation relationship: a pointer member variable in the whole object stores the address of the part.

Two objects, instances of Car and Engine. The Car class has a pointer member variable that points to an Engine object. — **Aggregation: a whole-part or "Has A" relationship**. Aggregation consists of one or more part objects bound to a whole object by pointers. In this way, complex objects, like a `Car`, can be built from more simple objects, like an `Engine` or a `Transmission`. So, if we read the relationship from the whole to the part, we can say a `Car` *has an* `Engine`. Or, we can read from the part to the whole saying an `Engine` is *part of* a `Car`. Pointers make aggregation more flexible than composition.

The UML forms the aggregation connector as a line with a hollow or outlined diamond attached to the whole class and the plain or undecorated end attached to the part class.

An abstract representation of an aggregation relationship: a pointer member variable in the whole object stores the address of the part.

A Basic Aggregation Example

C++ does not automatically initialize local and class-scope variables¹. Nevertheless, memory always contains some bit-pattern - a sequence of 1's and 0's. If a program doesn't explicitly initialize the value saved in a variable, the pattern is essentially random (often colloquially called "garbage") and invalid. Consider what happens when a program attempts to use the arrow operator with an uninitialized pointer: object->member;. The program cannot locate member (neither data nor a function) if object is invalid, which causes a catastrophic failure (i.e., the program "crashes").

Even if the program initializes object to nullptr, the arrow operator still can't find member, causing a runtime error. However, if we carefully initialize a pointer to nullptr, the program can test for this condition and avoid misusing an invalid pointer. Writing safe, secure, and correct code requires rigorously initializing and testing the values saved in pointer variables. We extend the Person class introduced in the previous section to demonstrate simple aggregation and three ways of initializing its pointer variable. The next section builds on the example, showing how to use aggregation.

UML Person class:
Person
--
- name : string* = nullptr
- weight : int = 0
- height : int = 0
--
+ Person()
+ Person(n : string, w : int, h : double)
+ Person(w : int, h : height)
+ setName(n : string*) : void — **Aggregation with the `Person` class**. The example creates an aggregation relationship between the `Person` and `string` classes. One of `Person`'s member variables is a pointer, making it the whole class and `string` the part. Aggregation is a weak relationship that programmers can establish and change at any convenient time by changing the address saved in the pointer. If the whole class constructor doesn't build an aggregation relationship, the class must initialize the pointer to `nullptr`.

As of the C++14 standard, programmers can choose to initialize class members in the class specification or with constructors. Which option they choose dictates which constructors they must implement.

Together, three conditions require a class to provide a "dummy" default constructor:

It *needs* a default constructor for *some* reason,

it initializes its members in the class specification, and

It has one or more non-default or parameterized constructors.

The last two parameters are fundamental types, and we covered how to initialize them in the previous chapter. The first parameter is a `string`, which has a constructor that *copies* the argument, building a new `string`, forming an aggregated part (see Modeling with library classes (b)). It isn't *necessary* for a class specification to initialize a pointer if all the constructors do.

If the program doesn't form a whole-part relationship when it creates the whole object, then any member pointers are initialized to `nullptr`, which, until the ANSI C++14 standard, could only be done in the constructor as illustrated.

Programmers may establish or change the aggregation relationship anytime with a setter function. To avoid a memory leak, the setter must `delete` (i.e., deallocate) an existing part object before it installs a new one. Deleting a null pointer should be safe, but as recently as 2019, I witnessed a program fail when it did. So, I still test before deleting, but the test is *probably* no longer needed. If the whole already has a part, the setter destroys it before installing a new one.

Constructor Initialization Options

Although the previous example demonstrated three constructors, only one created an aggregation relationship. That example created a new part object from its constituent or "raw" ingredients (i.e., data), passed in as the constructor's parameters. Alternatively, the constructor's parameter may be the address of an existing object created elsewhere in the program. Classes may have both constructors but typically only need one. Which constructor programmers choose to implement depends on the problem the program solves.

class Engine								// the part class
{
	private:
		double	size;
		int	cylinders;

	public:
		Engine(double s, int c) : size(s), cylinders(c) {}	// (a)
};

class Car								// the whole class
{
	private:
		Engine*	motor;						// (b)
		string	model;

	public:
		Car(string m, double s, int c)				// (c)
			: motor(new Engine(s, c)), model(m) {}

		Car(string m, Engine* e)				// (d)
			: motor(e), model(m) {}
};

Constructor options and examples. The example builds an aggregation relationship between two classes: Engine (the part) and Car (the whole).

Engine has a constructor requiring two parameters (yellow). The constructor initializes the two member variables with the parameters through an initializer list.
One Car class member variable, motor, is an Engine pointer (gold). Once we establish a variable's name, we must use it consistently throughout the class, including in the initializer lists below.
The Car constructor receives the "raw material" (i.e., the data) to build a new Engine through its parameters (red). The initializer list creates a new Engine object with the raw data and stores the object's address motor. Notice that new Engine(s, c)) is a function call to the Engine constructor, so the number and type of arguments in the call must match the number and type of parameters in the constructor. The variable names in the constructor call must match the names in the constructor's parameter list.
The second constructor receives the address of an existing Engine object through its parameter (green), and the initializer list stores it in motor.

The first constructor is appropriate when Car doesn't share its Engine, and the second may be appropriate when it does.

Setter Options

It may not always be practical to build an aggregation relationship (i.e., initialize a pointer member) when constructing a whole object. At other times, programmers may need to change an existing part by updating a pointer. The solution in both cases is an accessor or setter function. But there is one subtle difference between a constructor and a setter. A constructor creates a new object, so initially, the pointer member cannot point to an existing part. Alternatively, a setter function may establish the whole's first part or replace an existing one. Unlike a constructor, a setter must detect which situation applies and behave accordingly.

class Car { private: Engine* motor; string model; public: Car(string s) : model(s), motor(nullptr) {} void set_motor(double s, int c); void set_motor(Engine* e); };	void Car::set_motor(double s, int c) { delete motor; motor = new Engine(s, c); }
	(b)
	void Car::set_motor(Engine* e) { delete motor; motor = e; }
(a)	(c)

Managing an aggregation relationship with a setter function. The previous figure demonstrated two versions of the Car constructor: one receiving the "raw" data to create and set a new part and another receiving a pointer to an existing part. Setters follow the same pattern, but if the whole "owns" the part, the setter must destroy it before overwriting its address.

Uninitialized member variables contain unknown, random values, often called "garbage." If the whole (Car) constructor doesn't have the data necessary to build the aggregation relationship, it should initialize the pointers to nullptr.
The data needed to build a new part (Engine) object are passed to the setter as function parameters. This kind of setter function is common when the whole (Car) class owns the part (Engine) and is responsible for its destruction before replacing it.
A part (Engine) object passed by pointer to the setter. We can create this kind of setter function regardless of which object "owns" the part, but it's most common when another object is responsible for destroying it. We'll see an example of this situation below.

If a program discards a part object - e.g., the Engine pointed at by the address saved in motor - it cannot use or deallocate it later. The code printed in red shows how the setter deletes the part if it exists. As I discussed with the Person class above, I often put the delete operation in an if-statement, but modern compilers should generate correct code without it. Remove the code in red if the whole doesn't own the part and isn't responsible for destroying it.

Who "Owns" The Part?

Pointers are a small, fixed-size data type whose size is independent of the data (object) to which they point, and it's entirely possible to have multiple pointers pointing to the same data or object. Programmers can use multiple pointers to organize data in numerous ways without incurring the expense of duplicating that data. In this way, two or more whole objects can share a part. What the sharing means depends on the problem that the program solves. For the following example, imagine the Car illustrated in Figure 1 is a sponsored racing car. Racing cars are notoriously hard on engines, so it's not unreasonable to further imagine that the racing team has several spare engines. Finally, imagine that the team tracks the spare engines with a database class named Warehouse. Now, it should be easier to see how a Car can share its Engine with another program object.

The 'Warehouse' class diagram with a single private attribute: 'spares : Engine*[10]' — **Sharing objects through pointers**.

The partial `Warehouse` class diagram illustrates how a class can have multiple aggregation relationships with a single class. We'll explore the corresponding UML notation later in this chapter's "multiplicity" section. We typically use connector symbols rather than attributes to show relationships in a class diagram. Ideally, class diagrams should be language-neutral, precluding the C++-specific pointer array. We briefly violate both customs to help us understand part sharing.

An abstract representation of objects bound together with pointers. Examples in the next section demonstrate how we can use the pointers to send messages from the wholes to the parts.

It seems reasonable in this example to have the `Warehouse` "own" the `Engines` and make it responsible for creating and destroying them. The `Car` builds and uses an aggregation relationship with an `Engine` as needed but otherwise does not create or destroy it.

Objects instantiated from three classes: Car, Engine, and Warehouse. The Car object has a pointer that points to an Engine object. The Warehouse object has an array of pointers pointing at several Engine objects. It shares one of those objects with a Car object. — **Sharing objects through pointers**.

The partial `Warehouse` class diagram illustrates how a class can have multiple aggregation relationships with a single class. We'll explore the corresponding UML notation later in this chapter's "multiplicity" section. We typically use connector symbols rather than attributes to show relationships in a class diagram. Ideally, class diagrams should be language-neutral, precluding the C++-specific pointer array. We briefly violate both customs to help us understand part sharing.

An abstract representation of objects bound together with pointers. Examples in the next section demonstrate how we can use the pointers to send messages from the wholes to the parts.

It seems reasonable in this example to have the `Warehouse` "own" the `Engines` and make it responsible for creating and destroying them. The `Car` builds and uses an aggregation relationship with an `Engine` as needed but otherwise does not create or destroy it.

When two or more whole classes share a part, programmers must establish a protocol specifying which whole is responsible for managing the parts. Typically, we designate one class to create new part objects and destroy them when they are no longer needed. Alternatively, any whole can create a part and share it as needed. Finally, while we can devise an arbitrarily complex algorithm to determine which whole will destroy a part, I highly recommend assigning that responsibility to a single class.

Aggregation and Inheritance Examples

We expect the number and variety of class relationships to grow as the size and complexity of programs increase. Programs build objects and the relationships that bind them together by creating constructor-call chains. The chains execute when the program instantiates an object from one class. The following examples demonstrate how we form the constructor chains. Aside from the class names, the two examples are similar, differing only in the aggregation's location relative to the inheritance relationship. We'll see how to use the chains in the next section.

The picture shows three classes connected by two relationships as follows:
Address
--
- city : string
- state : string
--
+ Address(c : string, s : string)

Person
--
- name : string
--
+ Person(n : string, c : string, s : string)
Student
--
- gpa : double
--
+ Student(n : string, g : double, c : string, s : string)
Student is a subclass of Person — **Inheritance and aggregation example 1**. The relationships are straightforward: a `Student` is a `Person`, and a `Person` has an `Address`. Each class has one or two attributes. None of the classes have a default constructor, so the program must initialize the attributes via constructor calls. The chain runs when the program instantiates a `Student` object:
Student valedictorian("Alice", 4.00, "Ogden", "Utah");
The program provides the data to populate the three related objects as constructor arguments.

The `Student` constructor retains one value to initialize `gpa` and passes the remaining values to `Person` by calling its constructor (blue). A C++ class calls its superclass constructor by the class name, and the call is always the first element in the initializer list.

The `Person` constructor retains the first `string` to initialize `name` ans pass the remaining two to `Address` by calling it constructor (coral). C++ implements aggregation with a class-scope member variable in the whole class. The constructor uses that variable name to call the aggregated class's constructors.

The `Address` constructor uses the two strings to initialize `city` and `state`, ending the constructor-call chain. The example highlights the syntax forming the relationships in yellow.

The picture shows three classes connected by two relationships as follows:
Pet
--
- name : string
- vaccinations : string
--
+ Address(pn : string, v : string)

Person
--
- name : string
--
+ Person(n : string)

Owner
--
- account : int
--
+ Owner(n : string, a : int, pn : string, v : string)

Aggregation connects the Owner (the whole) with the Pet (the part). — **Inheritance and aggregation example 2**. Imagine that the classes in the diagram are from a program used in a veterinary clinic. They represent a `Pet` and its `Owner`. The `Owner` is a `Person` and has a `Pet`. As in the previous example, each class has one or two attributes, and the program initializes them with a chain of constructor calls. The program starts the chain when it instantiates an `Owner` object:
Owner client("McCartney", 123456, "Martha", "June 1, 1980");
The figure highlights the syntax forming the relationships in yellow, while the `Owner` constructor calls the `Person` and `Pet` constructors.

Some compilers do automatically initialize local and member variables to their zero-equivalent values, which for pointers is nullptr. However, the ANSI C++ standard does not require initializing these variables, and you should not rely on that behavior. Any program dependence on non-standard behaviors reduces its portability.