9.2. UML Class Diagrams

Time: 00:06:52 | Download: Large, Large (CC), Small | Streaming, Streaming (CC) | Slides (PDF)

"The Unified Modeling Language (UML) is a graphical language for visualizing, specifying, constructing, and documenting the artifacts of a software-intensive system" (Booch, Rumbaugh, & Jacobson, 2005, p. xiii)¹. Three object-oriented design methodologies, the Booch Method, Rumbaugh's Object Modeling Technique (OMT), and Jacobson's Object-Oriented Software Engineering (OOSE), are combined or unified to form the UML. The UML currently consists of thirteen different diagrams capturing various aspects of the structure and behavior of an object-oriented software system. The diagrams consist of abstract symbols with well-defined rules specifying each symbol's precise meaning. UML class diagrams, the only UML diagram covered this semester, consist of class symbols connected by one of five possible class relationships (covered in the following chapter).

The UML Class Symbol

The UML class symbol is formed by a rectangle divided into three sections. The class name appears in the top section, the attributes in the middle section, and the operations in the bottom section. The individual class symbols are "semantically rich," meaning they encode much information about each class. For example, the class name seems straightforward, but the other two sections include many arcane² symbols. Fortunately, it is easy to decode the symbols. The following UML class diagram illustrates the UML class diagram symbols.

The picture of a UML class diagram is a rectangle sectioned into three smaller rectangles. The top section has the class name ('Person'). The middle lists the attribute: -name:string, -height:double, -weight:int, and -instance:int; 'instance' is underlined. The bottom section lists the operations, further divided by stereotype labels. The 'constructor' section contains +Person(a_name:string, a_height:double, a_weight:int). The 'process' section has three operations: +pay_taxes():bool; +catch_bus(direction:int):void; and +get_instance)_:int. 'get_instance' is underlined. The final section, 'helper' has one operation: -get_address():Address — **The UML class symbol**. A UML class diagram is a rectangle divided into three sections. The « and » symbols define an optional *stereotype* that is a kind of label or short comment. Standard stereotypes label different kinds of operations. Underlined attributes or operations belong to the class as a whole rather than to individual instances or objects; C++ overloads the `static` keyword to distinguish features that belong to the class from those that belong to objects.
The arguments passed into constructors and setter functions are often used to initialize attributes or member variables. Programmers often use two similar naming conventions to differentiate between the attribute and argument names. The first convention is adding an "a_" prefix to the attribute name to form the argument name. For example, `a_name` denotes an argument the function uses to initialize the member variable `name`. The second convention, common in C++ and Java programs, is similar. This convention adds an "a" prefix but also makes the first letter of the attribute name upper-case. Following this convention, `aName` is an argument that initializes `name`.

Within a UML class symbol, the elements in the sections describing the attributes and operations follow a rigid but straightforward notation. Although the notation's syntax isn't the same as C++'s, it is nevertheless well-formed, and programmers can easily translate it to C++.

An annotated picture of the syntax patterns for an attribute and operation. For the attribute '-counter:int,' the first symbol, '-' in this example, is the visibility. Use + for public, - for private, and # for protected. 'counter' is the attribute name. ':int' is the data type. For the operation, '+pow(base:double, exponent:double):double,' the first symbol is the visibility. 'pow' is the operation name. The parentheses enclose the parameter list; each attribute follows the same pattern describing the attributes. The ':double' appearing at the end is the return type. — **UML class notation annotated**. The UML class notation specifying attributes and operations is a well-defined language. That means that minor details aside, there is only one way to translate a UML diagram into a C++ class. (Recall that the UML doesn't describe the bodies of a function, only the interface or prototype.) Similarly, minor details aside, there is only one way to translate C++ into a UML class. Going from UML to code is called *forward engineering*, while going from code to UML is called *reverse engineering*.

Translating UML Attributes To C++ Code

Translating UML attributes to C++ code is quite simple. A step-by-step algorithm and an example follow. The example is probably the quickest and easiest way to see how to do the translation, but the algorithm contains some details that warrant at least one read-through.

UML	C++
-name : string	private: string name;
-instances : int	private: static int instances;

Mapping UML attributes to C++ member fields (aka member variables).

The visibility symbols appearing in a UML class diagram correspond to the C++ keywords:
- "-" → "private:"
- "+" → "public:"
- "#" → "protected:"
A "real" C++ class has one "private" section that contains both member variables.
Create one labeled visibility section in the C++ class for each visibility symbol in the UML class attribute section. Each label consists of one of the three keywords above followed by a ":"
Move the data type forward, and, if necessary, convert the type into a C++ data type or class name (UML diagrams are not language specific, so for example, an attribute might be specified as "String" rather than "string" or as "integer" rather than "int")
Write the name, discard the ":" and append a ";"
An underlined attribute is static; add the keyword static at the beginning of the line (the text covers static features later in the chapter)

Translating UML Operations To C++ Code

Before reviewing the short process of converting UML operations into C++ member functions, recall that programmers may define short functions in the class but generally only prototype long ones. In either case, the UML does not specify the contents of the function body.

Translating UML operations to C++ code is similar to translating attributes but does entail a few more steps. The following figure illustrates the connection between the elements of a UML class diagram and a C++ class.

UML	C++
+Person(a_name : string, a_height : double, a_weight : int)	public: Person(string a_name, double a_height, int a_weight);
+pay_taxes() : bool	public: bool pay_taxes();
+catch_bus(direction : int) : void	public: void catch_bus(int direction);
-get_address() : Address	private: static Address get_address();
-get_address() : Address	Programmers may choose to place this prototype in a separate `private` section, keeping variables and functions separate or place it in the `private` section with the member variables

Mapping UML operations into C++ member functions. Although the C++ column illustrates each member in a separate public or private section, programmers typically collect them into two or three sections. See the example in the next figure.

The visibility symbols appearing in a UML class diagram correspond to the C++ keywords:
- "-" → private:
- "+" → public:
- "#" → protected:
Similar to the previous figure, a "real" C++ class will place both member functions in the same "public" section. You may create a new "private" section for get_address or place it with the member variables.
Create a "public" section if the class doesn't have one. Typically, all class variables are "private" and all functions are "public." So, you may not have created a "public" section when translating a diagram's attributes to C++ member variables. You may need to add a "protected" section or "private" functions for more complex classes. Only very rarely will you need "public" variables.
As explained above, the UML is independent of any programming language (to the extent possible), which means that you can convert the UML diagram into any object-oriented language like C++, Java, or C#. If necessary, convert any types (return or argument) into an appropriate C++ data type or class name
Move the return type from the right end of the UML operation to the beginning of the function prototype/definition and discard the trailing ":"
Copy the operation name to the function name
The function argument list is enclosed within parentheses and is a comma-separated list of data type and argument name pairs. Convert each behavior argument into a C++ function argument as described for attributes above
Either end the prototype with a ";" or define the body of a (short) C++ function between an opening "{" and a closing "}"

UML To C++ Example

The following figure repeats the Time example. It illustrates both the UML class diagram and the corresponding C++ class specification.

The class Time UML diagram.
Time
Attributes
-hours:int
-minutes:int
-seconds:int
Operations
+Time()
+Time(h:int, m:int, s:int)
+Time(s:int)
+add(t1:Time):Time
+add(t2:Time*):Time*
+print()
+read() — **A UML class diagram and corresponding C++ class**. The UML divides the class symbol into three sections (top to bottom): the class name, class attributes, and class operations. The C++ class is formatted to match the UML diagram as closely as possible.

UML	C++
	class Time { private: int hours; int minutes; int seconds; public: Time(); Time(int h, int m, int s); Time(int s); Time add(Time t2); Time* add(Time* t2); void print(); void read(); };

Comments On C++ Class Style

The formatting of the C++ class in the above figure illustrates my programming style. C++ does not require or enforce this style. I'll explain my reasoning, encouraging you to develop your own style.

In a UML class diagram, attributes are always placed in the middle section and operations in the bottom section. Attributes are most often private, while most operations are public. So I order the "private" section before the "public" section so that the C++ code matches a UML class diagram better.
The members of a C++ class have private visibility by default, technically making the "private" label unnecessary. Nevertheless, most programmers explicitly label the private section, making the class specification easier to understand and avoiding confusion.
I use the TAB key to indent, and the editor I prefer sets tab stops at eight characters. Many IDEs (Visual Studio, for example) set tab stops at four characters by default (but you can change that). I like big indentations.
I indent the private and public keywords; Visual Studio aligns both keywords with the braces (but you can reconfigure this behavior). Which you choose is just a matter of visual preference.
I tab between the data type and the names of variables and functions, believing it makes finding and reading the various features easier.
I place the constructors at the top of the functions. Although not required, this also seems like a pretty common style among C++ programmers.

Booch, G., Rumbaugh, J., & Jacobson, I. (2005). The unified modeling language user guide (2nd ed.). Upper Saddle River, NJ: Addison-Wesley.
As a joke (maybe not a good one), I deliberately used the word "arcane" to describe something arcane because the word "arcane" is itself arcane. Here's another wordplay: "eschew obfuscation!"