9.2. UML Class Diagrams

Time: 00:06:52 | Download: Large, Large (CC), Small | Streaming, Streaming (CC) | Slides (PDF)

"The Unified Modeling Language (UML) is a graphical language for visualizing, specifying, constructing, and documenting the artifacts of a software-intensive system" (Booch, Rumbaugh, & Jacobson, 2005, p. xiii)1. Three object-oriented design methodologies, the Booch Method, Rumbaugh's Object Modeling Technique (OMT), and Jacobson's Object-Oriented Software Engineering (OOSE), are combined or unified to form the UML. The UML currently consists of thirteen different diagrams capturing various aspects of the structure and behavior of an object-oriented software system. The diagrams consist of abstract symbols with a set of well-defined rules specifying the precise meaning of each symbol. UML class diagrams, the only UML diagram covered this semester, consist of class symbols connected by one of five possible class relationships (covered in the following chapter).

The UML Class Symbol

The UML class symbol is formed by a rectangle divided into three sections. The class name appears in the top section, the attributes in the middle section, and the operations in the bottom section. The individual class symbols are "semantically rich," meaning they encode much information about each class. For example, the class name seems straightforward, but the other two sections include many arcane2 symbols. Fortunately, it is easy to decode the symbols. The following UML class diagram illustrates the UML class diagram symbols.

The picture of a UML class diagram is a rectangle sectioned into three smaller rectangles. The top section has the class name, 'Person' in this example. The middle lists the attribute: -name:string, -height:double, -weight:int, and -instance:int; instance is underlined. The bottom section lists the operations, further divided by stereotype labels. The <<constructor>> section contains +Person(a_name:string, a_height:double, a_weight:int). The <<process>> section has three operations: +pay_taxes():bool; +catch_bus(direction:int):void; and +get_instance)_:int. get_instance is underlined. The final, <<helper>> section has one operation: -get_address():Address
The UML class symbol. A UML class diagram is a rectangle divided into three sections. The « and » symbols define an optional stereotype that is a kind of label or short comment. Standard stereotypes label different kinds of operations. Underlined attributes or operations belong to the class as a whole rather than to individual instances or objects; C++ overloads the static keyword to distinguish features that belong to the class from those that belong to objects.

The arguments passed into constructors and setter functions are often used to initialize attributes or member variables. Programmers often use two similar naming conventions to differentiate between the attribute and argument names. The first convention is adding an "a_" prefix to the attribute name to form the argument name. For example, a_name denotes an argument the function uses to initialize the member variable name. The second convention, common in C++ and Java programs, is similar. This convention adds an "a" prefix but also makes the first letter of the attribute name upper-case. Following this convention, aName is an argument that initializes name.

Withing a UML class symbol, the elements in the sections describing the attributes and operations follow a rigid but straightforward notation. Although the notation's syntax isn't the same as C++'s, it is nevertheless well-formed, and programmers can easily translate it to C++.

An annotated picture of the syntax patterns for an attribute and operation. For the attribute '-counter:int', the first symbol, '-' in this example, is the visibility. Use + for public, - for private, and # for protected. 'counter' is the attribute name. ':int' is the data type. For the operation, '+pow(base:double, exponent:double):double', the first symbol is the visibility. 'pow' is the operation name. The parentheses enclose the parameter list; each attribute follows the same pattern that describes the attributes. The :double at the end is the return type.
UML class notation annotated. The UML class notation specifying attributes and operations is a well-defined language. That means that minor details aside, there is only one way to translate a UML diagram into a C++ class. (Recall that the UML doesn't describe the bodies of a function, only the interface or prototype.) Similarly, minor details aside, there is only one way to translate C++ into a UML class. Going from UML to code is called forward engineering while going from code to UML is called reverse engineering.

Translating UML Attributes To C++ Code

Translating UML attributes to C++ code is actually quite simple. A step-by-step algorithm and an example follow. The example is probably the quickest and easiest way to see how to do the translation, but the algorithm contains some details that warrant at least one read-through.

-name : string
	string name;
-instances : int
	static int instances;
Mapping UML attributes to C++ member fields (aka member variables):
  1. The visibility symbols appearing in a UML class diagram correspond to the C++ keywords:
    • "-" → "private:"
    • "+" → "public:"
    • "#" → "protected:"
    A "real" C++ class has one "private" section that contains both member variables.
  2. Create one labeled visibility section in the C++ class for each visibility symbol appearing in the UML class attribute section. Each label consists of one of the three keywords above followed by a ":"
  3. Move the data type forward, and, if necessary, convert the type into a C++ data type or class name (UML diagrams are not language specific, so, for example, an attribute might be specified as "String" rather than "string" or as "integer" rather than "int")
  4. Write the name, discard the ":", and append a ";"
  5. An underlined attribute is static; add the keyword static at the beginning of the line (static features are covered later)

Translating UML Operations To C++ Code

Before reviewing the short process of converting UML operations into C++ member functions, recall that programmers may define short functions in the class but generally only prototype long ones. In either case, the UML does not specify the contents of the function body.

Translating UML operations to C++ code is similar to translating attributes but does entail a few more steps. Following the approach above, a step-by-step algorithm and an example both follow. Although the example is probably still the quickest and easiest way to see how to do the translation, the algorithm does contain some details that you should know.

+Person(a_name : string, a_height : double, a_weight : int)
    Person(string a_name, double a_height, int a_weight);
+pay_taxes() : bool
    bool pay_taxes();
+catch_bus(direction : int) : void
    void catch_bus(int direction);
-get_address() : Address
    static Address get_address();
This prototype may be placed in a separate "private" section (keeping variables and functions separate) or placed in the private section with the member variables
Mapping UML operations into C++ member functions:
  1. The visibility symbols appearing in a UML class diagram correspond to the C++ keywords:
    • "-" → "private:"
    • "+" → "public:"
    • "#" → "protected:"
    Similar to the previous figure, a "real" C++ class will place both member functions in the same "public" section. You may create a new "private" section for get_address or place it with the member variables.
  2. Create a "public" section if one was not created previously. Frequently, all class variables are "private" and all functions are "public." So, you may not have created a "public" section when translating a diagram's attributes to C++ member variables. For more complex classes, you may need to add a "protected" section, or you may need "private" functions. Only very rarely will you need "public" variables.
  3. As explained above, the UML is independent of any programming language (to the extent possible), which means that you can convert the UML diagram into any object-oriented language like C++, Java, or C#). If necessary, convert any types (return or argument) into an appropriate C++ data type or class name
  4. Move the return type from the right end of the UML operation to the beginning of the function prototype/definition and discard the trailing ":"
  5. Copy the operation name to the function name
  6. The function argument list is enclosed within parentheses and is a comma-separated list of data type and argument name pairs. Convert each behavior argument into a C++ function argument as described for attributes above
  7. Either end the prototype with a ";" or define the body of a (short) C++ function between an opening "{" and a closing "}"


The following figure repeats the Time example. It illustrates both the UML class diagram and the corresponding C++ class specification.

The class Time UML diagram.
+Time(h:int, m:int, s:int)
class Time
		int	hours;
		int	minutes;
		int	seconds;

		Time(int h, int m, int s);
		Time(int s);

		Time	add(Time t2);
		Time*	add(Time* t2);

		void	print();
		void	read();
A UML class diagram and corresponding C++ class. The UML divides the class symbol into three sections (top to bottom): the class name, class attributes, and class operations. The C++ class is formatted to match the UML diagram as closely as possible.

Comments On Style

The C++ class in the example above illustrates many of my personal stylistic elements. C++ does not require these, and the compiler does not enforce them. I'll explain my reasoning, such as it is, and let you adopt or reject each element to match your personal style.

  1. In a UML class diagram, attributes are always placed in the middle section and operations in the bottom section. Attributes are most often private, while most operations are public. So I order the "private" section before the "public" section so that the C++ code matches a UML class diagram better.
  2. The members of a C++ class have private visibility by default, technically making the "private" label unnecessary. Nevertheless, most programmers explicitly label the private section, making the class specification easier to understand and avoiding confusion.
  3. I use the TAB key to indent, and the editor I prefer sets tab stops at eight characters. Many IDEs (Visual Studio, for example) set tab stops at four characters by default (but you can change that). I like big indentations.
  4. I indent the private and public keywords; Visual Studio aligns both keywords with the braces (but you can reconfigure this behavior). Which you choose is just a matter of visual preference.
  5. I tab between the data type and the names of variables and functions. I think it makes it easier to find and read the various features.
  6. I place the constructors at the top of the functions. Although not required, this also seems like a pretty common style among C++ programmers.

1 Booch, G., Rumbaugh, J., & Jacobson, I. (2005). The unified modeling language user guide (2nd ed.). Upper Saddle River, NJ: Addison-Wesley.

2 As a joke (maybe not a good one), I deliberately used the word "arcane" to describe something arcane because the word "arcane" is arcane. Here's another wordplay: "eschew obfuscation!"