5.2. Enumerations (enum)

Time: 00:04:03 | Download: Large, Large (CC), Small | Streaming, Streaming (CC) | Slides (PDF)

The basic enumeration syntax forms a list of simple named integers or symbolic constants. The compiler can set default values for each constant, or programmers can establish the values. Programmers can name enumerations, creating a new program type specifier. Enumerations further extend the number of ways that programmers can create symbolic constants, each with relative advantages and disadvantages.

Symbolic, Named, or Manifest Constants

The terms symbolic constant, named constant, and manifest constant are synonyms for naming a constant value. As the term implies, a constant is a value that does not change while the program runs. For example, 5 is an integer constant, 5.0 is a double constant, '5' is a character constant, and "5" and "Hello World" are string constants (also known as string literals). C++ provides three ways of naming constants:

  1. #define NUMBER_OF_STUDENTS 100
  2. const int NUMBER_OF_STUDENTS = 100;
  3. enum { NUMBER_OF_STUDENTS = 100 };
Methods of creating symbolic constants. An advantage of techniques 1 and 2 is that they support constants of any type, whereas technique 3 only supports integer constants. With the adoption of the ANSI 2011 C++ standard, programmers can select the integer size but can't replace the integer with another type. The advantage of technique 3 is that it allows programmers to create multiple constants with a single statement, unlike techniques 1 and 2. Regardless of how programmers create them, naming constants is beneficial in at least three ways.
  1. Naming constants can provide shorthand names for long values. For example, we can write M_PI rather than 3.141592653589793.
  2. Names can help clarify the meaning or purpose of a value in a program, making the program more self-documenting. For example, the value 100 appearing in a program doesn't tell the reader much about what 100 means or represents in the program. But if the programmer names that value NUMBER_OF_STUDENTS, the name conveys more meaning when it appears in context:
    for (int i = 0; i < NUMBER_OF_STUDENTS; i++)
    	. . .
  3. Naming constants can make maintaining or modifying a program easier. Suppose that 100 appears many times in a program: Sometimes, 100 refers to the number of students in a class, but at other times, it refers to something else entirely. Further, suppose that the programmer needs to change the number of students, but must not change any other reference to 100. The programmer has no alternative to combing through the program, examining each occurrence of 100, and changing only those that represent the number of students - a tedious and error-prone process. The chance of making an error is even greater if the program spans many files. If, on the other hand, we have the symbolic constant created as in II above, then there is only one place in only one file for the programmer to make the change!
Benefits of symbolic constants.

The enum Syntax

The original C-style enumerations were simple and unscoped, but they have evolved with C++, growing to support the object-oriented paradigm and additional data types. Although the text continues to focus on simple, unscoped enumerations, the following figure presents a more complete description of the modern syntax. By now, we are familiar with the idea that a program consists of a pattern of keywords, symbols, and programmer-supplied names. The specification of an enumeration follows this same basic approach. Some parts of an enumeration are required, while others are optional.

This is a complex diagram illustrating most of the syntax of the enum statement. It is complex because it shows optional and repetitive parts. An enumeration begins with one of three keyword phrases: enum, enum class, or enum struct. A name is optional. A colon followed by a kind of integer is also optional. An opening brace is required. The body consists of a comma-separated list of elements or constant names. Optionally, each name may be followed by = and a value. A closing brace is required. A comma-separated list of variables may follow. The terminating semicolon is required.
Element Role Required
enum
enum class
enum struct
Keyword yes (1 of 3)
name Also called a tag, this is the name or identifier a programmer gives to the enumeration, forming a new data type. The name is optional for traditional enumerations but required for scoped ones no
: type A kind of an integer: char, short, int, etc. The ANSI 2003 standard allows the implementation to choose the integer size, but this feature overrides that choice no
{ Delimiter yes
element list Comma-separated list of names (these are the names of the symbolic constants). Strictly speaking, this list is not required, but we're going to treat it as if it was yes - one or more
= value If a value is specified, it overrides the default value. Default values begin at 0 for the first element and then count sequentially for subsequent elements. The value can be a constant expression; e.g., if "circle" appears to the left of "square," then square = circle + 4 is legal no
} Delimiter yes
variable list Comma-separated list of enumeration variables no
; Statement terminator yes
The enumeration syntax. Although the complete enumeration syntax is quite complex, only a few elements are necessary in practice. The illustration colors the traditional enumeration elements in blue and colors the elements specific to scoped enumerations in green. The textbook focuses on the traditional (blue) enumeration elements and briefly discusses scoped enumerations later.

The many varied paths through the syntax diagram notwithstanding, the enumerations programmers use most frequently in practice are generally quite simple. The following examples illustrate common and a few less common uses.

enum { circle, square, triangle };
Basic enumerations. This example demonstrates the most fundamental and frequently used enumeration syntax. It creates three symbolic constants: circle, square, and triangle. When the enumeration doesn't assign an explicit value to a symbolic constant, it assigns default values from left to right, beginning with 0 and counting by 1. Therefore This approach is useful when the values are arbitrary but must be consistent throughout a program. Arbitrary values may not seem very useful, but Programming Example 1 below presents a situation where they can make code much easier to read and understand.
(a) enum { NUMBER_OF_STUDENTS = 100 };
(b) enum { alpha = 3, beta = 10, gamma = 15 };
(c) enum { alpha = 3, beta, gamma };
(d) enum { alpha = 3, beta = alpha + 2, gamma = alpha + beta };
The enumeration list. The enumeration list is essential to the enumeration, where it does the "real work." It is very flexible, providing many different ways of initializing symbolic constants.
  1. A single element/value pair creates a single symbolic constant, replacing #define or const.
  2. Enumerations can assign unique values to each element in the list, replacing a series of #define directives or const statements.
  3. Whenever an enumeration does not explicitly assign a value to an element, it automatically increments the previous value. So, alpha = 3, beta = 4, and gamma = 5.
  4. Enumerations support simple constant arithmetic. Element assignment occurs left to right, and each part of the expression to the right of the assignment operator must be an established constant value when assignment occurs. So, alpha = 3, beta = 5, and gamma = 8.
enum shape { circle, square, triangle };

shape my_shape;
shape your_shape;


my_shape = square;      // all compilers
my_shape = 1;           // some compilers
(a)(b)
The enumeration tag (name). The enumeration tag or name is an optional name a programmer gives to an enumeration. Unlike the other data structures this chapter covers, (unscoped) enumerations are typically unnamed. However, when named, the name forms a new data type or type specifier.
  1. Programmers use type specifiers to define variables. The tag name, shape, specifies the variable type, while my_shape and your_shape are the defined variables' names.
  2. Once my_shape is defined, it is possible to save shape values to it. The shape type is a sub-range of the integers, so some, but not all, compilers allow programmers to save an integer.
enum { circle, square, triangle } my_shape, your_shape;
The variable list. The optional variable list consists of one or more variable names following the enumeration list, constituting a variable definition. The example illustrates an alternate way to define my_shape and your_shape.

As the size and complexity of programs increase, programmers typically break them into multiple files. Programmers frequently put enumerations in header files (files ending with a .h extension) and #include them in source code files (files ending with a .cpp extension). We'll explore this organization in detail later in the chapter. With enumerations in header files and variable definitions in source code files, programmers use variable lists infrequently in practice.

Enumeration Examples

It's easier to understand enumerations by seeing them in a concrete example demonstrating how and why programmers use them.

Programming Example 1: Eliminating Magic Numbers

A "magic number" is an arbitrary value the program uses to pass information about events, notifications, or other simply-encoded information around a program. The exact value is insignificant, but it is essential that the value is unique within the given context and used consistently throughout the program. The following example illustrates magic numbers and the importance of their uniqueness and consistency.

"Magic Number" Version Enumeration Version
int command = get_command();

switch(command)
{
    case 0:
        exit(0);
    case 1:
        search();
        break;
    case 2:
        input();
        break;
    case 3:
        import();
        break;
    case 4:
        help();
        break;
    default:
        cerr << "Unknown command\n";
        break;
}


enum { EXIT, SEARCH, INPUT, IMPORT, HELP };

int command = get_command();

switch(command)
{
    case EXIT:
        exit(0);
    case SEARCH:
        search();
        break;
    case INPUT:
        input();
        break;
    case IMPORT:
        import();
        break;
    case HELP:
        help();
        break;
    default:
        cerr << "Unknown command\n";
        break;
}
(a)(b)
if (command == 2)
	do_one_thing();
else if (command == 3)
	do_another_thing();
if (command == INPUT)
	do_one_thing();
else if (command == IMPORT)
	do_another_thing();
(c)(d)
Eliminating "magic numbers". The example excepts the switch statement from a simple database program. The program displays a menu, and the user enters a single command, such as "search." The function get_command reads the command from the console as a string and maps it to a unique integer, which it returns. The switch statement interprets the command by calling an appropriate function.
  1. Imagine that the user enters the "search" command and get_command returns 1. If the program doesn't use "1" to denote any other command and always uses "1" to represent the search operation, then the program will run the search operation correctly. Even when there are no problems, case 1: does little to inform a reader what that particular case does - 1 is a "magic number" that doesn't have a natural connection to the operation it represents.
  2. The second example is like the first but creates five symbolic constants with an enumeration. get_command still returns the same value but encoded with a statement like return SEARCH;. Each case now has a symbolic constant as its target, clarifying the case's purpose to a program reader: case SEARCH: conveys more meaning than case 1:.
  3. A program reader can surmise the meaning of each case in (a) from the function it calls. But it's easy to imagine situations denying the reader even that meager information.
  4. Symbolic constants don't replace appropriate program comments, but they do make it more self-documenting.

Programming Example 2: Naming Special Values

Unix and early versions of Linux managed file and directory access based on three users, each with three distinct permissions. This technique classifies the users as the file owner (called the user), group, and others and names the three permissions as read, write, and execute. The technique can represent each permission with a single bit: 1 indicates that the user has that permission, and 0 indicates that the user does not. That means that a sequence of nine bits summarizes a given file's access permissions.

enum {	uread = 1,	// 000 000 001	user
	uwrite = 2,	// 000 000 010
	uexe = 4,	// 000 000 100
	gread = 8,	// 000 001 000	group
	gwrite = 16,	// 000 010 000
	gexe = 32,	// 000 100 000
	oread = 64,	// 001 000 000	others
	owrite = 128,	// 010 000 000
	oexe = 256	// 100 000 000
};
Special value names. Working with strings of digits is often tedious and error-prone. One way that programmers solve this problem is to create a set of symbolic constants (i.e., bit masks) that represent each permission. The enumeration explicitly assigns a value to each enumeration element. The value of each element is 2× greater than the previous one, which moves the 1-bit to the left by one place.