9.5. Member Functions and Program Organization

Time: 00:06:50 | Download: Large, Large (CC), Small | Streaming, Streaming (CC) | Slides (PDF)
Review

One of the benefits of classes is that they provide an intermediate scope, which is more restricted than global scope but less restricted than local scope. An intermediate, class-level scope enables programmers to allow some functions to access member data while denying all other functions access to it. Specifically, C++ controls access to member variables with three keywords: public, private, and protected. In this way, classes provide a mechanism for allowing some functions to share member variables without suffering from the problems that inevitably occur when all functions have access to all variables. Unlike the functions appearing in previous chapters, programs must attach or bind member functions to objects when calling them.

The UML class diagram for class Person:
Person
------------------
-name : string
-height : double
-weight : int
------------------
+print() : void A picture of three instances of the Person class. Each object stores values for the three member variables.
object p0:
Dilbert
5.9
175
object p1:
Alice
5.02
120
object p2:
Wally
5.6
190
(a) (b)
Member function calls are bound to objects. We can clearly see the binding in the function calls, but the function's code only implies the binding. We'll see how C++ implements the implied binding later in the chapter.
  1. Class Person defines a member function called print. Notice that no arguments appear inside the parentheses. It is an error to call the function independent of an object: print(); If the program called the function this way, what would the function print?
  2. It is possible to instantiate many objects from a single class. (Referring to the cookie cutter metaphor presented at the beginning of the chapter, it's possible to stamp out many cookies with one cookie cutter.) If p0, p1, and p2 are instances of Person, then the following calls are valid:
    • p0.print();
    • p1.print();
    • p2.print();
    The dot operator temporarily binds the calling object (the operator's left-hand operand) to the called function (the right-hand operand). The bound object becomes the default target of the function. So, during the first function call, the print function is bound to p0 and prints Dilbert, 5.9, and 175. The binding between an object and a member function ends when the call returns.

The current chapter focuses on single-class programs; this section emphasizes how member functions access member variables and the concepts surrounding that access.

Member Selection: Revisiting the dot and arrow operators

Chapter 5 introduced structures and the dot and arrow field selection operators. C++ also uses these operators to select an object's data members (synonymous with structure fields) and extends them to select member functions. The previous figure demonstrated the dot operator binding an instance of the Person class to its print member function. Most client code (code using an object) follows this pattern because member functions are generally public, permitting program-wide access. However, member variables are typically private, preventing direct client code access. Consequently, programs generally access member variables from within member functions, as demonstrated by the following figure.

Programs use the dot and arrow operators similarly, differing only in how they reference objects. We can demonstrate the arrow operator by extending the previous example to include a pointer to a Person object:

Member Functions and Variable Access

It is easiest to explore the relationship between member functions and member variables with an example. To better understand the differences between "normal" and member functions, we begin by reviewing the add function as it appears in the original structure version of the Time example, first introduced in chapters 5 and 6.

Function CallFunction Definition
z = add(x, y);
Time add(Time t1, Time t2)	// struct version
{
	int	i1 = t1.hours * 3600 + t1.minutes * 60 + t1.seconds;
	int	i2 = t2.hours * 3600 + t2.minutes * 60 + t2.seconds;

	return make_time(i1 + i2);
}
The structure version of the add function. The structure version of the add function call has two arguments, x and y. The program passes the arguments to the function's two parameters, t1 and t2. Any expression that accesses a structure field (e.g., hours), requires three parts: a structure variable name (e.g., t1), the dot operator, and a field name.

The next step continues converting Time from a structure to a class. Specifically, it converts the add function into a member of the Time class.

class Time
{
    private:
        int    hours;
        int    minutes;
        int    seconds;

    public:
        Time add(Time t2)
        {
            int    i1 = hours * 3600 + minutes * 60 + seconds;
            int    i2 = t2.hours * 3600 + t2.minutes * 60 + t2.seconds;

            return Time(i1 + i2);
        }
};
class Time
{
    private:
        int    hours;
        int    minutes;
        int    seconds;

    public:
        Time add(Time t2);
};
 
 
 
 
 
 
(a) (b)
Member function options. The Time structure was previously converted into a C++ class. The two partial class specifications presented here demonstrate two options:
  1. Placing the body of a function inside the class (as is always done in a Java program) is appropriate for small functions (generally less than three or four statements). Defining a function inside a class makes the function inline by default without using the inline keyword.
  2. Placing just a function prototype inside the class is common when the function contains more than three or four lines of code or a loop. When we prototype a function in a class, we write the complete function in a separate source code file (see "Program Organization Summary" below)

The class version of the add function seems to be missing the first argument, t1, and all references to it in the body of the function! How can you add one of anything? This puzzle is an object-oriented optical illusion - the class version of the add function does have two Time arguments. Whenever a program calls a member function, the function is called through or bound to an instance of the class that defines the function, and that object is passed to the function where it becomes the default target of the function's operations. The next figure illustrates this important concept.

Function CallFunction Definition
z = x.add(y);
Time Time::add(Time t2)
{
    int    i1 = hours * 3600 + minutes * 60 + seconds;
    int    i2 = t2.hours * 3600 + t2.minutes * 60 + t2.seconds;

    return Time(i1 + i2);
}
(a) (b)
Member variable access. In contrast to the structure version of the add function (Figure 2), the class version seems to have only one argument, which appears inside the parentheses. However, the function still adds two Time class objects. The second, seemingly invisible, Time argument is the object to which the add function is bound. The selection operator (the dot or the arrow) temporarily binds member functions to objects.
  1. Only one argument, y, is explicitly passed to the add function in the argument list (i.e., inside the parentheses). But the other argument, x (highlighted in green), is passed to add, but only implicitly (the object-oriented function call syntax implies passing x to the add function). It is easier to see both objects, x and y, in the function call than in the definition.
  2. Member functions prototyped in class (Figure 3(b)) are typically defined in a separate source code (i.e., a .cpp) file. In this case, the function definition must include syntax (highlighted in yellow), tying it to the class. Instances of the Time class have three member variables or fields (Figure 3(a)), and the add function can access all member variables belonging to both arguments. In the expression
    t2.hours * 3600 + t2.minutes * 60 + t2.seconds
    the dot operator binds the explicit argument t2 to the member variables hours, minutes, and seconds. But in the expression
    hours * 3600 + minutes * 60 + seconds
    the member variables seem unattached to any object. Whenever a member variable is referenced without being explicitly tied to an object, the reference is made to the object bound to the function by default. So, in the calculation of i1, the member variables belong to the object named x, which is the object bound to add by the function call (a).

The this pointer

Whenever a program passes an argument to a function, the function must define a parameter (i.e., a local variable) to hold the argument value. In the case of member functions, the function is temporarily bound to the calling object, and the calling object's address is passed by pointer to the function. The compiler automatically creates a local pointer variable named this to store the calling object's address. So, we rewrite the first function statement as
int  i1 = this->hours * 3600 + this->minutes * 60 + this->seconds;
It is common practice to call the object bound to a member function "this object," "this argument," or "the implicit argument."

Both objects are clearly visible in the member function call but only one object is explicitly visible in the function definition. Nevertheless, both objects are present in the member function definition, but one object is implied or implicit in how member functions work. From this point forward, it will be convenient to label objects as either implicit or explicit. Doing this warrants a bit of explanation.

Implicit vs. Explicit

These two terms are often confusing, so let's begin by defining them in general before we use them to describe features of an object-oriented program:

Now, we are better positioned to use the terms to help us understand member functions and arguments.

Object Labels (Implicit vs. Explicit Arguments)

Next, let's explore how we use the terms "implicit" and "explicit" to label objects in a member function.

We label the objects involved in member functions based on their role or position in the function call so we can more easily talk about them. This chapter uses two names: implicit and explicit argument (or implicit and explicit object). The following figure illustrates how these arguments are related to a function call.

Bar foo();
Bar foo(Bar t2);
Bar foo(Bar t2, Bar t3);
x.foo();
x.foo(y);
x.foo(y, z);
(a) (b) (c) (d)
Implicit and explicit arguments. To help illustrate the terms implicit and explicit, assume that:
  1. Prototypes for three overloaded functions
  2. x is the implicit argument and there are no explicit arguments
  3. x is the implicit argument and y is an explicit argument
  4. x is the implicit argument and y and z are both explicit arguments
The implicit argument is also called the implicit object, the default argument, or the default object. Java programmers call the implicit argument "this object," and experienced C++ programmers also understand the term. Notice that "default" as used here is different and distinct from the default arguments introduced in chapter 6.

Summary

The program temporarily binds function calls to the implicit argument (or object), which is always on the left side of the dot or arrow operator. If there are explicit arguments, the program passes them inside the parentheses. We can see the explicit parameters in the function definition and the corresponding arguments in the function call. We can also see the parameter's name in each function statement using the parameter. However, the implicit argument seems to be missing; nevertheless, it's there, its presence implied by the member function rules. C++ programs pass the implicit object's address to the automatically defined pointer variable named this. A picture clarifying the difference between 'implicit' and 'explicit' objects or arguments. In the function call o1.function(o2, o3, ..., on), object o1 is implicit while the objects o2 through on are explicit. The function definition makes it easier to see why we use these terms to describe the objects participating in a function call. The program can access the members of the implicit object with only the member's name, but it must use both the parameter and member name to access any parameter's members. The rules of object-oriented functions imply access to the implicit object's members. Alternatively, the program must use the names of the explicit objects, the arguments in the parentheses, o2 through o-n, to access their members. For example, c2.member.

In a sense, we are right back where we began our discussion of implicit and explicit, but in a better position to add some critical detail. All the objects or arguments are visible in a member function call but not in a function definition. The compiler builds member function calls to bind the call to an object automatically, but the mechanism doesn't use the name of the calling object. Nevertheless, the member function may access the member variables of every object involved in the function call, including the implicit and all of the explicit objects. But what about member variables that are private? The keyword private secures member variables at the class level but not at the object level. An example is the easiest way to illustrate and understand these concepts, so we return to the Time class and the add function introduced above.

Time y;
Time z;
	. . .
Time x = y.add(z);
(a)
Time add(Time t2)
{
	int	i1 = hours * 3600 + minutes * 60 + seconds;
	int	i2 = t2.hours * 3600 + t2.minutes * 60 + t2.seconds;

	return Time(i1 + i2);
}
(b)
Objects and member functions.
  1. Calling the Time add function. The add function is bound to y while it runs. y becomes an implicit object or implicit argument in the add function.
  2. The definition of the Time add function (as it appears inside of the class).

The function call copies object z to the parameter t2, which is explicit (i.e., clearly visible) in the function definition. It copies object y to the implicit add parameter, this. Object-oriented member functions automatically define this, making its presence in the function generally invisible and otherwise unnamed. When the add function needs to access a member variable that belongs to t2 (which is a copy of z), it explicitly names t2 and then uses the dot operator to select the specific member variable: t2.hours. When the add function needs to access a member variable that belongs to the implicit object (which is a copy of y), it does so by simply naming the member variable - the access doesn't need an object name or a selection operator: hours. (If desired, programmers can use an extended notation: this-<hours, but it isn't required.)

Program Organization

As we saw with structures in Chapter 4, we can make it easier to reuse a data structure if we separate the specification (the structure specification and the function prototypes) from the implementation (i.e., the function definitions). This organization is also routine with classes. Take, for example, the string class. If we write a program that needs to use the string class, we add #include <string> to the top of our program, which includes the class specification and the string function prototypes, in our program. The C++ compiler system places the string class function definitions in a separate source-code file, compiles them to machine code, and stores the machine code in a library. The linker (Windows) or the loader (Unix/Linux/macOS) extracts the string function machine code from the library and incorporates it with our programs when we compile them.

We followed this organization in the chapter 6 version of the Time example, which consisted of three files: Time.h, Time.cpp, and driver.cpp. Converting the Time example from a structure-based to a class-based program requires

  1. converting the struct to a class,
  2. converting the functions to class member functions, and
  3. updating the driver to reflect the different syntax used to call the member functions.

Finally, there is one last, fortunately small, change that we need to make when we define a member function outside of the class:

Time Time::add(Time t2)
{
	int	i1 = hours * 3600 + minutes * 60 + seconds;
	int	i2 = t2.hours * 3600 + t2.minutes * 60 + t2.seconds;

	return Time(i1 + i2);
}
The add function defined outside the class. The two colons appearing together, ::, form the scope resolution operator. In essence, the highlighted code above says that "the Time class defines the add function in its scope." When we define a member function outside of its class, we must still prototype it in the class, which means that it is not possible to add Time:: to just any function and thereby gain access to the private member variables. Furthermore, imagine you neglect the class name and the scope resolution operator when writing a member function outside the class. In that case, it is not a class member, so the compiler flags any attempt to access private member variables as a compile-time error.

If a function is small, we can include its body in the class. Doing this creates an inline function but without the inline keyword. The question is, "What is a small function?" There is no absolute size, but a good rule of thumb is that one to three statements constitute a small function, especially if they are short. More than seven statements are too many for a function to be considered small. (Attempting to inline large functions causes the final executable to become unnecessarily large; see Figure 2(b).) Furthermore, any attempt to create an inline function is merely a suggestion that the compiler may silently ignore.

The examples at the end of the chapter include the complete class version of the Time example.

Program Organization: Inline vs. Non-inline Functions

It's appropriate to define short member functions inside a class, but it is better to only prototype longer functions inside the class and define them outside the class. There are two reasons for following this practice: First, it makes class specifications shorter and easier to read, and second, C++ automatically implements functions defined inside the class as inline functions (compare Figures 1 and 2). (However, even when we use the inline keyword, inlining is only a suggestion that the compiler may choose to ignore.) No clear rule distinguishes a "short" function from a "long" one, but a reasonable rule of thumb is that functions with one to three statements are short, and functions with more than six are long. The size and complexity of the statements themselves govern in the gray area of four to six statements. Finally, we must include the class name and the scope resolution operator to define a member function outside its class.

File Defined As An Inline Function Defined Out Of Class
In The Class Inline Keyword
Header
(.h)
Class Foo
{
    public:

        void function()
        {
	    . . . .
        }
};
 
 
Class Foo
{
    public:

        void function();
};

inline void Foo::function()
{
    . . . .
}
class Foo
{
    public:

        void function();
};
 
 
 
 
 
Source
(.cpp)
   
void Foo::function()
{
    . . . .
}
 a.ia.iib
Member function definitions: inside vs. outside the class. Programmers can appropriately define small functions in a header file, but they should define larger functions in separately compilable source code files (template functions, Chapter 13, are an exception). Programmers typically name header and source code files for the class they implement.
  1. C++ offers programmers two options for specifying inline functions. Inline functions are only accessible in source code files that #include the defining header file. Both options are only appropriate for small functions.
    1. The compiler inlines functions defined inside a class by default.
    2. Alternatively, programmers can move member function definitions below the class specification and add the inline keyword. Although the function definition is in the header file with the class specification, the class name and scope resolution operator are still necessary.
  2. Programmers can prototype functions in the class specification but define them outside. The C++ syntax requires programmers to bind the definition to the class with the class name and the scope resolution operator (highlighted in yellow).