6.3.5. Function Return Part 2

Time: 00:06:33 | Download: Large, Large (CC), Small | Streaming, Streaming (CC) | Slides (PDF)

Review

auto and static (see especially the block in yellow)
Three characteristics of a variable (Figure 3, especially c.i)
An abstract representation of a pointer (Figure 4)
new and delete
The stack and heap
Dereferencing a pointer
Structure assignment (Figure 4)
Structure function return value (Figure 6)

In the same way, C++ has three ways of passing data into functions; it also has three ways for functions to return their results with the return operator. The operator can return a value formed by an expression, a pointer, or a reference. However, returning a local variable (i.e., a variable defined inside the function) by pointer or reference is problematic. The following pointer and reference sections devote considerable space to the problem and its solution.

Each example consists of two functions: a supplier and a client. The supplier represents a function completing a useful but otherwise abstract task. It creates and returns a part object, simulating the behavior of an authentic function. The client has a single concrete statement calling the supplier and assigning the returned data. The following figures refer to the same part structure the previous passing sections used:

struct part
{
	char	type;
	int	id;
};

Return-By-Value

A function creates an object named 'a' inside a function named 'supplier' and initializes its fields to 'd' and 10. The function 'client' returns the object to a function named 'supplier.' The return statement and the assignment operation copy 'a' from 'supplier' to 'x' in 'client.' — **Return-by-value**. Return-by-value works similarly to pass-by-value: the value in the local `supplier` variable, `a`, is *copied* as the function's return value, and is deallocated when the function ends. The assignment operator saves the returned copy in the `client` variable `x`.

**Returned:** data *copy*

**Notes:**

The default return mechanism

The function returns a *copy* of the data value

Any valid data type, including structures and classes, may be returned

Return-By-Pointer

Two squares labeled 'x' and 'a' represent two variables. The square for the pointer variable, 'x,' has an arrow pointing to the square representing the structure variable (i.e., object) 'a.' The picture shows the square for variable 'a' drawn with dashed lines suggesting that the program deallocates its memory when the function ends. — **Incorrect return-by-pointer**. Functions can return data by pointer, but programmers must be careful with *what* they return. The function is *syntactically* correct and compiles. Nevertheless, this version has a *logical error*. Some but not all compilers will issue a non-fatal warning for the `return &a;` statement.

The function returns the *address* of a `part` object, so the return type, `part*`, is correct. But the local variable `a` is deallocated when the function ends, so `x` points to deallocated memory. Once the program deallocates memory, it is available for reallocation, overwriting any previously saved data.

The arrow represents a pointer: `x` points to a deallocated `part` object (represented by the dashed-line box).

The picture represents three data values with boxes. The supplier function creates a part object with the new operator and saves its address in variable a. The client defines variable b, saving the address returned by the supplier. — **Correct return-by-pointer**. Both example programs are syntactically and *logically* correct.

Memory allocated on the heap with the `new` operator is not deallocated when the function ends. The local variable `a` goes out of scope when the function returns, but the `client` saves the address in `x`. Programmers must remember to `delete` the heap memory when the program no longer needs it, lest it becomes a memory leak.

The small boxes represent the local variables `a` and `x`, the large box represents the new `part` object, and the solid arrows represent the pointers to the object. The `supplier` returns the address saved in `a` to the `client`, which saves it in `x` - indicated by the dotted arrow. The dashed-line box indicates that the program deallocates `a` when the function ends, but the object remains allocated and usable.

This example adds the `static` keyword to the code presented in Figure 2, creating a solution without a logical error. C++ does not allocate `static` variables on the stack or the heap; it allocates their memory when the operating system loads the program into memory, remaining allocated until the program ends. The local variable `a` goes out of scope when the function ends, but its memory is *not* deallocated and retains the saved data. `supplier` returns the object's address to `client`, which saves it in the pointer `x`. Although `a` is not in scope, it remains in memory, accessible indirectly through `x`.

Compare to Figure 2(b). `x` points to `a`, which the solid box suggests remains allocated.

**Returned:** the *address* of data

**Notes:**

Return-by-pointer is a special case of return-by-value - the value returned is an address

The returned address must point to `static` or dynamic (allocated with `new`) data

Two squares representing two variables. The square for the pointer variable, x, has an arrow pointing to the square representing the structure variable a. — **Correct return-by-pointer**. Both example programs are syntactically and *logically* correct.

Memory allocated on the heap with the `new` operator is not deallocated when the function ends. The local variable `a` goes out of scope when the function returns, but the `client` saves the address in `x`. Programmers must remember to `delete` the heap memory when the program no longer needs it, lest it becomes a memory leak.

The small boxes represent the local variables `a` and `x`, the large box represents the new `part` object, and the solid arrows represent the pointers to the object. The `supplier` returns the address saved in `a` to the `client`, which saves it in `x` - indicated by the dotted arrow. The dashed-line box indicates that the program deallocates `a` when the function ends, but the object remains allocated and usable.

This example adds the `static` keyword to the code presented in Figure 2, creating a solution without a logical error. C++ does not allocate `static` variables on the stack or the heap; it allocates their memory when the operating system loads the program into memory, remaining allocated until the program ends. The local variable `a` goes out of scope when the function ends, but its memory is *not* deallocated and retains the saved data. `supplier` returns the object's address to `client`, which saves it in the pointer `x`. Although `a` is not in scope, it remains in memory, accessible indirectly through `x`.

Compare to Figure 2(b). `x` points to `a`, which the solid box suggests remains allocated.

**Returned:** the *address* of data

**Notes:**

Return-by-pointer is a special case of return-by-value - the value returned is an address

The returned address must point to `static` or dynamic (allocated with `new`) data

Return-By-Reference

A square representing a variable passed by reference. The single square is labeled both a and x, suggesting that one variable has two names. The picture draws the square with a dashed line denoting that the program deallocates the object when the function ends. — **Incorrect return-by-reference**. Functions can also return data by reference but require programmers to exercise caution again. This version is *syntactically* correct, compiling without error, but having a *logical* error. Also, like return-by-pointer, some but not all compilers will issue a non-fatal warning at the `return a;` statement.

The compiler makes references by mapping two variable names, each in a different scope, to the same memory address. In this example, it maps the variable `x` in the `client` to the memory location for the variable `a` in the `supplier`. But `a` is deallocated when the `supplier` ends, leaving `x` to refer to deallocated memory.

The `supplier` defines the local variable `a`, which the program deallocates when the function ends (denoted by the dashed-line box), leaving `x` naming or mapped to deallocated memory.

A square representing an object returned by reference. The single square is labeled a and x, suggesting the object has two names. The program defines the variables in different scopes, 'a' in 'supplier' and 'x' in client. Making variable 'a' static prevents the program from deallocating it when the function returns. — **Correct return-by-reference**. Return-by-reference is difficult to understand and challenging to illustrate. We can often replace it with return-by-value with little effort. However, return-by-reference has one peculiar advantage that return-by-value does not. A function performing a return-by-value can only act as an r-value expression, but one returning a reference can also act as an l-value. We'll be better able to take advantage of this feature when we study object-oriented programming and overloaded operators.

Carefully compare this `supplier` function with Figure 4(a), noticing the addition of the `static` keyword, preventing the memory deallocation when the function ends and retaining the saved data between function calls. Although the program defines the *names* `a` and `x` in different scopes, they refer to the same memory location, which remains allocated throughout the program execution.

The solid-line box in this figure replaces the dashed-line box of Figure 4(b), suggesting that object `a` remains deallocated when function `supplier` ends.

(Please initially ignore the highlighted statements.) The `new` operator allocates memory on the heap, which remains allocated until the program explicitly deallocates it with the `delete` operator. The dereference operator, `*`, and variable `a` form an expression whose value is the new object. In conjunction with returning a reference and the deference operation, the assignment operation in `client` maps the local variable `x` to the new object.
The highlighted statements find and print the objects' addresses. The dereference operator, `*`, has a higher precedence than the address-of operator, `&`, and operates first. The expression `*a` produces an object, and `&` gets its address. In `client`, `&x` gets the address of `x`. The two `cout` statements print the same address, demonstrating that the object resulting from the dereference operation in `supplier` is that same object named `x` in `client`.

Two views of the new object and the pointer and reference variables. The first illustrates pointer `a` pointing at the new object. The second, following the function return and assignment operation, illustrates that the program has deallocated `a` and mapped `x` to the address of the object form by the dereferencing operation.

Two squares, one small and the other large, representing a pointer variable and an object allocated on the heap with new, respectively. An arrow from the small to the large box represents a pointer variable pointing to an object. Together, the dereference operation and return-by-reference map x to the object's address. — **Correct return-by-reference**. Return-by-reference is difficult to understand and challenging to illustrate. We can often replace it with return-by-value with little effort. However, return-by-reference has one peculiar advantage that return-by-value does not. A function performing a return-by-value can only act as an r-value expression, but one returning a reference can also act as an l-value. We'll be better able to take advantage of this feature when we study object-oriented programming and overloaded operators.

Carefully compare this `supplier` function with Figure 4(a), noticing the addition of the `static` keyword, preventing the memory deallocation when the function ends and retaining the saved data between function calls. Although the program defines the *names* `a` and `x` in different scopes, they refer to the same memory location, which remains allocated throughout the program execution.

The solid-line box in this figure replaces the dashed-line box of Figure 4(b), suggesting that object `a` remains deallocated when function `supplier` ends.

(Please initially ignore the highlighted statements.) The `new` operator allocates memory on the heap, which remains allocated until the program explicitly deallocates it with the `delete` operator. The dereference operator, `*`, and variable `a` form an expression whose value is the new object. In conjunction with returning a reference and the deference operation, the assignment operation in `client` maps the local variable `x` to the new object.
The highlighted statements find and print the objects' addresses. The dereference operator, `*`, has a higher precedence than the address-of operator, `&`, and operates first. The expression `*a` produces an object, and `&` gets its address. In `client`, `&x` gets the address of `x`. The two `cout` statements print the same address, demonstrating that the object resulting from the dereference operation in `supplier` is that same object named `x` in `client`.

Two views of the new object and the pointer and reference variables. The first illustrates pointer `a` pointing at the new object. The second, following the function return and assignment operation, illustrates that the program has deallocated `a` and mapped `x` to the address of the object form by the dereferencing operation.

Returning Non-Local Data

The previous examples defined a variable in the supplier function and returned it to the client. When returning the variable by pointer or reference, the challenge was avoiding pointing or referring to deallocated memory. Programmers can combine pass and return by pointer or reference, forming another solution to this problem.

By-Pointer	By-Reference
part* supplier(part* a) { a->type = 'd'; a->id = 10; return a; } void client() { part y; part* x = supplier(&y); }	part& supplier(part& a) { a.type = 'd'; a.id = 10; return a; } void client() { part y; part& x = supplier(y); }
(a)
void my_function(part* p) { ... } my_function(supplier(&y));	void my_function(part& p) { ... } my_function(supplier(y));
(b)
*supplier(&p) = r;	supplier(p) = r;
(c)

Returning non-local data. The client function defines the data, variable y, and passes it as an argument to the supplier. The supplier completes its task, saving the result in y and returning it to the client. The return operation is superfluous because pass-by-pointer and by-reference are INOUT passing methods, allowing data to flow into and out of a function through the argument/parameter pair. Nevertheless, returning a non-local variable is a convenient and frequently used technique.

Any changes that supplier makes to parameter a are automatically reflected in argument y in the client. So, after supplier returns, the values in x and y are the same. The variable name a goes out of scope when the supplier ends, but the memory allocated in client remains allocated.
A function call to supplier is an expression because the function returns a part object. This example demonstrates how programmers can embed one function call inside another, effectively chaining the function calls and illustrating the benefit of returning non-local data.
Returning by-pointer or by-reference allows programmers to use the function call as an l-value (a value appearing on the left side of an assignment operator). While useful cases of the pointer version are rare, the reference version is quite useful, as we'll see in Chapter 10.

part* supplier() { part a = { 'd', 10 }; return &a; } void client() { part* x = supplier(); }
(a)	(b)

part* supplier() { part* a = new part { 'd', 10 }; return a; } void client() { part* x = supplier(); }
(a)	(b)
part* supplier() { static part a = { 'd', 10 }; return &a; } void client() { part* x = supplier(); }
(c)	(d)

part& supplier() { part a = { 'd', 10 }; return a; } void client() { part& x = supplier(); }
(a)	(b)

part& supplier() { static part a = { 'd', 10 }; return a; } void client() { part& x = supplier(); }
(a)	(b)
part& supplier() { part* a = new part {'d', 10 }; cout << &a << endl; return a; } void client() { part& x = supplier(); cout << &x << endl; }
(c)	(d)