Please review the following as needed:
During my career as a software engineer, I periodically interviewed for new positions. Interviews, especially second interviews, often include engineers who ask technical questions. During an interview, a software engineer wrote the following code fragment on the blackboard. He wrote a few distracting statements in place of the ellipses and then asked, "What's wrong with this code?"
Give yet? Here's the problem: the function get_name returns the address of the local variable name, which the program deallocates when the function returns. What becomes of that deallocated memory depends on the program and the computer running it. The program may reallocate the memory and overwrite it very quickly. However, the program that calls get_name is likely still using (or, more accurately, trying to use) the data stored in name. If the program runs correctly once, there's no guarantee that the next call will succeed. Even if the program consistently runs correctly on one computer, there's no guarantee it will work on another. The following figures illustrate three solutions.
Although they are sometimes tightly coupled, scope and memory allocation are distinct concepts and mechanisms. Programs allocate and deallocate memory for local or automatic variables when they come into and go out of scope, masking the distinction between the two mechanisms and making the above appear to be a scoping problem. The distinction between scope and memory allocation is easier to see if we make name a static
variable, which is the first of three possible solutions. Study each solution carefully to understand the interplay between scope and memory allocation.
There is no one right solution to solve the problem of a function returning a local C-string. There are several possible solutions, each with advantages and disadvantages. So, we can only choose an appropriate solution in the context of a specific problem. As a practicing computer scientist, one of your tasks is weighing the advantages against the disadvantages so that your chosen solution is the best for a given problem.
The static
data solution is straightforward to use, and because it only allocates data once, it's also fast. Nevertheless, static
data does have some disadvantages, albeit mostly minor ones. Early programming languages like FORTRAN didn't have automatic variables; the program allocated the memory for all variables at load time. Modern programming languages that support automatic variables run the same programs using far less memory than those older languages. Making a few variables static
won't greatly increase the program's memory requirements, but making all variables static
certainly would.
Functions that have static
variables cannot be recursive or reentrant. We briefly covered recursive functions in Chapter 6, and reentrancy is a property needed by specialized functions most often seen in operating systems and is beyond this course's scope. Both issues only concern a few specialized functions and are easily ignored. But there is one aspect of static
data that we must understand.
void client() { char* data; data = get_name(data); . . . data = get_name(data); . . . } |
void client() { char* data[10]; data[0] = get_name(data); data[1] = get_name(data); . . . } |
||
(a) | (b) |
static
C-string name. Therefore, each call to get_name overwrites the data stored in name with new data. Depending on how the client (the program that calls get_name) uses the data, this version of get_name may work correctly, or it may have a logical error.
static
variables and is correct.static
data a good choice for solving the initial memory deallocation problem.
char* get_name(int size) { char* name = new char[size]; cin.getline(name, size); . . . return name; } |
|
(a) | (b) |
new
operator allocates memory from the heap rather than from the stack, and the memory remains allocated until the program explicitly deallocates it with the delete
operator. In this version of get_name, the memory allocated by new
is not deallocated when the function ends and remains available to the client calling get_name.
Dynamically allocating memory with the new
operator allows a more general solution (at least for one-dimensional arrays). The function allocates memory based on the size parameter, providing a more "customized" fit for the anticipated data whenever the program calls get_name.
Dynamic data, just like static data, has advantages and disadvantages. The new operator allocates a new block of memory for each C-string that get_name reads and returns. Therefore, each returned C-string has a unique memory location.
void client() { char* data; data = get_name(100); . . . data = get_name(100); . . . } |
void client() { char* data[10]; data[0] = get_name(100); data[1] = get_name(100); . . . } |
||
(a) | (b) |
new
operator. Therefore, subsequent calls to get_name do not overwrite the data returned by the previous call. This behavior solves the problem with static
data presented above but creates a different opportunity for a logical error. The code fragments of parts (a) and (b) are identical to Figure 3, but the picture illustrates the difference between the static
and the dynamic implementations of get_name.
delete
or save the address before calling get_name again.delete
the returned C-strings to prevent them from becoming unrecoverable garbage.
void client() { char data[100]; . . get_name(data, 100); // use data } |
char* get_name(int size, char* name) { cin.getline(name, size); . . . return name; } |
(a) | (b) |
(c) |
cout << get_name(data) << endl;We can find examples of this convenience technique in the C-string API (see the standard versions of strcpy and strcat).
char* get_name(int size, char* name = nullptr) { if (name == nullptr) name = new char[size]; cin.getline(name, size); . . . return name; } |
void client() { char* data = get_name(100); // use data . . . . } |
(a) | (b) |
new
.