During my career as a software engineer, I periodically interviewed for new positions. Interviews, especially second interviews, often included engineers who asked technical questions. During two interviews, at different times and with different companies, I was given the following function, with various distracting statements in place of the ellipses, and asked, "What's wrong with this code?" 1 Can you identify the problem?
char* get_name()
{
char name[100];
cin.getline(name, 100);
.
.
.
return name;
}
char name[100]; is a valid character array definition. The getline adds the null-termination character, making the array a valid C-string.char*, matches the returned expression, return name;, because name (without any brackets) is a character pointer.The problem with the above function is the scope of the C-string, name. It is defined and its memory allocated in get_name, making it a local variable in the function's scope. The function returns the variable's address, but the program deallocates the memory when the function returns. What happens to the deallocated memory depends on the program and the computer running it. The computer may reallocate the memory and overwrite its contents very quickly. However, the program that calls get_name is likely still using (or, more accurately, trying to use) the data stored in name. If the program runs correctly once, there's no guarantee that the next call will succeed. Even if the program consistently runs correctly on one computer, there's no guarantee it will work on another. The following figures illustrate three solutions.
Understand: scope != allocation
Although they are sometimes tightly coupled, scope and memory allocation are distinct concepts and mechanisms. Programs allocate and deallocate memory for local or automatic variables when they come into and go out of scope, respectively. In the case of local variables, the tight coupling of scope and memory management blurs the boundaries distinguishing them, making them appear as one. However,static variables, the first solution presented, make the distinction more apparent. Study each solution carefully to understand the interplay between scope and memory allocation.
There isn't a single or even a best solution for the scoping problem. Each described solution has a set of unique outcomes that, if misunderstood, can lead to further failures. However, we shouldn't consider the outcomes as programming errors or language deficiencies, but as the result of our design choices. Various data structures, such as stacks, lists, and trees, are useful examples: some support efficient insertion, while others prioritize efficient searching. A common software development task is aligning a structure's behaviors with the needs of a given problem. As practicing computer scientists, our task is to understand the ramifications of each approach, enabling us to identify a satisfactory match for a given problem.
char* get_name()
{
static char name[100];
cin.getline(name, 100);
.
.
.
return name;
}
static variables when the operating system loads them into memory, and that memory remains allocated until the program ends. The variable name does go out of scope when the function ends, but the memory it names remains allocated and dedicated to the program. Since the function returns the address of name, the data at that address is accessible indirectly through it. For example:
char* line = get_line(); cout << line[3] << endl;
The static data solution is relatively easy to understand and straightforward to use. Although rarely a liability, it does have some limitations. First, it uses more memory than non-static variables. Early programming languages like FORTRAN didn't have automatic variables; instead, programs allocated memory for all variables at load time. ALGOL introduced the concept and term "automatic variable," allowing equivalent programs written in ALGOL-derived languages to use less memory than those written in older languages. Making a variable static does increase the program's memory requirements, but not significantly.
Second, functions that have static variables cannot be recursive or reentrant. We briefly covered recursive functions in Chapter 6, and reentrancy is a property only needed by specialized functions, most often seen in operating systems, and beyond the scope of this course. Nevertheless, the final imitation is more concerning.
void client()
{
char*
|
void client()
{
char*
|
|
|
| (a) | (b) | ||
static variables.static data a good choice for solving the initial memory deallocation problem.
char* get_name(int size)
{
char* |
![]() |
| (a) | (b) |
new operator (green) allocates memory from the heap rather than the stack. Heap memory offers two distinct advantages: First, the program doesn't deallocate heap memory when the allocating function, get_name in this example, returns. Second, heap memory is more flexible than stack memory. Whereas programs must specify the size of stack memory with a compile time constant, programs can specify the size of heap data at runtime. This version of get_name capitalizes on this property by allowing the client program to specify the length of the allocated C-string.
void client()
{
char* data;
data = get_name(100);
. . .
|
void client()
{
char*
|
|
|
| (a) | (b) | ||
new operator. Therefore, subsequent calls to get_name do not overwrite the data returned by the previous call. This behavior replaces the problem with static data presented above with the potential for creating a logical error.
The final solution solves the scoping problem by moving the data's definition, and therefore its scope, from the supplier to the client (i.e., the function calling the supplier). The client calls the supplier, passing the data by pointer. We can visualize a sequence of chained function calls as a path through a program. In this visualization, while one function calls another, the first remains "active" or running, and consequently, its local or automatic data remain allocated. The supplier accesses the client's data indirectly through its parameter.
| client / Calling Scope | get_name / Supplier |
|---|---|
void client()
{
|
char* get_name(int size, |
| (a) | (b) |
![]() | |
| (c) | |
cout << get_name(data) << endl;The C-string API provides examples of this convenience technique: see the standard versions of strcpy and strcat.
char* get_name(int size, char* name = nullptr)
{
if (name == nullptr)
name = new char[size];
cin.getline(name, size);
. . .
return name;
}
|
void client()
{
char* data = get_name(100);
// use data
.
.
.
.
}
|
void client()
{
char* data = new char[200];
get_name(200, data);
// use data
.
.
.
}
|
| (a) | (b) | (c) |