4.7. Dynamic Memory: `new` And `delete`

Time: 00:04:08 | Download: Large, Large (CC), Small | Streaming, Streaming (CC) | Slides (PDF)

Review

Some concepts below build on terminology introduced in previous chapters. Please review the following as needed:

Definition (especially Figures 4 and 5)
Variable initialization (especially Figure 9)

Automatic variables were introduced in Chapter 1 and are so named because the memory storing them is automatically allocated when they come into scope (i.e., when a function or method runs) and is automatically deallocated when they go out of scope (when the function or method ends). Functions and methods return in the reverse order of their calls, implying that programs deallocate automatic variables in the reverse order of their allocation. This behavior matches a stack's operation exactly, so C++ programs allocate memory for automatic variables on the runtime stack.

Computer scientists often use the term "dynamic" to describe something that happens while the program runs. So, programs allocate dynamic memory while they run. They also allocate and deallocate memory for automatic variables when they run, but all the information needed for automatic allocation is available when the program is written and compiled. For example, suppose a program needs a certain number of objects to perform its task. If the required number is known when we write the program, we can include that number in the code. But suppose the user inputs the number, or the program calculates it while running. In that case, it must wait until the program runs, determine how many objects it needs, and then dynamically allocate the memory for those objects. Dynamic memory is managed on the heap with two operators: new and delete.

`new`

Dynamic memory is allocated from the heap with the new operator, which returns the address of the memory to the requesting program. The new operator can allocate memory for any data type a program needs, but it is rarely useful when dynamically creating individual chars, ints, or doubles. Alternatively, it is frequently useful to create larger data types such as arrays, structures, and objects (covered in detail in subsequent chapters). The new operator does three important things when it runs:

It allocates memory on the heap
If the newly created data item is an object, it calls the constructor to initialize the memory
It returns the address of the data or object

For example:

C++	Java
char* c = new char; // uncommon double* scores = new double[size]; // common Person* p = new Person; // common	Character c = new Character(); double[] scores = new double[size]; Person p = new Person();
(a)	(b)

C++

Java

char*	 c = new char;			// uncommon
double*	 scores = new double[size];	// common
Person*	 p = new Person;		// common

Character	c = new Character();
double[]	scores = new double[size];
Person		p = new Person();

(a)

(b)

Allocating memory with the new operator. The new operator allocates a memory block on the heap. In C++, the block size may be as small as one byte (the size of a non-wide character). However, it's not common to allocate a single primitive variable with new because the memory overhead of the pointer is a significant percentage of the data itself. In either C++ or Java, the block may be arbitrarily large - not without limits but, practically speaking, very large. Both examples assume that "Person" names a class as described at the beginning of the chapter.

In a C++ program, it's possible to have a pointer to any kind of data. Notice that all of the variable definitions (on the left side of the assignment operator) include an asterisk, which makes the variables pointers. Furthermore, each statement performs two distinct operations: a variable definition (on the left side of the assignment operator) and a variable initialization. The initialization includes a variable name, the assignment operator, and the new expression.
Java only allows pointers to objects - not to simple data types like char or double. An array in Java is an instance (i.e., an object) of an unnamed class.

The stack and the heap come together in the above examples. Each variable appearing on the left side of the assignment operator is an automatic variable allocated on the stack. The new operator allocates memory on the heap. Interestingly, the abstract representation of the stack and heap illustrated in Figure 2 is appropriate for C++ and Java alike (excepting, perhaps, that an instance of class Character in Java is larger than a single char in C++).

The picture illustrates how the statements in Figure 1 (a) and (b) affect memory. A rectangle containing smaller rectangles depicts the stack. The smaller rectangle represents the variables c, scores, and p. Another large rectangle with smaller rectangles inside depicts the heap.
The smaller rectangles represent the data allocated by the new operator. Arrows lead from the variable boxes in the stack to the allocated memory boxes in the heap. — **Automatic variables on the stack pointing to dynamically allocated data on the heap**. It's easier to understand the interplay between pointers and dynamic memory if we recognize that there are *two* memory locations involved: the data memory, allocated on the heap, and the pointer, often (but not always) allocated on the stack. The pointer serves as a name for the dynamically allocated heap memory.

`c` points to simple data like a char, int, or double

`scores` points to a C-string (Chapter 8)

`p` points to a structure (Chapter 5) or an instance of a class (Chapter 9)

`delete`

The C++ and Java statements appearing in Figure 1 are similar and perform the same tasks. One important feature in both languages is that memory allocated with the new operator is independent of the allocating scope. So, unlike a stack, returning from a function that allocates memory with new doesn't automatically deallocate that memory. But C++ and Java programs deal with that memory differently when it is no longer needed. In a C++ program, dynamic memory remains available to a program until the program explicitly deallocates it. To deallocate dynamic memory, C++ provides an operator not found in Java: delete.

delete c;		// deallocates a single char
delete[] scores;	// deallocates an array
delete p;		// deallocates one object

Returning allocated memory to the heap. The example statements deallocating dynamic or heap memory assume the variables defined in Figure 1(a).

To deallocate memory, it is generally sufficient to follow the delete operator with a pointer variable pointing to the memory that we want to return to the heap.
To deallocate an array, we must modify the operator by adding square brackets at the end: delete[]. We'll cover the reason for the extra symbols when we discuss classes in detail in Chapter 9.

`delete` vs. automatic garbage collection

Java doesn't require a delete operator because it has an automatic garbage collector that returns discarded memory to the heap. C++ has no garbage collector, so programmers must explicitly deallocate memory with delete. Suppose a C++ program loses the address of heap memory before deleting it. In that case, that memory becomes garbage and is unusable until the program ends and the operating system reclaims all the program's memory. When a program loses access to dynamic memory (i.e., it creates garbage), programmers say that it has a memory leak. It is easier to write a correct program in Java than in C++, but a correct C++ will always run much faster than the same program written in Java. Java's garbage collector is only one reason for the performance difference, but consider all the garbage collector must do.

Java's garbage collector is based on a "mark and sweep" algorithm involving three distinct steps. It is important to recognize that while the garbage collector runs (i.e., throughout all three steps), the program must suspend all operations performing all problem-solving tasks.

The garbage collector marks as garbage all program data allocated with new
The garbage collector follows every pointer in the program and erases the "garbage" mark in all reachable data
Finally, the garbage collector sweeps up all of the allocated memory still marked as garbage, returning it to the heap

The garbage collector is elegant and makes writing correct code without memory leaks much easier than putting the programmer in charge of memory management. But it also takes much longer to run than simply returning memory to the heap with the delete operator.

Common Programming Errors

It is an error to delete a value that does not point to memory allocated on the heap with the new operator. Both of the following represent common programming errors:

Person* p1;
delete p1;	// p1 not initialized

Person	p;
Person*	p2 = &p;
delete p2;	// p2 points to p, which is allocated on the stack

new

delete

delete vs. automatic garbage collection

`new`

`delete`

`delete` vs. automatic garbage collection