4.6. Memory Management: Stack And Heap

Time: 00:04:03 | Download: Large, Large (CC), Small | Streaming, Streaming (CC) | Slides (PDF)

The chapter previously asserted that C++ can create objects in two distinct ways. It demonstrated the syntax for both ways and abstractly illustrated the differences in memory (see Figure 4). The two demonstrated statements allocate the object's memory from different regions of the program's memory. To appreciate where and how programs allocate object memory, we must understand how they organize and manage it.

Memory Allocation

"To allocate" means to assign, allot, distribute, or "set apart for a particular purpose." Programs manage their memory by partitioning or dividing it into separate regions that perform specific tasks. Two of those regions are the stack and the heap. When a program needs memory for data or variables, it allocates it from the stack or heap. It deallocates the memory when it's no longer needed, returning it to the allocating region where it is again available for allocation.

When an operating system (OS) runs a program, it begins by loading it into memory. Memory is used for the program's machine instructions and for the data that the program uses. When I first created Figure 1, computers typically used a memory allocation technique called segmented memory. When the OS loaded and ran a program on a segmented-memory computer, it allocated a contiguous block or segment of memory to the program. The program divided its memory into regions that performed specific program functions. Although this memory management technique is now obsolete, having been replaced by paged memory, programs continue to organize their memory based on the functional units illustrated here.

Paged memory computers manage memory dynamically, so the amount of memory allocated to a program can increase and decrease as the program's needs change. Memory is allocated to the program and reclaimed by the OS in fixed-size "chunks" or blocks called pages. When the OS loads a program on a paged-memory computer, it initially allocates a minimal number of pages to the program and allocates additional memory pages as needed. Machine code and data that are not immediately needed are not loaded, and the OS can reclaim pages storing machine code and data not recently used. Although Figure 1 no longer represents the physical layout of memory, it accurately represents the functional or logical organization.

An abstract representation of the functional units of memory managed by a running program. The units include the heap, the stack, memory to hold any global or static variables, and the text segment that holds the machine instructions.
The functional memory organization of a running program.
  1. The text area contains the program's machine instructions (i.e., the executable code).
  2. Global variables are defined in global scope outside of any function or object; static variables have the keyword static included as part of their definition. The OS typically allocates the memory holding global and static variables at program startup.
  3. Segmented memory systems allocate the illustrated memory en bloc, where it remains unused until the stack or heap grows into it. Paged memory systems allocate the space in page-sized blocks when the stack or heap grows.
  4. The stack (sometimes called the runtime stack) contains all the automatic (i.e., non-static) variables.
  5. Memory is allocated from and returned to the heap with the new and delete operators, respectively.
A program maintains data in three logical memory regions. The OS allocates the first region for global and static data when it loads the program into memory and doesn't deallocate it until the program terminates. The other two regions, the stack and heap, are more dynamic. The program maintains some free space to satisfy requests to grow these regions and can request more memory from the OS when needed. The differences between these two regions are the algorithms that manage the memory and how they behave.

Memory allocation occurs at multiple levels, requiring us to use the term "allocate" in two ways. First, operating systems manage all computer resources, including main memory. So, the OS allocates physical memory to a running program in pages, but this operation is completely transparent to and beyond the control of programmers.

Second, running programs allocate memory for program data. The C++ runtime system (see Figure 4, item 3) includes a memory management subsystem in every C++ program. It allocates memory for automatic variables on the stack and dynamic variables, created with the new operator, on the heap. When the program deallocates memory, the memory management system returns it to its origin - the stack or heap.


A stack is a simple last-in, first-out (LIFO) data structure. Imagining a stack of plates in a cafeteria is a common way to introduce stacks to new computer scientists. In the cafeteria example, plates are added to or removed from the stack only at the top. Upon entering the cafeteria, one of our first actions is to pop a plate off the top of the stack before going through the serving line. Perhaps unrealistically, we require the dishwashers to push each clean plate on top of the stack, one at a time. In this way, the last clean plate pushed on the stack is the first plate that a customer pops off. Inserting or removing a plate from the middle of the stack is not permitted. Stacks must support at least two operations: push and pop; other operations are possible but are not required.

An animation illustrating items pushed onto and popped off a stack. Four colored rectangles physically move, one at a time, to the top of the stack (that is, they pile on top of each other), and then the rectangles move, one at a time, from the stack's top. An animation in which nothing moves physically; instead, memory values change, represented by changing the colors of rectangles within the stack area (the colors change from bottom to top to represent the push operation and then from top to bottom to represent the pop operation.
Demonstrations of stack behavior.
  1. Items (e.g., plates) being pushed onto and popped off a stack. The items are popped off in the reverse order in which they were pushed onto the stack.
  2. Nothing moves physically when data items (variables or objects) are pushed onto or popped off of a stack in memory. Only the values stored in the memory managed by the stack are changed, as illustrated here. It's possible and common to intermix push and pop operations.
Both animations oversimplify stacks by showing that all the elements stored on the stack are the same size. The data items pushed on the runtime stack may be any convenient size. Nevertheless, stacks are somewhat rigid, only allowing access at the top. But this rigidity also makes stacks easy to implement, making the push and pop operations fast and efficient.


The heap is more flexible than the stack. Whereas the stack only allows allocation and deallocation at the top, programs can allocate or deallocate memory anywhere in a heap. Furthermore, programs must return memory to the stack in the opposite order of its allocation, but they can return memory to the heap in any order. That means it's possible to have "holes" in the middle of the stack - unallocated memory surrounded by allocated memory. To see the difference, compare Figures 2 and 3.

The heap abstractly represented as a rectangle. The animation inserts colored rectangles to simulate memory allocation. The animation removes the colored rectangle to simulate memory deallocation. Two adjacent colored blocks are initially still separated by a line; coalescing the blocks into a single block is demonstrated by removing the separating line.
Demonstration of heap behavior. The heap allocates memory by finding and returning the first memory block large enough to satisfy the request. Memory is returned or freed in any convenient order. When the program deallocates or releases two adjacent memory blocks, the heap merges them to form a single block. Doing this allows the heap to meet future demands for large memory blocks. The cross-hatched block illustrates a request for a large block (twice the size of the colored blocks) of memory.

There is only one restriction on the memory allocated to the program from the heap: it must form a contiguous block large enough to satisfy the request with a single chunk of memory. This restriction increases a heap's complexity in at least two ways: First, the code carrying out the allocation operation must scan the heap until it finds a contiguous block of memory large enough to satisfy the request. Second, when the program returns memory to the heap, the memory manager must coalesce it with any adjacent free blocks to better accommodate future requests for large blocks of memory. The heap's increased complexity means managing memory with a heap is slightly slower than with a stack. But a heap also has advantages that justify the increased overhead. The next section explores those advantages and illustrates how the stack and the heap work together to manage complex data in a running program.