4.6 Memory Management: Stack And Heap

Time: 00:04:03 | Download: Large Small | Streaming | Slides (PDF)
Memory Allocation

"To allocate" means to assign, allot, distribute, or "set apart for a particular purpose." Programs manage their memory by partitioning or dividing it into different units that perform specific tasks. Two of those units are the stack and the heap, which manage the program's unused memory and allocate it for different kinds of data or variables. When the memory is no longer needed, it may be deallocated. Deallocated memory is returned to its original source, either the stack or the heap, and is available to be reused for different data.

When an operating system (OS) runs a program, it must first load the program into main memory. Memory is used both for the program's machine instructions and for the data that the program uses. At the time that Figure 1 was created, computers typically used a memory allocation technique called segmented memory. When the OS loads a program on a segmented-memory computer, it allocates to the program a contiguous block or segment of memory and the program can use no more memory than this. As the program's memory was limited, it was necessary to divide it into regions that performed specific functions within the program. Although this memory management technique is now obsolete, programs continue to organize their memory based on the functional units illustrated here.

Contemporary systems are based on a more flexible memory allocation technique called paged memory. These systems manage memory dynamically; that is, the amount of memory allocated to a program is allowed to increase and decrease as the program's needs change. Memory is allocated to the program and reclaimed by the OS in fixed-size chunks called pages. When the OS loads a program on a paged-memory computer, it initially allocates a minimal number of pages to the program, but the OS will supply additional memory as it is needed, and that memory is added to either the heap, the stack, or the machine instructions area. Any machine code or data that are not immediately needed are not not initially loaded, and pages of memory containing machine instructions or data that have not been used recently may be returned to the OS to be reallocated to other programs that currently need more memory. Although Figure 1 no longer represents the physical layout of memory, it accurately represents the functional or logical organization of program memory.

An abstract representation of the functional units of memory managed by a running program. The units
	include the heap, the stack, memory to hold any global or static variables, and the text segement that holds
	the machine instructions.
The functional memory organization of a running program.
  1. The text area contains the program's machine instructions (i.e., the executable code).
  2. Global variables are defined in global scope outside of any function or object; static variables have the keyword static included as part of their definition. The memory that holds global and static variables is typically allocated at program startup.
  3. The space illustrated here was actually allocated but unused on a segmented-memory system, and provided space in to which the stack and heap could grow. On a paged-memory system this space is not actually allocated but signifies that there is space available for stack and heap growth.
  4. The stack (sometimes called the runtime stack) contains all of the automatic (i.e., non-static) variables.
  5. Memory is allocated from and returned to the heap with with the new and delete operators respectively.

It's easy to get confused by the term "allocate" when we talk about a C++ program in the context of hardware and the operating system. The operating system, which is tasked with managing all of a computer's resources - including main memory - allocates physical memory to a running program in pages, but this operation is completely transparent to and beyond the control of programmers.

On the other hand, programmers can request additional memory for program data. The OS provides a host environment in which a program runs. One of the services that the OS provides, in the form of runtime code that is linked with our C++ code, is the memory managers that implement the stack and the heap. When a programmer requests additional memory, that memory is allocated either from the stack (by defining an automatic variable) or from the heap (using the "new" operator).

For our discussions, memory allocated to a running program comes from either the stack or the heap, and deallocated memory is returned to the structure from which it was originally allocated.

The important concept that we can draw from Figure 1 is that a running program maintains its data in one of three logical regions of memory. The memory for global and static variables is allocated when the program is first loaded in memory for execution and is not deallocated until the program terminates. Values stored in these variables do not change unless the programmer explicitly changes them (barring changes made by external - outside of the program - systems). The other two regions, the stack and the heap, are more dynamic - the memory is allocated to the program and deallocated as needed. The differences between these two regions are the algorithms that manage the memory and therefore how they behave.


A stack is a simple last-in, first-out (LIFO) data structure that is often presented to new computer scientists as a stack of plates in a cafeteria. In the cafeteria example, plates are removed from or added to the stack only at the top. One of our first actions upon entering the cafeteria is to pop a plate off the top of the stack of plates before we go through the line. Similarly, the dishwashers push each clean plate on top of the stack one at a time. In this way, the last clean plate pushed on the stack is the first plate that a customer pops off of the top. Inserting or removing a plate from the middle of the stack is not permitted. Stacks must support at least two operations: push and pop; other operations are possible but are not required.

An animation of items being pushed on to and popped off of a stack. Four colored rectangles are physically moved, one at a time, to the top of the stack (that is, they are piled on top of each other), and then the rectangles are removed one at a time from the top of the stack. An animation in which nothing moves physically; instead, memory values change, which is represented by
changing the colors of rectangles within the stack area (the colors change bottom to top to represent the
push operation and then from top to bottom to represent the pop operation.
Demonstrations of stack behavior.
  1. Items (e.g., plates) being pushed on to and popped off of a stack. The items are popped off in the reverse order in which the were pushed on to the stack.
  2. Nothing really moves physically when data items (variables or objects) are pushed on to or popped off of a stack in memory. Only the values stored in the memory managed by the stack are changed, which this version of a stack attempts to illustrate. It's possible and common to intermix push and pop operations.

Both Figure 2 animations oversimplify stacks by showing that all of the elements stored on the stack are the same size. The data items pushed on the runtime stack may be any convenient size. Nevertheless, stacks are somewhat rigid in that they only allow access at the top. But this rigidity also makes stacks easy to implement, and makes the push and pop operations fast and efficient.


In contrast to the stack, the heap is much more flexible: access is not restricted to a single location within the heap. The memory managed by the heap may be allocated to the program from anyplace in the heap, even if that memory is in the middle of the heap (i.e., in the middle of previously allocated memory). How is it possible to have a "hole" (i.e., unallocated memory) in the middle of the heap? The program can return memory to the heap whenever it is convenient to do so. So, unlike a stack, which is required to deallocate memory in the reverse order in which it was allocated, the program may return memory to the heap without regard to the original order of allocation. The easiest way to see the difference between the stack and the heap behaviors is to compare Figure 2 (b) with Figure 3.

The heap abstractly represented as a rectangle. Colored rectangles are inserted to simulate memory being allocated
on the heap. Deallocation is simulated by removing a color block. Two adjacent colored blocks are initially still separated
by a line; coalescing the blocks into a single block is demonstrated by removing the separating line.
Demonstration of heap behavior. Memory is allocated by returning the first block large enough to satisfy the request. Memory is returned or freed in any convenient order. When two blocks of allocated memory are freed, they are coalesced to form a single block to better meet demand for larger blocks of memory. A request for a large block (twice the size of the colored blocks) is illustrated with the cross-hatched block.

There is only one restriction on the memory that is allocated to the program from the heap: it must form a contiguous block large enough to satisfy the request with a single chunk of memory. This one restriction increases the complexity of a heap in at least two ways: First, the code that carries out the allocation operation must scan the heap until it finds a contiguous block of memory that is large enough to satisfy the request. Second, when memory is returned to the stack, adjacent freed blocks must be coalesced to better accommodate future requests for large blocks of memory. The heap's increased complexity means that managing memory with a heap is slower and less efficient than doing so with a stack. But a heap also has advantages that justify the increased overhead. The next section explores some of those advantages and illustrates how the stack and the heap work together to manage complex data in a running program.