"To allocate" means to assign, allot, distribute, or "set apart for a particular purpose." Programs manage their memory by partitioning or dividing it into different units that perform specific tasks. Two of those units are the stack and the heap, which manage the program's unused memory and allocate it for different kinds of data or variables. When the memory is no longer needed, it may be deallocated. Deallocated memory is returned to its original source, either the stack or the heap, and is available to be reused for different data.
When an operating system (OS) runs a program, it must first load the program into main memory. Memory is used both for the program's machine instructions and for the data that the program uses. At the time that Figure 1 was created, computers typically used a memory allocation technique called segmented memory. When the OS loads a program on a segmented-memory computer, it allocates to the program a contiguous block or segment of memory and the program can use no more memory than this. As the program's memory was limited, it was necessary to divide it into regions that performed specific functions within the program. Although this memory management technique is now obsolete, programs continue to organize their memory based on the functional units illustrated here.
Contemporary systems are based on a more flexible memory allocation technique called paged memory. These systems manage memory dynamically; that is, the amount of memory allocated to a program is allowed to increase and decrease as the program's needs change. Memory is allocated to the program and reclaimed by the OS in fixed-size chunks called pages. When the OS loads a program on a paged-memory computer, it initially allocates a minimal number of pages to the program, but the OS will supply additional memory as it is needed, and that memory is added to either the heap, the stack, or the machine instructions area. Any machine code or data that are not immediately needed are not not initially loaded, and pages of memory containing machine instructions or data that have not been used recently may be returned to the OS to be reallocated to other programs that currently need more memory. Although Figure 1 no longer represents the physical layout of memory, it accurately represents the functional or logical organization of program memory.
It's easy to get confused by the term "allocate" when we talk about a C++ program in the context of hardware and the operating system. The operating system, which is tasked with managing all of a computer's resources - including main memory - allocates physical memory to a running program in pages, but this operation is completely transparent to and beyond the control of programmers.
On the other hand, programmers can request additional memory for program data. The OS provides a host environment in which a program runs. One of the services that the OS provides, in the form of runtime code that is linked with our C++ code, is the memory managers that implement the stack and the heap. When a programmer requests additional memory, that memory is allocated either from the stack (by defining an automatic variable) or from the heap (using the "new" operator).
For our discussions, memory allocated to a running program comes from either the stack or the heap, and deallocated memory is returned to the structure from which it was originally allocated.
The important concept that we can draw from Figure 1 is that a running program maintains its data in one of three logical regions of memory. The memory for global and static variables is allocated when the program is first loaded in memory for execution and is not deallocated until the program terminates. Values stored in these variables do not change unless the programmer explicitly changes them (barring changes made by external - outside of the program - systems). The other two regions, the stack and the heap, are more dynamic - the memory is allocated to the program and deallocated as needed. The differences between these two regions are the algorithms that manage the memory and therefore how they behave.
A stack is a simple last-in, first-out (LIFO) data structure that is often presented to new computer scientists as a stack of plates in a cafeteria. In the cafeteria example, plates are removed from or added to the stack only at the top. One of our first actions upon entering the cafeteria is to pop a plate off the top of the stack of plates before we go through the line. Similarly, the dishwashers push each clean plate on top of the stack one at a time. In this way, the last clean plate pushed on the stack is the first plate that a customer pops off of the top. Inserting or removing a plate from the middle of the stack is not permitted. Stacks must support at least two operations: push and pop; other operations are possible but are not required.
Both Figure 2 animations oversimplify stacks by showing that all of the elements stored on the stack are the same size. The data items pushed on the runtime stack may be any convenient size. Nevertheless, stacks are somewhat rigid in that they only allow access at the top. But this rigidity also makes stacks easy to implement, and makes the push and pop operations fast and efficient.
In contrast to the stack, the heap is much more flexible: access is not restricted to a single location within the heap. The memory managed by the heap may be allocated to the program from anyplace in the heap, even if that memory is in the middle of the heap (i.e., in the middle of previously allocated memory). How is it possible to have a "hole" (i.e., unallocated memory) in the middle of the heap? The program can return memory to the heap whenever it is convenient to do so. So, unlike a stack, which is required to deallocate memory in the reverse order in which it was allocated, the program may return memory to the heap without regard to the original order of allocation. The easiest way to see the difference between the stack and the heap behaviors is to compare Figure 2 (b) with Figure 3.
There is only one restriction on the memory that is allocated to the program from the heap: it must form a contiguous block large enough to satisfy the request with a single chunk of memory. This one restriction increases the complexity of a heap in at least two ways: First, the code that carries out the allocation operation must scan the heap until it finds a contiguous block of memory that is large enough to satisfy the request. Second, when memory is returned to the stack, adjacent freed blocks must be coalesced to better accommodate future requests for large blocks of memory. The heap's increased complexity means that managing memory with a heap is slower and less efficient than doing so with a stack. But a heap also has advantages that justify the increased overhead. The next section explores some of those advantages and illustrates how the stack and the heap work together to manage complex data in a running program.