4.2 Variables And Memory Addresses

Time: 00:03:22 | Download: Large, Large (CC) Small | Streaming, Streaming (CC) | Slides (PDF)

Pointers' power, utility, and ubiquity mean that, while they can be conceptually challenging to understand, learning them, their associated syntax, and how to use them to solve problems is nevertheless well worth the time and effort it may take. To understand pointers and the operations that can take place with them, we must first understand a little about computer memory and how programs locate variables in memory. Contemporary computer systems typically have many gigabytes of memory, and each byte of memory has a unique address. We can think of main memory or RAM as an extensive array of bytes and the addresses as the indexes of each array element or byte within the array. Most variables stored in the array (i.e., in main memory) are larger than one byte, so the address of each variable is the index of the first byte of that variable.

An abstract illustration of main memory as an array of bytes indexted from 0 to n-1 where n is the number of bytes of RAM installed in the computer. Four variables viewed as adjacent bytes of memory. Each variable spans four bytes, so the addresses of each variable are four bytes apart. The first variable is at index location 316, the second at 320, the third at 324, and the last at 328.
(a)(b)
Viewing main memory as an array of bytes.
  1. Main memory, often called RAM, can be visualized as a contiguous array of bytes. An address is equivalent to an index in the memory array.
  2. Most C++ data types span multiple bytes of memory. Illustrated here is a sequence of 4-byte variables; depending on the compiler and the underlying hardware, they might be ints, longs, or floats. The address of multi-byte variables is the address or array index of the first byte. (This simplification ignores the big-endian vs. little-endian problem.)

The amount of memory in a computer and how the computer addresses it are physical properties of the hardware. Alternatively, the variables residing in memory are an abstraction wholly defined by software. Previously, we used houses along a street as a metaphor for the arrangement of variables in memory. Every house has a unique street address that people can use to find and distinguish it from the others on the same street. In much the same way, variables have addresses in main memory. Memory addresses are a hardware property and cannot change, but the data stored in memory - stored in a variable - can change or vary over time.

Four house along a street. Each house has an address: 316 Elm, 320 Elm, 324 Elm, and 328 Elm. An animation of a house flying over a rainbow.
(a)(b)
Houses as a metaphor for variables.
  1. Like houses, each variable has a unique address that grows as you move along the street or through memory.
  2. The contents of a variable, like the occupants of a house, can change over time. But the address of a variable, like the address of a house, is fixed and does not change (unless a tornado picks it up and carries it over a rainbow, which almost never happens).

When we write a program, we name the variables, and when we wonder about how the program works, we think in terms of the variable names. But when a program executes, the computer accesses and manipulates all data by their memory addresses. The compiler maps every variable name to a unique memory address and incorporates the address into the machine code. Together, the compiler and the operating system determine the location in memory of each variable. So, a variable is a named location in main memory with three characteristics: a name, a content, and a memory address. How the compiler interprets a variable's name - either as its address or its contents - is determined by where the name appears in a given statement.

int counter = 123;
counter = 5;
balance = counter * 10;
cout << counter << endl;
(a)(b)(c)
Three characteristics of a variable.
  1. The C++ code that defines and initializes a variable named counter
  2. An abstract representation of how the variable counter appears in main memory:
    1. the variable's name - the compiler maps the name to a location in main memory
    2. the variable's content - the data stored in main memory
    3. the variable's address in main memory (it is customary to write addresses using hexadecimal notation, e.g., 0xff456e)
  3. A variable's name can refer to either its content or its address depending on where the name appears in a statement:
    1. On the left side of the assignment operator, the name counter represents the address of the variable. When a program uses a variable's name as an address, it's called an l-value (so named because l-values often appear on the left side of the assignment operator, which means "store the value at this address").
    2. On the right side of the assignment operator or in an expression (e.g., an output statement), the name counter represents the variable's contents. When a variable name represents the contents of memory, it's called an r-value (named because it often appears on the right side of the assignment operator, which means "load the value stored at this address").
    Whenever the compiler must determine the meaning of a symbol (a variable name, operator, or keyword) based on where it appears in a program (i.e., the symbol's context within a program), it is said to be context sensitive.

Fortunately, when we write a program, we generally use the variable's name, ignoring that it has an address. But this dynamic changes when we work with pointers. A Pointer is a variable that stores or contains the address of data, possibly another variable. Pointers allow programmers to access other variables by their addresses. Any programming element (variable, object, function, etc.) stored in memory has an address that can be found and saved in a pointer. We leave unanswered, for now, the question of why we might want to access or save an element's address.

The relation between a pointer, represented by a rectangle, and data, represented by a larger rectangle. An arrow from the pointer to the data suggests that the pointer is pointing to the data.
An abstract representation of a pointer. Programmers often use pictures to help them solve problems. We often represent the addresses saved in pointers as arrows in pictures. The picture illustrates a pointer pointing to some data in memory, which is a graphical way of saying that the pointer's content is the data's address.

The terms pointer and address are sometimes used interchangeably, but while they are closely related, there is an important difference between them. An address is a location in main memory and cannot change. Alternatively, a pointer is a kind of variable that holds or stores an address, and the content of a pointer can change.

C++ provides several pointer operators, which allow programmers to define pointer variables, find the addresses of variables, and access data through a pointer. These operators are the subject of the next sections of the text.