7.13. Two-Dimensional And Higher Arrays

Time: 00:08:21 | Download: Large, Large (CC), Small | Streaming, Streaming (CC) | Slides (PDF)
Review

C++ doesn't limit the number of array dimensions programs can create and use. Creating arrays with two or more dimensions as automatic or local variables on the stack is straightforward. However, it is surprisingly more difficult to create them dynamically on the heap with the new operator. Some seemingly natural ways of dynamic array allocation do not compile.

Dimensions Automatic (Local, Stack) Dynamic (Heap)
1
int scores[15];
int* scores = new int[15];
 (a)(b)
2
int scores[15][10];
int* scores = new int[15][10];		// (1)
int** scores = new int[15][10];		// (2)
int* scores[50] = new int[15][10];	// (3)
 (c)(d) - Syntax Errors
2
int** scores = new int* [nrows] { new int[ncols] };			// array initialization syntax
int** scores = (int **) new int[nrows][ncols];				// typecast
int** scores = reinterpret_cast<int **>(new int[nrows][ncols]);		// typecast
 (e) - Runtime Errors
Automatic vs. dynamic array allocation. Programs must specify the size of each dimension with a compile-time constant when creating an array automatically on the stack. However, they can use variables when creating them dynamically on the heap. The benefits of this flexibility motivate our exploration of dynamic multi-dimensional array syntax.
  1. The syntax for creating a one-dimensional array automatically as a local or stack variable.
  2. The syntax for creating a one-dimensional array dynamically with the new operator as a heap variable.
  3. The syntax for creating a two-dimensional array automatically as a local or stack variable.
  4. Incorrect statements failing to create a dynamic two-dimensional array.
    1. The new operator returns a single pointer, seemingly justifying defining scores as a single-dimensional pointer variable, but compilation fails with the diagnostic "cannot convert from 'int (*)[10]' to 'int *'."
    2. The diagnostic suggests that scores is a int pointer, but compiling the modified statement fails with a similar diagnostic "cannot convert from 'int (*)[10]' to 'int **'."
    3. The final version also fails with "cannot convert from 'int (*)[10]' to 'int *[10]'."
  5. Programmers can force the dynamic allocation operations to compile with "creative" initialization syntax or typecasts, only to see them fail with runtime errors.

The failures of (d) and (e) notwithstanding, programs can create multi-dimensional arrays dynamically. The following discussion presents three solutions:

Automaic Type Deduction

The ANSI standard's 2011 extension of the auto keyword affords a partial solution for the multi-dimensional dynamic array problem. While the solution is limited, its simplicity makes it the favored approach, if the program can function within the limitation.

#include <iostream>
#include <iomanip>
using namespace std;

int main()
{
	int nrows = 15;
	const int ncols = 10;

	auto scores = new int[nrows][ncols];

	for (int i = 0; i < nrows; i++)
		for (int j = 0; j < ncols; j++)
			scores[i][j] = i * j;

	for (int i = 0; i < nrows; i++)
	{
		for (int j = 0; j < ncols; j++)
			cout << setw(5) << scores[i][j];
		cout << endl;
	}

	return 0;
}
auto detecting a two-dimensional array type. The compiler can automatically detect a variable's type by using the auto keyword, circumventing the need for a programmer to determine the appropriate type. This technique resolves some of the problems presented in Figure 1 (d) and (e). The solution allows programmers to use a variable for the first array dimension but still requires compile-time constants for the second and subsequent dimensions.

Creating Two-Dimensional Arrays As An Array Of Arrays

Java takes a different approach to creating multi-dimensional arrays: it creates arrays of arrays. Specifically, creating a two-dimensional array begins by creating a one-dimensional array, whose elements are one-dimensional arrays. C++ programs can take the same approach, which has two advantages: Each dimension can be a variable and the indexing or element access notation is straightforward. However, it has two disavantages: Creating the arrays is a complex, multi-step process and the program must deallocate the arrays to avoid a memory leak.

The picture consists of a pointer variable named 'table,' a vertical array with 'nrows' elements, and several horizontal arrays denoting rows, each with 'ncols' elements. An arrow points from 'table' to the vertical array, and each element of the virtical array points to a row.
#include <iostream>
#include <iomanip>
using namespace std;

int main()
{
    int    nrows;
    int    ncols;

    cout << "Number of rows: ";
    cin >> nrows;
    cout << "Number of columns: ";
    cin >> ncols;

    int**    table = new int* [nrows];		  // (1)

    for (int i = 0; i < nrows; i++)		  // (2)
        table[i] = new int[ncols];

    for (int row = 1; row <= nrows; row++)
        for (int col = 1; col <= ncols; col++)
            table[row-1][col-1] = row * col;	  // (3)


    for (int row = 0; row < nrows; row++)
    {
        for (int col = 0; col < ncols; col++)
            cout << setw(4) << table[row][col];  // (3)
        cout << endl;
    }
    
    for (int i = 0; i < nrows; i++)		  // (4)
        delete[] table[i];
    delete[] table;

    return 0;
}
(a)(b)
The multtab example implemented as an array of arrays.
  1. table is a pointer to a pointer (i.e., it's a pointer with two levels of indirection as illustrated in b.i). table points to an array of row pointers. Each row pointer points to an array that serves as one table row.
    1. The program defines table as int**, meaning that it is a pointer to a pointer. The data type int* means that each element of table is a pointer to an integer, that is, an array of integers.
    2. The program must create each row of the table one at a time.
    3. One of the advantages of creating a two-dimensional array as an array or arrays is that the "client" or main-logic code can continue to use the two-index notation: table[row][col].
    4. The program must delete each row one at a time. When the rows are deleted, then the program deletes the array of pointers. The square brackets, [], indicate that delete is operating on an array.

Creating Two-Dimensional Arrays with row-major Ordering

This technique reverses the advantages and disadvantages of the row-major solution. It takes more effort to create the array, and we must destroy it later, but using the array is less awkward because it only uses the array indexing operator.

Now that we have an idea of how row-major ordering works - how it converts two indexes into one index - we can use it to create a one-dimensional array that behaves like a two-dimensional array. The "trick" is a simple expression that consistently combines the row and column to produce the same, unique single array index value. For clarity, the following example evaluates the expression in an inline function.

#include <iostream>
#include <iomanip>
using namespace std;


inline int index(int row, int col, int ncols) { return row * ncols + col; }	// (a)

int main()
{
	int	nrows;
	int	ncols;

	cout << "Number of rows: ";
	cin >> nrows;
	cout << "Number of columns: ";
	cin >> ncols;

	int*	table = new int[nrows * ncols];					// (b)

	for (int row = 1; row <= nrows; row++)
		for (int col = 1; col <= ncols; col++)
			table[index(row - 1, col - 1, ncols)] = row * col;	// (c)


	for (int row = 0; row < nrows; row++)
	{
		for (int col = 0; col < ncols; col++)
			cout << setw(4) << table[index(row, col, ncols)];	// (c)
		cout << endl;
	}

	return 0;
}
The multtab example implemented with row-major mapping. This example uses row-major mapping to simulate a two-dimensional array implemented with a single dimension. Replace the highlighted "int" with another type name to create arrays of different types.
  1. The index function evaluates the row-major expression to map a row × column address into a single array index. I believe that wrapping the row-major calculation in a function accrues two advantages: First, the function is less-error prone than the expression alone when the indexes are complex expressions, and the advantage increases proportionally with the expression complexity. Second, it provides a general solution in programs with multiple simulated two-dimensional arrays. index is an ideal example of a function that benefits from being inlined: An authentic program will call it frequently, and because it avoids the overhead of a call and return, the generated machine code is smaller than a "regular" or non-inline function.
  2. Creates a one-dimensional array large enough to hold all the rows and columns of the simulated two-dimensional array.
  3. Whenever an element of the simulated two-dimensional array is needed, the index function maps the row and column indexes into the single value needed to index into the actual one-dimensional array. However, the index function is not necessary - it just reflects my personal preference. We can easily embed the mapping operation directly in each array access:
    table[row * ncols + col]
Ideally, we would like our code to "hide" the row × column ⇒ index mapping, (c), from the client code. We can easily implement the desired hiding by wrapping the array logic in a class.

Although creating a simulated two-dimensional array is easy, and we don't need to destroy it later (because the program makes it on the stack), using the array is a little awkward and unnatural. Whenever we want to access an array element, we must translate the row×column address into a single array index using the row-major ordering. Another two-dimensional technique is available.