7.14.2. Binary Search

array, search, binary search, run order, Big O, key

The binary search algorithm is easy to understand and to implement. Nevertheless, with a run order of O(log n), it is an efficient algorithm for searching arrays for a given element. The goal is to locate a specific value quickly (called the key) in an array or determine that it isn't there. A run order of O(log n) means that a large increase in the data size results in a relatively small increase in the runtime. The one drawback of binary search is that it only operates on a sorted array, making it dependent on the sorting algorithm. Coupled with selection sort, binary search is useful for small arrays or arrays whose contents change infrequently. The next chapter introduces a more efficient sorting algorithm that greatly extends the usefulness of binary search.

A picture of a character array containing the sequence of characters SEETHEQUICKREDFOX Three variables, top, mid, and bottom, index into the array. top is at element C, mid is at I, and bottom is at X. After one iteration of the binary search algorithm, top is now at K, and mid is at R; bottom is unchanged at X. After the second iteration, top is still at K, but mid moves to O and bottom to Q. After the last iteration, top, mid, and bottom are all at Q.
(a)(b)(c)(d)(e)
Binary search. The binary search algorithm divides the array into two equal-length subarrays. It determines whether the key is in the upper or lower subarray, ignores the subarray without the key, and then repeats the division and determination on the subarray with the key. The algorithm continues until it locates the key or determines that it isn't in the array. Significantly, the algorithm doesn't change the array; instead, it manages the subarray with three index variables: The steps illustrate the search takes to locate 'Q' (the key or target) in a simple array of characters:
  1. The original, unsorted array.
  2. The array is sorted, top and bottom are initialized, and the first value of mid is calculated.
  3. The key is in the bottom half of the subarray, so the algorithm updates top and calculates a new value for mid.
  4. The key is in the top half of the subarray, so the algorithm updates bottom and calculates a new value for mid.
  5. The indexes are equal in this example, but that isn't a requirement. The algorithm ends when top and bottom are equal or have crossed each other. If the array has an object matching the key, mid will point to it.

Binary Search With Fundamental Data

array, search, binary search, key
DriverBinary Search
#include <iostream>
using namespace std;


void ssort(char* data, int size);
int  binsearch(char key, char* data, int size);


int main()
{
    char data[] = { 'S','E','E','T','H','E','Q','U',
            'I','C','K','R','E','D','F','O','X' };
    int	size = sizeof(data) / sizeof(char);

    ssort(data, size);
    int index = binsearch('Q', data, size);

    if (index >= 0);
        cout << "key found at index = " << index << endl;
    else
        cout << "key not in array" << endl;

    return 0;
}
int binsearch(char key, char* data, int size)
{
    int top = 0;				// (a)
    int bottom = size - 1;

    do
    {
        int mid = (top + bottom) / 2;		// (b)

        if (key < data[mid])			// (c)
            bottom = mid - 1;
        else if (key > data[mid])
            top = mid + 1;
        else
            return mid;
    } while (top <= bottom);			// (d)

    return -1;					// (e)
}
The binary search function and driver. The driver defines an array of test data and counts its elements. Programs counting the array elements with sizeof must do so in the same scope that defines the array. The driver sorts the array (using the ssort function from the previous section) before calling the binary search function.
  1. Initialize the top and bottom indexes.
  2. Calculate the index of the middle element; integer division truncates in the case of a subarray with an odd size.
  3. Determine which subarray contains the key and update the appropriate index. If it finds the key, the function returns its address.
  4. The search ends without finding the key if the top and bottom indexes meet or cross.
  5. If it doesn't find the key, it returns -1 (an invalid array index).

Binary Search With Object Data

array, search, binary search, associative search, key
Review

Searching for a particular value in an array of fundamental data (e.g., characters or integers) has limited value, but it becomes more useful when the array elements are objects. A common practice is to use one or more fields as the key to sort the array. A program searches the array for a given key and, if it finds a matching object, can access all the data associated with that key (i.e., stored in the object). This operation is called an associative search.

#include <iostream>
using namespace std;

struct student
{
    const  char*  name;
           int    id;
           double gpa;
};

void ssort(student* data, int size);
int  binsearch(int key, student* data, int size);

int main()
{
    student students[] = {
            { "Dilbert", 123, 3.5 },
            { "Wally", 456, 2.0 },
            { "Alice", 987, 3.9 },
            { "Asok", 730, 3.8 },
            { "Catbert", 501, 3.0 },
            { "Pointy Haired Boss", 666, 1.0 },
            { "Dogbert", 111, 4.0 }
        };
    int size = sizeof(students) / sizeof(student);

    ssort(students, size);
    int i = binsearch(501, students, size);

    if (i >= 0)
        cout << i << ": " << students[i].name << " " <<
          students[i].id << " " << students[i].gpa << endl;
    else
        cout << "key not in array" << endl;

    return 0;
}

int binsearch(int key, student* data, int size)
{
    int top = 0;
    int bottom = size - 1;

    do
    {
        int mid = (top + bottom) / 2;
        if (key < data[mid].id)
            bottom = mid - 1;
        else if (key > data[mid].id)
            top = mid + 1;
        else
            return mid;
    } while (top <= bottom);

    return -1;
}
#include <iostream>
#include <cstring>
using namespace std;

struct student
{
    const  char* name;
           int    id;
           double gpa;
};

void ssort(student* data, int size);
int  binsearch(char* key, student* data, int size);

int main()
{
    student students[] = {
            { "Dilbert", 123, 3.5 },
            { "Wally", 456, 2.0 },
            { "Alice", 987, 3.9 },
            { "Asok", 730, 3.8 },
            { "Catbert", 501, 3.0 },
            { "Pointy Haired Boss", 666, 1.0 },
            { "Dogbert", 111, 4.0 }
        };
    int size = sizeof(students) / sizeof(student);

    ssort(students, size);
    int i = binsearch("Dilbert", students, size);

    if (i >= 0)
        cout << i << ": " << students[i].name << " " <<
          students[i].id << " " << students[i].gpa << endl;
    else
        cout << "key not in array" << endl;

    return 0;
}

int binsearch(char* key, student* data, int size)
{
    int top = 0;
    int bottom = size - 1;

    do
    {
        int mid = (top + bottom) / 2;
        if (strcmp(key, data[mid].name) < 0)
            bottom = mid - 1;
        else if (strcmp(key, data[mid].name) > 0)
            top = mid + 1;
        else
            return mid;
    } while (top <= bottom);

    return -1;
}
Searching an array of objects. The examples specify a student structure with three fields or members: (1) the student's name, (2) identification number (ID), and (3) grade point average (GPA). They create an array of student objects, fill each object with test data, sort the array by the key (name or ID), and search for a specific key value. The examples sort the array with the object version of the ssort function described in the previous section. The second example modifies the ssort function to use strcmp as illustrated in the figure.

Programmers must tailor each version of the ssort and binsearch function for a specific key data type. Programmers can generalize the functions with void pointers, but it is awkward. C++ provides a more flexible, easier-to-use solution based on templates. The next chapter describes void pointers and Chapter 13 covers templates.