Large data processing applications often include two related sub-tasks: sorting the elements of an array (e.g., putting an array of names in alphabetical order) and searching for an element in an array (e.g., searching for a specific name in the array). These tasks are common but not so common that programming languages support them directly. However, most languages do provide library functions that perform these tasks. The selection sort and binary search algorithms presented in the last chapter demonstrated searching and sorting with character arrays. But the library functions must be more general - they must work with all data types.
We often study searching and sorting at the same time because many searching algorithms only work when the data is in sorted order and because both operations generally require an ordering function. For example, how the selection sort moves the array elements around to arrange them in sorted order is independent of the element's type. However, it chooses which elements to move by selecting the smallest value and swapping it with the element at the top, and this step is dependent on the element's type. When the array contains fundamental data (e.g., integers and doubles), programmers can use C++ relational operators (<, <=, >, and >=) to select the smallest value. But how do programmers implement that step when the array contains complex objects - instances of structures or classes?
Library programmers create the searching and sorting functions before application programmers create the programs that use them. Applications can specify new data types as structures or classes, but C++ doesn't include native operators that can directly compare objects (instances of structures or classes). How can library programmers make the searching and sorting functions sufficiently general to work with any object? Library programmers solve the problem by requiring application programmers to provide an ordering function.
![]() |
![]() |
![]() |
(a) | (b) | (c) |
Application programs pass ordering functions to the searching and sorting functions by pointer. As a consequence, the library functions control the signature and the overall behavior of the ordering functions. So, the library functions "expect" to pass arguments to the ordering functions by pointer. The ordering function's name is unimportant, but the arguments must be void pointers.
A void pointer, void*
, is C++'s most general data type. The compiler uses data types to allocate memory for data and to interpret its bits. As stored in pointers, addresses are always the same size, but the compiler still uses their type to interpret the data they point to. But the keyword void
makes the data typeless - it has no specific data type. So, a void pointer is a bare address that can point to anything, but it must be cast to a known data type before the data can be extracted and used.
#include <iostream> using namespace std; struct student // (a) { char* name; int id; }; void function1(void* arg) { int data = * (int *)arg; // (b) cout << data << endl; } void function2(void* arg) { char* str = (char *)arg; // (c) cout << str << endl; } |
void function3(void* arg) { student* s = (student *)arg; // (d) cout << s->name << " " << s->id << endl; } int main() { int x = 5; char* c = "Hello, World!"; // (e) student s = { "Alice", 123 }; // (f) function1(&x); function2(c); function3(&s); return 0; } |
(int *)
, and then (2) dereference the pointer, *
. You can edit this function to work with any fundamental data type.(char *)
. This function is a special case used only with C-strings(student *)
, and then each field is accessed with the arrow operator. Edit this function to work with any object, be it an instance of a structure or a class.The last chapter introduced to us the selection sort and the binary search algorithms. While selection sort works, it has a run order of O(n2), which is not very efficient - a small increase in the data size results in a large increase in run time. Quicksort is an efficient and well known algorithm. It usually has a run order of O(n log n), which means that it is usually quite efficient - a small increase in the data size results in a small increase in run time. Binary search is efficient with a run order of O(log n). Both algorithms are ideal candidates for inclusion as library functions, and C implemented them as the qsort and bsearch functions. Although the functions are not object-oriented, C++ inherits them, and C++ programmers may use both.
void qsort(void* data, size_t num, size_t size, int (*order)(const void* e1, const void* e2));
void* bsearch(void* key, void* data, size_t num, size_t size, int (*order)(const void* e1, const void* e2));
cstdlib
(C++) and stdlib.h
(C) header files.
*
. int
and const void* e1, const void* e2
require the ordering function to return an integer and accept two void pointer arguments. Programmers may replace order, e1, and e2 with appropriate names.bsearch only
The complexity and arcane syntax of the library functions are due to a combination of two requirements. The first is the need to make the functions sufficiently general to operate with whatever kind of data the programmer needs to search or sort. Void pointers provide a general data type, but programmers must cast them to an established type before accessing data through them. Furthermore, the general nature of void pointers and the library functions require programmers to provide an ordering function.
The second requirement impacting the complexity and syntax of the searching and sorting functions is the need to pass and return data by pointer. qsort changes the data array by moving the elements around in the array, so the program must pass the data with an INOUT mechanism. C++ inherits qsort and bsearch from C, which does not have pass-by-reference, making pass-by-pointer the only available INOUT mechanism. The library functions pass elements from the data array to the ordering functions, which must dereference their pointer arguments to access the data. bsearch uses the same data arrays and ordering functions as qsort, so it too uses pass-by-pointer. The following examples demonstrate how to write and use ordering functions for different kinds of data.
Each example follows the same basic pattern. It creates an array of example data. It sorts the data and prints it, demonstrating the sorting. Lastly, it creates a key and searches the data array for it. You can modify one of the examples to satisfy most of your searching and sorting needs.
Working with fundamental or primitive data is relatively straightforward. The following example is based on an integer array but will work with any fundamental data type (e.g., char, double, long, etc.).
/* * Sorts and then searches an array of integers. * Demonstrates how to write an ordering function * that compares two integers. */ #include <iostream> #include <cstdlib> using namespace std; int order_int(const void* e1, const void* e2); int main() { int data[10] = { 2, 5, 3, 9, 1, 4, 6, 8, 7, 0 }; qsort(data, 10, sizeof(int), order_int); // (a) for (int i = 0; i < 10; i++) cout << data[i] << endl; int key = 6; // (b) int* found = (int *)bsearch(&key, data, 10, sizeof(int), order_int); // (c) if (found != nullptr) cout << "key found " << *found << endl; else cout << "key not found\n"; return 0; } int order_int(const void* e1, const void* e2) // (d) { return * (int *) e1 - * (int *) e2; }
While the difference between float and double values indicates their relative order, ordering functions must return an int value. Casting the difference to an int is insufficient because casting truncates and discards fractional values. For example, imagine that e1 = 0.8 and e2 = 0.4. So, e1 - e2 = 0.8 - 0.4 = 0.4. But (int)0.4 = 0, indicating that the elements are equal or at least that they have the same order. Please see Figure 13 for a working example.
The C-string examples are special cases because we can't easily generalize them to other data types. Two similar representations are possible; the pictures help us understand their differences. The pictures form bridges that help us span the gulf between a problem and the details of the final program.
![]() |
![]() |
(a) | (b) |
\0
. The name of an array represents the array's address, so it's convenient to move C-strings around as character pointers.
char* data[]
. So, when the ordering function casts the void pointers, it must cast them to double character pointers (i.e., pointers to pointers). demo2.cpp (Figure 7) demonstrates how to build and use this array. Although this is a special case, it demonstrates concepts applicable to objects that have C-string fields or members.char data[11][8]
, but the program initializes each row as a C-string. The array is a good example of a situation where it is not possible to reverse the rows and columns: [8][11] will NOT work. demo3.cpp (Figure 8) demonstrates how to build this array./* * Sorts an array of C-strings. * Demonstrates how to write an ordering function * that compares two C-strings. */ #include <iostream> #include <cstdlib> #include <cstring> using namespace std; int order_string(const void* e1, const void* e2); int main() { char* data[] = { "see", "the", "quick", "red", "fox", "jump", "over", "the", "lazy", "brown", "dog" }; qsort(data, 11, sizeof(char*), order_string); // (a) for (int i = 0; i < 11; i++) cout << data[i] << endl; char* key = "jump"; // (b) char** found = (char **)bsearch(&key, data, 11, sizeof(char*), order_string); // (c) if (*found != nullptr) cout << "key found: " << *found << endl; else cout << "key not found\n"; return 0; } int order_string(const void* e1, const void* e2) // (d) { return strcmp(*(char **) e1, *(char **) e2); }
/* * Sorts a two-dimensional array of characters. * Demonstrates how to write an ordering function * that compares two C-strings formed from a 2D array. */ #include <iostream> #include <stdlib.h> #include <cstring> using namespace std; int order_char(const void* e1, const void* e2); int main() { char data[11][8] = { "see", "the", "quick", "red", "fox", "jump", "over", "the", "lazy", "brown", "dog" }; qsort(data, 11, 8, order_char); // (a) for (int i = 0; i < 11; i++) cout << data[i] << endl; char* key = "lazy"; // (b) char* found = (char *)bsearch(key, data, 11, 8, order_char); // (c) if (found != nullptr) cout << "key found: " << found << endl; else cout << "key not found\n"; return 0; } int order_char(const void* e1, const void* e2) // (d) { return strcmp((char *) e1, (char *) e2); }
Searching for a given element in an array of integers or strings only tells us that the value is or isn't in the array. However, searching for a specific object in an array is more helpful. For example, suppose the array contains objects like those specified in the following figure. If the application searches for an object with the id 123, the search function will return a pointer. It returns nullptr if the array doesn't have an object with that id or a pointer to the object if it does. Instances of the student class have three fields, and if we find the object with the matching id, we retrieve all the information saved in that object. Searching for part of the data in an object to find all the data is called an associative search. Furthermore, we can sort and search the array using any field.
The following examples illustrate how to search for objects based on different fields or members. The figure presents demo4.cpp
as a series of figures to simplify the discussion. The complete program, with all parts in context, is available at the bottom of the page.
struct student { char* name; int id; double gpa; }; |
struct student students[] = { { "Dilbert", 123, 3.5 }, { "Wally", 456, 2.0 }, { "Alice", 987, 3.9 }, { "Asok", 730, 3.8 }, { "Catbert", 501, 3.0 }, { "Pointy Haired Boss", 666, 1.0 }, { "Dogbert", 111, 4.0 } }; |
(a) | (b) |
![]() |
|
(c) |
void print(student* data); // Prints a single student structure to the console. void print(int number, student* data); // Prints the whole array of structures to the console. int order_name(const void* e1, const void* e2); // Orders the objects by name. int order_id(const void* e1, const void* e2); // Orders the objects by id. int order_gpa(const void* e1, const void* e2); // Orders the objects by gpa.
qsort(students, 7, sizeof(student), order_name); // (a) print(7, students); student key1 = { "Catbert", 0, 0 }; // (b) student* found1 = (student *)bsearch(&key1, students, 7, sizeof(student), order_name); // (c) if (found1 != nullptr) // (d) print(found1); else cout << "key not found\n";
int order_name(const void* e1, const void* e2) // (e) { return strcmp(((student *)e1)->name, ((student *)e2)->name); // (f) }
qsort(students, 7, sizeof(student), order_id); // (a) print(7, students); student key2 = { "", 730, 0 }; // (b) student* found2 = (student *)bsearch(&key2, students, 7, sizeof(student), order_id); // (c) if (found2 != nullptr) // (d) print(found2); else cout << "key not found\n";
int order_id(const void* e1, const void* e2) // (e) { return ((student *)e1)->id - ((student *)e2)->id; // (f) }
qsort(students, 7, sizeof(student), order_gpa); // (a) print(7, students); student key3 = { "", 0, 3.9 }; // (b) student* found3 = (student *)bsearch(&key3, students, 7, sizeof(student), order_gpa); // (c) if (found3 != nullptr) // (d) print(found3); else cout << "key not found\n";
int order_gpa(const void* e1, const void* e2) // (e) { double diff = ((student *)e1)->gpa - ((student *)e2)->gpa; // (f) if (diff < 0) return -1; // (g) if (diff > 0) return 1; return 0; }