Please review the following as needed:
Suppose that we are given an array of structures where each structure contains student information: name, id, address, etc. We may wish to sort the array into alphabetical order (Alice, ..., Wally), or maybe we want the array ordered by student ID (00000000, ..., 99999999). We were introduced to the selection sort algorithm in the last chapter. While selection sort works, it has a run order of O(n2), which is not very efficient - a small increase in the data size results in a large increase in the runtime. One efficient and well known algorithm used to sort large amounts of data is named quicksort, which usually has a run order of O(n log n), which means that it is usually more efficient than selection sort - a small increase in the data size results in a small increase in the runtime. Unfortunately, there are some situations where quicksort doesn't perform well and a good implementation of the algorithm must detect these situations and alter its behavior accordingly, which naturally increases the complexity of the code. Together, the complexity and the utility of quicksort make it an ideal candidate for being implemented as a library function.
We were also introduced to the binary search algorithm in the last chapter. Unlike qsort, binary search is a small, simple algorithm, and unlike selection sort, binary search is suitable for larger amounts of data. Binary search only works with sorted data, but once the data is sorted, binary search has a run order of O(log n). The standard library, cstdlib
, provides two functions: qsort
and bsearch
that are demonstrated below.
void qsort(void* data, size_t num, size_t size, int (*order)(const void* e1, const void* e2));
void* bsearch(void* key, void* data, size_t num, size_t size, int (*order)(const void* e1, const void* e2));
cstdlib
or stdlib.h
, includes functions named qsort
and bsearch
that implement the quicksort and binary search algorithms respectively. Note that the code highlighted in yellow represents a single function argument: a pointer to an ordering function. Ordering functions and void pointers were described in the previous page.
int
: "order" points to a function whose return type is an intconst void* e1, const void* e2
: "order" requires two arguments; each argument points to one element in the array. The "const" keyword signifies that the arguments are not modified.The complexity and arcane syntax of the library functions is largely due to one requirement: the need to make the function sufficiently general that it will operate with whatever kind of data the programmer wishes to search or sort. Solving this problem comes at cost: the user must provide an ordering or comparison function. Given two elements from the array, this function determines which element comes first. The ordering function arguments are passed as void pointers, which requires the function to (1) cast the void pointer to a known data type and (2) extract the part of the data used to sort the array. Accessing the data "hidden" in the void pointers is the hardest part of using the library functions and is demonstrated in the following four examples. Each example demonstrates a different kind of data, but you should be able to modify one of the examples to satisfy most common searching and sorting tasks.
Working with simple or primitive data is fairly straightforward. The following example is based on an array of type int
but will work with any primitive data type (e.g., char
, double
, long
, etc.).
/* * Sorts and then searches an array of integers. * Demonstrates how to write an ordering function * that compares two integers. */ #include <iostream> #include <cstdlib> using namespace std; int order_int(const void* e1, const void* e2); // (a) int main() { int int_data[10] = { 2, 5, 3, 9, 1, 4, 6, 8, 7, 0 }; // (b) qsort(int_data, 10, sizeof(int), order_int); // (c) for (int i = 0; i < 10; i++) // (d) cout << int_data[i] << endl; int key = 6; // (e) int* found = (int *)bsearch(&key, int_data, 10, sizeof(int), order_int); // (f) if (found != nullptr) // (g) cout << "key found " << *found << endl; else cout << "key not found\n"; return 0; } int order_int(const void* e1, const void* e2) // (h) { return * (int *) e1 - * (int *) e2; }
As the sorted data becomes more complex, it also becomes more important to understand the data's structure. It is often convenient to visualize or represent the structure graphically. A picture or abstract representation forms a bridge that helps programmers the span the gulf between a problem and the details of the final program.
/* * Sorts an array of C-strings. * Demonstrates how to write an ordering function * that compares two C-strings. */ #include <iostream> #include <cstdlib> #include <cstring> using namespace std; int order_string(const void* e1, const void* e2); // (a) int main() { char* string_data[] = { "see", "the", "quick", "red", "fox", "jump", // (b) "over", "the", "lazy", "brown", "dog" }; qsort(string_data, 11, sizeof(char*), order_string); // (c) for (int i = 0; i < 11; i++) // (d) cout << string_data[i] << endl; char* key = "jump"; // (e) char** found = (char **)bsearch(&key, string_data, 11, sizeof(char*), order_string); // (f) if (*found != nullptr) // (g) cout << "key found: " << *found << endl; else cout << "key not found\n"; return 0; } int order_string(const void* e1, const void* e2) // (h) { return strcmp(*(char **) e1, *(char **) e2); // (i) }
strcmp
function requires two C-string (i.e., character pointer) arguments, which it compares.strcmp
.char char_array[10][15]
and is a good example of a situation where is not possible to reverse the rows and columns: [15][10] will NOT work. Each row in the array is treated as a C-string. Although the rows are only partially filled, the elements past the null termination characters are ignored.
/* * Sorts a two-dimensional array of characters. * Demonstrates how to write an ordering function * that compares two C-strings formed from a 2D array. */ #include <iostream> #include <stdlib.h> #include <cstring> using namespace std; int order_char(const void* e1, const void* e2); // (a) int main() { char char_array[10][15] = { "four", "score", "and", "seven", "years", // (b) "ago", "our", "fathers", "brought", "forth" }; qsort(char_array, 10, 15, order_char); // (c) for (int i = 0; i < 10; i++) // (d) cout << char_array[i] << endl; char* key = "score"; // (e) char* found = (char *)bsearch(key, char_array, 10, 15, order_char); // (f) if (found != nullptr) // (g) cout << "key found: " << found << endl; else cout << "key not found\n"; return 0; } int order_char(const void* e1, const void* e2) // (h) { return strcmp((char *) e1, (char *) e2); // (i) }
strcmp
function requires two C-string (i.e., character pointer) arguments, which it compares.strcmp
.Searching for a given element in an array that contains integers or strings only serves to answer the question, "Is the element in the array or not?" But searching an array of objects can be very useful. For example, suppose that the array contains objects as illustrated in Figure 8. If we search for id 123, the search function will return an object that has three fields and we get all of the information associated with the search key. Searching for part of the data in an object or record to find all of the data is called an associative search. Furthermore, it's possible to sort and search the array based any of the structure fields.
To simplify the discussion, demo4.cpp
is presented as a series of figures. The complete program, with all parts in context, is available at the bottom of the page.
struct student { char* name; int id; double gpa; }; |
struct student students[] = { { "Dilbert", 123, 3.5 }, { "Wally", 456, 2.0 }, { "Alice", 987, 3.9 }, { "Asok", 730, 3.8 }, { "Catbert", 501, 3.0 }, { "Pointy Haired Boss", 666, 1.0 }, { "Dogbert", 111, 4.0 } }; |
(a) | (b) |
![]() |
|
(c) |
void print(student* data); // Prints a single student structure to the console. void print(int number, student* data); // Prints the whole array of structures to the console. int order_name(const void* e1, const void* e2); // Orders the objects by name. int order_id(const void* e1, const void* e2); // Orders the objects by id. int order_gpa(const void* e1, const void* e2); // Orders the objects by gpa.
qsort(students, 7, sizeof(student), order_name); // (a) print(7, students); student key1 = { "Catbert", 0, 0 }; // (b) student* found1 = (student *)bsearch(&key1, students, 7, sizeof(student), order_name); // (c) if (found1 != nullptr) // (d) print(found1); else cout << "key not found\n";
int order_name(const void* e1, const void* e2) // (e) { return strcmp(((student *)e1)->name, ((student *)e2)->name); // (f) }
student
structure is as illustrated in Figure 8(b).strcmp
library function. Two operations are needed to extract the name from the structure: (1) the void pointer is cast to a student
pointer and (2) the name is accessed using the arrow operator. Note that the arrow operator has a higher precedence than does the the casting operator - the correct order of operation is achieved with the grouping parentheses appearing before the arrow operator.
qsort(students, 7, sizeof(student), order_id); // (a) print(7, students); student key2 = { "", 730, 0 }; // (b) student* found2 = (student *)bsearch(&key2, students, 7, sizeof(student), order_id); // (c) if (found2 != nullptr) // (d) print(found2); else cout << "key not found\n";
int order_id(const void* e1, const void* e2) // (e) { return ((student *)e1)->id - ((student *)e2)->id; // (f) }
student
structure is as illustrated in Figure 8(b).student
pointer and (2) the id is accessed using the arrow operator. Note that the arrow operator has a higher precedence than does the the casting operator - the correct order of operation is achieved with the grouping parentheses appearing before the arrow operator.
qsort(students, 7, sizeof(student), order_gpa); // (a) print(7, students); student key3 = { "", 0, 3.9 }; // (b) student* found3 = (student *)bsearch(&key3, students, 7, sizeof(student), order_gpa); // (c) if (found3 != nullptr) // (d) print(found3); else cout << "key not found\n";
int order_gpa(const void* e1, const void* e2) // (e) { double diff = ((student *)e1)->gpa - ((student *)e2)->gpa; // (f) if (diff < 0) return -1; // (g) if (diff > 0) return 1; return 0; }
student
structure is as illustrated in Figure 8(b).student
pointer and (2) the name is accessed using the arrow operator. Note that the arrow operator has a higher precedence than does the the casting operator - the correct order of operation is achieved with the grouping parentheses appearing before the arrow operator.