Library Functions: qsort And bsearch
Review
Please review the following as needed:
Suppose that we are given an array of structures where each structure contains student information: name, id, address, etc. We may wish to sort the array into alphabetical order (Alice, ..., Wally), or maybe we want the array ordered by student ID (00000000, ..., 99999999). We were introduced to the selection sort algorithm in the last chapter. While selection sort works, it has a run order of O(n2 ) , which is not very efficient - a small increase in the data size results in a large increase in the runtime. One efficient and well known algorithm used to sort large amounts of data is named quicksort , which usually has a run order of O(n log n) , which means that it is usually more efficient than selection sort - a small increase in the data size results in a small increase in the runtime. Unfortunately, there are some situations where quicksort doesn't perform well and a good implementation of the algorithm must detect these situations and alter its behavior accordingly, which naturally increases the complexity of the code. Together, the complexity and the utility of quicksort make it an ideal candidate for being implemented as a library function.
We were also introduced to the binary search algorithm in the last chapter. Unlike qsort, binary search is a small, simple algorithm, and unlike selection sort, binary search is suitable for larger amounts of data. Binary search only works with sorted data, but once the data is sorted, binary search has a run order of O(log n) . The standard library, cstdlib
, provides two functions: qsort
and bsearch
that are demonstrated below.
void qsort(void* data, size_t num, size_t size, int (*order)(const void* e1, const void* e2) );
void* bsearch(void* key, void* data, size_t num, size_t size, int (*order)(const void* e1, const void* e2) );
Prototypes for the qsort and bsearch library functions . The standard C/C++ library, cstdlib
or stdlib.h
, includes functions named qsort
and bsearch
that implement the quicksort and binary search algorithms respectively. Note that the code highlighted in yellow represents a single function argument: a pointer to an ordering function. Ordering functions and void pointers were described in the previous page.
void* key
The object or data that is being searched for. The key usually takes the form of a partially filled object - just the field or fields needed for the search contain data.
void* data
The data array that will be sorted.
size_t num
The number of elements in the data array.
size_t size
The size, measured in bytes, of each element in the array.
int (*order)(const void* e1, const void* e2)
Defines "order" as function pointer (this is a function argument, so you may choose any appropriate variable name):
int
: "order" points to a function whose return type is an int
const void* e1, const void* e2
: "order" requires two arguments; each argument points to one element in the array. The "const" keyword signifies that the arguments are not modified.
bsearch return value
If "key" is found in "data," bsearch returns a pointer to the matching array element; otherwise it returns nullptr.
The complexity and arcane syntax of the library functions is largely due to one requirement: the need to make the function sufficiently general that it will operate with whatever kind of data the programmer wishes to search or sort. Solving this problem comes at cost: the user must provide an ordering or comparison function. Given two elements from the array, this function determines which element comes first. The ordering function arguments are passed as void pointers, which requires the function to (1) cast the void pointer to a known data type and (2) extract the part of the data used to sort the array. Accessing the data "hidden" in the void pointers is the hardest part of using the library functions and is demonstrated in the following four examples. Each example demonstrates a different kind of data, but you should be able to modify one of the examples to satisfy most common searching and sorting tasks.
Searching and Sorting An Array Of Integers
Working with simple or primitive data is fairly straightforward. The following example is based on an array of type int
but will work with any primitive data type (e.g., char
, double
, long
, etc.).
An array of integers . An array of any simple or primitive data will look similar.
/*
* Sorts and then searches an array of integers.
* Demonstrates how to write an ordering function
* that compares two integers.
*/
#include <iostream>
#include <cstdlib>
using namespace std;
int order_int (const void* e1, const void* e2); // (a)
int main()
{
int int_data[10] = { 2, 5, 3, 9, 1, 4, 6, 8, 7, 0 }; // (b)
qsort(int_data, 10, sizeof(int), order_int ); // (c)
for (int i = 0; i < 10; i++) // (d)
cout << int_data[i] << endl;
int key = 6; // (e)
int* found = (int *)bsearch(&key, int_data, 10, sizeof(int), order_int ); // (f)
if (found != nullptr) // (g)
cout << "key found " << *found << endl;
else
cout << "key not found\n";
return 0;
}
int order_int (const void* e1, const void* e2) // (h)
{
return * (int *) e1 - * (int *) e2;
}
demo1.cpp . Using the qsort and bsearch library functions to sort and search an array of integers. It's rarely useful to search an array of numbers.
Ordering function prototype.
Test data.
The call to qsort has 4 parameters: (1) array to sort, (2) number of elements in the array, (3) the size (in bytes) of each element, and (4) a pointer to the ordering function.
Prints the data array to demonstrate that it is sorted.
Creates a key for the binary search function.
The call to bsearch has 5 parameters: (1) the address of the key (the value being searched for), (2) the array to search, (3) number of elements in the array, (4) the size (in bytes) of each element, and (5) a pointer to the ordering function. bsearch returns a void pointer, which is cast to an integer pointer.
bsearch returns the pointer to the array element if a match is found or nullptr if a matching element is not found. Notice that the pointer variable "found" must be dereferenced to access the array element returned by bsearch.
The ordering function:
Each argument (e1 and e2) is cast from a void pointer to an integer pointer.
Each integer pointer is dereferenced (the asterisk outside the parentheses), which results in the integer values stored at one location in the original data array.
The difference between the two inters indicates their relative order, e.g., 5 - 10 < 0, 5 - 5 == 0, and 10 - 5 > 0.
The difference is the return value of the ordering function.
Searching and Sorting An Array Of C-Strings
As the sorted data becomes more complex, it also becomes more important to understand the data's structure. It is often convenient to visualize or represent the structure graphically. A picture or abstract representation forms a bridge that helps programmers the span the gulf between a problem and the details of the final program.
An array of C-strings . The figure illustrates the most common way of creating an array of C-strings. The key to understanding all the bizarre casting syntax is to remember that (1) arrays are always passed to functions as pointers and (b) that C-strings are defined as character pointers - so, wherever a pointer is needed, it becomes a double pointer.
/*
* Sorts an array of C-strings.
* Demonstrates how to write an ordering function
* that compares two C-strings.
*/
#include <iostream>
#include <cstdlib>
#include <cstring>
using namespace std;
int order_string (const void* e1, const void* e2); // (a)
int main()
{
char* string_data[] = { "see", "the", "quick", "red", "fox", "jump", // (b)
"over", "the", "lazy", "brown", "dog" };
qsort(string_data, 11, sizeof(char*), order_string ); // (c)
for (int i = 0; i < 11; i++) // (d)
cout << string_data[i] << endl;
char* key = "jump"; // (e)
char** found = (char **)bsearch(&key, string_data, 11, sizeof(char*), order_string ); // (f)
if (*found != nullptr) // (g)
cout << "key found: " << *found << endl;
else
cout << "key not found\n";
return 0;
}
int order_string (const void* e1, const void* e2) // (h)
{
return strcmp(*(char **) e1, *(char **) e2); // (i)
}
demo2.cpp . Using the qsort library function to sort an array of C-strings.
Ordering function prototype.
Test data.
Call to qsort library function has 4 parameters: (1) array to sort, (2) number of elements in the array, (3) the size (in bytes) of each element, and (4) a pointer to the ordering function.
Prints the data array to demonstrate that it is sorted.
Creates a key for the binary search function. A C-string is a character pointer.
The call to bsearch has 5 parameters: (1) the address of the key (the value being searched for), (2) the array to search, (3) number of elements in the array, (4) the size (in bytes) of each element, and (5) a pointer to the ordering function. bsearch returns a void pointer, which is cast to an char* pointer - a pointer to a C-string or char**.
bsearch returns a pointer to the array element if a match is found or nullptr if a matching element is not found. Notice that the pointer variable "found" must be dereferenced to access the array element returned by bsearch.
The ordering function:
Each argument (e1 and e2) is cast from a void pointer to a pointer to a C-string. However, a C-string is already a character pointer, so the cast uses two asterisks denoting a double pointer or two levels of indirection - one level for the C-string and one level caused by passing the array element as a pointer.
Each pointer to a C-string is dereferenced (the asterisk outside the parentheses), which produces a C-string or single character pointer. Each C-string is one of the strings stored in the original data array, or, in the case of bsearch, the key.
The strcmp
function requires two C-string (i.e., character pointer) arguments, which it compares.
The ordering function returns the value returned by strcmp
.
Searching and Sorting A Two-Dimensional Array Of Characters
A two-dimensional array of characters . The array is defined as char char_array[10][15]
and is a good example of a situation where is not possible to reverse the rows and columns: [15][10] will NOT work. Each row in the array is treated as a C-string. Although the rows are only partially filled, the elements past the null termination characters are ignored.
/*
* Sorts a two-dimensional array of characters.
* Demonstrates how to write an ordering function
* that compares two C-strings formed from a 2D array.
*/
#include <iostream>
#include <stdlib.h>
#include <cstring>
using namespace std;
int order_char (const void* e1, const void* e2); // (a)
int main()
{
char char_array[10][15] = { "four", "score", "and", "seven", "years", // (b)
"ago", "our", "fathers", "brought", "forth" };
qsort(char_array, 10, 15, order_char ); // (c)
for (int i = 0; i < 10; i++) // (d)
cout << char_array[i] << endl;
char* key = "score"; // (e)
char* found = (char *)bsearch(key, char_array, 10, 15, order_char ); // (f)
if (found != nullptr) // (g)
cout << "key found: " << found << endl;
else
cout << "key not found\n";
return 0;
}
int order_char (const void* e1, const void* e2) // (h)
{
return strcmp((char *) e1, (char *) e2); // (i)
}
demo3.cpp . Using the qsort library function to sort C-strings that are rows in a two-dimensional character array. Carefully compare demo2 and demo3; see if you can spot the differences - when you can explain the differences, you will truly understand pointers and C-strings.
Ordering function prototype.
Test data.
Call to qsort library function has 4 parameters: (1) array to sort (which may be multi-dimensional but is examined by rows), (2) number of elements in the array (the number of rows in this example), (3) the size (in bytes) of each element (the maximum length of the C-string including the null termination character), and (4) a pointer to the ordering function.
Prints the data array, one row at a time, to demonstrate that it is sorted.
The key or C-string that is searched for in the array.
The call to bsearch has 5 parameters: (1) the key, which is a pointer and so is an address, (2) the array to search, (3) number of elements in the array, (4) the size (in bytes) of each element, and (5) a pointer to the ordering function. bsearch returns a void pointer, which is cast to a char* or C-string.
bsearch returns a pointer to the array element if a match is found or nullptr if a matching element is not found. Notice that unlike demo2, the variable "found" is NOT dereferenced.
The ordering function:
Each argument (e1 and e2) is cast from a void pointer to a pointer to a C-string, which is stored as one null-terminated row of characters in a two-dimensional array of characters.
The strcmp
function requires two C-string (i.e., character pointer) arguments, which it compares.
The ordering function returns the value returned by strcmp
.
Searching and Sorting An Array Of Objects
Searching for a given element in an array that contains integers or strings only serves to answer the question, "Is the element in the array or not?" But searching an array of objects can be very useful. For example, suppose that the array contains objects as illustrated in Figure 8. If we search for id 123, the search function will return an object that has three fields and we get all of the information associated with the search key. Searching for part of the data in an object or record to find all of the data is called an associative search . Furthermore, it's possible to sort and search the array based any of the structure fields.
To simplify the discussion, demo4.cpp
is presented as a series of figures. The complete program, with all parts in context, is available at the bottom of the page.
struct student
{ char* name;
int id;
double gpa;
};
struct student students[] = {
{ "Dilbert", 123, 3.5 },
{ "Wally", 456, 2.0 },
{ "Alice", 987, 3.9 },
{ "Asok", 730, 3.8 },
{ "Catbert", 501, 3.0 },
{ "Pointy Haired Boss", 666, 1.0 },
{ "Dogbert", 111, 4.0 }
};
(a) (b)
(c)
An array of structure objects .
The structure specification.
Defining and initializing the array of structures.
An abstract representation of the array in memory.
void print(student* data); // Prints a single student structure to the console.
void print(int number, student* data); // Prints the whole array of structures to the console.
int order_name(const void* e1, const void* e2); // Orders the objects by name.
int order_id(const void* e1, const void* e2); // Orders the objects by id.
int order_gpa(const void* e1, const void* e2); // Orders the objects by gpa.
demo4.cpp prototypes .
qsort(students, 7, sizeof(student), order_name ); // (a)
print(7, students);
student key1 = { "Catbert", 0, 0 }; // (b)
student* found1 = (student *)bsearch(&key1 , students, 7, sizeof(student), order_name ); // (c)
if (found1 != nullptr) // (d)
print(found1);
else
cout << "key not found\n";
int order_name (const void* e1, const void* e2) // (e)
{
return strcmp(((student *)e1)->name, ((student *)e2)->name); // (f)
}
Searching and sorting by name .
Sorts the array alphabetically by name. The student
structure is as illustrated in Figure 8(b).
Creates a search key, which is a partially filled structure - only the name field is used.
Searches the sorted array for the key (i.e., the name).
bsearch returns nullptr if the key is not found, or a pointer to the matching object in the array if it is found.
An ordering function based on the student's name.
The ordering function must first access or extract the necessary fields; once the fields are available, it then compares them - for the student names it uses the strcmp
library function. Two operations are needed to extract the name from the structure: (1) the void pointer is cast to a student
pointer and (2) the name is accessed using the arrow operator. Note that the arrow operator has a higher precedence than does the the casting operator - the correct order of operation is achieved with the grouping parentheses appearing before the arrow operator.
qsort(students, 7, sizeof(student), order_id) ; // (a)
print(7, students);
student key2 = { "", 730, 0 }; // (b)
student* found2 = (student *)bsearch(&key2 , students, 7, sizeof(student), order_id ); // (c)
if (found2 != nullptr) // (d)
print(found2);
else
cout << "key not found\n";
int order_id (const void* e1, const void* e2) // (e)
{
return ((student *)e1)->id - ((student *)e2)->id; // (f)
}
Searching and sorting by id .
Sorts the array numerically by id, low to high. The student
structure is as illustrated in Figure 8(b).
Creates a search key, which is a partially filled structure - only the id field is used.
Searches the sorted array for the key (i.e., the id).
bsearch returns nullptr if the key is not found, or a pointer to the matching object in the array if it is found.
An ordering function based on the student's id.
The ordering function must first access or extract the necessary fields; once the fields are available, it then compares them - for student ids the difference is fast and efficient. Two operations are needed to extract the id from the structure: (1) the void pointer is cast to a student
pointer and (2) the id is accessed using the arrow operator. Note that the arrow operator has a higher precedence than does the the casting operator - the correct order of operation is achieved with the grouping parentheses appearing before the arrow operator.
qsort(students, 7, sizeof(student), order_gpa ); // (a)
print(7, students);
student key3 = { "", 0, 3.9 }; // (b)
student* found3 = (student *)bsearch(&key3 , students, 7, sizeof(student), order_gpa ); // (c)
if (found3 != nullptr) // (d)
print(found3);
else
cout << "key not found\n";
int order_gpa (const void* e1, const void* e2) // (e)
{
double diff = ((student *)e1)->gpa - ((student *)e2)->gpa; // (f)
if (diff < 0) return -1; // (g)
if (diff > 0) return 1;
return 0;
}
Searching and sorting by gpa .
Sorts the array numerically by gpa. The student
structure is as illustrated in Figure 8(b).
Creates a search key, which is a partially filled structure - only the gpa field is used.
Searches the sorted array for the key (i.e., the gpa). Data like the gpa is likely not unique and binary search may return any object with a value matching the key.
bsearch returns nullptr if the key is not found, or a pointer to the matching object in the array if it is found.
An ordering function based on the student's gpa.
The ordering function must first access or extract the necessary fields; once the fields are available, it then compares them. Two operations are needed to extract the name from the structure: (1) the void pointer is cast to a student
pointer and (2) the name is accessed using the arrow operator. Note that the arrow operator has a higher precedence than does the the casting operator - the correct order of operation is achieved with the grouping parentheses appearing before the arrow operator.
The gpa is type double, so the difference between the two gpa fields is a double-valued expression - but the function returns an integer, so, to avoid truncation errors, the difference to mapped to an appropriate integer value.
Downloadable Code
Back |
Chapter TOC |
Next