At their most fundamental level, computers operate on integers, including textual or character-based operations. Computers encode - store and manipulate - characters as integers of various sizes. The ASCII character set was an early, 7-bit encoding appropriate for the limited capabilities of non-graphic hardware such as teletype and CRT displays (see The Computer Console). However, the ASCII encoding is limited and unable to support the numerous alphabets used today. Modern, graphic-capable computer systems now use the Unicode encoding, based on a 16-bit integer and capable of representing many alphabets and emojis.
Programs performing textual operations often must classify or categorize characters based on a common characteristic. For example, a program may validate numerical input by verifying that each character read represents a digit. Both character encodings represent digit characters as integer sequences, allowing programmers to write simple tests. Using the Hindu-Arabic numerals for simplicity and assuming that the character is stored in the variable c, we can write the test as follows:
if (c >= '0' && c <= '9') ...Including hexadecimal digits adds two more ranges to the if-statement:
if (c >= '0' && c <= '9' || c >= 'A' && c <= 'F' || c >= 'a' && c <= 'f') ...Testing for punctuation characters is still more cumbersome because they are scattered throughout the encodings.
While we can always use if-statements and switches to classify characters, these operations occur so often that C++ provides a library of optimized classification functions. The library implements the functions with table lookups and bitwise operations written in hand-crafted assembly code, making them very fast, and it's unlikely we can outperform them with if-statements or switches.
#include <cctype> | Header file required to use the cctype library |
int isalnum(int c) | Returns true if the character c is an alphanumeric (isalpha(c) && isdigit(c)), else return false |
int isalpha(int c) | Returns true if the character c is an alphabetic letter, else return false; what characters the function considers to be alphabetic letters depends on the default locale: isalpha(c) is true when isupper(c) && islower(c) is true |
int isascii(int c) | Returns true if the character c is in the range 0 to 0x7F (deprecated) |
int isblank(int c) | Returns true if the the character c is blank, else return false |
int iscntrl(int c) | Returns true if the character c is a control character, else return false |
int isdigit(int c) | Returns true if the character c is a decimal digit, else return false; isdigit(c) returns true whenever c is in the range '0' to '9' |
int isgraph(int c) | Returns true if the character c has a graphical representation, else return false |
int islower(int c) | Returns true if the character c is a lowercase letter, else return false; which characters the function considers as letters depends on the default locale: in U.S. English, islower(c) returns true whenever c is in the range 'a' to 'z' |
int isprint(int c) | Returns true if the character c is a printable, else return false |
int ispunct(int c) | Returns true if the character c is a punctuation character, else return false |
int isspace(int c) | Returns true if the character c is a white-space, else return false |
int isupper(int c) | Returns true if the character c is an uppercase letter, else return false; which characters the function considers as letters depends on the default locale: in U.S. English, islower(c) returns true whenever c is in the range 'A' to 'Z' |
int isxdigit(int c) | Returns true if the character c is a hexadecimal digit, else return false; isxdigit(c) returns true whenever c is in one of the ranges '0' to '9', 'A' to 'F', or 'a' to 'f' |
int tolower(int c) | Convert uppercase letter to lowercase, else return false; does not change the character if it is not an uppercase letter |
int toupper(int c) | Convert lowercase letter to uppercase, else return false; does not change the character if it is not a lowercase letter |
#include <iostream> #include <cctype> using namespace std; int main() { for (char c : "Hello, World!") if (isalpha(c)) cout << c; cout << endl; return 0; } |
#include <iostream> #include <cctype> using namespace std; int main() { for (char c : "Hello, World!") cout << char(tolower(c)); // or cout << (char)tolower(c); cout << endl; return 0; } |
HelloWorld |
hello, world! |
(a) | (b) |