By this point in our study of programming generally and C++ specifically, we know that programs often require the user to input data. Furthermore, we know that there are many kinds of data and that entering the wrong kind can lead to strange results or program failure (infinite loops, crashes, etc.). How a program responds when the end-user enters the wrong kind of data depends on what kind of data the program expects versus what kind of data the user entered, the programming language, and the host system (hardware and operating system). The following program illustrates some of these variables.
#include <iostream> using namespace std; int main() { int input; cout << "Please enter an integer: "; cin >> input; cout << input; return 0; } |
D:\>bulletproof Please enter an integer: hello 4194048 D:\> |
D:\>bulletproof Please enter an integer: hello 2096640 D:\> |
|
(a) | (b) |
Bulletproof code is code that catches or detects input errors, reports the error to the user, and "gracefully" deals with the error. Gracefully dealing with an error typically means that after detecting the error and printing a diagnostic (i.e., an error message), the program either terminates or allows the user to re-enter the data. Detecting the error is usually the hardest part of this process, so we approach it by breaking it into two smaller problems.
First, entering the wrong kind of data can cause a program to fail. We need a way to input the data that is guaranteed not to crash the program. A program can represent all data as a string (either a C-string or an instance of the string class). We use strings as our "universal" data type and initially read the data as a string. Once the data is validated, there are various ways that the data can be re-read from the string. We'll explore some of these techniques in the Streams chapter, but for simplicity, we'll restrict this discussion to just numbers.
Once the data is in the program as a string, we need to verify that each character in the string represents the kind of data that the program needs. For example, the string "123" looks like an integer, but the string "hello" does not. We need some way of formalizing and programming the concept of "looks like." Regular expressions (RE) are a compact and efficient way of verifying that strings match a given pattern. RE can differentiate between a string of digit characters and non-digit characters or a string with one non-digit character amongst digits. RE can detect and validate that data are in the correct format. For example, imagine a program prompts a user to enter a date. An RE can distinguish between Jan 1, 2025 and 1/1/2025, and branch to the correct code to process the input. Unfortunately, REs are a bit advanced for us right now and beyond the scope of this course, but you will learn about them later. Although not as compact or powerful as an RE, we can use loops and the cctype library to detect non-numeric input.
string class version | C-string version |
---|---|
string input; getline(cin, input); for (size_t i = 0; i < input.length(); i++) if (! isdigit(input[i])) { cerr << "Invalid integer: " << input << endl; exit(1); } |
char input[100]; cin.getline(input, 100); for (size_t i = 0; i < strlen(input); i++) if (! isdigit(input[i])) { cerr << "Invalid integer: " << input << endl; exit(1); } |
If the string only contains digits, we can safely convert it to an integer. Validating a correct floating-point number (either a float or a double) is more difficult because it may have a decimal point or an exponent. Nevertheless, converting a string to a floating-point number is almost as easy as converting a string to an integer.
We can use the casting operators to convert from one numeric data type to another. But casting works only when the source and destination data types are "close" (e.g., an integer and a double are "close" in the sense that they are both numbers). But strings (either C-strings or string objects) are too different - not "close" enough - to cast between them. Nevertheless, we know from the solutions developed for the palindrome-number problem (cpalnumber.cpp and palnumber.cpp) that is is possible to convert an integer into a string. Converting a correctly formed string into a number is equally easy.
string Class | C-String |
---|---|
#include <iostream>
#include <string>
#include <cctype>
using namespace std;
int main()
{
string input;
cout << "Please enter an integer: ";
getline(cin, input);
for (size_t i = 0; i < input.length(); i++)
if (! isdigit(input[i]))
{
cerr << "Invalid integer: " << input << endl;
exit(1);
}
cout << stoi(input) << endl;
return 0;
}
|
#include <iostream>
#include <cstring>
#include <cctype>
using namespace std;
int main()
{
char input[100];
cout << "Please enter an integer: ";
cin.getline(input, 100);
for (size_t i = 0; i < strlen(input); i++)
if (! isdigit(input[i]))
{
cerr << "Invalid integer: " << input << endl;
exit(1);
}
cout << atoi(input) << endl;
return 0;
} |
(a) | (b) |
D:\>bulletproof Please enter an integer: 123 123 |
D:\>bulletproof Please enter an integer: hello Invalid integer: hello |
(c) | (d) |
The documentation for the string class conversion functions show function prototypes with multiple parameters. However, only the first parameter lacks a default value, so the functions work as illustrated here. The impact of the default arguments is easier to see in the following two figures.
string to int | int stoi(const string& s, size_t* index = nullptr, int base = 10); |
---|---|
string to double | double stod(const string& s, size_t* index = nullptr); |
s
is the string object the function converts to a number. The program passes it by reference for efficiency.index
is a pointer to an int
(i.e., it is an argument passed-by-pointer). The conversion functions save the index location of the first unconvertible character (i.e., where the conversion ended) in the integer. The following figure illustrates the feature's syntax and behavior. Setting index to nullptr disables the feature.base
is the base of the digit-characters in the string, for example, 2 (binary), 8 (octal), decimal (10), or 16 (hexadecimal) - other bases also work but may not be named. The default assumes base 10 or decimal.string to int | string to double |
---|---|
#include <iostream>
#include <string>
using namespace std;
int main()
{
string s = "1234Hello World";
size_t index = 0;
cout << stoi(s) << endl; // a
cout << stoi(s, nullptr) << endl; // b
cout << stoi(s, &index) << endl; // c
cout << index << endl; // d
return 0;
}
|
#include <iostream>
#include <string>
using namespace std;
int main()
{
string s = "3.14Hello World";
size_t index = 0;
cout << stod(s) << endl; // a
cout << stod(s, nullptr) << endl; // b
cout << stod(s, &index) << endl; // c
cout << index << endl; // d
return 0;
}
|
Output | Output |
1235 1235 1235 4 |
3.14 3.14 3.14 4 |
For the C-string conversion functions that take multiple arguments, notice that the second argument is a character pointer - not an integer pointer as in the string class functions. The character and integer pointers serve the same purpose: notifying the program where the string-to-number conversion ended in the string. Whereas the string class functions provide an index into the string, the C-string functions return a sub-string (a character pointer to the beginning of the unconverted part of the original string). C++ inherits the the C-string conversion functions from the C Programming Language, which does not support default arguments. So, programmers must provide arguments for all the conversion functions' parameters. However, the following figures demonstrate that nullptr is a valid second argument for the multi-argument functions.
C-string to int & long int |
int atoi(const char* s); long strtol(const char* s, char** end, int base); |
---|---|
C-string to double | double atof(const char* s); double strtod(const char* s, char** end); |
s
is the C-string that the function converts to a number.end
is a pointer to a C-string, which is already a character pointer (thus the two asterisks). After the function call, end points to the first character in the original string that the function could not convert to a number. Passing nullptr switches off this behavior.base
is the base of the digit-characters in the string, for example, 2 (binary), 8 (octal), decimal (10), or 16 (hexadecimal) - other bases also work but may not be named.int / long | double |
---|---|
#include <iostream>
#include <cstring>
using namespace std;
int main()
{
char* input = "1234Hello World";
cout << atoi(input) << endl; // a
char* end;
cout << strtol(input, nullptr, 10) << endl; // b
cout << strtol(input, &end, 10) << endl; // c
cout << end << endl; // d
return 0;
} |
#include <iostream>
#include <cstring>
using namespace std;
int main()
{
char* input = "3.14Hello World";
cout << atof(input) << endl; // a
char* end;
cout << strtod(input, nullptr) << endl; // b
cout << strtod(input, &end) << endl; // c
cout << end << endl; // d
return 0;
} |
Output | Output | 1234 1234 1234 Hello World |
3.14 3.14 3.14 Hello World |