8.2.1. C-string I/O

Time: 00:03:09 | Download: Large, Large (CC), Small | Streaming, Streaming (CC) | Slides (PDF)
Review

To this point, the console has been our primary source and target for input and output, and the extractor ( >> ) and the inserter ( << ) operators our primary means for performing the I/O operations. These operations also work with C-strings, but not always as we might expect. To better understand C-string I/O, we must consider the two operations separately.

C-string Output

The inserter operator works predictably well, independent of how the C-string was created (as an array or as a pointer). C++ behaves this way because the name of an array is a pointer, implying that the inserter operator sees no difference between the two representations. Furthermore, it makes no difference if the program creates the C-string as an automatic or local variable on the stack or a dynamic variable on the heap with new.

char s1[]
char s2[]
char* s3
const char* s4
char* s5
char* s6
char s7[15]
char s8[15]
cout << s1 << endl;
cout << s2 << endl;
cout << s3 << endl;
cout << s4 << endl;
cout << s5 << endl;
cout << s6 << endl;
cout << s7 << endl;
cout << s8 << endl;
Hello world
Hello world
Hello world
Hello world
Hello world
Hello world
Example
Example
C-string output to the console. The inserter operator prints the C-strings defined and initialized in the previous section to the console. These observations allow us to conclude that C-string's output is independent of how the program created and initialized it.

Furthermore, we can reexamine a familiar operation in the light of the previous discussion:

cout << "Please enter the value for x:" << endl;
Printing string literals to the console. The string literals in our output statements (prompts and output labels) are C-strings. The compiler stores them in memory and adds the null terminator to the end of the string. The inserter uses the literal's address to display the characters to the console by sending all the string's characters to the console, up to but not including the null terminator.

While C-string output is predictable, input is not.

C-string Input

The first step to reading data into a C-string is to define the C-string variable. We can define a C-string as a pointer or an array, so which definition is appropriate? If we define a C-string as a pointer, the pointer must point to memory allocated with the new operator or as an automatic array. Or we can initially create the C-string as an array. The latter choice is common, especially for simple input.

The next consideration is how big we should make the array. If we're lucky, the associated problem will inform us, but we're left on our own most of the time. If we make the C-string too small, the program may not have enough space to store all the data, but if it is too big, the program may waste space. Memory is typically plentiful unless working in a constrained environment (like a satellite or phone), so we try to identify a worst-case scenario and add a generous safety margin. We take that approach with the following simple program:

#include <iostream>
using namespace std;

int main()
{
	char	input[100];			// C-string variable

	cout << "Enter a string" << endl;
	cin >> input;				// does not read spaces
	cout << input << endl;

	return 0;
}
C-string input: failed version. The character array input is large enough to hold a C-string ninety-nine characters long plus the null termination character - long enough for a simple demonstration. If we enter the string Hello at the prompt, the program will naturally print Hello on the console. But what happens if we enter Hello world? The program again prints Hello! Unfortunately, the extractor operator reads up to but not including the first white-space character and then stops. The extractor reads strings reliably only as long as the input doesn't contain spaces or tabs, which is generally uncommon. Paraphrasing a popular 1975 film, "[We're] gonna need a bigger boat."

The "bigger boat" we need is another way to read C-strings. cin is an instance (i.e., an object) of a class named istream (short for input stream). The inserter operator, <<, is a member function defined in istream, and it is this function that fails to read white-space characters. Fortunately, istream also defines another member function named getline. getline reliably reads all characters, including spaces and tabs. It reads an entire line - everything up to and including the new-line character. Furthermore, it discards the new-line character, which saves us from any errors that might arise from a leftover new-line character remaining in the input stream, and it keeps us from discarding the new-line with a call to ignore. The next version of the program works the way most of us would expect:

#include <iostream>
using namespace std;

int main()
{
	char	input[100];

	cout << "Enter a string" << endl;
	cin.getline(input, 100);		// reads spaces
	cout << input << endl;

	return 0;
}
C-string input: correct version. The first getline argument is the name of the C-string where the function saves the input data. input is an array, so the name is the address of the C-string, meaning that the program passes input by pointer, which is an INOUT passing technique. The second argument is the size of the array. getline will read at most one less than this value - it always reserves space to hold the null termination character. So, in this example, getline will read at most 99 characters. Regardless of how many characters it reads, getline always appends a null termination character at the end of the input data.