8.3.2. `string` Functions: Documentation And Examples

Time: 00:08:25 | Download: Large, Large (CC) Small | Streaming, Streaming (CC) | Slides (PDF)

Review

A Review of Object-Oriented Programming: C++ vs. Java
- Calling a member function defined in a class
l-value and r-value (c)
Argument passing
- By-Pointer
- By-Reference
Return-By-Reference
Default arguments
size_t

The C++ string library begins with the string class and its member functions, but it also includes many non-member functions with at least one string argument. Member functions are defined in and belong to a class, making them class members. The string class constructors introduced previously are one kind of member function. The <string> header file also includes numerous functions using string objects without being class members.

C++ allows programmers to overload operators like + and =, and the string library overloads several, giving them new meanings when applied to string objects. Chapter 11 describes how programmers overload operators, but we need one detail now to understand the documentation for some frequently used string operators. "Overloading" means that a program reuses a name, so a program with overloaded functions has two or more functions with the same name. In the case of overloaded operators, the name consists of the keyword "operator" and the characters forming the operator. For example, operator⁠🙂, but with the operator replacing 🙂. The main difference between a "regular" function and an operator function is that operators support a novel calling syntax.

Although the text doesn't formally introduce classes and objects until the next chapter, one goal of the object-oriented paradigm is making it (relatively) easy for programmers to use them. We can usually understand the string functions and operators through their respective behaviors supported by abstract representations of their effect on the objects. There are too many functions and operators to remember (or cover) in detail. So, this section presents a few basic string functions and operators and directs you to the online documentation for details and more operations.

Basic `string` Operations

A depiction of a string as an array or series of boxes. The string holds some characters indicated as used or filled array elements. Some array elements are empty or unused, indicated as empty boxes. The string's capacity is the sum of the filled and empty array elements. — **`string` length and capacity**. A fundamental tenet of the object-oriented paradigm is that objects should hide their implementation details from the programs using them. Furthermore, some variation exists between compilers. For example, many create short strings with a capacity greater than their initial size, allowing them to add a few characters before reallocating memory. Nevertheless, we can confidently say `string` objects are based on arrays.

An abstract representation of a `string` illustrating the relationship between its length (or size) and capacity.

The statement creates and initializes a `string` object for the following examples.

The number of characters currently saved in the `string`. Both functions return the same value.

Programs call member functions by binding them to an object. The dot operator, `.`, binds object `s` to functions `length` and `size`.

The total number of characters the `string` can hold before it must grow.

The statement prints `11 15` when compiled with Visual Studio and g++, demonstrating the initial size is less than the capacity.

Documentation Prototypes	Example Function Calls
	string s = "Hello world";
(a)	(b)
size_t length(); size_t size();	for (int i = 0; i < s.length(); i++) . . . for (int j = 0; j < s.size(); j++) . . .
(c)	(d)
size_t capacity();	cout << s.length() << " " << s.capacity() <<endl;
(e)	(f)

Documentation Prototypes	Example Function Calls
char& operator[](size_t pos);	for (size_t i = 0; i < s.length(); i++) cout << s[i] << endl;
(a)	(b)
	s[0] = 'X';
	(c)
char& at(size_t pos);	for (size_t i = 0; i < s.length(); i++) cout << s.at(i) << endl;
(d)	(e)
	s.at(0) = 'Z';
	(f)
char& front(); char& back();	cout << s.front() << endl; s.front() = "X'; cout << s.back() << endl; s.back() = "Z';
(g)	(h)

Character access. C++ programs often need to access the individual characters in a string object, and the class provides programmers with two options: an operator and a function. Both can appear on either side of the assignment operator because they return a character reference. The examples assume that s is a string object. Like arrays and C-strings, instances of the string class are zero-indexed, constraining legal index values to 0 to length-1.

The index operator, [], returns the character at position pos in the string. The operator does not validate the index, and an out-of-range index may "crash" the program and always returns meaningless results.
The at function returns the character at position pos in the string. Unlike the index operator, it does validate the index, and an out-of-range index causes the function to throw an exception (a runtime error).
The index operator returns a reference, allowing it to operate as an l- or r-value (appearing on the left or right side of an assignment operator).
It may look odd to have a function call on the left-hand side of the assignment operator, but it is allowed when a function returns a reference. This statement stores 'Z' at the zeroth position in the string.
Two convenience functions performing the same operations as s[0] and s[s.length()-1], respectively.
The functions may operate as l- or r-values.

`string` Construction Operators

Documentation Prototypes	Example Function Calls
string& operator=(const string& str); string& operator=(const char* s); string& operator=(char c);	string s1("hello"); string s2; s2 = s1; s2 = "hello"; s2 = 'X';
(a)
string operator+(const string& lhs, const string& rhs); string operator+(const string& lhs, const char* rhs); string operator+(const string& lhs, char rhs);	string s1("hello"); string s2(" world"); string s3; s3 = s1 + s2; s3 = s1 + " world"; s3 = s1 + s2 + '!';
(b)
string& operator+=(const string& str); string& operator+=(const char* s); string& operator+=(char c);	string s1("hello"); string s2(" world"); s2 += s2; s1 += " world"; s1 += ' '; s1 += "world";
(c)

The string assignment and concatenation operators. A previous section listed all string operators. This figure focuses on utilizing documentation to write usable function calls. The examples demonstrate the operator's syntax, so some statements in an example may duplicate the actions of previous statements.

The assignment operator copies (i.e., duplicates) the right-hand operand to the left-hand string. The assignment operator is a member of the string class, so its left-hand operand must be a string object.
The + operator is overloaded, forming a string concatenation operator. The concatenation operator is not a string class member; at least one operand must be a string object, but it can appear on the operator's left or right side. The documentation only shows an object on the object's left side to save space.
The concatenation with assignment operator is a shorthand for s1 = s1 + right-hand-operand. The operator is a member of the string class, so its left-hand operand must be a string object.

Chapter 11 explains the left versus right side operand requirements in greater detail.

Searching In `strings`

The string class also defines a symbolic constant named npos, which is the largest possible value for the data type size_t. String objects may contain at most npos characters, which means that when accessing individual characters within a string, legal index values, i, are in the range 0 ≤ i < npos. This arrangement allows any string function that returns an index into a string to return npos to indicate failure.

A string object whose length and capacity is npos, the maximum capacity allowed. The first element is at index location 0, and the last is at location npos-1. — **The `string` constant `npos`**. Programs that process text often need to search for a sub-string (either a `string` object or a C-string) or a character in a `string`. When found, the searching functions return the location of the character's index or the sub-string's first character. They return a non-index value to signal when they don't find the sub-string or character. The `string` class defines a symbolic constant named `npos`, which is the largest possible value for the data type `size_t`. Instances of the `string` class are zero-indexed, meaning that legal index values, `i`, are in the range `0 ≤ i < npos`. The searching functions return `npos` when the character or sub-string is not in the target.

An abstract representation of a maximum-length `string` illustrating the index values and the `npos` constant.

When a program uses `npos` it must bind it to the `string` class with the scope resolution operator, `::`. Chapter 9 elaborates on this syntax, but please use it as illustrated for now.

Documentation Prototypes	Example Function Calls
size_t find(const string& str, size_t pos = npos); size_t find(const char* str, size_t pos = 0); size_t find(const char c, size_t pos = 0);	string s = "Hello, World!"; size_t index = s.find("World"); if (index != string::npos) cout << "World found at " << index << endl; else cout << "World not found." << endl;
(a)
size_t rfind(const string& str, size_t pos = npos); size_t rfind(const char* str, size_t pos = 0); size_t rfind(const char c, size_t pos = 0);	string s = "Hello, World!"; size_t index = s.rfind('l'); if (index != string::npos) cout << "'l' found at " << index << endl; else cout << "'l' not found." << endl; cout << s.rfind('l', 5) << endl;
(b)

The find and rfind functions. The various versions of the find and rfind functions search a string object for a sub-string or a character. pos marks the last character (searching from 0 to the right) included in the search. (Like arrays, instances of the string class are zero-indexed; so, pos = npos, the default, searches the entire string.)

The find function searches left-to-right (English reading order), beginning at the pos index location, for a string sub-string, a C-string sub-string, or a single character.
The rfind function operates similar to find but searches right-to-left (reverse English reading order). The first example (calculating index) locates the 'l' in "World". The effect of npos on rfind is, to me, counterintuitive. rfind searches the characters from 0 to npos (inclusive) but from right to left. So, the second example, embedded in the cout statement, finds the second 'l' in "Hello".

Conversions: `string` To Number and Number To `string`

The console only "knows about" textual or character data. Whenever a program reads numeric data from or writes it to the console, it must convert between numbers and text. Library functions, like >> and <<, typically perform the conversions as part of the I/O process. But sometimes, a program must perform the conversion without performing any I/O (the palindrome-number problem, for example). The following figures illustrate some of the conversion functions.

Number To `string` Conversions: Documentation And Examples

Documentation Prototypes	Example Function Calls
	int counter, i, j; double x, y;
string to_string(int val);	string s1 = to_string(123); string s2 = to_string(counter); string s3 = to_string(i + j);
string to_string(double val);	string s4 = to_string(3.14159); string s5 = to_string(x + y); string s6 = to_string(pow(x + y));

Documentation Prototypes

Example Function Calls

int counter, i, j;
double x, y;

string to_string(int val);

string s1 = to_string(123);
string s2 = to_string(counter);
string s3 = to_string(i + j);

string to_string(double val);

string s4 = to_string(3.14159);
string s5 = to_string(x + y);
string s6 = to_string(pow(x + y));

Converting numbers to strings. C++ provides a family of overloaded functions named to_string that convert numbers to string objects. The numeric arguments are passed-by-value. This passing method is appropriate as the largest numeric types only require a few bytes of memory, and it allows programmers to form the arguments with a variety of expressions. The C++ library provides one overloaded function for each fundamental numeric data type and prototypes them in the <string> header file - part of the std namespace. Please see to_string for a complete list and additional details.

`string` To Number Conversions: Documentation And Examples

Documentation Prototypes	Example Function Calls
	string s1 = "123"; string s2 = "0XAF27" string s3 = "3.14159"; size_t index;
int stoi(const string& str, size_t* index = 0, int base = 10);	cout << stoi(s1, &index) << endl; cout << stoi(s2, &index, 16) << endl;
double stod(const string& str, size_t* index = 0);	cout << stod(s3, &index) << endl;

Documentation Prototypes

Example Function Calls

string s1 = "123";
string s2 = "0XAF27"
string s3 = "3.14159";
size_t index;

int stoi(const string& str, size_t* index = 0, int base = 10);

cout << stoi(s1, &index) << endl;
cout << stoi(s2, &index, 16) << endl;

double stod(const string& str, size_t* index = 0);

cout << stod(s3, &index) << endl;

Converting string objects to numbers. A family of overloaded number-to-string functions is convenient because the arguments are different data types. However, it's less convenient when going the other way because the primary argument is always a string. (The additional arguments deal with the desired output base or other details.) The C++ library has several functions converting a well-formed string into various kinds of numbers. A well-formed string contains digits and other characters appropriate for the base and number type.

"stoi" and "atod" are shortenings of "string to integer" and "string to double," respectively. There are similar functions for all numeric types. Please see cplusplus.com, Functions for a complete list and details.
The string argument is passed by reference but the const keyword prevents the function changing it.
The second, index, argument is passed by pointer, making it an INOUT argument. It is unused if set to nullptr. The next figure describes it in more detail.
The base parameter (integer conversion functions only) specifies the output base (i.e., the characters used) in the string: 10 or decimal (the default), 16 or hexadecimal, 2 or binary, etc.

A string named s represented as a sequence of adjacent characters: 123hellow. A pointer named index points to the first character, '1,' which has an index location of 0. — **The `index` argument**. When converting a number to a `string`, the compiler can validate the argument expression, detecting and reporting errors. However, the primary argument to functions doing the opposite conversion is always a `string`, and the compiler can't "see" its contents - it can't tell if the `string` contains a well-formed number or not. In some situations, the `index` argument (passed-by-pointer, making it an INOUT argument) records where the `string`-to-number conversion stops.

Example statements demonstrating the `index` function's behavior when it can only convert the first part of the `string` to a number. The function returns 123 and sets `index` as illustrated.

An abstract representation of the initial condition before the program calls the `stoi` function.

An abstract representation of the final condition following the `stoi` call.

Contrasting the behavior of `index` with the similar `endptr` in the corresponding C-string functions, I believe this mechanism could be more robust. Adding an 'x' to the beginning, `"x123hello"`, causes the program to "crash" by throwing an exception calling the `abort` function.

The same string or sequence of characters, but now index points to 'h,', which has an index location of 3. — **The `index` argument**. When converting a number to a `string`, the compiler can validate the argument expression, detecting and reporting errors. However, the primary argument to functions doing the opposite conversion is always a `string`, and the compiler can't "see" its contents - it can't tell if the `string` contains a well-formed number or not. In some situations, the `index` argument (passed-by-pointer, making it an INOUT argument) records where the `string`-to-number conversion stops.

Example statements demonstrating the `index` function's behavior when it can only convert the first part of the `string` to a number. The function returns 123 and sets `index` as illustrated.

An abstract representation of the initial condition before the program calls the `stoi` function.

An abstract representation of the final condition following the `stoi` call.

Contrasting the behavior of `index` with the similar `endptr` in the corresponding C-string functions, I believe this mechanism could be more robust. Adding an 'x' to the beginning, `"x123hello"`, causes the program to "crash" by throwing an exception calling the `abort` function.

Additional Resources

This section describes many of the most useful and frequently used string functions but does not represent a complete list. The following links present additional functions and greater detail. I recommend creating bookmarks for them in your web browser.

string s = "123hello"; int index = 0; ... int counter = stoi(s, &index);
(a)	(b)	(c)

Basic string Operations

string Construction Operators

Searching In strings

Conversions: string To Number and Number To string

Number To string Conversions: Documentation And Examples

string To Number Conversions: Documentation And Examples