8.2.2.6. C-Strings And Number Conversions

Time: 00:03:46 | Download: Large, Large (CC), Small | Streaming, Streaming (CC) | Slides (PDF)
Review

Converting a number to a corresponding string or sequence of characters is a surprisingly common task. Similarly, converting a string of digits and related characters (e.g., '.' or '-') to a number is also a frequent programming task. When the operating system (OS) runs a program, it provides a host environment separating the program from the hardware. Programs exchange data with the OS as C-strings through system calls. The system calls that read from and write to the console only have two arguments: a C-string and the number of characters to transfer. So, the OS exchanges data with the program and the console as C-strings.

Programs are responsible for converting between strings and numbers. Most programming languages provide services, as part of the language or as library functions, to perform the conversions and formatting. For example, C++ provides the inserter and extractor operators (<< and >>), and a variety of manipulators. Nevertheless, the services ultimately make system calls to the OS, exchanging data only as strings. Subsequent examples demonstrate that it is sometimes convenient for programs to make conversions outside of I/O operations.

C-strings To Numbers

Documentation Prototypes Example Function Calls
int atoi (const char* str);
cout << atoi(s1) << endl;
cout << atoi("123") << endl;
long atol(const char* str);
cout << atol(s1) << endl;
cout << atol("123") << endl;
double atof(const char* str);
cout << atof(s2) << endl;
cout << atof("3.14159") << endl;
ASCII to number conversion functions. The function names use "ASCII" as a synonym for a string of ASCII characters - a C-string. The single argument for each function is a C-string in any of its valid representations. The example calls assume
Documentation Prototypes Example Function Calls
long strtol(const char* index, char** endptr, int base);
cout << strtol("123", nullptr, 10) << endl;
cout << strtol("0xafcd", nullptr, 16) << endl;
double strtod(const char* index, char** endptr);
cout << strtod("3.14159", nullptr) << endl;
Advanced C-string to number conversion functions. The functions' increased flexibility requires additional arguments configuring the conversion process. For example, the base argument (called "radix" in some documentation) specifies the base of the digits in the C-string: 10 or decimal, 16 or hexadecimal, 2 or binary, etc. The following figure describes the most perplexing part of the prototypes: the double-pointer endptr. As illustrated here, setting it to nullptr disables the feature it controls.
A C-string named s represented as a sequence of adjacent characters: 123 456 789. A character pointer named 'end' points to the first character. A double character pointer named endptr points to end. The same string but now 'end' points to the space between 3 and 4. endptr still points to end.
(a)(b)
#include <iostream>
using namespace std;

int main()
{
	char s[] = "123 456 789";
	char* end = nullptr;
	cout << strtol(s, &end, 10) << endl;
	cout << strtol(end, &end, 10) << endl;
	cout << strtol(end, &end, 10) << endl;
	return 0;
}
#include <iostream>
using namespace std;

int main()
{
	char s[] = "123 456 789";
	char* end = s;
	while (*end != '\0')
		cout << strtol(end, &end, 10) << endl;

	return 0;
}
(c)(d)
The endptr parameter. C-string s is a null-terminated character array having three numeric sequences separated by delimiters (the empty boxes). The delimiters consist of one or more space, tab, or new-line characters. The char** endptr parameter in the function prototypes above defines a C-string passed-by-pointer. Adding parentheses clarifies the separate parts: (char*)*. Passing the local pointer variable end by pointer makes it an INOUT argument, saving the function's current processing location in the string. The strtol and strtod functions update the location when they encounter a delimiter.
  1. The example programs pass the address of end to the strtol function's endptr parameter, making endptr point to end. Program (c) doesn't assign a value to end until the function returns; program (d) initially sets the end pointer to the first character in the C-string s (the dashed line) to support the while-loop in program (d).
  2. The first strtol call leaves end pointing to the first delimiter. The next call skips all delimiters and begins processing the string at the next digit character.
  3. The client program defines the character pointer char* end, initializes it to nullptr, and passes it by pointer (&end) to strtol. The initialization and pass do not have the same effect as passing nullptr as the second argument, as illustrated in Figure 2. In this figure, end is null, but in Figure 2, endptr is null.
  4. The while-loop uses the variable end in three places, but the associated operator (or lack of an operator) yields three distinct meanings:
    1. end (without an operator) is a character pointer pointing to C-string s or a sub-string in s.
    2. *end dereferences s, yielding a single character. The loop ends when that character becomes the null terminator at the end of the C-string.
    3. The expression &end calculates the address of end, which is passed to the strtol function.

Numbers To C-strings

Strangely, C++ does not include any standard number-to-string functions. (Although more complicated, it can do the conversions with stream objects.) The CRT does have some conversion functions, but the ANSI C++ standard does not included them. Therefore, their support and implementation vary between compilers. You can often find source code for these functions on the Internet, and we'll write a simple version in the cpalnumber example program.

Documentation Prototypes Example Function Calls
 
char s[25];
char* itoa(int num, char* str, int base);
itoa(123, s, 10);
itoa(0xaf48, s, 16);
char* _itoa(int num, char* str, int base);
_itoa(123, s, 10);
_itoa(0xaf48, s, 16);
errno_t _itoa_s(int num, char* str, size_t size, int base);
_itoa_s(123, s, 25, 10);
_itoa_s(0xaf48, s, 25, 16);
Numbers to C-string functions. These functions have limited compiler support. Visual Studio supports all of them but may issue warning messages depending on the compiler's settings; g++, on Linux, supports none. Aside from the underscore as part of the name, the first two functions are identical; the third function is part of Microsoft's secure function set (see https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/itoa-s-itow-s?view=msvc-170 of a complete list of secure number conversion functions).