C-strings are a fundamental or primitive data type that the compiler processes without additional information. That means that programmers can define C-strings and use the basic operators like indexing ([]), the inserter (<<), and argument passing without #including a C-string header file. However, a program must have prototypes for the C-string functions before using them. C++ inherits most C-string functions from the C programming language, so their prototypes are available in two header files.
#include <cstring> (preferred for C++)#include <string.h> (usable with both C and C++)The header files contain the function prototypes for all the C-string functions. C++ documentation typically includes the same prototypes at the top of the function's description. The functions are a standard part of C++'s API, meaning the compiler system stores each function's machine instructions in a library, and the linker (or loader) extracts the instructions and incorporates them into a program as part of the compilation process. New C++ programmers need to understand four prerequisite concepts when learning to use the C-string documentation:
Internally, computers store textual data in a coded format. Unicode is common today (that's what Java uses). Unicode represents characters as 2-byte integers, allowing for 64K distinct characters. Although C++ can also use a wide, 2-byte character, we'll use the older, 1-byte character type char for our examples. The char data type uses the older American Standard Code for Information Interchange (ASCII) encoding scheme. ASCII is a 1-byte (or 8-bit) character that can store 256 different characters. The first 128 characters are standardized, but the last 128 (the extended ASCII code) are not always interpreted in the same way by all devices.
We can find the ASCII encoding of a specific character in any ASCII table (a web search for "ASCII table" will provide numerous examples). The first 32 characters (the first column in the table linked above) are control characters used to control connected hardware devices. The remaining characters encode the symbols, punctuation marks, digits, and alphabetic characters that comprise most of the text on a computer screen. Interestingly, the digits '0' through '9' are not represented by the numbers 0-9. The digit '0' is encoded as the numeric value of 48 (decimal or base-10), while the character '9' is encoded as 57. ASCII encodes all the digits in the contiguous range 48-57. The alphabetic characters also occupy a contiguous range, but the uppercase letters (65-90) are separate from and precede the lowercase letters (97-122).
nullptr, introduced in Chapter 4, is a special pointer value that indicates "pointing to nothing." Chapter 4 also suggested that we can test a pointer's value to see whether it is nullptr. But for this test to be meaningful, we must initialize the pointer before running it. A pointer variable definition, such as char* p;, allocates the memory needed to store a pointer but does not always initialize the value stored in the pointer. An uninitialized pointer variable does not store nullptr or any valid address.
char* p = nullptr; . . . if (p == nullptr) . . . if (p != nullptr) . . . |
char s[100] = ""; // (i) char s[100]; // (ii) s[0] = '\0'; if (s != nullptr) . . .// is true |
![]() |
| (a) | (b) | (c) |
\0 marks the end of the string (i.e., the end of the data). So, while memory for index values ≥ 1 may contain characters from previous operations, they do not contain useful data now, and s is logically empty.In the C-string library functions, nullptr may appear as either an argument or as a return value. But what it means depends on the specific function. For example, when a function returns nullptr, it may indicate an error has occurred; if the function is processing data, it may mean that all data is processed; or in the case of a searching function, it may indicate the search didn't find what it was looking for. nullptr can also be used as a function argument, but how a function interprets it varies from one function to another. For example, the strtok function uses nullptr and not-nullptr to either continue searching a previously provided C-string, or to start searching a new C-string, respectively.
The parameters and return types of the C-string functions generally reflect the behavior their name (cryptically) suggests. However, the function prototypes in their documentation often use peculiar and unexpected datatypes. Furthermore, the datatypes can vary between document versions. Type aliases account for some of the peculiarities, while others are "normal" C++ types not yet covered. Four example functions introduce some of the datatypes and demonstrate the importance of carefully reading multiple documentation sections, as each contributes information necessary to use the functions correctly.
char* strcpy(char* dest, const char* src); |
char* strncpy(char* dest, const char* src, size_t num); |
void* memcpy(void* dest, const void* src, size_t num); |
void* memmove(void* dest, const void* src, size_t num); |
char* strcpy(char* dest, const char* src);size_t type aliasvoid* (i.e., void pointers. C++ converts any pointer argument passed to a void pointer parameter to a void pointer; programs must explicitly cast it back to a known type to access data through it. The text covers void pointers in detail later in the chapter.)
char* dest; // Wrong! . . . strcpy(dest, src); |
char dest[100]; . . . strcpy(dest, src); |
| (a) | (b) |
char* dest = new char[100]; . . . strcpy(dest, src); |
char dest[100]; strcpy(dest, "Hello world"); |
| (c) | (d) |
destination = source. Programmers can create the parameters in various ways, but they must ensure the destination has sufficient space to store the results.
char s1[100]; char* s2 = "Hello, world!"; cout << strcpy(s1, s2) << endl;
Microsoft describes the Security Features in the CRT, saying that "Many old CRT functions have newer, more secure versions. If a secure function exists, the older, less secure version is marked as deprecated and the new version has the _s ('secure') suffix."
| (a) | #define _CRT_SECURE_NO_WARNINGS |
| (b) | errno_t strcpy_s(char *strDestination, size_t numberOfElements, const char *strSource); |
There are many C-string functions, too many to explore each in detail, and too many to remember all their parameters and return values. It's more productive to learn the general operations the C-string library provides and how to use the library documentation. The following sections demonstrate the use of the C-string documentation for some of the most used functions and elaborate on their implementation to help users understand how C-strings work. Follow the links in the highlighted box to excellent C-string documentation.
I recommend creating a bookmark for these pages in your web browser.