Throughout my first career as a software engineer, some of my tasks always involved writing and maintaining programs written in C. Midway through that career, I began writing and maintaining object-oriented programs written in C++. That experience allowed me to teach C++ to my colleagues within and without my main employment. At this time, experienced C programmers were the most frequent C++ students. So, creating a string class based on C-strings was a natural object-oriented programming example.
Creating a string class based on C-strings is still instructionally beneficial. It gives new C++ programmers additional experience using the C-string functions and demonstrates many object-oriented features presented in the current chapter. The previous example emphasized pictures for solving problems and converting the solutions to working C++ functions. This example encourages you to use the C-string functions' documentation and practice your elaboration skills. As you study each function, try to explain to yourself what each statement does. You can put yourself in an authentic mindset by imagining that you are in a formal code review and must explain to your peers how each function operates and why you have implemented it as it appears in the example. Working in small groups, taking turns elaborating successive functions to each other, is a proven learning technique.
As we transition our string class from length-prefixed to C-style strings, please consider the following issues:
Programmers initially created C-style and length-prefixed strings as fundamental, non-object-oriented string representations. Although classes built on these representations can be ungainly or inefficient, they are good examples of C++ class development.
Our length-prefixed string class has a 255-charter fixed capacity. This implementation wastes space for short strings and prevents creating strings longer than 255 characters. Nevertheless, it also frees us from explicitly maintaining a string's capacity and simplifies functions creating or reading strings.
For example, the LPString readln function created a string object and added characters to it one at a time. The function stopped adding characters when it read a new-line character or the string reached capacity. Provided that a program can allocate sufficient memory, C-strings do not have a maximum capacity, which makes the readln function challenging. To implement it, we must modify its signature, constrain it, or accept a much more complex solution. The first option preserves the function's flexibility and is easily programmed.
Creating a string class based on C-style strings alleviates an inherent limitation of C-strings. C++ arrays do not maintain information about their size or capacity. Likewise, C-strings, based on arrays, don't maintain their capacity. Whenever a program passes an array or C-string to a function, it must pass the capacity as a separate value. Recall that functions always pass arrays (and therefore C-strings) as pointers, so we can't use sizeof to find the size of an array argument. We solve this problem by adding a capacity member variable to our C-string-based string class.
Finally, an object-oriented string can be dynamic - increasing and decreasing its capacity as needed. We'll allocate and deallocate memory with the new and delete operators. Using dynamic memory forces us to add a destructor function to the class.
Now we now have sufficient background to outline our class, which we name CString to emphasize its reliance on C-strings.
CString Class
Member Variables
class CString
{
private:
char* text = nullptr;
size_t capacity = 0;
};
text - character pointer pointing to an array allocated on the heap.
capacity - size type (an unsigned integer) that saves the number of elements in the allocated array.
Empty String
String With Content
CString: A C-string based class. Embedding a C-string in a class shifts the responsibility for the error-prone memory management from the application to the class programmer. It also "hides" the logic and operations, making the string dynamic (able to change its capacity as needed). Once the class programmer writes and verifies the constructors, destructor, and algorithmic member functions, application programmers can use them without regard to their complexity.
Member functions that create new CString objects or modify this object must update both member variables.
Throughout the text, I've referred to C-style strings as C-strings. I named the class CString to distinguish it from the other string examples in this chapter. Although the two names are similar and related, they denote different string representations.
inline CString::CString(char c)
{
capacity = 2;
text = new char[2];
text[0] = c;
text[1] = '\0';
}
(a)
(b)
(c)
inline CString::CString(const char* s)
{
capacity = strlen(s) + 1;
text = new char[capacity];
strcpy(text, s);
}
inline CString::CString(const CString& cs)
{
capacity = cs.capacity;
text = new char[capacity];
strcpy(text, cs.text);
}
inline CString::~CString()
{
if (text != nullptr)
delete[] text;
}
(d)
(e)
(f)
CString constructors and destructor: building and destroying objects.
Programmers put inline functions in the class header file below the class specification. This figure and those following provide complete function implementations without elaborating how or what the individual function statements do.
Many C++ default string constructors create an empty string with an initial capacity of 15 characters. We can understand the value of this approach by imagining a program that creates a new string and adds characters to it one at a time. Enlarging a string requires three steps. First, the program must allocate memory for a new array; next, it copies the contents of the old array to the new one, and finally, it deallocates the old array. It's inefficient to do this for each character added to the array, so the default constructor gives the new string a little "growing room." We copy this behavior with the CString default constructor.
Although this constructor has a single, non-reference parameter, it isn't a conversion constructor - a fact that can cause some trouble. C++ automatically converts integers to characters; if the integer is too large to fit in a character, it only uses the least-significant 8 bits. So, CSstring cs(25); may be a call to (b) or to (c). Client code must disambiguate the call with a typecast: CString cs((size_t)25);. The C++ string class avoids the problem by not defining a similar constructor.
A conversion constructor that converts a single character to a CString object.
A conversion constructor that converts a C-string to a CString object.
The copy constructor that makes a new CString object by copying an existing one.
The CString destructor destroys an object by deallocating heap memory. Recall that destructors do not have parameters, implying they cannot be overloaded.
char& CString::at(int index)
{
if (index < 0 || index > strlen(text) - 1)
throw "index out of bounds";
return text[index];
}
(c)
CString access functions. "Access function" is a collective term for both getter and setter functions. As illustrated here, access functions are often small and relatively straightforward.
As a getter function, we could have name it get_length or getLength, but we choose to follow the C++ string example.
An example of a typical getter.
The function verifies the index is inside this string and throws an exception if it isn't. The function returns a reference, enabling it to serve as both a getter and a setter:
print(char*) is static, but the "static" keyword only appears with the function prototype in the class specification. Static functions do not have a this pointer and are not bound to an object when called. See client.cpp at the bottom of the page for examples.
Prints a CString object to the console but leaves the cursor at the end of the text.
Calls (b) and then prints an endl to move the cursor to the beginning of the next line.
The general string ADT that we outlined previously included a parameterless readln() function. We could implement it in the CString class (see the next figure), but it isn't practical. So, for simplicity, we follow the C-string getline example and define a parameter that is the maximum number of characters the function will read.
void CString::readln()
{
const int NBLOCKS = 1024;
const int BLKSIZE = 512;
char** blocks = new char*[NBLOCKS];
int count = 0;
for (; count < NBLOCKS; count++)
{
blocks[count] = new char[BLKSIZE];
for (int i = 0; i < BLKSIZE - 1; i++)
{
int c = cin.get();
if (c != '\n' && c != EOF)
blocks[count][i] = c;
else
{
blocks[count][i] = '\0';
goto done;
}
}
blocks[count][BLKSIZE - 1] = '\0';
}
done:
delete[] text;
capacity = count * NBLOCKS + strlen(blocks[count]) + 1;
text = new char[capacity];
text[0] = '\0';
for (int i = 0; i <= count; i++)
{
strcat(text, blocks[i]);
delete[] blocks[i];
}
delete[] blocks;
}
The CString readln() function. I didn't include the parameterless readln() function in the class specification because it's not a practical function. But it is a good example of "nuts and bolts" programming - solving a programming problem even when there isn't an elegant or efficient solution.
The amount of memory the operating system makes available to a program is the one constraint on a CString's capacity. A program can allocate what is needed with the new operator if there is sufficient memory. But the program doesn't "know" how much memory to allocate until the full CString contents are in memory. We solve the problem by allocating and filling blocks of memory as needed and assembling the final CString after the program reads all the text. The get function, first introduced in the wc.cpp example, reads characters from the console one at a time.
This solution still fails to read an arbitrarily long string. Nevertheless, we can adjust the symbolic constants to make the function read strings as long as we need. If we replace the blocks array with a linked list, the function can continue reading text until it exhausts its available memory. While this version is a good overall solution, it's a more appropriate topic for a Data Structures And Algorithms course.
CString Process Functions
Process functions perform general operations on an object's member variables. They include non-private member functions that don't fit the other labeled categories. See The UML Class Symbol.
Functions Modifying this Object
Three CString functions deliver the results of their operations to the client program by updating this object - the object bound to the function at call time.
The CString append and clear functions. The first two functions altering this string are small and simple.
Checks this string's capacity and increases or grows it if necessary. The C-string strcat completed the append operation.
Makes the string logically empty by placing a null-termination character at the beginning of the text array. The function does not alter the string's capacity.
Functions Returning A CString Object
The three functions in this category return a new CString representing the results of their operations. Each function defines a local CString object, named local, operates on it, and returns it when its work is complete.
The CString copy and concat functions. Three functions, the CString(size_t) constructor and the C-string strcpy and strcat functions, do all the work.
Defines a local variable whose capacity is the same as this string's, copies this text to the local variable, and the returns it.
Defines a local variable whose capacity is the sum of this and s. The C-string functions copy this to local and then concatenates local and s before retuning local.
CString CString::substring(int index, size_t length) const
{
if (index < 0 || index > strlen(text))
throw "index location is too large";
if (index + length > strlen(text))
throw "\"length\" is too long";
CString local(length + 1);
strncpy(local.text, &text[index], length);
local.text[length] = '\0';
return local;
}
(a)
(b)
The CString substring function. The substring function is more complex than most of the CString functions, so we again turn to a picture to help establish the relationships between the two objects and the two integer parameters. The function verifies that the beginning and end of the substring, index and index + length respectively, are inside this string. It throws an exception if either endpoint is out of bounds. The strncpy function copies length characters from this string to local; substring null-terminates and returns the copy.
CString Comparison Functions
The two functions in this category compare two CString objects: this and the parameter s.
int CString::order(const CString& s) const
{
return strcmp(text, s.text);
}
(a)
(b)
The CString equals and order functions. The C-string library has a function, strcmp, that compares two C-strings. The library documentation typically describes it as an ordering function: Given two C-strings, it determines their relative alphabetical (or more accurately, their ASCIIbetical) ordering. But, the library does not define an equals function.
When its two C-string parameters are identical (contain the same characters), the strcmp function returns 0, indicating that the strings have the same ordering. C++ treats 0 as false, which the negation operator, !, converts to true.
Expressing the relative order of two strings as -1, 0, or 1 is a common encoding used in many programming languages (for example, Java's Comparable.compareTo method). So, the CString order function only needs to return the value produced by the strcmp function
The CString Helper Function
In chapter 6, I claimed that functions influence how software developers think about and solve problems. Further, developers decompose large complex functions into smaller, more manageable ones. These observations remain true for member functions. Developers often decompose large member functions into smaller helper functions and make them private because they help the other member functions rather than forming a complete service.
void CString::grow(size_t new_capacity)
{
char* temp = new char[new_capacity];
strcpy(temp, text);
delete[] text;
text = temp;
capacity = new_capacity;
}
The CString grow function. Helper functions embody code used by one or more member functions. Class designers exclude them from the class's public interface because they don't represent a complete service that a client program should access directly.
Several of the member functions described above can increase a CString object's capacity beyond its allocated memory. When this happens, the member functions call the grow function to increase the object's capacity. Putting this code in the grow function eliminates duplicate code from calling functions.
Imagine that the class designer wishes to make the function a setter rather than a helper. A setter can truncate or shrink an object's capacity or increase or grow it. It takes three modest steps to complete the conversion:
Name the function appropriately - give it a more general name
Replace strcpy with strncpy, which copies one C-string to another or a specified number of characters - whichever is the shortest
Insert a null-termination character in the copied string
Move the function from a class's private section to a public one
Both functions demonstrate a problem shared by many dynamic data structures - C++ strings and vectors and Java ArrayLists and Vectors. Whenever a program changes the capacity of an array-based data structure, it must allocate a new array with the desired capacity, copy the old array to the new one, and deallocate the old array. The copy step becomes time-expensive for large structures.