9.15.4.1. Length-Prefixed String Example

Review

Non-object-oriented programming languages must represent strings without relying on classes. Structures or their equivalent are a logical replacement, and we'll explore such an implementation in the next section. But we can encode a string as a single-character array if we're clever. We have seen that the C-programming language accomplished this by adding a sentinel or null-termination character at the end of the textual data. This section explores another single-array implementation called length-prefixed or size-prefixed strings. An early implementation of the Pascal programming language from the University of California San Diego, UCSD Pascal, based its string type on this scheme. So many programmers also call length-prefixed strings UCSD strings.

Programmers implement length-prefixed strings as fixed-length arrays 256 characters long and store the string's length in the first or 0-th character. For simplicity, we use an 8-bit or 1-byte unsigned character for our character array. An 8-bit character is an integer that can store 256 distinct values. The character type is frequently signed on modern computer hardware, giving it a range of [-128 - 127]. Alternatively, an unsigned character has a range of [0 ‑ 255] and is how we'll implement the class. As we reserve the first array character for the string's length, 255 characters remain to store the string's textual data. This approach has some disadvantages: it wastes space when strings are short and doesn't allow strings longer than 255 characters, but it is simple enough that programming languages can support it as a fundamental or built-in data type.

An Empty String String With Content
A picture of an empty length-prefixed string implemented as an array of 256 characters. The length of the array, 0, is stored in the first or 0-th character. The string's capacity is 255 - it can save 255 characters. A picture of a length-prefixed string implemented a 256-character array. In the example, the string's length is 5 and is stored in text[0]. The string stores the word 'Hello': text[1] = H, text[2] = e, text[3] = l, text[4] = l, and text[5] = o. Characters 6 through 255 are empty.
(a)(b)
Length-prefixed string. The length-prefixed implementation of the general string API captures the three string elements:
  1. The textual data is stored in the elements of an array, named text, allocated automatically on the stack
  2. The string's length is stored in text[0], coming before or prefixing the textual data - hence the name length-prefixed string
  3. The string's capacity is implicit in the implementation as a fixed-length array
  1. An empty string has a length of 0, which the string saves in the first array element: text[0]. The remaining array elements, text[1] through text[255] are logically empty. Physically, every memory location always has some content - a random value leftover from the computer startup or the last program - but the string functions ignore the values in the elements beyond the string's length. Creating the string establishes its capacity of 255, and it never changes. Notice that the string's capacity is one less than the array's capacity of 256.
  2. In this example, element 0 is 5, the string's length. The string's content, saved in elements 1 through 5, is "Hello." Elements 6 through 255 are empty.
These are not C-strings, so they are not zero-index or null-terminated. Furthermore, they are not dynamic - their length can vary, but their capacity cannot.

Length-prefixed strings are simple enough to implement without making them a class. Nevertheless, they are a good class example, demonstrating how pictures can help us see and program problem details. The following figures detail the class and each function. The complete source code for the example is available for download at the bottom of the page.

#pragma once

class LPString
{
	public:
		const static int LENGTH = 256;

	private:
		unsigned char text[LENGTH];

	public:
		// constructors
		LPString() { text[0] = 0; }		// default constructor
		LPString(const char* s);		// conversion constructor: C-string to LPString
		LPString(char c);			// conversion constructor: char to LPString
		LPString(const LPString& s);		// copy constructor

		// access
		int length() const { return text[0]; }
		unsigned char& at(int index);

		// i/o
		static void print(const char*);
		void print() const;
		void println() const;
		void readln();

		// modify "this"
		void append(const LPString& s);
		void insert(const LPString& s, int index);
		void clear() { text[0] = 0; }

		// new LPString
		LPString copy() const;
		LPString concat(const LPString& s) const;
		LPString substring(int index, int length) const;

		// ordering
		bool equals(const LPString& s) const;
		int order(const LPString& s) const;
};
The LPString class. Following the program organization introduced earlier, we write the LPString class specification in the LPString.h header file. The functions included in the specification reflect the abstract operations identified in the previous section. However, the class specification adds C++ implementation details such as the "const" keyword and the ampersand denoting pass-by-reference. And it replaces the generic name "string" with the more specific "LPString" class name.

A previous chapter demonstrated that arrays are always passed to and returned from functions by pointer. However, wrapping the array in a class, even though the array is the class's only data member, changes the basic passing mechanism. Programs can pass objects by value, reference, or pointer regardless of their contents.

Even a brief examination of a modern programming language's string API will reveal essential string operations that LPString omits. Its inability to print numerical values or directly print characters will hamper our efforts to test and validate the member functions. Furthermore, an authentic class typically uses low-level operating system services to complete the I/O operations. For simplicity, LPString will instead use the <iostream> functions. Nevertheless, the class is sufficient for our instructional needs.

I recommend a stepwise or cyclic approach to class implementation. After programming each function, or at most each small group of related functions, pause to test and verify the additions. Verifying the code this way makes finding and correcting syntax errors easier, making the task less frustrating. It will also help make the overall debugging and validation process more manageable. And finally, some member functions often depend on other members; if the independent functions are validated, any errors are more likely in the dependent functions. Testing and validation typically require a certain "critical mass" of functions. Specifically, we need a constructor and a display or print function, so we begin with these.

Constructors

Illustrates an empty LPString as having a 0 in the first or 0-th element of its 'text' array, and the remaining elements are blank.
LPString() { text[0] = 0; }
LPString lps1;
cout << lps1.length() << endl;
LPString::print("lps1 = ");
lps1.println();
(a)(b)(c)
The LPString default constructor. The default constructor creates a logically "empty" LPString object.
  1. The constructor creates the array automatically on the stack with a predetermined capacity. The picture clearly shows the constructor's only task: initializing the array's length to 0. The random values in elements 1 through 255 are irrelevant.
  2. An initializer list can initialize a member variable but not part of a variable (i.e., not one array element). So, the constructor cannot use a list, and we write the function with a "regular" body. The complete function consists of one statement, making it an ideal candidate for implementation as an inline function in the class specification.
  3. Following the cyclic approach described above, the initial test creates an object with the most basic constructor, the default. The also uses the length, println, and print(char*) functions, which are detailed below.
The picture represents an LPString object and a C-string as rectangles denoting arrays. The LPString rectangle is the 'text' member variable. The picture shows that i, the loop control variable, ranges from 1 to 5 while copying 'Hello' from the C-string to the LPString.
LPString::LPString(const char* s)
{
    text[0] = 0;
    for (int i = 0; s[i] && i < LENGTH - 1; i++)
    {
        text[0]++;
        text[i + 1] = s[i];
    }
}
0
1 H
2 He
3 Hel
4 Hell
5 Hello
 
 
 
(a)(b)(c)
LPString::print("\n*** Testing LPString(char*), length, and println function: ***\n");
LPString lps2("See the quick red fox jump over the lazy brown dog. "
"See the quick red fox jump over the lazy brown dog. "
"See the quick red fox jump over the lazy brown dog.");
cout << lps2.length() << endl;
LPString::print("lp2 = ");
lps2.println();
(d)
The LPString(char*) conversion constructor. The constructor converts C-string, s, to an LPString object by copying the characters one at a time. Correctly indexing the arrays and controlling the for-loop are challenging sub-problems, and this example demonstrates how a picture can help us solve them.
  1. The LPString is initially empty as illustrated in Figure 1(a). The function copies the characters from s to the LPString's text array with a for-loop. But where does the loop begin and end, and how do we index into the arrays? (Equivalently, what values does the loop-control variable take, and how do we use the variable to index the arrays?)
  2. Try mapping parts of the picture to corresponding parts of the C++ code:
    • The function uses the LPString's length, text[0], as an accumulator to count the characters as it copies them. The function must initialize the length to 0 before looping and must increment the count during each iteration.
    • C-strings are zero-indexed and the copy operation begins at s[0], so we initialize the loop control variable to 0 . However, text[0] is the string's length, and the characters begin at text[1]. This organization makes the indexes off by one throughout the copy operation. The assignment operation accounts for the offset by adding 1 to the loop control variable when indexing text.
    • Two situations can end the loop. The null termination character is a character 0, which C++ treats as false. If s is short, < 255 characters, the sub-expression s[i] ends the loop (when loop reaches s[5] in this example). If s is long, >= 255 characters, the sub-expression i < LENGTH - 1 ends the loop. The -1 is necessary to prevent indexing text out of bounds.
  3. Pictures don't need to be elaborate to be helpful - simple characters are often sufficient. This picture shows how text begins and changes with each loop iteration.
  4. It's necessary to test strings with a length greater than 127. The test-and-validation code uses a "trick" inherited from C to create a long string: the compiler automatically concatenates adjacent C-strings to form a single string. The test code prints the newly created LPString's length and content, verifying that the class works with long strings.
An LPString object represented as a rectangle denoting an array named 'text.' A single character, c, is represented as a square containing the character 'X.' The constructor copies 'X' from the character to text[1] and initializes the LPString's length, text[0], to 1.
LPString::LPString(char c)
{
	text[0] = 1;
	text[1] = c;
}
LPString lps3('X');
cout << lps3.length() << endl;
LPString::print("lps3 = ");
lps3.println();
 
(a)(b)(c)
The LPString(char) conversion constructor. The constructor converts a single character, c, to an LPString object.
  1. The picture illustrates making an LPString string by copying a character to it and setting its length to 1.
  2. The constructor converts a character to an LPString by copying the character, c, to text[1] and initializing the string's length to 1: text[0] = 1.
  3. The test validates the construction by printing the string's length and content.
Two LPString objects represented as rectangles denoting the LPString member variable '&text.' The picture shows that i, the loop control variable, ranges from 0 to 5 while copying the length, 5, and the content, 'Hello' from the existing LPString, s, to the new LPString.
LPString::LPString(const LPString& s)
{
    for (int i = 0; i <= s.text[0]; i++)
        text[i] = s.text[i];
}
LPString lps4(lps2);
LPString::print("lps4 = ");
lps4.println();
 
 
(a)(b)(c)
The LPString copy constructor. The copy constructor creates a new LPString object by copying an existing one. The picture of the problem and the function code are similar to the char* conversion constructor (Figure 4).
  1. The picture helps us identify details leading to a compact and efficient solution. The function must copy length+1 characters from the original or parameter LPString to the new one. The loop must begin at 0 and iterate the original string's length plus one.
  2. A single for-loop copies the used elements of the existing LPString (the length in element 0 and the characters in elements 1 through 5) to the new string. The contents of the unused elements are irrelevant, so the function does not copy them. So, the for-loop begins at 0 and uses <= for control.
  3. The test and validation code uses lps2 created in Figure 4.

LPString Access Functions

We could choose to name the LPString access functions with the "get" and "set" prefixes like other access functions. However, looking at the C++ and Java string libraries or APIs, these functions don't typically follow that naming convention. So, we choose instead to follow the conventions of the other languages.

A fully populated LPString containing the characters 'Hello world' in elements 1 through 11. The string's length, 11, is maintained in element 0 and indicated by an arrow.
int length() const { return text[0]; }
(a)(b)
The LPString length function.
  1. The picture illustrates the relationship between the saved textual data, "Hello world," and the string's length, 11. It also emphasizes that the string's length - the number of characters currently stored in the string - is always saved in the first array element: text[0].
  2. The length function is a "getter," but most string classes name it either length or size, and many provide both functions. The function is short, so we inline it in the class specification. We validated it above in conjunction with the constructors.
A fully populated LPString containing the characters 'Hello world' in elements 1 through 11 and the length, 11, in element 0. The variable 'index,' corresponding to the parameter in the at function, points to index location 7 in the text array.
unsigned char& LPString::at(int index)
{
	if (index < 1 || index > text[0])
		throw "index out of bounds";
	return text[index];
}
(a)(b)
LPString::print("\n*** Testing the at function: ***\n");
LPString lps1("Hello world");
try
{
	// 2 ways of printing a character - "at" as an r-value or getter
	char c = lps1.at(7);
	cout << c << endl;
	LPString(lps1.at(7)).println();		// obscure conversion constructor call

	lps1.at(7) = 'X';			// "at" as an l-value or setter
	LPString::print("lp1 after changing the first character: \n");
	lps1.println();
	lps1.at(12);				// out of bounds - throws an exception
}
catch (const char* error)
{
	cerr << "Error: " << error << endl;
}
(c)
The LPString at function. Surprisingly, the at function implements both "getter" and "setter" operations - it can get or set the character at the index location. Returning a reference (the red ampersand) allows programs to use the function as an l- and an r-value, performing both operations. (The compiler treats it as a value or address, depending on where the program uses it.)
  1. A picture helps us see the relationship between the text saved in the string and each character's index location. We need to clarify the character indexing because we sometimes use a zero-indexed organization, and sometimes we don't. The arrow points to the 'w' at index location 7, which we use in the test and validation code.
  2. The at function returns a reference to one element, a variable, in text. The if-statement verifies that the index is valid or in-bounds (i.e., within the string) and throws an exception if it is not.
  3. The test and validation code for the at function demonstrates some vital syntax and one obscure conversion.
    • The function call lps1.at(7) gets one character element or variable from lps1. As used in the two illustrated statements, the compiler treats the element as an r-value or the character stored in the variable. The second example, with the obscure conversion, might be confusing. None of the overloaded print functions can print a single character, but we get around the limitation by calling a conversion constructor. lps1.at(7) returns a character, which is passed to the LPString(char) constructor. The constructor call creates a new, anonymous object, and the object calls println, which can print an LPString.
    • In the statement lps1.at(7) = 'X' the at functional call again gets the element or variable from lps1 at index location 7. But in this statement, the call is on the left side of the assignment operator, so the compiler treats it as an address and saves the character 'X' in that memory location.
    • The statement lps1.at(12) indexes the string out of bounds - that is, one position beyond the last character - and causes the function to throw an exception.
    • Together, the try and catch blocks detect and handle the index-out-of-bounds exception.

I/O Functions

A rectangle, divided into boxes, representing a C-string. The boxes, left to right, save the characters 'Testing\0' where 'T' is in line[0] and '\0' is in line[7].
static void print(char* line);
(a)(b)
void LPString::print(const char* line)
{
	cout << line;
}
LPString::print("\Testing the copy constructor:\n");
(c)(d)
The LPString static print function. The static version of the print function is a special case: it allows us to print C-strings with the LPString class. We could continue using the <iostream> functions to complete this task, but including it in LPString provides us with another opportunity to demonstrate static or class functions.
  1. The picture reminds us that line is a C-string, so it is zero-indexed and null-terminated.
  2. To demonstrate the placement of the "static" keyword, we prototype the print function in the class specification, which is in the LPString.h header file.
  3. Continuing the demonstration, we place the function definition in the LPString.cpp source code file. Notice that we don't need the "static" keyword here.
  4. The static print function "belongs" to the class rather than to an object or instance of the class. So, when a program calls the function, it must use the class name and the scope resolution operator, ::.
An LPString with the text'Hello world' in text[1] through text[11] and the string's length in text[0].
void LPString::print() const
{
	for (int i = 1; i <= text[0]; i++)
		cout << text[i];
}
(a)(b)
void LPString::println() const
{
	print();
	cout << '\n';
}
LPString lps1("Hello world");

lps1.print();
lps1.println();
 
(c)(d)
The LPString print and println member functions. The print and println functions are named the same as the corresponding Pascal and Java functions. The print function prints a string to the console without a trailing new-line character, while println prints the string followed by a new-line character. To prevent duplicating code, println calls print and then adds the new-line character.
  1. Admittedly, the picture would be more useful if we followed a more authentic implementation based on lower-level operations or system calls. Still, the function uses a for-loop to print the characters one at a time, and the picture helps us configure the loop.
  2. The text array is not null-terminated, so we can't print it with a single C-string operation. Using the information organized in the picture, we configure the for-loop controls: the loop starts at 1, uses less than or equals for the test, and compares the loop-control variable with the value saved in text[0].
  3. It is generally good practice to avoid duplicating code whenever feasible, so the println calls print and then adds the new-line character.
  4. Function validation is straightforward.
Two pictures of an LPString. The first string is logically empty because text[0] is 0. The LPString functions ignore the characters in text[1] through text[5]. The second picture illustrates characters read from cin, saved in variable c, and then copied into the string's text array. The function discards the new-line character, i.e., it doesn't save the character in the text array.
 0
 1 H
 2 He
 3 Hel
 4 Hell
 5 Hello
 6 Hello 
 7 Hello w
 8 Hello wo
 9 Hello wor
10 Hello worl
11 Hello world
(a)(b)
void LPString::readln()
{
    int c;
    text[0] = 0;
    while ((c = cin.get()) != '\n' && text[0] < LENGTH - 1)
        text[++text[0]] = c;
}
LPString::print("\n*** Testing the order function: ***\n");
LPString lps1;
cout << "Please enter a string: ";
lps1.readln();
lps1.println();
 
 
(c)(d)
The LPString readln function. String-input functions typically allow users to backspace and reenter characters before signaling the program to read the string by pressing the Enter key. Pressing the enter key also inserts a new-line character at the input stream. The Java and Pascal readln functions read the new-line character but discard it (i.e., they do not include it in the string). We'll read the string one character at a time, allowing us to locate the new-line character. We'll use the get function (see the wc.cpp example) in place of lower-level operations, to read the characters.
  1. String input functions typically discard or overwrite a string's contents. Accordingly, the LPString must be empty before the reading operation begins. If the string is new, as in Figure 1(a), it's ready for the operation. However, if the string contains text, as in Figure 1(b), the function must discard the character data before reading. The top string (1) shows what Figure 1(b) looks like after the function empties it - the length is 0, and the function ignores the remaining characters: "Hello." The second string (2) illustrates characters as the get function reads them from cin and saves them in c. The while-loop copies the characters to text.
  2. A simple picture illustrates how the string changes during each loop iteration. The brown box represents the space character.
  3. Although the readln function is short, it involves several intricate steps:
    • The statement text[0] = 0 empties the string - array (1). Now, the function can use array[0] as an accumulator to count the characters as the loop adds them to the string.
    • The get function reads characters one at a time from cin and temporarily saves them in c (the pair of red parentheses force the get function call and the assignment operation to take place first).
    • The loop runs while the input is not the new-line character, and there is space in text for additional characters. The "-1" is necessary because although the array has 256 elements, the string only uses 255 to store characters.
    • The expression ++text[0] first increments the string's length and then uses it as an index into the string.

Algorithmic Functions

Algorithmic functions manipulate, modify, and otherwise use LPStrings to solve client program problems. For organizational convenience, we'll group these functions into three sub-categories:

  1. Functions that modify this object. The functions in this category follow the general pattern: void a.function(b), where a is an LPString object and b represents 0 or more parameters of various types. The functions change a, reflecting the function's results.
  2. Functions that create a new LPString object. These functions have the general pattern: LPString a.function(), where a is an LPString object and b is 0 or more parameters. The functions in this group do not alter a or b but return a new LPString object representing the function's operation.
  3. Functions that compare two LPStrings. The final group has follows two similar patterns: bool a.function(b) or int a.function(b), where a and b are LPStrings.

Functions Modifying this Object

A picture of the LPString 'Hello' saved as text[0] = 5, and the characters in text[1] through text[5]. A picture of the same LPString after being cleared: text[0] = 0 but text[1] through text[5] still contains 'Hello,' but the functions ignore the characters
(a)(b)
void clear() { text[0] = 0; }
lps2.clear();
(c)(d)
The LPString clear function. The clear function is trivial, and the text explains the concepts justifying its operation above.
  1. An LPString before the clear function operation.
  2. The string after the clear function operation.
  3. The clear function inlined in the class specification.
  4. A simple test statement. See the append function below for the full context of the test.
A picture showing two LPStrings, this and s as follows:
this[0] = 5
this[1] = 'H'
this[2] = 'e'
this[3] = 'l'
this[4] = 'l'
this[5] = 'o'
and
s.text[0] = 6
s.text[1] = ' '
s.text[2] = 'w'
s.text[3] = 'o'
s.text[4] = 'r'
s.text[5] = 'l'
s.text[6] = 'd'
The function must copy s.text[1] to this[6], s.text[2] to this[7], and so on to s.text[6] to this[11].
 5 Hello
 6 Hello 
 7 Hello w
 8 Hello wo
 9 Hello wor
10 Hello worl
11 Hello world
A picture showing the completed append operation: text[0] = 11, and text[1] through text[11] = 'Hello world.'
(a)(b)
void LPString::append(const LPString& s)
{
	if (text[0] + s.text[0] >= LENGTH)
		throw "strings too long to append";

	for (int i = 1; i <= s.text[0]; i++)
		text[i + text[0]] = s.text[i];

	text[0] += s.text[0];
}
LPString::print("\n*** Testing the append function: ***\n");
LPString lps1("Hello");
LPString lps2(" world");
lps1.append(lps2);
lps1.println();

LPString lps3("Hell");
lps3.append('o');		// append a single character
lps3.append(" world");		// append a C-string
lps3.println();
(c)(d)
The LPString append function. The append function adds or appends characters at the end of this LPString. The function uses a for-loop to copy each character, and correctly indexing each string with the loop-control variable is the most challenging part of the function. An ancillary problem is distinguishing the strings' lengths, which is necessary to control the for-loop and index the strings. The picture helps us see how the function must index the strings and drive the loop.
  1. Appends the parameter s to the end of this LPString by copying the parameter characters one at a time. We can use the same variable to index both strings if we use a constant offset when indexing this string. The offset is the length of this string. The loop copies the characters from s to this string. The for-loop runs from 1, the index location of the first character in s, to the length of s, saved in s.text[0].
  2. A dynamic, step-by-step picture of the copy operation. The final picture details the this string after the function finishes. The brown boxes represent the space character.
  3. The function begins by verifying that there is enough space in this string to complete the append operation and throws an exception if there isn't. The for-loop carries out the copy operation outlined in the picture. When the loop finishes, the function updates the length of this string. Notice that the function does not increment the length of this string because doing so would "break" the constant offset needed for offsetting the this string index.
  4. We divide the test and validation code into two groups. The first group is straightforward: it appends the function argument to this string. However, the second group relies on an unexpected C++ operation. The at function test-and-validation code (Figure 8) employed an obscure - in the sense that it's hard to see - conversion operation. This example goes a step further and uses two "hidden" conversions. While the LPString class does not have overloaded append functions that accept a character or a C-string, it does have constructors that do. So, the C++ compiler automatically converts 'o' and " world" into anonymous LPString objects and then uses them to complete the append operations. The compiler will only perform one level of conversion: it won't automatically convert x to y and then convert y to z.

Functions Creating A New LPString Object

Two LPString objects, 'this' and 'local,' are represented as rectangles denoting their 'text' member variable arrays. The 'this' text array saves the string 'Hello' in elements 1 through 5 and the string's length, 5, in element 0. The picture shows that i, the loop control variable, ranges from 0 to 5 while copying 'this' object to the local variable named 'local.' The for-loop copies the elements 'this' to 'local' one at a time.
LPString LPString::copy() const
{
    LPString local;

    for (int i = 0; i <= s.text[0]; i++)
        local.text[i] = text[i];

    return local;
}
LPString lps1("Hello");

LPString lps2 = lps1.copy();
lps2.println();
(a)(b)(c)
The LPString copy function. The copy function is very similar to the copy constructor, and you could argue that the copy constructor makes the copy function redundant. Nevertheless, the class includes it as a simple example of a function that returns an object.
  1. The picture suggests that the function must copy the elements of this object to another object. Unlike the previous functions, the function's signature or prototype doesn't provide another object. So, the function creates a temporary, local object and copies this object to it. Following the copy operation, the function returns the local object.
  2. The function creates a local, and initially empty, object LPString object named local with the default constructor. A single for-loop copies the elements of this string to local string. The return operator returns local by value (i.e., by copy).
  3. Calling the copy function and validating the returned value is straightforward.
The picture shows 'this' string. The stores it length, 11, in [0]. The characters 'Hello world' occupy [1] through [11].
LPString LPString::substring(int index, int length) const
{
	if (index < 1 || index > text[0])
		throw "index is out of bounds";
	if (index + length >= LENGTH)
		throw "\"length\" is too long";

	LPString local;
	local.text[0] = length;
	for (int i = 0; i < length; i++)
		local.text[i + 1] = text[index + i];
	return local;
}
(a)
The picture shows the 'text' array of the 'local' string. The string's length, 5, is saved in [0] and the sub-string 'world' in [0] through [5].
(b)(c)
i index+i text[index+i] i+1 local.text[i+i]
0 7 w 1 w
1 8 o 2 o
2 9 r 3 r
3 10 l 4 d
4 11 d 5 d
(d)
The LPString substring function. The substring function extracts and copies part of an LPString object, creating a new LPString that stores the extracted substring. The function has two arguments: index is the starting location of the copy, and length is the substring's length. The example assumes that the substring function is called with index = 7 and length = 5. The function creates a local temporary variable, local, to hold the copy until the function returns it.
  1. The relationships between this string and the function parameters index and length.
  2. The string the substring function returns.
  3. The substring function verifies that the starting location, index, is valid (i.e., inbounds). It also verifies that the sum of the substring starting location and length doesn't index the this string out of bounds. If either test fails, the function throws an exception using the throw keyword. The function creates an empty LPString named local, initializes its length to the substring's length, and copies the substring characters from this string to local one at a time. When the for-loop finishes copying the characters, the function returns local, containing the extracted characters.
  4. The tables can help understand how the program uses the loop control variable to index into the string arrays. Unlike many for-loops in the previous problems, we begin this loop at 0 and use a strict less-than test to drive it (highlighted in yellow). We adjust the range of the loop control variable by adding 1 to it when we index into the local string's text array (highlighted in light blue). We use the sum of the loop control variable and index, the substring starting location, to index into this string (highlighted in coral). We could start the for-loop at 1, use <=, simplify the indexing into local, and compensate by changing the this string indexing: text[index + i - 1].

String Comparison Functions

The picture shows two LPStrings, this and s, and suggests that the function must compare pairs of characters. The picture shows two LPStrings, this and s. The function compares the characters in pairs and ends when it finds the first unequal pair.
(a)(b)
The picture shows two LPStrings, this and s, with different lengths. The function returns when it detects the length difference.
bool LPString::equals(const LPString& s)
{
	for (int i = 0; i <= text[0]; i++)
		if (text[i] != s.text[i])
			return false;
	return true;
}
(c)(d)
LPString::print("\n*** Testing the equals function: ***\n");
LPString lps1("hello world");
LPString lps2("hello world");
if (lps1.equals(lps2))
	cout << "equals" << endl;
else
	cout << "not equals" << endl;

LPString lps3("hello world");
LPString lps4("hello Alice");
if (lps3.equals(lps4))
	cout << "equals" << endl;
else
	cout << "not equals" << endl;
LPString lps5("hello world");
LPString lps6;
if (lps5.equals(lps6))
	cout << "equals" << endl;
else
	cout << "not equals" << endl;

LPString lps7("apple");
LPString lps8("zebra");
if (lps7.equals(lps8))
	cout << "equals" << endl;
else
	cout << "not equals" << endl;
 
(e)
The LPString equals function. The equals function compares the characters of two LPStrings, left to right, one pair of characters at a time - including the "characters" storing the strings' lengths. The function returns false when it detects the first unequal pair; it returns true only after comparing all pairs and verifying that they are equal. The comparison is case-sensitive, meaning that A is not equal to a.
  1. The picture suggests that the equals function compares the elements of two LPStrings by pairs, including the elements storing the strings' lengths. The function returns true after comparing the characters in locations 0 through 11 without detecting a mismatch.
  2. Characters at index locations 0 through 6 are equal, but the characters at index location 7 are not, causing the function to return false without comparing additional characters.
  3. The function determines, with a single comparison, that the strings have different lengths and returns immediately.
  4. The equals function is small and straightforward. Beginning the loop at 0 includes the strings' lengths, so strings of unequal lengths are rejected quickly. This logic allows us to drive the loop with one string's length without the risk of (logically) indexing the other string out of bounds.
  5. A set of tests validating the equals function and demonstrating how to call it.
The picture shows two LPStrings, both containing the characters 'apple.' The order function compares the strings one character at a time. All the corresponding characters match, and the function returns 0. The picture has two strings. As pictured, the first string contains 'apple' and the second 'zebra.' The function must only compare the first character from each string to detect the mismatch. 'a' comes before 'z,' so the function returns -1.
(a)(b)
The picture shows two strings, 'apple' and 'appl'. The function compares the first four characters of each string and then reaches the end of the shortest. 'Nothing comes before something,' so the shortest string comes before the longer one. As pictured, the function returns 1.
int LPString::order(const LPString& s)
{
    for (int i = 1; i <= text[0] && i <= s.text[0]; i++)
        if (text[i] < s.text[i])
            return -1;
        else if (text[i] > s.text[i])
            return 1;

    if (text[0] == s.text[0])
        return 0;
    else if (text[0] < s.text[0])
        return -1;
    else
        return 1;
}
(c)
The picture shows two strings, 'apple' and 'applx.' The function compares all five characters before detecting the difference between the strings. As pictured, 'e' comes before 'x,' and the function returns -1.
(d)(e)
LPString::print("\n*** Testing the order function: ***\n");
LPString lps1 = "apple";
LPString lps2 = "apple";
cout << lps1.order(lps2) << endl;	// 0 (a)

LPString lps3 = "apple";
LPString lps4 = "zebra";
cout << lps3.order(lps4) << endl;	// -1 (b)
cout << lps4.order(lps3) << endl;	// 1
LPString lps5 = "apple";
LPString lps6 = "appl";
cout << lps5.order(lps6) << endl;	// 1 (c)
cout << lps6.order(lps5) << endl;	// -1

LPString lps7 = "apple";
LPString lps8 = "applx";
cout << lps7.order(lps8) << endl;	// -1 (d)
cout << lps8.order(lps7) << endl;	// 1
(f)
The LPString order function. Ordering functions compare two strings and determine their relative order, that is, which one comes first. Determining two strings' relative order is an important step in, among other operations, sorting strings - for example, listing them in alphabetical order. Given two strings, X and Y, and the operation order(X,Y) or X.order(Y), ordering functions typically return a negative value if X comes before Y, a positive value if X comes after Y, and 0 if X and Y have the same order. The magnitude of the positive and negative values is unimportant, and modern functions typically return -1, 0, and 1. Like equals, the order comparisons are case-sensitive. Furthermore, upper-case letters come before lower-case.
  1. The strings are the same length, and their characters are all the same, so the strings have the same order, indicated when the function returns 0.
  2. The strings are the same length, but their first characters differ. The nested if-statement nested ends the for-loop early. As pictured, the function returns -1, but the validation code tests both orders..
  3. The loop runs four times before the mismatched string lengths end it. The if-else ladder determines the order by applying the rule "nothing comes before something." As pictured, the function returns 1, but the validation code, (f), tests both orders.
  4. The strings are the same length but differ at the last character. The for-loop ends the function call.
  5. The for-loop stops when it reaches the end of the shortest string. The nested if-statement determines the strings' order if the function finds mismatched characters before reaching the end of the shortest string; otherwise, the if-else ladder makes the determination.
    If execution reaches the ladder, the loop didn't find mismatched characters, and the strings' lengths determine the order based on the "nothing comes before something" rule.
  6. A minimal set of validating tests. This function is "tricky," and we must test it thoroughly.

Try It Yourself

Learning to draw and use pictures to help solve problems takes practice. Two LPString functions, concat and insert, remain unimplemented. Writing these functions will give us some practice using pictures, more experience solving basic programming problems, and help us review array and member function syntax. Once you have implemented the functions, design and write an appropriate set of validation tests.

The picture illustrates three LPStrings represented as rectangles: 'this,' 's,' and 'local.' The function must copy this[1] to local.text[1] through this[5] to local.text[5]. Then, it copies s.text[1] to local.text[6] through s.text[6] to local.text[11].
The LPString concat function. We can write this function in two fundamentally different ways. First, we can write it using existing functions. This approach is relatively easy. Second, we can write it using fundamental operations like loops, if-statements, and arrays - just as we have done in the previous examples. See if you can write the function both ways, as each approach can teach us a valuable lesson.
LPString concat(const LPString& s) const;
The concat function concatenates two strings, this and s, to form a new string, named local in the illustration, that it returns. The picture suggests the function has five main parts:
  1. Validate that the concatenated strings, this and s, fit (i.e., do not overflow) an LPString's capacity.
  2. Create a new LPString in local or function scope.
  3. Copy the characters from this to local.
  4. Append the characters from s to the end of local.
  5. Set local's length.
There are several ways of writing this function - see the copy and append functions for ideas. Two possible solutions are presented here.
Before and after pictures showing the parameter string's insertion into the target or this string. The target contains 'Hello world!' while the parameter contains 'new ' (note the space at the end, making the parameter four characters long). The insertion occurs at this[7], the location of the 'W' in the target. The picture illustrates shifting the characters in the target to the right four spaces.
The LPString insert function. The initial picture gives us an overview of how the insert function operates. But, this is one of the most complex functions in the class, so it is helpful to draw some intermediate pictures showing more detail before programming the function.
void LPString::insert(const LPString& s, int index);
The LPString insert function requires two parameters: a string that the function will insert into the target or this string and the location in the target where the insertion will take place. Some parts of the insert function are similar parts of the copy and append functions (Figures 6 and 13), and you may wish to review them before continuing. It's often easier to write complex functions like insert by breaking them down into logical steps:
  1. Verify that the total length of the two strings will not overflow the target; throw an exception if the total is too long.
  2. Verify that the index parameter is valid (i.e., it's inside the target string).
  3. The picture suggests we must make room in the target string before inserting additional characters. We make the space by shifting some target characters to the right, beginning at the location indicated by index. We shift the target characters to the right by the length of the parameter string. It's vital to recognize that the shift operation must begin with the rightmost character and proceed right-to-left.
  4. The next step, copying the characters from the parameter string to the target, is similar to the append function.
  5. Finally, update the length of the target or this string; this step is also like the append function.
When finished, please study the solution presented here.

Downloadable Code

The example programs are formatted with tab stops set at 8 spaces.

View1Download
LPString.h LPString.h
LPString.cpp LPString.cpp
client.cpp client.cpp
1 The behavior of these links depends on your browser and desktop configuration.
Jonassen, D. H. (2000). Toward a design theory of problem solving. Educational Technology, Research and Development, 48(4), 63-85.
Greeno, J. G., Collins, A. M., & Resnick, L. B. (1996). Cognition and learning. In D. C. Berliner & R. C. Calfee (Eds.), Handbook of Educational Psychology (pp. 15-46). New York: MacMillian Library Reference USA.
Larson, L. C. (1983). Problem-Solving Through Problems. New York: Springer-Verlag.