13.5.1. STL Examples: map, list, pair, and iterator

Review

We continue the practice in this section of using the word count program to explore containers. However, rather than creating a container, we greatly simplify the programs by using existing STL containers. The basic program logic remains unchanged from the KVTree version: the program reads the input file one character at a time, converts the alphabetic characters to lower case, and appends them to a word. When the word is complete, the program searches a container for the word. If the word isn't in the container, the program adds it and increments its count otherwise. Please see the earlier example for details omitted for brevity.

The STL examples contrast a map implementation of the word-count program with two list implementations. The STL implements a map with a binary tree, providing fast insert and search operations. Both operations slow as the tree becomes larger. Fortunately, the slowdown is gradual with the runtime generally (ignoring some worst-case scenarios) proportional to log n, where n is the number of tree nodes. Conversely, lists are linear data structures whose insert and search operations degrade more quickly than binary trees as the data size increases. The time to run the list operations is proportional to n. We begin by introducing some STL support classes.

STL Iterators

Reiterating (did you see what I did there?) our previous definition, "iterators are objects that sequentially access the elements stored in a container object," allowing a program to loop or iterate through them. The rudimentary iterator we created for the KVTree is quite limited compared with the STL iterators. Although the iterators used by the following programs are all named iterator, their full names, formed with a container class name and the scope resolution operator, are unique - they are not the iterator.

While programs instantiate the STL iterators from distinct classes, they share several crucial features:

The pair class

The pair structure builds an association between two values whose types it represents with template variables. In Chapter 9, I claimed, "The only real difference between a struct and a class is the default visibility: structures have public visibility, and classes have private visibility. Nevertheless, C++ programmers generally only use structures to represent packaged data and reserve classes for truly object-oriented situations (i.e., when an entity should have attributes and operations packaged together)." pair is an exception: programmers often access both fields or members, but giving it functions eases its use.

template <class T1, class T2>
struct pair
{
    T1 first;
    T2 second;
};
The pair structure specification. A pair object associates two general template values. However, the meaning of the values and their association only exists in a problem and the program solving it, making it impossible to derive member names reflecting either the association or the problem. Therefore, the pair structure generically names its members first and second. Several containers transparently use the pair structure, importing the specification themselves, but programmers can use it independently of other containers by including the <utility> header.

WordCount: map and iterator

#include <iostream>
#include <fstream>
#include <iomanip>
#include <string>
#include <cctype>
#include <map>											// (a)
using namespace std;

int main()
{
    map<string, int>	words;									// (b)
    ifstream		file("alice.txt");
    int			c;
    string		word;

    while ((c = file.get()) != EOF)
    {
        if (isalpha(c))
            word += tolower(c);
        else if (word.length() > 0)
        {
            words[word]++;									// (c)
            word.clear();
        }
    }

    for (map<string,int>::iterator i = words.begin(); i != words.end(); i++)			// (d)
        cout << left << setw(20) << i->first << right << setw(3) << i->second << endl;		// (e)

    return 0;
}
map container WordCount solution. A map container's features closely match the word-count problem's requirements, making it the ideal container for a programmed solution. The map is implemented as a binary tree with two template variables, making inserting or finding words fast operations. The template implementation allows programmers to use any appropriate key and value types, specifically string and int in the word-count problem. Finally, the map container's functions and iterators are well-engineered and highly optimized, allowing us to focus on the problem while (mostly) ignoring the container. The map "hides" one significant implementation detail: it stores each key-value pair in an instance of the pair class.
  1. The map class specification and function prototypes.
  2. Instantiates a map container with string keys and int values.
  3. If we can single out one statement in the program as pivotal, it is this one. If word isn't in the map, words[word] inserts it with a default 0 count. The expression also returns a reference to the value (i.e., the count) associated with word, which ++ increments.
  4. The first for-loop expression defines a map iterator and initializes it to the first map node. The middle expression loops until the iterator has passed the last node. The last expression, i++, increments the iterator to the next node.
  5. The cout statement formats the key and value for output. The iterator, i, points to a pair object, so the program accesses the key (word) and value (count) with the member names first and second.

WordCount 2: list, iterator, and Client Structure

#include <iostream>
#include <fstream>
#include <iomanip>
#include <string>
#include <cctype>
#include <list>											// (a)
using namespace std;

struct w_count											// (b)
{
    string	word;
    int		count;
};

int main()
{   list<w_count>	words;									// (c)
    ifstream		file("alice.txt");
    int			c;
    string		word;

    while ((c = file.get()) != EOF)
    {
        if (isalpha(c))
            word += tolower(c);
        else if (word.length() > 0)
        {
            list<w_count>::iterator i = words.begin();						// (d)
            while (i != words.end() && word > i->word)
                i++;
            
            if (word == i->word)								// (e)
                i->count++;
            else
            {
                w_count node;
                node.word = word;
                node.count = 1;
                words.insert(i, node);
            }

            word.clear();
        }
    }

    for (list<w_count>::iterator i = words.begin(); i != words.end(); i++)
        cout << left << setw(20) << i->word << right << setw(3) << i->count << endl;

    return 0;
}
A list storing a client structure. Programs typically access list data by position rather than content. The STL list class allows programs to insert and access data at the beginning or end, but they must use iterators to insert or access interior data. There is a significant time lapse from the program start to the first output.
  1. The list class specification and function prototypes.
  2. The list container doesn't automatically create a K-V pair associating a word and its count, so the client program creates a structure to make the association.
  3. Instantiates a list container storing w_count objects.
  4. Without a search function, the program must scan the list from the beginning to find the word or its insertion point. It begins by creating a list iterator named i and looping through the words list. The expression word > i->word maintains the list in alphabetical order.
  5. If the word is in the list, increment its count; otherwise, insert it with a count of 1.

WordCount 3: list, pair, and iterator

#include <iostream>
#include <fstream>
#include <iomanip>
#include <string>
#include <cctype>
#include <list>											// (a)
using namespace std;


int main()
{
    list<pair<string,int>>	words;								// (b)
    ifstream			file("alice.txt");
    int				c;
    string			word;

    while ((c = file.get()) != EOF)
    {
        if (isalpha(c))
            word += tolower(c);
        else if (word.length() > 0)
        {
            list<pair<string,int>>::iterator i = words.begin();
            while (i != words.end() && word > i->first)
                i++;
            
            if (word == i->first)
                i->second++;
            else
                words.insert(i, make_pair(word,1));						// (c)

            word.clear();
        }
    }

    for (list<pair<string,int>>::iterator i = words.begin(); i != words.end(); i++)
        cout << left << setw(20) << i->first << right << setw(3) << i->second << endl;

    return 0;
}
A list and pair solution. This list version is logically identical to the previous example, differing only by replacing the client w_count structure with the STL pair class.
  1. The list and pair class specifications and function prototypes.
  2. Instantiates a list storing instance of pair class, which the program configures to hold a string and int (notice the nest angle brackets).
  3. The make_pair creates a pair object, storing word with a count of 1. The insert function inserts the pair in the words list at the location indicated by the iterator i.

Downloadable Code

ViewDownloadComments
WordCount.cpp WordCount.cpp A program counting the unique words in a book using the STL template classes map and iterator
WordCount2.cpp WordCount2.cpp A program counting the unique words in a book using the STL template classes list and iterator
WordCount3.cpp WordCount3.cpp A program counting the unique words in a book using the STL template classes list, iterator, and pair
alice.txt alice.txt The WordCount input file