14.1. Introduction To Files and I/O Streams

Review
A picture depiction a file as a long rectangle divided into smaller rectangle. Each small retangle is numbered from 0 to n-1, where n is the number of small rectangles. The picture represents a file as a contiguous sequence of bytes. The large rectangle is the file, and each small rectangle is a byte in the file.
A file of n bytes. We can view a file as an array of bytes. The OS maintains the current position or location in open files with a position pointer. The pointer is an instance of the streampos class but works like an index into an array. The next I/O operation, read or write, occurs at the current position, and the operation advances the pointer by the number of bytes read or written.
A file is a named collection of related information recorded on secondary storage. From a user's perspective, a file is the smallest allotment of logical secondary storage; the OS can only store data in secondary storage as a file. Commonly, files represent programs (both source and object form) and data. Data files may be numeric, alphabetic, alphanumeric, or binary. Files may be free form, such as text files, or formatted rigidly. In general, a file is a sequence of bits, bytes, lines, or records, the meaning of which is defined by the file's creator and user. The concept of a file is thus extremely general. (p. 384).
Silberschatz, A.; Galvin, P. B.; & Gagne, G.
Operating System Concepts Essentials
John Wiley & Sons, Inc., 2011

Data, as it moves between a program and a file, is just a stream of bytes – think of marbles rolling down a hose. C++ represents and manages the streams as objects instantiated from a family of classes. The names of the classes and the associated header files reflect the specific purpose of the streams: Some streams perform input operations, while others perform out operations; some streams only read from and write to the console, while others read from and write to stored files. When C++ was first released, the I/O streams were included as an example of a class, while today, they are a fully integrated part of the ANSI definition of the language.

We already know quite a bit about streams and have considerable experience using them. cin and cout are instances of two classes named istream and ostream respectively. These two stream objects read from and write to the console, which is just a special file. The figure below illustrates I/O streams as a class hierarchy. Different classes in the hierarchy allow us to instantiate new objects that can access different kinds of files (such as those on a disk or a flash drive) in various ways. Although we can access more kinds of files with these stream objects, the syntax for reading from or writing to them is the same as for reading from or writing to the console with cin and cout.

A complex UML class diagram of the fundamental I/O stream classes. The top class is named 'ios' for input/output system. It has an aggregated 'strambuf' class, which is the superclass of 'filebuf'. 'istream' (input stream) and ostream (output stream) are subclasses of 'ios.' 'istream' has two subclasses: 'ifstream' (input file stream) and 'iostream' (input/output stream). 'ostream' also has two subclasses: 'ofstream' (output file stream) and 'iostream' (input/output stream). So, 'iostream' has two superclasses: 'istream' and 'ostream'. 'iostream' has one subclass: 'fstream' (file stream).
UML class diagram of the C++ I/O stream classes. The classes that support console I/O are colored blue in the UML diagram, while the more general file I/O classes are colored green. The <iostream> header file contains the class specifications and function prototypes necessary for console I/O, while the <fstream> header file contains the class specifications and prototypes needed for file I/O. (C++ spreads the classes over more header files than this, but these two #include the others, making it unnecessary to include them explicitly.)

The ios class is at the top of the input/output system. It defines several basic functions and some symbolic constants. The constants allow us to change or configure the behavior of the input and output objects. Letters appearing in the names of the classes indicate the class's primary features:

i input - read data into the program
o output - write data from the program
f file - the target of the operation is a file
One class, iostream, utilizes multiple inheritance, which means that it inherits features from both istream and ostream. Curiously, including fstream in a program automatically includes some but not all of the features of iostream. (This behavior was observed on Windows and Linux.) For clarity and portability, I recommend including both header files in programs performing console and file I/O. See <ios> for more detail and a different view of the classes.

From the UML diagram above, we can see that ifstream is a subclass of istream and that ofstream is a subclass of ostream. This means that once we make instances of the file streams, it is just as easy to use them as it is to use cin and cout (instances of istream and ostream respectively). Thus, we easily extend our previous experience to perform familiar console operations with files.

int	i;
double	d;
	. . .
cin >> i;
cout << d << endl;
ifstream	input("data1.txt");
ofstream	output("data2.txt");
	. . .
input >> i;
output << d << endl;
(a)(b)
File I/O is similar to console I/O.
  1. Using the extractor ( >> ) and the inserter ( << ) operators to read data from and to write data to the console are now familiar operations.
  2. This simplified code fragment illustrates that using the extractor and the inserter operators with file stream objects is similar to using them with console streams.

The formal definition of a file presented above states that "a file is a sequence of bits, bytes, lines, or records." Although files may contain bitwise data, most operating systems don't provide a way of accessing individual bits, which leaves us with three ways of accessing file data: (a) bytes or characters, (b) lines, and (c) records or blocks. We'll organize much of our discussion of file I/O around these three access techniques. But before we can read from or write to a file, we must first open the file, and there are a few prerequisite concepts that we must understand before we can open a file. The next section presents these concepts.