14.10. Random and Direct Access

Review

Understanding random or direct file access relies on concepts introduced previously. Please review the following as needed:

The current chapter began with two figures. The first suggests that a file looks like an array of bytes with a position pointer indicating where the program is currently reading or writing data. The second illustrates the primary C++ input/output system classes. Significantly, the iostream class has two superclasses, istream and ostream, and inherits a position pointer from each. So, fstream, a subclass of iostream, has two position pointers, one for input and one for output.

When it is important to do so, we distinguish the two pointers as "get" (read or input) and "put" (write or output). In preparation for accessing a specific data item, the program moves the pointers within a file with one of four overloaded "seek" functions. The next I/O operation occurs at the new position in the file. Additionally, two "tell" functions report a stream's current position within a file. The name of each function ends with either a "p" or a "g" to denote a specific pointer.

Function Description
istream& seekg(streampos pos);
Moves the "get" or "put" stream position to an absolute location, pos, in the file. Absolute positions are measured in bytes from the beginning of the file, pos must be ≥ 0.
ostream& seekp(streampos pos);
istream& seekg(streampos off, ios::seekdir loc);
Moves the "get" or "put" stream position to a location relative to one of three file locations. The offset, off, is the number of bytes added to or subtracted from the specified location, loc, as denoted by one of the symbolic constants:
  • ios::beg beginning of the file; off must be ≥ 0
  • ios::cur current position in the file; see note
  • ios::end end of the file; see note
Note: It is not possible to seek past the beginning of the file, so current position + offset ≥ 0. It is possible, but rarely useful, to seek beyond the file's end, so current position + offset ≤ file size.
ostream& seekp(streampos off, ios::seekdir loc);
streampos tellg();
Returns the current position of one of the position pointers within the file. The returned position is measured in bytes from the beginning of the file. Both functions return -1 on failure.
streampos tellp();
seekg and seekp. The file positioning functions allow programs to access data in any desired order, not just sequentially. As the order isn't predetermined, it's often described as "random." A program can call the read or write functions repeatedly, resulting a sequential access. However, switching between reading and writing requires an intervening "seek" function call:

Direct Access

In the Block I/O section, we learned that we can group the bytes together to form blocks or records, and that we can read and write block-oriented files sequentially. We can implement direct access by combining block I/O, the file positioning functions, and records. The previous discussion specified the seek positions in bytes. But a file of records looks and behaves very much like an array of objects. If we index into an array of objects, the compiler automatically converts the array index into a byte address in memory. To implement direct access, we need a way to map a record number (aka block number), the analog of an array index, to a byte address within the file. Unfortunately, C++ doesn't have a way to complete the mapping automatically, so we must do it manually.

byte address = record number × size of a record
record number = byte address / size of a record

(a)
struct chunk { . . . };

byte = record * sizeof(chunk);
record = byte / sizeof(chunk);
(b)
byte = 7 * 10 = 70
record = 70 / 10 = 7
(c)
A picture demonstrating the relationship between byte addresses and record numbers. Assuming that each record is ten bytes long, record number 5 is located at byte address 50.
(d)
The relation between byte addresses and record numbers. Byte addresses and record numbers begin with zero and continue through the end of the file. The last valid byte address is the file's size one, while the last valid record number is the number of records minus one. That is, both addressing schemes are zero-indexed. A simple mapping function converts a record or block number into a byte address that the positioning functions can use. A similar mapping function converts from a byte address to a record number.
  1. Pseudo code for the mapping functions.
  2. C++ code for the mapping functions. Replace chunk with a specific structure or class from the problem.
  3. A simple example of the mapping functions assuming, for simplicity, that the size of a chunk is 10 bytes.
  4. An abstract representation of the example. Each rectangle is a record, that is, an instance of the chunk structure. For the purpose of illustrating the problem, we assume that the size of each record is 10 bytes. The bottom rectangle or last record is located at byte address 70 in the file; it spans the bytes 70 through 79, and its record number is 7.
One of the hallmarks of direct access is that a program can seek to a specific record, read it, modify it, seek to the same record number, and write the modified record back into the file without affecting any other record.

Common Programming Patterns

struct chunk { . . . };
chunk c;
fstream data;
Specification and definitions for the examples
data.seekp(0, ios::end);
data.write((char *) &c, sizeof(chunk));
Write a new record at the end of the file
data.seekg(0);
data.read((char *) &c, sizeof(chunk));
Rewind to the beginning of the file and read the first record.
data.seekg(0);
while (data.read((char *) &c, sizeof(chunk)))
	....
Rewind to the beginning of the file and read all records in sequence
data.seekp(data.tellg() - (streampos)sizeof(chunk));
data.write((char *) &c, sizeof(chunk));
Searching and replacing - seek to the position after the last read, back up one record, and overwrite the record that's already there
data.seekp(ios::cur, -(streampos)sizeof(chunk));
data.write((char *) &c, sizeof(chunk));
Replace or overwrite the last record written
data.seekp(0, ios::end);		  // (a)
streampos pos = data.getp();
data.write((char *) &c, sizeof(chunk));
		. . .
data.seekg(pos);			  // (b)
data.read((char *) &c, sizeof(chunk));
		. . .
data.seekp(pos);			  // (c)
data.write((char *) &c, sizeof(chunk));
Simple database operations. Groups of code are separated by time. The variable pos may be a field in another object.
  1. Move to the end end of the data file, save the position, and write a new record
  2. Move to the saved position and read the stored record
  3. Move to the saved position and replace (overwrite) the existing record
Using the positioning functions to implement direct access.