14.3. File Buffers

Our previous discussions involving variables, functions, pointers, and other data-related features all centered around main memory. Main memory is fast but requires constant power to "remember" the stored data and programs. Secondary memory is slower than main memory, but it can persistently store large amounts of data and programs. The "persistent" nature of secondary memory means that it doesn't lose its contents when the computer is powered off. Modern computers use many physical devices for secondary memory, but we'll restrict our discussion to disk drives.

A picture illustrating one magnetic disk drive platter.
A magnetic hard disk drive (HDD). An HDD consists of one or more disks or platters covered in a magnetic material that can store digital data. Drives organize data on both sides of a platter as concentric tracks and further divide each track into small arcs called sectors. Each sector consists of a header, a data area, and a trailer. "The header and trailer contain information used by the disk controller" (Silberschatz, Galvin, & Gagne, 2011, p. 468). An arm moves a read/write head from one track to another. The drive reads or writes data from sectors as they pass under the read/write head.

In a typical HDD, the disks spin at 7200 rotations per minute (RPM) or 120 rotations per second. So, each sector takes more than 1/8000th of a second to make one complete rotation. The time that a program waits while a needed sector rotates around to the read/write head is called the rotational latency. The time it takes for the arm to move from its current position to a needed track is called the seek time. Rotational latency and seek time seem small in terms of human perception, but they are very long compared to the speeds of a modern CPU and main memory.

Solid state drives (SDD) use various integrated circuits to store data. SSDs do not have moving parts, so they are faster, quieter, use less power, and generate less heat, but they are also more expensive. The construction differences between HDDs and SSDs notwithstanding, SSDs still operate more slowly than CPUs and main memory.

The speed difference between main and secondary memory makes it impractical to read or write a file one byte at a time. "To improve I/O efficiency, I/O transfers between memory and disk are performed in units of blocks. Each block has one or more sectors. Depending on the disk drive, sector size varies from 32 bytes to 4,096 bytes; the usual size is 512 bytes" (Silberschatz, Galvin, & Gagne, 2011, p. 424). The drive reads an entire block whenever a program requests to read data. Similarly, the drive writes an entire block when a program requests to write data. But what happens when the program needs to read or write less than a block-size amount of data?

The UML class diagram appearing at the beginning of the chapter suggests that all stream classes inherit a buffer named filebuf from the ios class at the top of the hierarchy. In general, a buffer is something that sits between two or more entities and cushions them as they bump together; buffers also control what happens as two or more things mix or come together. A computer buffer is a block of memory that temporarily holds data as the computer moves it from one place to another. A buffer is useful when the data source has a different size or capacity than the destination or when the source and destination operate at different speeds. Both buffer properties are useful when transferring data between a program and a file on a disk drive.

When a program reads data from a disk file, it reads an entire block of data from the disk and stores it in the stream's file buffer. The program takes the data from the buffer in "chunks" that are appropriate for its needs. For example, it takes one byte when it needs a char, four bytes when it needs an int, and eight bytes when it needs a double. When the buffer is empty, the program reads a new block. When a program writes data to a disk file, it stores the data in an instance of filebuf, and only writes the buffer to the disk when it is full or when the stream is closed or flushed.


Silberschatz, A.; Galvin, P. B.; & Gagne, G.
Operating System Concepts Essentials
John Wiley & Sons, Inc., 2011