General-purpose operating systems (OS) provide a host environment for running programs. A significant component of the host environment is a file system consisting of files and directories stored in persistent secondary memory. Two variations between systems affect how C++ programs use the file system. The first is how the system organizes and names the files. The second is how they separate the lines in a line-oriented text file, which ultimately determines if the OS distinguishes between text and binary files. Understanding these differences is a prerequisite for writing effective programs utilizing file I/O.
From the formal definition of a file presented at the beginning of the chapter, we know that files have names, which programs use to access their contents. Most contemporary operating systems use one of two organizational conventions.
![]() |
![]() |
(a) | (b) |
\
, on Windows systems and a forward-slash, /
, on POSIX-compliant systems. A sub-directory may have any number of files and sub-directories, but every name within a directory must be unique. Standard directories in both organizations have the same or similar names based on their common use.
![]() |
|
\Users\dilbert\Music\Neil Diamond\Classics\Shilo.mp3 /home/dilbert/Music/Neil Diamond/Classics/Shilo.mp3 |
Music\Neil Diamond\Classics\Shilo.mp3 Music/Neil Diamond/Classics/Shilo.mp3 |
(a) | (b) |
..\dilbert\Music\Classics\Neil Diamond\Shilo.mp3 ../dilbert/Music/Neil Diamond/Classics/Shilo.mp3 |
Shilo.mp3 Shilo.mp3 |
(c) | (d) |
/
, as the separator, while Windows traditionally used the back-slash, \
. However, while Windows still uses the back-slash character to report pathnames, it does accept the forward-slash for input. Each example pair illustrates Windows (top) and POSIX (bottom) pathnames.
\
character as the root's name and as the file separator. POSIX systems use the /
...
represents the parent of or one level up from the current working directory.
represents the current working directoryThe formal definition of a file presented in the previous section asserts that "Data files may be numeric, alphabetic, alphanumeric, or binary." Numeric data is just a special case of binary data, and together, alphabetic and alphanumeric data form a more general class called textual data. These generalizations allow us to simplify the one aspect of file I/O, focusing on two file types: textual and binary. We won't deal with textual and binary data combinations, but when a combination of data occurs in practice, the program typically treats it as a binary file. The distinction between binary and textual data centers around how a file system marks the separation between the lines in a line-oriented text file.
POSIX | Windows |
---|---|
See the quick\n red fox jump\n over the lazy\n brown dog\n. |
See the quick\r\n red fox jump\r\n over the lazy\r\n brown dog.\r\n |
(a) | (b) |
\r\n
: a newline and carriage return. Researchers originally developed the C and C++ programming languages on Unix systems using a single newline line separator. (Classic macOS, before incorporating a Unix kernel, used a single \r
as the line separator.)
Programs that process files by searching for the line separator are difficult to port between systems utilizing different line-separator conventions. To alleviate the problem, C and C++ programs running on Windows systems map the \r\n
characters to a single \n
when reading text files, and perform the reverse mapping, \n
to \r\n
, when writing them. Unfortunately, the mapping causes another problem: the bytes in a binary file may have any one of 256 values, including values corresponding to the ASCII newline and carriage return characters. If a file contains binary data (e.g., an image or audio file), discarding or inserting a byte corrupts the data. To circumvent this problem, C and C++ compilers allow programmers to specify a file's mode, text to binary, when opening it. POSIX systems don't distinguish between text and binary files and don't perform the mapping.
ifstream in("filename", ios::binary); |
ifstream in; in.open("filename", ios::binary); |
(a) | (b) |
ios::binary
. Files not explicitly opened in binary mode are opened in text mode by default.
It was suggested at the end of the last section that there are three common ways of accessing a file's contents. It's now appropriate to revisit these three ways in the context of text versus binary files: