9.4. Class Development

Time: 00:05:53 | Download: Large, Large (CC), Small | Streaming, Streaming (CC) | Slides: PDF, PPTX
Review
Analysis, design, and programming visualized as three circles. The developer's attention can alternate between analysis and design; between design and programming; and from programming to analysis, but it doesn't typically go directly from analysis to programming.
Three phases of software development. Developers focus their attention on one activity at a time, switching quickly between them as they refine a software component. The process typically forms a cycle: analysis to design, design to programming, and programming to analysis. Smaller internal cycles between analysis and design and between design and programming also occur.

Class development spans three distinct phases of the software development process: analysis, design, and implementation. Initially, Analysts view a problem through an object-oriented lens, identify the objects they find, and abstract them into domain classes. They also identify the data the system maintains and the operations it performs, initially assigning the features to classes according to their "natural responsibilities." UML class diagrams are a convenient way to capture and organize these observations.

Developers refine the domain classes during the design phase. They discard extraneous classes and add implementation classes (classes that don't exist in the problem but are necessary for a program). They also discard, add, modify, and relocate features during the design phase.

In the final phase, programmers express the classes as working program components. One of the benefits of an object-oriented development lifecycle is that it maintains a consistent "vocabulary" of classes throughout the process. The domain classes identified during analysis and the implementation classes added during design are the same classes that appear in the final program.

Contemporary software development methodologies include these phases in extensive development processes. Some are suitable for rapidly designing software, while others are better suited for designing large systems that must function for many years. Although many methodologies are in use today, there isn't an algorithm for selecting classes, attributes, or operations. Consequently, developers generally apply the phases in cycles, initially focusing on small components (e.g., individual classes) and refining them until the component is functional. You will formally study some methodologies in subsequent courses. Here, we take a more informal approach focused on individual classes.

The Two-Hat Technique

Another object-oriented approach to software development is to view a class from two different perspectives: the class developer or the class user. The developer focuses on organizing the class's data and implementing the service it supplies. Alternatively, the class user treats the class as a black box, focusing on how it can help solve a problem while ignoring its internal details. The distinction is greatest for general, reusable classes, where the developer imagines all the services the class should supply. The string class is a good example: developers can't foresee all the programs that will use it, but they can imagine the services it should supply. Conversely, programmers using the string class don't need to know how it works, only what it does.

The two-hat technique is a helpful way to differentiate between the two roles. Although it may be the same person or group creating and using a class, it's beneficial to adopt the appropriate perspective. You may switch between the developer and user roles quickly and frequently, but pausing briefly and intentionally shifting your perspective results in better code.

Server / Supplier Client
Picture of a person wearing a 'Developer' hat. Picture of a person wearing a 'User' hat.
  • Make the best possible class without worrying about how the class will be used
  • What services or operations should the class naturally provide?
  • What data must the class maintain to provide the desired services?
  • "Shopping list" features (explained below) for general, low-level classes; narrow features for specialized classes
  • Ignore how the class works or what data it maintains
  • Focus on the public interface (i.e., what the class does; detailed below)
Designing classes with the two-hat technique. The class developer creates classes that provide services, while the class user creates an application or client code that uses those services.

What Do Classes "Know"

A program visualized as a collection of individual classes. The picture shows the classes organized in a tree structure. The class at the top is labeled 'Application,' and the classes at the bottom, the tree's leaves, are simple classes like 'string,' 'istream,' and 'ostream.' 'Semantic awareness' is how much each class knows about the program, increasing upward in the tree.
Program classes conceptualized as a hierarchy. The class at the top "knows" the high-level problem solution; the classes in the middle "know" how to coordinate low-level classes to solve one aspect of the problem, while the classes at the bottom are more specialized, having detailed "knowledge" of a specific part of the program.

It is often insightful to treat classes as living entities that can do things. For example, imagine organizing the classes in an object-oriented program like a company's organizational chart. The director at the top knows the company's long-range goals but not how to design a program or an electric circuit. Mid-level managers know the details of a specific project but not the company strategy or detailed system design. Engineers know how to design software and circuits, but not how to run a company or manage a project. Similarly, the classes at the top know what the program does but not how it does it. Classes near the bottom of the hierarchy have detailed knowledge of solving a small part of the problem, but they don't know how their contributions work with the other classes in the program.

Software developers typically create classes near the top of the hierarchy to solve a specific problem, often using them in only one program. As such, developers specialize the top-level class features to meet the particular needs of that program. Classes become more general and reusable as you go down the hierarchy, with developers frequently using many general-purpose library classes at the bottom. These classes provide a broad set of general services focused on the class rather than a specific problem.

Choosing what operations to put in a class is similar to making a shopping list. Sometimes you make the list with a specific meal in mind, and other times with general "pantry" items you might need. The top classes are like the "specific meal" lists, and the bottom classes are like the "pantry items." The previous string example demonstrates a "pantry" list: its numerous operations might help some program.

A Class's Public Interface

In general, an interface is a "place where two things come together and affect each other," a place where they "touch" and where they can interact. A class's interface is how a client connects to, interacts with, or uses an instance of the class. A class's interface includes all non-private features (i.e., attributes and operations) that are visible and accessible to other classes. C++ and Java have in common three keywords that control feature accessibility: public, protected, and private (Java also has a default accessibility that C++ does not). private is the only level of accessibility that completely excludes a feature from a class's interface, and public features are part of the class's public interface. Lego™ blocks are a second metaphor (besides the cookie-cutter mentioned previously) for objects. The objects fit together to form structures by matching posts of a precise size, orientation, and spacing on one Lego with compatible sockets or holes on a second Lego. The posts and holes are the Lego's interface.

Similarly, a class's features form its interface; its public features that it exposes are its public interface. Typically, a class's public interface consists of its public member functions but can occasionally include public member variables as well. Figure 4 is an abstract representation of an object. Objects encapsulate or hide their data and their operation's algorithms (i.e., the bodies of the member functions) but expose the function's signatures (the function name, return type, and argument list). A client can use the object by just "knowing" the public interface - it does not need to know the private features or how the functions perform their tasks. The public interface effectively separates the how from the what.

A picture of Lego building blocks labeled with their precise dimensions.
Lego blocks as a metaphor for a public interface. Lego building blocks have studs on one side and matching sockets on the other. The sizes and spacing of the studs and sockets are precisely specified so that the blocks fit together to form smooth structures. We can think of the studs and sockets as forming the block's interface, joining the blocks, and forming complete structures.

"Lego dimensions" by Cmglee - Own work. Licensed under CC BY-SA 3.0 via Wikimedia Commons.

An abstract representation of an object as a rectangle. The illustration depicts the hidden data as another rectangle within the object, with no external access. The illustration represents functions as two connected rectangles. It depicts the function's body as a rectangular box with a solid border, suggesting that its contents are inaccessible outside the object. It depicts the function header or signature as a rectangle drawn with a dashed border protruding from the object, suggesting its exposure to other parts of the program.
An abstract representation of an object's public interface. Objects "hide" their data and the bodies of their functions from other parts of the program. However, they expose their function signatures (header) to the full program. The exposed signatures form the object's public interface.

Class Design and Implementation Considerations

As a general rule of thumb, programmers usually make attributes private and operations public, but the UML and C++ both support private operations and public attributes. Any non-private feature becomes part of the class's public interface. Making an operation private is appropriate whenever the class uses it internally, but it does not represent an external service. Typical examples include operations that represent code shared by two or more operations, or when a programmer decomposes large, complex operations into smaller, simpler sub-operations. These "helper" functions are appropriately private in a UML diagram and a corresponding program class (regardless of the implementing language).

Conversely, non-private data makes maintaining a stable public interface much more difficult. A stable public interface is one in which features (operations and attributes) are not removed or changed once they become available to client programs. Programmers may add new features to a public interface without adversely affecting its stability. However, once client code can use a public interface feature, that feature cannot be withdrawn or modified without potentially affecting existing client code. So, you should have a compelling reason to make an attribute non-private, and articulating your reason is a prerequisite for making that design decision.

Class Benefits

Classes and objects help software engineers manage the complexity of increasingly large programs by providing concrete implementations of several abstract programming constructs.

Stable Public Interface
Once a class is in use, the designer must ensure its public interface remains stable by not removing or altering any public features as the class evolves (adding new features does not affect the interface's stability). If the public interface remains stable, programmers can change a class's data and the logic of its member functions without disrupting existing client code. The client cannot "see" a class's hidden features and therefore cannot depend on them. For example, one of our classes may need a sort operation, and, being in a hurry, we write a slow selection sort: void sort();. Later, when we have more time, we rewrite the function based on the quick-sort algorithm, but as long as the function signature, void sort();, does not change, the class's interface remains stable and any existing client code is not adversely affected.
Abstract Data Type (ADT)
An abstract data type is any new data type that a programmer creates (i.e., it is any data type that is not a primitive data type defined by the programming language itself). Once specified, an ADT becomes a type specifier, which means that the name of the ADT can appear wherever the name of a data type would be legal. For example, "int" is a type specifier in the following variable definition statement:
int counter;

Similarly, Person (which we will assume is the name of a class) can be used to specify the type of a variable:

Person manager;
Computer scientists characterize ADTs by the operations they support (i.e., their public interface).
Data Hiding
Data hiding is a software design and implementation technique that separates the data from the software logic and insulates that data from the rest of the program. Therefore, the only way to manipulate or operate on the hidden data is through the operations specified by the ADT's public interface. Programming languages that do not support objects may implement data hiding to various degrees with other programming constructs (e.g., modules in C). Currently, objects are the most efficient and aesthetic way to implement data hiding. In conjunction with maintaining a stable public interface, data hiding helps make the software more change-resilient.
Encapsulation
Encapsulation is a programming technique for building data objects that bundle or encapsulate data with the functions that operate on it, restricting and controlling access to the data. In this sense, "Encapsulation" and "object" are synonymous. Objects also help control complexity by specifying an intermediate variable scope. It is a good programming technique to restrict a variable to the narrowest or tightest possible scope. Restricting a variable's scope reduces namespace exhaustion (running out of meaningful variable names in the problem's context) and inter-functional coupling. Coupling occurs when two or more functions share data. Sometimes sharing doesn't cause problems, but at other times, one function may alter the data, inadvertently harming the other function(s) that share it. However, there are times when functions must share data, implying that local scope is too narrow, but defining the data in global scope invites other functions to access it. We can't limit access to global data through a stable, public interface. So, the number of unintentional collisions, in which one function changes the data in a harmful way for another, increases with the program's size.