9.4. Class Development

Time: 00:05:35 | Download: Large, Large (CC), Small | Streaming, Streaming (CC) | Slides (PDF)

Class development spans three distinct phases of the software development process: analysis, design, and implementation. Initially, analysts look at the problem or real "real world" through Rumbaugh's object-oriented glasses and identify and describe the objects they find. Those objects are abstracted or generalized into classes. Analysts will also identify the data that the system must maintain and the operations that it must perform. The data and operations are often naturally associated with one of the identified classes. Other times, the connection is less clear. Thinking about "responsibilities" can sometimes help clarify which class is the best candidate for hosting these ambiguous features. Think of the classes as living entities that can do things and ask, "Who should be responsible for maintaining this data or for carrying out this operation?" If more than one class can naturally accept the responsibility, the one you choose likely doesn't matter. And remember, you can always change your mind later.

Software developers refine classes during the design phase. They may discard unneeded classes or add classes they didn't find in the problem, but a computer program does. They may also discard, add, or update operations during the design phase. Changes made during analysis and design are relatively inexpensive - you change some UML diagrams and perhaps some documentation - compared to changes made after programming or deployment.

Unfortunately, there isn't an algorithm or recipe for choosing classes, attributes, or operations. But by now, you have enough experience to understand that two different programs can look very different and still be reasonable and correct solutions to the same problem. The same is true of the classes found in an object-oriented program. There are many software development processes - some suitable for designing software rapidly, some ideal for designing extensive software systems that must function for many years - but each process aims to produce a satisfactory system and not necessarily the best system.

In subsequent courses, you will formally study some of the more common software development processes. Here, we take a more informal approach focused more closely on individual classes than complete software systems.

The Two-Hat Technique

One helpful way to conceptualize the relation between two classes is to see one class as a client and the other as a server or supplier. The server can supply some useful services that can help the client somehow. The two-hat technique takes advantage of the client-server perspective by guiding the software developer to keep the two aspects of a class separate and distinct. While wearing the class designer's hat, don't worry about how someone might use the class; focus instead on what services a general class of the kind you are making should provide. Similarly, while wearing the class user's hat, don't worry about how the class works or what data it may hide inside; focus on what the class can do for you as you use it to solve a sub-problem in your program.

It's common for you to be both the class designer and the class user (i.e., it's common for a programmer to wear both hats). It's also common for you to design the class while writing the code that uses it. Nevertheless, take a moment and mentally switch hats as you shift from one task to the next - the result is a better-designed class and better-organized client code. The table below summarizes some of the most important two-hat ideas."

Server / Supplier Client
Picture of a person wearing a 'Designer' hat. Picture of a person wearing a 'User' hat.
  • Make the best possible class without worrying about how the class will be used
  • What services or operations should the class naturally provide?
  • What data must the class maintain to provide the desired services?
  • "Shopping list" features (explained below) for general, low-level classes; narrow features for specialized classes
  • Don't worry about how the class works or what data it maintains
  • Focus on the public interface (i.e., what the class does)
Designing classes with the two-hat technique. The class designer creates classes that supply services, while the class user creates an application or client code that uses the class services.

What Do Classes "Know"

It is often insightful to treat classes as living entities that can do things. Building on that idea, we can organize the classes in an object-oriented program like a company's organizational chart. The boss at the top knows the company's long-range plans but not how to design a program or an electric circuit. The mid-level managers know the details about a specific project, not company strategy or system design. Engineers at the bottom know how to design software and circuits but not how to run a company or manage a project. Similarly, the classes at the top know what the program does but not how it does it. Classes near the bottom of the hierarchy have a detailed knowledge of solving one small part of the problem, but they don't know how their contribution works with the other classes in the program.

A program visualized as a collection of individual classes. The picture shows the classes organized in a tree structure. The class at the top is labeled 'Application,' and the classes at the bottom, the tree's leaves, are simple classes like 'string,' 'istream,' and 'ostream.' 'Semantic awareness' is how much each class knows about the program, increasing upward in the tree.
The classes in an object-oriented program are conceptualized as a hierarchy or tree. The classes at the top "know" more about what the program does than those at the bottom. Classes at the bottom have detailed "knowledge" of a small part of the program. Classes at the top may only coordinate the operations of lower-level classes.

Software developers typically create classes near the top of the hierarchy to solve a specific problem and often only use them in one program. As such, developers specialize the top-level class features to meet the particular needs of that program. Classes become more general and reusable as we go from the top to the bottom. Developers often use general-purpose library classes at the bottom of the hierarchy. These classes provide a broad set of general services focused on the class rather than a specific problem. The string class is a good example: It provides numerous operations that we expect of strings without "knowing" how a program ultimately uses them.

Choosing what features, especially operations, to put in a class is similar to making a shopping list. Sometimes, you make a shopping list with a specific meal in mind. Classes closer to the top are like these lists - we tailor them to meet particular needs. Sometimes, we make a general shopping list - we don't have any specific requirements but add items to the list that we like to keep on hand. Classes closer to the bottom are like these lists - we stock our pantry with common, frequently used items.

A Class's Public Interface

In general, an interface is a "place where two things come together and affect each other," a place where they "touch" and where they can interact. A class's interface is how a client connects to, interacts with, or uses an instance of the class. A class's interface includes all non-private features (i.e., attributes and operations) visible and accessible by other classes. C++ and Java have in common three keywords that control feature accessibility: public, protected, and private (Java also has a default accessibility that C++ does not). private is the only level of accessibility that completely excludes a feature from a class's interface, while everyone agrees that public features part of the class's public interface. Lego™ blocks are a second metaphor (besides the cookie-cutter mentioned previously) for objects. The objects fit together to form structures by matching posts of a precise size, orientation, and spacing on one Lego with compatible sockets or holes on a second Lego. The posts and holes are the Lego's interface.

Similarly, a class's features form its interface; its public features that it exposes are its public interface. Typically, a class's public interface consists of its public member functions but can occasionally include public member variables as well. Figure 4 is an abstract representation of an object. Objects encapsulate or hide their data and their operation's algorithms (i.e., the bodies of the member functions) but expose the function's signatures (the function name, return type, and argument list). A client can use the object by just "knowing" the public interface - it does not need to know the private features or how the functions perform their tasks. The public interface effectively separates the how from the what.

A picture of Lego building blocks labeled with their precise dimensions.
Lego blocks as a metaphor for a public interface. Lego building blocks have studs on one side and matching sockets on the other. The sizes and spacing of the studs and sockets are precisely specified so that the blocks fit together to form smooth structures. We can think of the studs and sockets as forming the block's interface, joining the blocks and forming complete structures.

"Lego dimensions" by Cmglee - Own work. Licensed under CC BY-SA 3.0 via Wikimedia Commons.

An abstract representation of an object as a rectangle. The illustration represents the hidden data as another rectangle inside the object without outside access. The illustration represents functions as two connected rectangles. It depicts the function's body as a rectangle drawn with a solid border, suggesting that the contents are inaccessible outside the object. It depicts the function header or signature as a rectangle drawn with a dashed border protruding from the object, suggesting its exposure to other parts of the program.
An abstract representation of an object's public interface. Objects "hide" their data and the bodies of their functions from other parts of the program. However, they expose their function signatures (header) to the full program. The exposed signatures form the object's public interface.

Class Design and Implementation Considerations

As a general rule of thumb, programmers usually make attributes private and operations public, but the UML and C++ both support private operations and public attributes. Any non-private feature becomes part of the class's public interface. Making an operation private is appropriate whenever the class uses the operation internally, but it does not represent a service directly. Typical examples include operations representing code common to two or more operations or when a programmer decomposes large, complex operations into smaller, simpler sub-operations. These "helper" functions are appropriately private in a UML diagram and a corresponding program class (regardless of the implementing language).

Conversely, non-private data makes maintaining a stable public interface much more difficult. A stable public interface is one in which features (operations and attributes) are not removed or changed once they become available to client programs. Programmers may add new features to a public interface without adversely affecting its stability. However, once client code can use a public interface feature, that feature cannot be withdrawn or modified without potentially affecting existing client code. So, you should have a compelling reason to make an attribute non-private, and articulating your reason is a prerequisite for making that design decision.

Class Benefits

Classes and objects help software engineers manage the complexity of increasingly large programs by providing concrete implementations of several abstract programming constructs.

Stable Public Interface
Once a class is in use, the designer must ensure its public interface remains stable by not removing or altering any public features as the class evolves (adding new features does not affect the interface's stability). If the public interface remains stable, programmers may change the class's data and the member functions' logic without disrupting existing client code. The client cannot "see" the hidden features in a class and, therefore, cannot depend on them. For example, one of our classes may need a sort operation, and, being in a hurry, we write a slow selection sort: void sort();. Later, when we have more time, we rewrite the function based on the quick-sort algorithm, but as long as the function signature, void sort();, does not change, the class's interface remains stable and any existing client code is not adversely affected.
Abstract Data Type (ADT)
An abstract data type is any new data type that a programmer creates (i.e., it is any data type that is not a primitive data type defined by the programming language itself). Once specified, an ADT becomes a type specifier, which means that the name of the ADT can appear wherever the name of a data type would be legal. For example, "int" is a type specifier in the following variable definition statement:
int counter;

Similarly, Person (which we will assume is the name of a class) can be used to specify the type of a variable:

Person manager;
Computer scientists characterize ADTs by the operations they support (i.e., their public interface).
Data Hiding
Data hiding is a software design and implementation technique that separates the data from the software logic and insulates that data from the rest of the program. Therefore, the only way to manipulate or operate on the hidden data is through the operations specified by the ADT's public interface. Programming languages that do not support objects may implement data hiding to various degrees with other programming constructs (e.g., modules in C). Currently, objects are the most efficient and aesthetic way to implement data hiding. In conjunction with maintaining a stable public interface, data hiding helps make the software more change-resilient.
Encapsulation
Encapsulation is a programming technique for building data objects that bundle or encapsulate data with the functions that operate on that data, restricting and controlling access to the data. In this sense, "Encapsulation" and "object" are synonymous. Objects also help control complexity by specifying an intermediate variable scope. It is a good programming technique to restrict a variable to the narrowest or tightest possible scope. Restricting a variable's scope reduces namespace exhaustion (running out of good variable names that have meaning in the problem's context) and inter-functional coupling. Coupling occurs when two or more functions share data. Sometimes, the sharing doesn't cause problems, but other times, one function may alter the data, inadvertently harming the other function(s) that share the data. However, there are times when functions must share data, implying that local scope is too narrow, but defining the data in global scope invites other functions to access it. We can't limit access to global data through a stable, public interface. So, the number of unintentional collisions, where one function changes the data in a harmful way to another, increases with the program's size.