6.2.1. Function Definitions and Declarations

Time: 00:09:04 | Download: Large, Large (CC), Small | Streaming, Streaming (CC) | Slides (PDF)

We first explored the distinction between the terms declaration and definition in the context of variables. A variable definition allocates memory to hold the variable's contents, while a declaration does not. A declaration "introduces" the variable's name to the compiler, which enters the name into its symbol table. Variable definitions are essential programming elements, while variable declarations are generally only needed when programmers use global variables in a multi-file program. But functions are a different matter.

Both function definitions and declarations are essential. Function definitions, like variable definitions, use memory. A function definition's key feature, a feature it must always have, is a body. The compiler translates the body into machine code, and the operating system stores the machine code in memory when the program runs. Like a variable declaration, a function declaration introduces the function name to the compiler, which enters the name into its symbol table. But unlike variable declarations, function declarations are essential, so they have a distinct name. Function declarations are typically called function prototypes or just prototypes for short. Before we study how to use prototypes, let's see what they look like and how they relate to function definitions.

double foo(int x, double y, char z)
{
	.
	.
	.
}
double foo(int x, double y, char z);
double foo(int a, double b, char c);
double foo(int, double, char);
(a)(b)
The relationship between a function definition and a prototype.
  1. Function Definition. Definitions are characterized by having both a header and a body.
  2. Function Prototype. Alternatively, prototypes are a function header followed by a semicolon; the terminating semicolon makes them a statement. They do not have a body, so the compiler cannot generate machine instructions from them.
    • The parameter names may be the same as those used in the definition.
    • The parameter names may differ from those used in the definition.
    • But parameter names are not even required.
Function definition and prototype details

Function Definitions

A function definition requires all parts of the function: a header (which includes the function return type, name, and an argument list) and a function body. The function body contains the code that carries out the function's tasks. The compiler generates machine code from the C++ code, and those machine code instructions use memory. A function name effectively names a block of memory that contains the function's machine instructions.

The distinguishing feature of a function definition is the body formed by the opening and closing braces: { and }. (At this point in our studies, the function body will always contain at least one statement, but in later chapters, we will discover some cases where the body can be empty.) We distinguish a function definition from a prototype, which does not have a body, and distinguish it from a function call, which does not have the data type of the arguments or the return value.

// function to square a number

double  sqr(double  x)
{
	return  x * x;
}
// the Newton-Raphson algorithm

double newton(double x0)
{
	double	x_new;
	double	epsilon = 1.0e-4;	// maximum error

	for (int i = 0; i < loopmax; i++)
	{
		x_new = x0 - f(x0) / f_prime(x0);

		if (abs(x_new - x0) < epsilon)
			return x_new;
		else
			x0 = x_new;
	}
}
Examples of simple and complex function definitions. Functions span a wide spectrum of complexity. Even modestly-sized functions, like newton, are difficult to understand if you're unfamiliar with the underlying algorithm.

Function Prototypes

For those transitioning from Java to C++, function prototypes must seem burdensome because Java programs don't require them. Java uses a two-pass compiler (it reads the source code twice). The compiler builds the symbol table during the first pass and generates code during the second pass. C++, on the other hand, is implemented as a one-pass compiler (it reads the source code only once), so it must build the symbol table and generate code during the same pass. C++ is strict and will not compile a function without a declaration, but the older, pre-ANSI C programming language was less strict. If the old C compiler encountered a call to an undeclared function, it would "guess" about the function's return type and the number and type of its arguments.

But there are at least two problems with making guesses about a function. First, lacking a solid description of the function interface or signature, the compiler can't provide any diagnostics to help debug a program. Furthermore, its guesses are often incorrect, causing the compiler to generate incorrect machine code! Additionally, many desirable features, such as overloaded functions and polymorphism, require that functions be precisely identified and distinguished before being compiled.

Version 1 Version 2 Version 3
int  main()
{
	double	y;
	y = sqr(2);
}



double sqr(double  x)
{
	return  x * x;
}
double sqr(double  x)
{
	return  x * x;
}



int  main()
{
	double	y;
	y = sqr(2);
}
double sqr(double x);

int  main()
{
	double	y;
	y = sqr(2);
}

double sqr(double  x)
{
	return  x * x;
}
Function definition, declaration, and prototype examples. Three versions of the same program illustrate how function declarations and prototypes affect how the compiler translates a function definition into machine code. The examples also illustrate two potentially confusing aspects of definitions, declarations, and prototypes. The compiler reads the code once from top to bottom.
Version 1
This version will not compile in C++, but it will in pre-ANSI C. The compiler encounters the sqr function call before the function definition, so it "assumes" that the argument and return types are int. The compiler-generated code passes 2 into x, corrupting the value in two ways. First, a double is generally twice the size of an int. So, when the call passes 2 to x, it only passes half the needed data, leaving the rest filled with garbage. Second, the bit patterns for the two data types are quite different even when the numbers are numerically the same (i.e., internally, an int 2 is different than a double 2). So, the function misinterpretes the incomplete data stored in x, resulting in more garbage. The function returns a double value, but the compiler assumes that the returned value is an int, so promotes it to a double - another misinterpretation - before storing it in y. There is little hope that y stores any valid information.
Version 2
The function has a body, making it a definition. But the function is also the first time the compiler "sees" the name sq, making it a declaration. In this example, the function serves as a definition and a declaration, but it isn't a prototype. When the compiler processes the function call, there is sufficient information to cause the compiler to promote the argument, 2, from an int to a double, but not change the returned value returned value's data type, so the program saves it without promotion.
Version 3
With a prototype in view, the code compiles and runs correctly even though the function definition follows the function call. The prototype provides enough information that the compiler promotes the 2 to a double and does not change the returned value's type before the assignment.

Rather than providing prototypes, wouldn't it be easier to organize our code to define every function before calling it as in version 2? Imagine that our program has hundreds of functions. Trying to maintain the order would be a nightmare! Furthermore, there are situations where we can't order the functions so that they are always defined before we call them.

File1.cppFile2.cpp  
int f(int arg)
{
	.
	.
	.
	.
	.
	return . . .;
}
int f(int);

int g()
{
	.
	.
	.
	int x = f(42);
}
 
void a()
{
	b();
}

void b()
{
	a();
}
(a) (b)
Situations requiring prototypes.
  1. A multi-file program must have exactly one definition of each function, but it may have many calls to any function. When a function is defined in one file but called in another, the calling file must have a prototype.
  2. The second example illustrates indirect recursion, which happens when two or more function calls form a cycle. The cycle makes it impossible to order the recursive function definitions in a way that ensures that the definitions always precede calls to them. Recursive functions, explored later in the chapter, must be carefully crafted, so treat this illustration as pseudo-code only.

In the case of multi-file programs, prototypes are typically located in a header file, which is then #included in whatever source file needs to use the functions. Our programs have been following this pattern all along: << and >> are just specialized functions called overloaded operators. The iostream header file contains prototypes for both functions. Putting prototypes in a header file reduces the tedium of copying them repeatedly and, more importantly, ensures that they are written consistently throughout the program.

Function Calls

Modern computers operate on the so-called Von Neumann cycle: fetch an instruction, decode the instruction, execute the instruction. When a program calls a function, it saves the address of the next instruction to be fetched (the instruction following the call) and jumps to the address of the function's instructions in memory. The cycle repeats throughout the instructions of the function until the function ends. When the function ends, the program restores the address of the instruction following the function call (i.e., the computer jumps back to where it was originally executing instructions), and the cycle resumes fetching the instruction following the call. Fortunately, the compiler hides all that detail from programmers, and we need only to be aware of the basic behavior of a function call: jump to the function, execute the code, and return.

Function Name-Use Summary

  1. Definition
    1. Has typing information
    2. Has a body
  2. Prototype
    1. Is a declaration
    2. Has typing information
    3. Ends with a semicolon (NO body)
    4. Often placed in a header file
  3. Call
    1. Does NOT have typing information
    2. The program jumps to the machine instructions' address
    3. The program returns to the instruction following the call when the function finishes