In contrast to the set of functions which handle memory allocation in
C (i.e., malloc() etc.), the operators new and
delete are specifically meant to be used with the features that
C++ offers. Important differences between malloc() and
new are:
malloc() doesn't `know' what the allocated
memory will be used for. E.g., when memory for ints is allocated,
the programmer must supply the correct expression using a multiplication by
sizeof(int). In contrast, new requires the use of a
type; the sizeof expression is implicitly handled by the
compiler.
malloc() is to use calloc(), which allocates memory
and resets it to a given value. In contrast, new can call the
constructor of an allocated object where initial actions are defined. This
constructor may be supplied with arguments. The comparison between free() and delete is
analogous: delete makes sure that when an object is deallocated, a
corresponding destructor is called.
The calling of constructors and destructors when objects are created or destroyed, has a number of consequences which shall be discussed in this chapter. Many problems in program development in C are caused by incorrect memory allocation or memory leaks: memory is not allocated, not freed, not initialized, boundaries are overwritten, etc.. C++ does not `magically' solve these problems, but it does provide a number of handy tools. In this chapter the following topics are discussed:
this pointer,
In this section we shall again use the class Person as
example:
class Person
{
public:
// constructors and destructor
Person ();
Person (char const *n, char const *a,
char const *p);
~Person ();
// interface functions
void setname (char const *n);
void setaddress (char const *a);
void setphone (char const *p);
char const *getname (void) const;
char const *getaddress (void) const;
char const *getphone (void) const;
private:
// data fields
char *name;
char *address;
char *phone;
};
In this class the destructor is necessary to prevent that memory, once
allocated for the fields name, address and
phone, becomes unreachable when an object ceases to exist. In the
following example a Person object is created, after which the data
fields are printed. After this the main() function stops, which
leads to the deallocation of memory. The destructor of the class is also shown
for illustration purposes.
Note that in this example an object of the class Person is also
created and destroyed using a pointer variable; using the operators
new and delete.
Person::~Person ()
{
delete name;
delete address;
delete phone;
}
void main ()
{
Person
kk ("Karel", "Rietveldlaan",
"050-426044"),
*bill = new Person ("Bill Clinton",
"White House",
"09-1-202-142-3045")
printf("%s, %s, %s\n",
kk.getname (), kk.getaddress (), kk.getphone ());
printf("%s, %s, %s\n",
bill->getname (), bill->getaddress (), bill->getphone ());
delete bill;
}
The memory which is occupied by the object kk is released
automatically when main() terminates: the C++ compiler makes
sure that the destructor is called. Note however the object pointed to by
bill is handled differently. The variable bill is a
pointer; and a pointer variable is, even in C++, in itself no
Person. Therefore, before main() terminates, the
memory occupied by the object pointed to by bill must be explicitly
released; hence the statement delete bill. The
operator delete will make sure that the destructor is called,
thereby releasing the three strings of the object.
Variables which are structs or classes can be
directly assigned in C++ in the same way that structs can be
assigned in C. The default action of such an assignment is a byte-by-byte
copying from one compound type to the other.
Let us now consider the consequences of this default action in a program statement as the following:
void printperson (Person const &p)
{
Person
tmp;
tmp = p;
printf ("Name: %s\n"
"Address: %s\n"
"Phone: %s\n",
tmp.getname (), tmp.getaddress (), tmp.getphone ());
}
We shall follow the execution of this function step by step.
printperson() expects a reference to a
Person as its parameter p. So far, nothing
extraordinary is happening.
tmp. This means that the
default constructor of Person is called, which -if defined
properly- resets the pointer fields name, address
and phone of the tmp object to zero.
p is copied to
tmp. By default this means that sizeof(Person) bytes
from p are copied to tmp.
Now a potentially dangerous situation has arisen. Note that the actual
values in p are pointers, pointing to allocated memory.
Following the assignment this memory is addressed by two objects:
p and tmp.
printperson() terminates: the object
tmp is destroyed. The destructor of the class Person
releases the memory pointed to by the fields name,
address and phone: unfortunately, this memory is
also in use by p!
The incorrect assignment is illustrated in the following figure.
After the execution of printperson() , the object which was
referenced by p may still contain valid pointers to strings, but
pointers which address deallocated memory. This action is undoubtedly not a
desired effect of a function like the above. The deallocated memory will likely
become occupied during subsequent allocations, thereby causing the previously
held strings to become lost.
In general it can be concluded that every class which contains a constructor and a destructor, and which contains pointer fields to address allocated memory, is a potential candidate for trouble. There is of course a possibility to intervene: this possibility will be discussed in the next section.
Obviously, the right way to assign one Person object to another,
is not to copy the contents of the object byte by byte. A better way is
to make an equivalent object; one with its own allocated memory, but which
contains the same strings.
The `right' way to dupliate a Person object is illustrated in
the following figure.
There is a number of solutions for the above wish. One solution consists of
the definition of a special function to handle assignments of objects of the
class Person. The purpose of this function would be to create a
copy of an object, but one with its own name, address
and phone strings. Such a member function might be:
void Person::assign (Person const &other)
{
// delete our own previously used memory
delete name;
delete address;
delete phone;
// now copy the other's data
name = strdup (other.name);
address = strdup (other.address);
phone = strdup (other.phone);
}
Using this tool we could rewrite the offending function
func():
void printperson (Person const &p)
{
Person
tmp;
// make tmp a copy of p, but with its own allocated
// strings
tmp.assign (p);
printf ("Name: %s\n"
"Address: %s\n"
"Phone: %s\n",
tmp.getname (), tmp.getaddress (), tmp.getphone ());
// now it doesn't matter that tmp gets destroyed..
}
In itself this solution is valid, although it is purely symptomatic. This
solution requires that the programmer uses a specific member function instead of
the operator =; the problem however remains if this rule is not
strictly adhered to. Our experience shows that errare humanum est; a
solution which doesn't enforce exceptions is therefore preferable.
The problem of the assignment operator is solved by using operator
overloading: the syntactic possibility of C++ to redefine the actions
of an operator in a given context. Operator overloading was discussed earlier,
when the operators << and >> were
redefined for the usage with streams as cin, cout and
cerr (see section CoutCinCerr
).
Overloading the assignment operator is probably the most common form of operator overloading. However, a word of warning is appropriate: the fact that C++ allows operator overloading does not mean that this feature should be used at all times. A few rules are:
Person.
+ for a class which represents a complex number. The
meaning of a + between two complex numbers is quite clear and
unambiguous.
Using these rules, operator overloading is minimized which helps keep source
files readable. An operator simply does what it is designed to do. Therefore, in
our vision, the operators << and >> in the
context of streams are misleading: the stream operations do not have anything in
common with the bitwise shift operations.
To achieve operator overloading in a context of a class, the class is simply
expanded with a public function which states the operator. A
corresponding function is then defined.
For example, to overload the addition operator + a function
operator+() would be defined. The function name consists of the
keyword operator and the operator itself.
In our case we define a new function operator=() to redefine the
actions of the assignment operator. A possible extension to the class
Person could therefore be:
// new declaration of the class
class Person
{
public:
.
.
void operator= (Person const &other);
.
.
private:
.
.
};
// definition of the function
void Person::operator= (Person const &other)
{
// deallocate old data
delete name;
delete address;
delete phone;
// make duplicates of other's data
name = strdup (other.name);
address = strdup (other.address);
phone = strdup (other.phone);
}
The function operator=() which is presented above is the first
version of the overloaded assignment. We shall present better and less bug-prone
versions later.
The actions of this member function are similar to those of the previously
proposed function assign(), but the name makes sure that this
function is also activated when the assignment operator = is used.
In fact there are two ways to call this function, which are illustrated
below:
Person
pers ("Frank", "Oostumerweg 23", "2223"),
copy;
// first possibility
copy = pers;
// second possibility
copy.operator= (pers);
It is obvious that the second possibility, in which operator=()
is explicitly stated, is not used often. The code fragment however illustrates
the similarity of the two methods of calling the function.
As we have seen, a member function of a given class is always called in the
context of some object of the class; there is always an implicit `substrate' for
the function to act on. C++ defines a keyword, this, to
address this substrate (this is not available in the not yet
discussed static member functionsthis
keyword is a pointer variable, which always contains the address of the object
in question. The this pointer is implicitly declared in each member
function (whether public or private); therefore, it is
as if in each member function of the class Person would contain the
following declaration:
extern Person *this;
A member function like setname(), which sets a name
field of a Person to a given string, could therefore be implemented
in two ways: with or without the this pointer:
// alternative 1: implicit usage of this
void Person::setname (char const *n)
{
delete name;
name = strdup (n);
}
// alternative 2: explicit usage of this
void Person::setname (char const *n)
{
delete this->name;
this->name = strdup (n);
}
Explicit usage of the this pointer is not very frequent. There
is however a number of situations where the this pointer is
needed.
As we have seen, the operator = can be redefined for the class
Person in such a way that two objects of the class can be assigned,
leading to two copies of the same object.
As long as the two variables are different ones, the previously presented
version of the function operator=() will function properly: the
memory of the assigned object is released, after which it is allocated again to
hold new strings. However, when an object is assigned to itself (which is called
auto-assignment), a problem occurs: the allocated strings of the assigned are
first released, but this also leads to the releasing of the strings of the
right-hand side variable! An example of this situation is illustrated below:
void fubar (Person const &p)
{
p = p; // auto-assignment!
}
In this example it is perfectly clear that something unnecessary, possibly even wrong, is happening. Auto-assignment can however occur in more hidden forms:
Person
one,
two,
*pp;
pp = &one;
.
.
*pp = two;
.
.
one = *pp;
The problem of the auto-assignment can be solved by using the
this pointer. In the overloaded assignment operator function we
simply test whether the address of the right-hand side object is the same as the
address of the current object: if so, no action needs to be taken. The
definition of the function operator=() then becomes:
void Person::operator= (Person const &other)
{
// only take action if address of current object
// (this) is NOT equal to address of other
// object (&other):
if (this != &other)
{
delete name;
delete address;
delete phone;
name = strdup (other.name);
address = strdup (other.address);
phone = strdup (other.phone);
}
}
This is the second version of the overloaded assignment function. One, yet better version remains to be discussed.
Note the usage of the address operator in the statement
if (this != &other)
The variable this is a pointer to the `current' object, while
other is a reference; which is an `alias' to an actual
Person object. The address of the other object is therefore
&other, while the address of the current object is
this.
The syntax of C++ states that the associativity of the assignment operator is to the right-hand side; i.e., in a statement as
a = b = c;
the expression b = c is evaluated first, and the result is
assigned to a.
The implementation of the overloaded assignment operator so far does not
permit such constructions, as an assignment using the member function returns
nothing (void). We can therefore conclude that the previous
implementation does circumvent an allocation problem, but is not quite
syntactically right.
The syntactical problem can be illustrated as follows. When we rewrite the
expression a = b =
c to the form which explicitly mentions member functions, we
get:
a.operator= (b.operator= (c));
This is syntactically wrong, since the sub-expression
b.operator=(c) yields void; and the class
Person contains no member functions with the prototype
operator=(void).
This problem can also be remedied by using the this pointer. The
overloaded assignment function expects as its argument a reference to a
Person object; in the same way it can return a reference to such an
object. This reference can then be used as an argument for a nested
assignment.
It is customary to let the overloaded assignment return a reference to the
current object (i.e., *this), as a const reference.
The (final) version of the overloaded assignment operator for the class
Person thus becomes:
// declaration in the class
class Person
{
public:
.
.
Person const &operator= (Person const &other)
.
.
};
// definition of the function
Person const &Person::operator= (Person const &other)
{
// only take action when no auto-assignment occurs
if (this != &other)
{
// deallocate own data
delete address;
delete name;
delete phone;
// duplicate other's data
address = strdup (other.address);
name = strdup (other.name);
phone = strdup (other.phone);
}
// return current object, compiler will make sure
// that a const reference is returned
return (*this);
}
In the following sections we shall look closer at another usage of the
operator =. We shall use a class String as an example.
This class is meant to handle allocated strings and is defined as follows:
class String
{
public:
// constructor, destructors
String ();
String (char const *s);
~String ();
// overloaded assignment
String const &operator= (String const &other);
// interface functions
void set (char const *data);
char const *get (void);
private:
// one data field: ptr to allocated string
char *str;
};
Concerning this definition we remark the following:
char *str to
address allocated memory. For this reason, the class has a constructor and
destructor.
A typical action of the constructor would be to set the str
pointer to 0. A typical action of the destructor would be to release the
allocated memory.
String const &String::operator= (String const &other)
{
if (this != &other)
{
delete str;
str = strdup (other.str);
}
return (*this);
}
String
a ("Hello World!\n");
Now let's consider the following code fragment. The statement references are discussed below the example:
String
a ("Hello World\n"), // see (1)
b, // see (2)
c = a; // see (3)
int main ()
{
b = c; // see (4)
return (0);
}
a is
initialized with a string ``Hello World''. This construction of the object
a therefore uses the constructor which expects one string
argument.
It should be noted here that this form is identical to
String
a = "Hello World\n";
Even though this code fragment uses the operator =, this is no
assignment: rather an initialization, and hence,
construction.
String object is created. Again a
constructor is called; but since no special arguments are present, this is the
default constructor.
c is created. Again, a
constructor is therefore called. The new object is also initialized with (the
data of) object a.
This form of initializations has not been discussed yet. Since we can rewrite this statement in the form
String
c (a);
this initialization suggests that a constructor is called, with as argument
a (reference to a) String object. Such constructors are quite
common in C++ and are called copy constructors. More properties
of these constructors are discussed below.
The simple rule which applies here is that whenever an object is created, a constructor is needed. The form of the constructor is still the following:
=.
We conclude therefore that, given the above code statement (3), the class
String must be rewritten to define a copy constructor:
// class definition
class String
{
public:
.
.
String (String const &other);
.
.
};
// constructor definition
String::String (String const &other)
{
str = strdup (other.str);
}
The actions of the copy constructor are similar to those of the overloaded assignment operator function: an object is duplicated, so that it contains its own allocated data. The copy constructor function is however simpler in the following respect:
Besides the above mentioned quite obvious usage of the copy constructor, this constructor has other important tasks. All of these tasks are related to the fact that the copy constructor is always called when an object is created and initialized with another object; even when this new object is a hidden or temporary variable:
This is illustrated in the following code fragment:
void func (String s) // no pointer, no reference
{ // but the String itself
puts (s.get ());
}
int main ()
{
String
hi ("hello world");
func (hi);
return (0);
}
In this code fragment hi itself is not passed as an argument,
but instead a temporary (stack) variable is created using the copy
constructor. This temporary variable is known within func() as
s. Note that by defining func() with a reference
argument, extra stack usage and a call to the copy constructor would have been
avoided.
String format:
String getline ()
{
char
buf [100]; // buffer for kbd input
gets (buf); // read buffer
String
ret = buf; // convert to String
return (ret); // and return it
}
A hidden String object is here
initialized with the return value ret (using the copy
constructor) and is returned by the function. The local variable
ret itself ceases to exist when getline()
terminates. To demonstrate that copy constructors are not called in all situations,
consider the following. We could rewrite the above function
getline() to the following form:
String getline ()
{
char
buf [100]; // buffer for kbd input
gets (buf); // read buffer
return (buf); // and return it
}
This code fragment is quite valid, even though the return value
char* doesn't match the prototype String. In this
situation, C++ will try to convert the char* to a
String: this is indeed possible, given a constructor which expects
a char* argument. This means that the copy constructor is
not used in this version of getline(). Instead, the
constructor expecting a char* argument is used.
The similarities between on one hand the copy constructor and on the other hand the overloaded assignment operator are reinvestigated in this section. We present here two primitive functions which often occur in `our' code, and which we think are quite useful. We remark that:
The two above actions (duplication and deallocation) can be coded in two
primitive functions, say copy() and destroy(), which
are used in the overloaded assignment operator, the copy constructor, and the
destructor. When we apply this method to the class Person, we can
rewrite the code as follows.
First, the class definition is expanded with two private
functions copy() and destroy(). The purpose of these
functions is to unconditionally copy the data of another object or to
deallocate the memory of the current object. Hence these functions implement
`primitive' functionality:
// class definition, only relevant functions are shown here
class Person
{
public:
// constructors, destructor
Person (Person const &other);
~Person ();
// overloaded assignment
Person const &operator= (Person const &other);
.
.
private:
// data fields
char *name, *address, *phone;
// the two primitives
void copy (Person const &other);
void destroy (void);
};
Next, we present the implementation of the functions copy() and
destroy():
// copy(): unconditionally copy other object's data
void Person::copy (Person const &other)
{
name = strdup (other.name);
address = strdup (other.address);
phone = strdup (other.phone);
}
// destroy(): unconditionally deallocate data
void Person::destroy ()
{
delete name;
delete address;
delete phone;
}
Finally the three public functions in which other object's
memory is copied or in which memory is deallocated are rewritten:
// copy constructor
Person::Person (Person const &other)
{
// unconditionally copy other's data
copy (other);
}
// destructor
Person::~Person ()
{
// unconditionally deallocate
destroy ();
}
// overloaded assignment
Person const &Person::operator= (Person const &other)
{
// only take action if no auto-assignment
if (this != &other)
{
destroy ();
copy (other);
}
// return (reference to) current object for
// chain-assignments
return (*this);
}
The following sections present more examples of operator overloading.
[] As one more example of operator overloading, we present here a class which is
meant to represent an array of ints. Indexing the array elements
occurs with the standard array operator [], but additionally the
class checks for boundary overflow.
An example of the usage of the class is given below:
int main ()
{
Intarray
x (20); // 20 ints
for (register int i = 0; i < 20; i++)
x [i] = i * 2; // assign the elements
for (i = 0; i <= 20; i++)
printf ("At index %d: value %d\n",
i, x [i]);
return (0);
}
This example shows how an array is created to hold 20 ints. The
elements of the array can be assigned or retrieved. The above example should
produce a run-time error, which is generated by the class Intarray:
the last for loop causes a boundary overflow, since
x[20] is addressed while legal indexes are range from 0 to 19.
The definition of the class is given below:
class Intarray
{
public:
// constructors, destructor etc.
Intarray (int sz = 1); // default size: 1 int
Intarray (Intarray const &other);
~Intarray ();
Intarray const &operator= (Intarray const &other);
// the interface
int &operator[] (int index);
private:
// data
int *data, size;
};
Concerning this class definition we remark:
int argument,
specifying the array size. This function serves also as the default
constructor, since the compiler will substitute 1 for the argument when none
is given.
int. This allows an expression like x[10] to be used
on the left-hand side and on the right-hand side of an assignment. We
can therefore use the same function to retrieve and to set data of the class.
The member functions of the class are given below.
// constructor
Intarray::Intarray (int sz)
{
// check for legal size specification
if (sz < 1)
{
printf ("Intarray: size of array must be >= 1, not %d!\n", sz);
exit (1);
}
// remember size, create array
size = sz;
data = new int [sz];
}
// copy constructor
Intarray::Intarray (Intarray const &other)
{
// set size
size = other.size;
// create array
data = new int [size];
// copy other's values
for (register int i = 0; i < size; i++)
data [i] = other.data [i];
}
// overloaded assignment
Intarray const &Intarray::operator= (Intarray const &other)
{
// take action only when no auto-assignment
if (this != &other)
{
// set size
size = other.size;
// remove previous memory, create new array
delete [] data;
data = new int [size];
// copy other's data
for (register int i = 0; i < size; i++)
data [i] = other.data [i];
}
return (*this);
}
// here is the interface function
int &Intarray::operator[] (int index)
{
// check for array boundary over/underflow
if (index < 0 || index >= size)
{
printf ("Intarray: boundary overflow or underflow, "
"index=%d, should range from 0 to %d\n",
index, size - 1);
exit (1);
}
// emit the reference
return (data [index]);
}
This section describes how a class can be adapted for the usage with the
C++ streams cout and cerr and the operator
<<. Adaptation of a class for the usage with cin
and its operator >> occurs in a similar way and is not
illustrated here.
The implementation of an overloaded operator << in the
context of cout or cerr involves the base class of
cout or cerr, which is ostream. This
class is declared in the header file iostream.h and defines only
overloaded operator functions for `basic' types, such as, int,
char*, etc.. The purpose of this section is to show how an operator
function can be defined which processes a new class, say Person
(see section Person
) , so that constructions as the following one become possible:
Person
kr ("Kernighan and Ritchie", "unknown", "unknown");
cout << "Name, address and phone number of Person kr:\n"
<< kr
<< '\n';
The statement cout << kr involves the operator
<< and its two operands: an ostream& and a
Person&. The proposed action is defined in a class-less
operator function operator<<() expecting two arguments:
// declaration in, say, person.h
extern ostream &operator<< (ostream &, Person const &);
// definition in some source file
ostream &operator<< (ostream &stream, Person const &pers)
{
return (stream << "Name: " << pers.getname ()
<< "Address: " << pers.getaddress ()
<< "Phone: " << pers.getphone ()
);
}
Concerning this function we remark the following:
ostream object, to
enable `chaining' of the operator.
<< are stated as the
two arguments of the overloading function.
ostream provides the member function
opfx(), which flushes any other ostream streams tied
with the current stream. opfx() returns 0 when an error has been
encountered.
An improved form of the above function would therefore be:
ostream &operator<< (ostream &stream, Person const &pers)
{
if (! stream.opfx ())
return (stream);
.
.
}
Two important extensions to classes have been discussed in this chapter: the overloaded assignment operator and the copy constructor. As we have seen, classes with pointer data which address allocated memory are potential sources of semantic errors. The two introduced extensions are the only measures against unintentional loss of allocated data.
The conclusion is therefore: as soon as a class is defined where pointer data are used, an overloaded assignment function and a copy constructor should be implemented.
Next Chapter, Previous Chapter,Home