Archive-name: C++-faq/part4 Posting-Frequency: monthly Last-modified: Sep 8, 1997 URL: http://www.cerfnet.com/~mpcline/c++-faq-lite/ AUTHOR: Marshall Cline / cline@parashift.com COPYRIGHT: This posting is part of "C++ FAQ Lite." The entire "C++ FAQ Lite" document is Copyright(C) 1991-96 Marshall P. Cline, Ph.D., cline@parashift.com. All rights reserved. Copying is permitted only under designated situations. For details, see section [1]. NO WARRANTY: THIS WORK IS PROVIDED ON AN "AS IS" BASIS. THE AUTHOR PROVIDES NO WARRANTY WHATSOEVER, EITHER EXPRESS OR IMPLIED, REGARDING THE WORK, INCLUDING WARRANTIES WITH RESPECT TO ITS MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. C++-FAQ-Lite != C++-FAQ-Book: This document, C++ FAQ Lite, is not the same as the C++ FAQ Book. The book (C++ FAQs, Cline and Lomow, Addison-Wesley) is 500% larger than this document, and is available in bookstores. For details, see section [3]. ============================================================================== SECTION [13]: Operator overloading [13.1] What's the deal with operator overloading? It allows you to provide an intuitive interface to users of your class. Operator overloading allows C/C++ operators to have user-defined meanings on user-defined types (classes). Overloaded operators are syntactic sugar for function calls: class Fred { public: // ... }; #if 0 // Without operator overloading: Fred add(Fred, Fred); Fred mul(Fred, Fred); Fred f(Fred a, Fred b, Fred c) { return add(add(mul(a,b), mul(b,c)), mul(c,a)); // Yuk... } #else // With operator overloading: Fred operator+ (Fred, Fred); Fred operator* (Fred, Fred); Fred f(Fred a, Fred b, Fred c) { return a*b + b*c + c*a; } #endif ============================================================================== [13.2] What are the benefits of operator overloading? By overloading standard operators on a class, you can exploit the intuition of the users of that class. This lets users program in the language of the problem domain rather than in the language of the machine. The ultimate goal is to reduce both the learning curve and the defect rate. ============================================================================== [13.3] What are some examples of operator overloading? Here are a few of the many examples of operator overloading: * myString + yourString might concatenate two string objects * myDate++ might increment a Date object * a * b might multiply two Number objects * a[i] might access an element of an Array object * x = *p might dereference a "smart pointer" that actually "points" to a disk record -- it could actually seek to the location on disk where p "points" and return the appropriate record into x ============================================================================== [13.4] But operator overloading makes my class look ugly; isn't it supposed to make my code clearer? Operator overloading makes life easier for the users of a class[13.2], not for the developer of the class! Consider the following example. class Array { public: int& operator[] (unsigned i); // Some people don't like this syntax // ... }; inline int& Array::operator[] (unsigned i) // Some people don't like this syntax { // ... } Some people don't like the keyword operator or the somewhat funny syntax that goes with it in the body of the class itself. But the operator overloading syntax isn't supposed to make life easier for the developer of a class. It's supposed to make life easier for the users of the class: main() { Array a; a[3] = 4; // User code should be obvious and easy to understand... } Remember: in a reuse-oriented world, there will usually be many people who use your class, but there is only one person who builds it (yourself); therefore you should do things that favor the many rather than the few. ============================================================================== [13.5] What operators can/cannot be overloaded? Most can be overloaded. The only C operators that can't be are . and ?: (and sizeof, which is technically an operator). C++ adds a few of its own operators, most of which can be overloaded except :: and .*. Here's an example of the subscript operator (it returns a reference). First without operator overloading: class Array { public: #if 0 int& elem(unsigned i) { if (i > 99) error(); return data[i]; } #else int& operator[] (unsigned i) { if (i > 99) error(); return data[i]; } #endif private: int data[100]; }; main() { Array a; #if 0 a.elem(10) = 42; a.elem(12) += a.elem(13); #else a[10] = 42; a[12] += a[13]; #endif } ============================================================================== [13.6] Can I overload operator== so it lets me compare two char[] using a string comparison? No: at least one operand of any overloaded operator must be of some class type. But even if C++ allowed you to do this, which it doesn't, you wouldn't want to do it anyway since you really should be using a string-like class rather than an array of char in the first place[17.3] since arrays are evil[21.5]. ============================================================================== [13.7] Can I create a operator** for "to-the-power-of" operations? Nope. The names of, precedence of, associativity of, and arity of operators is fixed by the language. There is no operator** in C++, so you cannot create one for a class type. If you're in doubt, consider that x ** y is the same as x * (*y) (in other words, the compiler assumes y is a pointer). Besides, operator overloading is just syntactic sugar for function calls. Although this particular syntactic sugar can be very sweet, it doesn't add anything fundamental. I suggest you overload pow(base,exponent) (a double precision version is in ). By the way, operator^ can work for to-the-power-of, except it has the wrong precedence and associativity. ============================================================================== [13.8] How do I create a subscript operator for a Matrix class? Use operator() rather than operator[]. When you have multiple subscripts, the cleanest way to do it is with operator() rather than with operator[]. The reason is that operator[] always takes exactly one parameter, but operator() can take any number of parameters (in the case of a rectangular matrix, two paramters are needed). For example: class Matrix { public: Matrix(unsigned rows, unsigned cols); double& operator() (unsigned row, unsigned col); double operator() (unsigned row, unsigned col) const; // ... ~Matrix(); // Destructor Matrix(const Matrix& m); // Copy constructor Matrix& operator= (const Matrix& m); // Assignment operator // ... private: unsigned rows_, cols_; double* data_; }; inline Matrix::Matrix(unsigned rows, unsigned cols) : rows_ (rows), cols_ (cols), data_ (new double[rows * cols]) { if (rows == 0 || cols == 0) throw BadIndex("Matrix constructor has 0 size"); } inline Matrix::~Matrix() { delete[] data_; } inline double& Matrix::operator() (unsigned row, unsigned col) { if (row >= rows_ || col >= cols_) throw BadIndex("Matrix subscript out of bounds"); return data_[cols_*row + col]; } inline double Matrix::operator() (unsigned row, unsigned col) const { if (row >= rows_ || col >= cols_) throw BadIndex("const Matrix subscript out of bounds"); return data_[cols_*row + col]; } Then you can access an element of Matrix m using m(i,j) rather than m[i][j]: main() { Matrix m; m(5,8) = 106.15; cout << m(5,8); // ... } ============================================================================== [13.9] Should I design my classes from the outside (interfaces first) or from the inside (data first)? From the outside! A good interface provides a simplified view that is expressed in the vocabulary of a user[7.3]. In the case of OO software, the interface is normally to a class or a tight group of classes[14.2]. First think about what the object logically represents, not how you intend to physically build it. For example, suppose you have a Stack class that will be built by containing a LinkedList: class Stack { public: // ... private: LinkedList list_; }; Should the Stack have a get() method that returns the LinkedList? Or a set() method that takes a LinkedList? Or a constructor that takes a LinkedList? Obviously the answer is No, since you should design your interfaces from the outside-in. I.e., users of Stack objects don't care about LinkedLists; they care about pushing and popping. Now for another example that is a bit more subtle. Suppose class LinkedList is built using a linked list of Node objects, where each Node object has a pointer to the next Node: class Node { /*...*/ }; class LinkedList { public: // ... private: Node* first_; }; Should the LinkedList class have a get() method that will let users access the first Node? Should the Node object have a get() method that will let users follow that Node to the next Node in the chain? In other words, what should a LinkedList look like from the outside? Is a LinkedList really a chain of Node objects? Or is that just an implementation detail? And if it is just an implementation detail, how will the LinkedList let users access each of the elements in the LinkedList one at a time? One man's answer: A LinkedList is not a chain of Nodes. That may be how it is built, but that is not what it is. What it is is a sequence of elements. Therefore the LinkedList abstraction should provide a "LinkedListIterator" class as well, and that "LinkedListIterator" might have an operator++ to go to the next element, and it might have a get()/set() pair to access its value stored in the Node (the value in the Node element is solely the responsibility of the LinkedList user, which is why there is a get()/set() pair that allows the user to freely manipulate that value). Starting from the user's perspective, we might want our LinkedList class to support operations that look similar to accessing an array using pointer arithmetic: void userCode(LinkedList& a) { for (LinkedListIterator p = a.begin(); p != a.end(); ++p) cout << *p << '\n'; } To implement this interface, LinkedList will need a begin() method and an end() method. These return a "LinkedListIterator" object. The "LinkedListIterator" will need a method to go forward, ++p; a method to access the current element, *p; and a comparison operator, p != a.end(). The code follows. The key insight is that the LinkedList class does not have any methods that lets users access the Nodes. Nodes are an implementation technique that is completely buried. The LinkedList class could have its internals replaced with a doubly linked list, or even an array, and the only difference would be some performance differences with the prepend(elem) and append(elem) methods. #include // Poor man's exception handling typedef int bool; // Someday we won't have to do this class LinkedListIterator; class LinkedList; class Node { // No public members; this is a "private class" friend LinkedListIterator; // A friend class[14] friend LinkedList; Node* next_; int elem_; }; class LinkedListIterator { public: bool operator== (LinkedListIterator i) const; bool operator!= (LinkedListIterator i) const; void operator++ (); // Go to the next element int& operator* (); // Access the current element private: LinkedListIterator(Node* p); Node* p_; }; class LinkedList { public: void append(int elem); // Adds elem after the end void prepend(int elem); // Adds elem before the beginning // ... LinkedListIterator begin(); LinkedListIterator end(); // ... private: Node* first_; }; Here are the methods that are obviously inlinable (probably in the same header file): inline bool LinkedListIterator::operator== (LinkedListIterator i) const { return p_ == i.p_; } inline bool LinkedListIterator::operator!= (LinkedListIterator i) const { return p_ != i.p_; } inline void LinkedListIterator::operator++() { assert(p_ != NULL); // or if (p_==NULL) throw ... p_ = p_->next_; } inline int& LinkedListIterator::operator*() { assert(p_ != NULL); // or if (p_==NULL) throw ... return p_->elem_; } inline LinkedListIterator::LinkedListIterator(Node* p) : p_(p) { } inline LinkedListIterator LinkedList::begin() { return first_; } inline LinkedListIterator LinkedList::end() { return NULL; } Conclusion: The linked list had two different kinds of data. The values of the elements stored in the linked list are the responsibility of the user of the linked list (and only the user; the linked list itself makes no attempt to prohibit users from changing the third element to 5), and the linked list's infrastructure data (next pointers, etc.), whose values are the responsibility of the linked list (and only the linked list; e.g., the linked list does not let users change (or even look at!) the various next pointers). Thus the only get()/set() methods were to get and set the elements of the linked list, but not the infrastructure of the linked list. Since the linked list hides the infrastructure pointers/etc., it is able to make very strong promises regarding that infrastructure (e.g., if it was a doubly linked list, it might guarantee that every forward pointer was matched by a backwards pointer from the next Node). So, we see here an example of where the values of some of a class's data is the responsibility of users (in which case the class needs to have get()/set() methods for that data) but the data that the class wants to control does not necessarily have get()/set() methods. ============================================================================== SECTION [14]: Friends [14.1] What is a friend? Something to allow your class to grant access to another class or function. Friends can be either functions or other classes. A class grants access privileges to its friends. Normally a developer has political and technical control over both the friend and member functions of a class (else you may need to get permission from the owner of the other pieces when you want to update your own class). ============================================================================== [14.2] Do friends violate encapsulation? If they're used properly, they actually enhance encapsulation. You often need to split a class in half when the two halves will have different numbers of instances or different lifetimes. In these cases, the two halves usually need direct access to each other (the two halves used to be in the same class, so you haven't increased the amount of code that needs direct access to a data structure; you've simply reshuffled the code into two classes instead of one). The safest way to implement this is to make the two halves friends of each other. If you use friends like just described, you'll keep private things private. People who don't understand this often make naive efforts to avoid using friendship in situations like the above, and often they actually destroy encapsulation. They either use public data (grotesque!), or they make the data accessible between the halves via public get() and set() member functions. Having a public get() and set() member function for a private datum is OK only when the private datum "makes sense" from outside the class (from a user's perspective). In many cases, these get()/set() member functions are almost as bad as public data: they hide (only) the name of the private datum, but they don't hide the existence of the private datum. Similarly, if you use friend functions as a syntactic variant of a class's public: access functions, they don't violate encapsulation any more than a member function violates encapsulation. In other words, a class's friends don't violate the encapsulation barrier: along with the class's member functions, they are the encapsulation barrier. ============================================================================== [14.3] What are some advantages/disadvantages of using friend functions? They provide a degree of freedom in the interface design options. Member functions and friend functions are equally privileged (100% vested). The major difference is that a friend function is called like f(x), while a member function is called like x.f(). Thus the ability to choose between member functions (x.f()) and friend functions (f(x)) allows a designer to select the syntax that is deemed most readable, which lowers maintenance costs. The major disadvantage of friend functions is that they require an extra line of code when you want dynamic binding. To get the effect of a virtual friend, the friend function should call a hidden (usually protected:) virtual[20] member function. This is called the Virtual Friend Function Idiom[15.8]. For example: class Base { public: friend void f(Base& b); // ... protected: virtual void do_f(); // ... }; inline void f(Base& b) { b.do_f(); } class Derived : public Base { public: // ... protected: virtual void do_f(); // "Override" the behavior of f(Base& b) // ... }; void userCode(Base& b) { f(b); } The statement f(b) in userCode(Base&) will invoke b.do_f(), which is virtual[20]. This means that Derived::do_f() will get control if b is actually a object of class Derived. Note that Derived overrides the behavior of the protected: virtual[20] member function do_f(); it does not have its own variation of the friend function, f(Base&). ============================================================================== [14.4] What does it mean that "friendship is neither inherited nor transitive"? I may declare you as my friend, but that doesn't mean I necessarily trust either your kids or your friends. * I don't necessarily trust the kids of my friends. The privileges of friendship aren't inherited. Derived classes of a friend aren't necessarily friends. If class Fred declares that class Base is a friend, classes derived from Base don't have any automatic special access rights to Fred objects. * I don't necessarily trust the friends of my friends. The privileges of friendship aren't transitive. A friend of a friend isn't necessarily a friend. If class Fred declares class Wilma as a friend, and class Wilma declares class Betty as a friend, class Betty doesn't necessarily have any special access rights to Fred objects. ============================================================================== [14.5] Should my class declare a member function or a friend function? Use a member when you can, and a friend when you have to. Sometimes friends are syntactically better (e.g., in class Fred, friend functions allow the Fred parameter to be second, while members require it to be first). Another good use of friend functions are the binary infix arithmetic operators. E.g., aComplex + aComplex should be defined as a friend rather than a member if you want to allow aFloat + aComplex as well (member functions don't allow promotion of the left hand argument, since that would change the class of the object that is the recipient of the member function invocation). In other cases, choose a member function over a friend function. ============================================================================== SECTION [15]: Input/output via and [15.1] Why should I use instead of the traditional ? Increase type safety, reduce errors, improve performance, allow extensibility, and provide subclassability. printf() is arguably not broken, and scanf() is perhaps livable despite being error prone, however both are limited with respect to what C++ I/O can do. C++ I/O (using << and >>) is, relative to C (using printf() and scanf()): * Better type safety: With , the type of object being I/O'd is known statically by the compiler. In contrast, uses "%" fields to figure out the types dynamically. * Less error prone: With , there are no redundant "%" tokens that have to be consistent with the actual objects being I/O'd. Removing redundancy removes a class of errors. * Extensible: The C++ mechanism allows new user-defined types to be I/O'd without breaking existing code. Imagine the chaos if everyone was simultaneously adding new incompatible "%" fields to printf() and scanf()?!). * Subclassable: The C++ mechanism is built from real classes such as ostream and istream. Unlike 's FILE*, these are real classes and hence subclassable. This means you can have other user-defined things that look and act like streams, yet that do whatever strange and wonderful things you want. You automatically get to use the zillions of lines of I/O code written by users you don't even know, and they don't need to know about your "extended stream" class. ============================================================================== [15.2] Why does my program go into an infinite loop when someone enters an invalid input character? [NEW!] [Recently created (on 1/97).] For example, suppose you have the following code that reads integers from cin: #include main() { cout << "Enter numbers separated by whitespace (use -1 to quit): "; int i = 0; while (i != -1) { cin >> i; // BAD FORM -- See comments below cout << "You entered " << i << '\n'; } } The problem with this code is that it lacks any checking to see if someone entered an invalid input character. In particuluar, if someone enters something that doesn't look like an integer (such as an 'x'), the stream cin goes into a "failed state," and all subsequent input attempts return immediately without doing anything. In other words, the program enters an infinite loop; if 42 was the last number that was successfully read, the program will print the message You entered 42 over and over. An easy way to check for invalid input is to move the input request from the body of the while loop into the control-expression of the while loop. E.g., #include main() { cout << "Enter a number, or -1 to quit: "; int i = 0; while (cin >> i) { // GOOD FORM if (i == -1) break; cout << "You entered " << i << '\n'; } } This will cause the while loop to exit either when you hit end-of-file, or when you enter a bad integer, or when you enter -1. (Naturally you can eliminate the break by changing the while loop expression from while (cin >> i) to while ((cin >> i) && (i != -1)), but that's not really the point of this FAQ since this FAQ has to do with iostreams rather than generic structured programming guidelines.) ============================================================================== [15.3] How does that funky while (cin >> foo) syntax work? [NEW!] [Recently created (on 1/97).] See the previous FAQ[15.2] for an example of the "funky while (cin >> foo) syntax." The expression (cin >> foo) calls the appropriate operator>> (for example, it calls the operator>> that takes an istream on the left and, if foo is of type int, an int& on the right). The istream operator>> functions return their left argument by convention, which in this case means it will return cin. Next the compiler notices that the returned istream is in a boolean context, so it calls the "cast" operator istream::operator bool(). I.e., in this case, it calls cin.operator bool(), just as if you had casted it explicitly such as (bool)cin or bool(cin). (Note: if your compiler doesn't yet support the bool type, istream::operator void*() will be called instead.) The operator bool() cast operator returns true if the stream is in a good state, or false if it's in a failed state (in the void* case, the return values will be some non-NULL pointer or the NULL pointer, respectively). For example, if you read one too many times (e.g., if you're already at end-of-file), or if the actual info on the input stream isn't valid for the type of foo (e.g., if foo is an int and the data is an 'x' character), the stream will go into a failed state and the cast operator will return false. The reason operator>> doesn't simply return a bool indicating whether it succeeded or failed is to support the "cascading" syntax: cin >> foo >> bar; The operator>> is left-associative, which means the above is parsed as: (cin >> foo) >> bar; In other words, if we replace operator>> with a normal function name such as readFrom(), this becomes the expression: readFrom( readFrom(cin, foo), bar); As always, we begin evaluating at the innermost expression. Because of the left-associativity of operator>>, this happens to be the left-most expression, cin >> foo. This expression returns cin (more precisely, it returns a reference to its left-hand argument) to the next expression. The next expression also returns (a reference to) cin, but this second reference is ignored since it's the outermost expression in this "expression statement." ============================================================================== [15.4] Why does my input seem to process past the end of file? Because the eof state is not set until after a read is attempted past the end of file. That is, reading the last byte from a file does not set the eof state. For example, the following code has an off-by-one error with the count i: int i = 0; while (! cin.eof()) { // WRONG! cin >> x; ++i; // Work with x ... } What you really need is: int i = 0; while (cin >> x) { // RIGHT! ++i; // Work with x ... } ============================================================================== [15.5] Why is my program ignoring my input request after the first iteration? Because the numerical extractor leaves non-digits behind in the input buffer. If your code looks like this: char name[1000]; int age; for (;;) { cout << "Name: "; cin >> name; cout << "Age: "; cin >> age; } What you really want is: for (;;) { cout << "Name: "; cin >> name; cout << "Age: "; cin >> age; cin.ignore(INT_MAX, '\n'); } ============================================================================== [15.6] How can I provide printing for my class Fred? [UPDATED!] [Recently added note about cascading operator<< calls (on 1/97).] Use operator overloading[13] to provide a friend[14] left-shift operator, operator<<. #include class Fred { public: friend ostream& operator<< (ostream& o, const Fred& fred); // ... private: int i_; // Just for illustration }; ostream& operator<< (ostream& o, const Fred& fred) { return o << fred.i_; } main() { Fred f; cout << "My Fred object: " << f << "\n"; } We use a friend[14] rather than a member since the Fred parameter is second rather than first. Note that operator<< returns the stream. This is so the output operations can be cascaded[15.3]. ============================================================================== [15.7] How can I provide input for my class Fred? [UPDATED!] [Recently added note about cascading operator<< calls (on 1/97).] Use operator overloading[13] to provide a friend[14] right-shift operator, operator>>. This is similar to the output operator[15.6], except the parameter doesn't have a const[18]: "Fred&" rather than "const Fred&". #include class Fred { public: friend istream& operator>> (istream& i, Fred& fred); // ... private: int i_; // Just for illustration }; istream& operator>> (istream& i, Fred& fred) { return i >> fred.i_; } main() { Fred f; cout << "Enter a Fred object: "; cin >> f; // ... } Note that operator>> returns the stream. This is so the input operations can be cascaded and/or used in a while loop or if statement[15.3]. ============================================================================== [15.8] How can I provide printing for an entire hierarchy of classes? Provide a friend[14] operator<<[15.6] that calls a protected virtual[20] function: class Base { public: friend ostream& operator<< (ostream& o, const Base& b); // ... protected: virtual void print(ostream& o) const; }; inline ostream& operator<< (ostream& o, const Base& b) { b.print(o); return o; } class Derived : public Base { protected: virtual void print(ostream& o) const; }; The end result is that operator<< acts as if it was dynamically bound, even though it's a friend[14] function. This is called the Virtual Friend Function Idiom. Note that derived classes override print(ostream&) const. In particular, they do not provide their own operator<<. Naturally if Base is an ABC[22.3], Base::print(ostream&) const can be declared pure virtual[22.4] using the "= 0" syntax. ============================================================================== [15.9] How can I "reopen" cin and cout in binary mode under DOS and/or OS/2? This is implementation dependent. Check with your compiler's documentation. For example, suppose you want to do binary I/O using cin and cout. Suppose further that your operating system (such as DOS or OS/2) insists on translating "\r\n" into "\n" on input from cin, and "\n" to "\r\n" on output to cout or cerr. Unfortunately there is no standard way to cause cin, cout, and/or cerr to be opened in binary mode. Closing the streams and attempting to reopen them in binary mode might have unexpected or undesirable results. On systems where it makes a difference, the implementation might provide a way to make them binary streams, but you would have to check the manuals to find out. ============================================================================== [15.10] Why can't I open a file in a different directory such as "..\test.dat"? Because "\t" is a tab character. You should use forward slashes in your filenames, even on an operating system that uses backslashes such as DOS, Windows, OS/2, etc. For example: #include #include main() { #if 1 ifstsream file("../test.dat"); // RIGHT! #else ifstsream file("..\test.dat"); // WRONG! #endif // ... } Remember, the backslash ("\") is used in string literals to create special characters: "\n" is a newline, "\b" is a backspace, and "\t" is a tab, "\a" is an "alert", "\v" is a vertical-tab, etc. Therefore the file name "\version\next\alpha\beta\test.dat" is interpreted as a bunch of very funny characters; use "/version/next/alpha/beta/test.dat" instead, even on systems that use a "\" as the directory separator such as DOS, Windows, OS/2, etc. ==============================================================================