References
The Concept Behind Reference
In modern programming languages there is two important concepts defining
the behaviour of variables:
Scope: Defines the area in the program source where a certain identifier
binded to a memory location. Also defines the visibility: when the identifier
is valid to use and means the memorylocation we mentioned before.
Life: Defines the time-span under runtime when the memory location is safe
to store our values. After the duration the memory location is not safe to
access.
Most cases we define scope and life in the same declaration:
int i;
Sometimes we can define a memory location, without binding a name to it.
new int;
And sometimes we can bind a new name to an existing memory location, whether
a name has been already binded to it or not.
int &j = i;
Reference in earlier programming languages
There is a history of references in Simula-67 and Algol-68.
Algol-68 make a difference between a binding a name to a certain memory
location and to a variable. A reference is converted to the value of the
same type, this is called dereferenciate of the variable. This is the way
to distinguish between the left and right side of an assignment.
int i = 5;
ref int j := i;
A key diffeernce between pointer an reference: null pointer
Base *bp = ...;
if ( Derived *dp = dynamic_cast<Derived*>(bp) )
{
}
Base &br = ...;
try
{
Derived &dr = dynamic_cast<Derived&>(br);
}
catch( bad_cast ) { ... }
Scope and Life Rules
A normal variable has scope and life, a reference has only scope.
void f()
{
int i;
int &ir = i;
ir = 5;
}
And this can lead to problems:
void f()
{
int *ip = new int;
int &ir = *ip;
delete ip;
ir = 5;
}
Parameter passing
Historically there are languages with parameter passing by address
(FORTRAN, PASCAL), and by value (PASCAL vith var keyword, C). The formal
just uses equivalent memory fields for the formal and the actual parameter,
but the latter copies the value of actual parameter into a local variable
in the area of the subprogram.
Parameter passing is originated in the initialization semantic in C++.
There is an important difference between initialization and assignment:
int i = 3;
int j = i;
j = i;
Parameter passing follows initialization semantic:
int i = 5;
int j = 6;
void f1( int x, int y)
{
}
void f2( int &x, int &y)
{
}
f1(i,j); ==> int x = i;
int y = j;
f2(i,j); ==> int &x = i;
int &y = j;
Reference as Return type
Normally when a C/C++ function returns by value, the returning object
is copied from the function to the target.
int f1()
{
int i;
return i;
}
It is also possible to define a function with reference return type.
The meaning is to bind the function expression to the returning object.
int& f2()
{
int i;
return i;
}
The usage is different. If the returning object will not survive the
function call expression, than we must copy it. Otherwise, it is allowed
returning with a reference.
int j = f1();
int& j = f2();
Typical usage of swap:
void swap( int &x, int &y)
{
int tmp = x;
x = y;
y = tmp;
}
int i = 5;
int j = 6;
swap( i, j);
assert(i==6 && j==5);
But a reference should be bound only to a left value:
swap(i,7);
int &y = 7;
swap(i,3.14);
int &y = int(3.14);
const int &z1 = 7;
const int &z2 = 3.14;
Usage examples:
class date
{
public:
date& date::setYear(int y) { _year = y; return *this; }
date& date::setMonth(int m) { _month = m; return *this; }
date& date::setDay(int d) { _day = d; return *this; }
date& date::operator++() { ++_day; return *this; }
date date::operator++(int) { date curr(*this); ++_day; return *curr; }
private:
int _year;
int _month;
int _day;
};
date d;
++d.setYear(2011).setMonth(11).setDay(11);
Optimalization
One of the typical usage of the references is returning from function by
reference rather than by value. Returning reference often more effective
than copying. In some cases, however, copying is a must.
template <typename T>
class matrix
{
public:
T& operator()(int i, int j) { return v[i*cols+j]; }
const T& operator()(int i, int j) const { return v[i*cols+j]; }
matrix& operator+=( const matrix& other)
{
for (int i = 0; i < cols*rows; ++i)
v[i] += other.v[i];
}
private:
T* v;
};
template <typename T>
matrix<T> operator+( const matrix<T>& left, const matrix<T>& right)
{
matrix<T> result(left);
result += right;
return result;
}
Let us discover the return values one-by-one:
T& operator()(int i, int j) { return v[i*cols+j]; }
This function returns reference to the selected value, allowing clients
to modify the appropriate element of the matrix:
matrix<double> dm(10,20);
dm(2,3) = 3.14;
cout << dm(2,3);
dm(2,3) += 1.1;
double& dr = dm(2,3);
dr += 1.1;
A second version of operator() is presented to read constant matrices.
const T& operator()(int i, int j) const { return v[i*cols+j]; }
This const memberfunction must return const reference, otherwise the
const-correctness would be leaking:
const matrix<double> cm = dm;
cm(2,3) = 3.14;
cout << cm(2,3);
cm(2,3) += 1.1;
double& dr = cm(2,3);
const double& cdr = cm(2,3);
Most assignment operators defined with (non-const) reference type as
return value:
matrix& operator=(const matrix& other)
{
if ( &other == this ) return *this;
delete [] v;
cols = other.cols;
rows = other.rows;
v = new T[cols*rows];
for (int i = 0; i < cols*rows; ++i)
v[i] = other.v[i];
}
matrix& operator+=( const matrix& other)
{
for (int i = 0; i < cols*rows; ++i)
v[i] += other.v[i];
}
This is more effective than a value returning type, and also avoiding
unneccessary restrictions:
dm1 = dm2 = dm3;
++( dm1 += dm2 );
But there is an other solution:
matrix& operator=(matrix other)
{
if ( &other == this ) return *this;
cols = other.cols;
rows = other.rows;
T* t = v;
v = other.v;
other.v = t;
}
In the previous examples returning reference (or constant reference) was
safe, because the memory area which was referred survived the operation.
In the next addiotion returning a reference (or constant reference) would
be a major fault:
template <typename T>
matrix<T> operator+( const matrix<T>& left, const matrix<T>& right)
{
matrix<T> result(left);
result += rigth;
return result;
}