Move semantics and rvalue reference
===================================



Based mostly on Thomas Becker's article
http://thbecker.net/articles/rvalue_references/section_01.html ...


What is the rvalue reference and why we need it?


I many earlier languages assignment works in the following way:

  <variable> := <expression>

like

  x := y + 5;


In C/C++ however there might be expressions on the left side of =

  <expression> = <expression>

like

  *++p = *++q;  // likely ok if p is pointer, except p is const *


But not all kind of expressions may appear on the left hand side:

  y + 5 = x;  // likely error except some funny operator+


"Left value" is originated to C language: an l value is an expression
that may appear on the left hand side of an assignment. "Right value"
is anything else.

Basically left-value identified a writeable memory location in C.
In C++ this is a bit more complex, but here is a widely acecpted
definition:


An "lvalue" is an expression that refers to a memory location and allows
us to take the address of that memory location via the & operator.
An "rvalue" is an expression that is not an "lvalue"



These are lvalues

int  i = 42;
int &j = i;
int *p = &i;

 i = 99;
 j = 88;
*p = 77;

int *fp() { return &i; } // returns pointer to i: lvalue
int &fr() { return i; }  // returns reference to i: lvalue

*fp() = 66;  // i = 66
 fr() = 55;  // i = 55 


These are rvalues

int f() { int k = i; return k; } // returns rvalue

  i = f();  // ok
  p = &f(); // bad: can't take address of rvalue
f() = i;    // bad: can't use rvalue on lefthand side

A rigorous definition of lvalue and rvalue:
http://accu.org/index.php/journals/227






Performace problems
===================



C and C++ has value semantics -- when we use assignment we copy by default.


Array a, b, c, d, e;

a = b + c + d + e;


This will generate the following pseudo-code:


// pseudocode for a = b + c + d + e

double* _t1 = new double[N];

for ( int i = 0; i < N; ++i)
    _t1[i] = b[i] + c[i];

double* _t2 = new double[N];

for ( int i = 0; i < N; ++i)
    _t2[i] = _t1[i] + d[i];

double* _t3 = new double[N];

for ( int i = 0; i < N; ++i)
    _t3[i] = _t2[i] + e[i];

// possible delete[] and new[] for a.operator=()

for ( int i = 0; i < N; ++i)
    a[i] = _t3[i];

delete [] _t3;
delete [] _t2;
delete [] _t1;



In the same time, in (FORTRAN-like) C we can write the following:


for ( int i = 0; i < N; ++i)
{
    a[i] = b[i] + c[i] + d[i] + e[i];
}



This has been investigated by Todd Veldhuizen and has led to C++ template
metaprogramming and expression templates.

- For small arrays new and delete result poor performance: 1/10 of C.
- For medium arrays, overhead of extra loops and memory access add +50%
- For large arrays, the cost of the temporaries are the limitations



It would be nice not to create the temporaries, but steal the resources of
the arguments of operator+()

But!
We can destroy only the temporaries: we should keep the original resources
for the b, c, d, e variables. How can we distinguis between variables and
unnamed temporaries? --> overloading

Overloading --> We need a separate type!
What kind of requirements we have for this type?

1. should be a reference type - otherwise we gain nothing

2. if there is an overload between ordinary reference and this new type
   then rvalues should prefer the new type and lvalues the ordinary reference





Rvalue reference
================



X &&  (X is not template)



Universal reference, when T&& and T is template:

template <class T>
T &&                 (discussed later)



Old kind of references now are called as lvalue references:

X &



void f(X&  arg_)  // lvalue reference parameter
void f(X&& arg_)  // rvalue reference parameter


X x;
X g();

f(x);   // lvalue argument --> f(X&)      
f(g()); // rvalue argument --> f(X&&)



We can overload copy constructor and assignmnet operator overloads:


class X
{
public:
  X(const X& rhs);
  X(X&& rhs);

  X& operator=(const X& rhs);
  X& operator=(X&& rhs);
private:
  // ...  

};


X& X::operator=(const X& rhs)
{
  // free old resources than copy resource from rhs
  return *this;
}

X& X::operator=(X&& rhs)  // draft version, will be revised later
{
  // free old resources than move resource from rhs
  // leave rhs in a destructable state
  return *this;
}



Reverse compatibility


If we implement the old-style memberfunctions with lvalue reference parameters
but do not implement the evalue reference overloading versions we keep the
old behaviour -> we can gradually move to move semantics.


However, if we implement "only" the rvalue operations than we cannot call
these on lvalues -> no default lvalue-copy constructor or operator= will
be generated.





Special member function generation
==================================



Move operations generated only if they needed.
If they generated, they perform memberwise moves.

Move constructor also moves base part or non-static members.


Move operations are "move requests": if we want to move something
not supporting move semantics (like C++98 classes), then it will
"move by copy" (silently).



Exact rules of generation move operations
are a bit different from copy operations.




1. The two copy operation (copy constructor and copy assignment)
   are independent. Declaring copy constructor does nor prevent
   compiler to generate copy assignment (and vice versa).
   (same as in C++98)


2. Move operations are not independent. Declare either prevents
   the compiler to generate the other.


3. If any of the copy operation is declared, then none of the move
   operation will be generated.


4. If any of the move operation is declared, then none of the copy
   operation will be generated. This is the opposite rule of (3).


5. If a destructor is declared, than none of the move operation
   will be generated. Copy operations are still generated for
   reverse compatibility with C++98.


6. Default constructor generated only no constructor is declared.
   (same as in C++98)




Move operations are generated only when all of these 3 are true:

- no copy operations are declared.
- no move operations are declared.
- no destructor is declared.

Extra: no copy operations are generated when any move operations declared.


(Automated generation of copy operations would be nice to restrict too,
but reverse compatbility denies that.)




templates does not play here a role:




class X
{
public:
  template <typename T>
  X(const T& rhs);            // does not prevent generating constructors

  templae <typename T>
  X& operator=(const T& rhs); // does not prevent generating assignments
private:
  // ...  

};




If we want to generate copy or move operations, we can do it



class X
{
public:
  ~X(); // user declared destructor -> denies generation of move operations

  X(const X& rhs) = default; // to generate copy constructor
  X(X&& rhs) = default;      // to generate move constructor

  X& operator=(const X& rhs) = default; // to generate copy assignment
  X& operator=(X&& rhs) = default;      // to generate move assignment 
private:
  // ...  

};







std::move
=========



std::move()     "pronounce: rvalue_cast"


With std::move() we use move semantic on lvalue.


// this will copy 
template<class T>
void swap(T& a, T& b)
{
  T tmp(a);
  a = b;
  b = tmp;
}

X a, b;
:
swap(a, b);


By default, this swap implementation will use copy semantic.
To enforce move semantic we should use std::move().


// this will move
template<class T>
void swap(T& a, T& b)
{
  T tmp(std::move(a));
  a = std::move(b);
  b = std::move(tmp);
}

X a, b;
:
swap(a, b);



std::move() converts its argument to rvalue reference, does not do
anything else. Specially, std::move() do nothing in run-time.

We should think std::move as "rvalue reference cast".