Lesson #5: Move Semantics

Move semantics provide a way for the contents of objects to be 'moved' between objects, rather than copied, thus significantly changing the way we design+code C++ by allowing things like return by value to be used a lot more often.

Move semantics solve a couple of common issues with old C++ …

Returning large objects from functions

Old C++

Before, if we wanted to return a large object which was too expensive to copy from a function, we had two alternatives:

Return by pointer

This required an extra memory allocation, and the caller to be responsible for memory management (deleting the returned pointer):

vector<int>* makeBigVector() {
   vector<int> result = new vector<int>(1024);
   for(int i=0; i<1024; i++) { 
      result[i] = rand(); 
   }
   return result;
}
:::
vector<int>* v = makeBigVector();

Pass-out by reference

Instead of returning the object as the return value, another common way was to modify a parameter, thus requiring the caller to first have created a named object to pass:

void makeBigVector( vector<int>& out ) {
   out.resize(1024);
   for(int i=0; i<1024; i++) { 
      out[i] = rand(); 
   }
}
:::
vector<int> v;
makeBigVector(v);

C++11

With C++11 move semantics, we can simply return by value. As all of the STL collection classes have been extended to support move semantics, we know that for STL collections, as wel as for all other classes for which move semantics are defined, this method of doing things will be efficient.

vector<int> makeBigVector() {
   vector<int> result;
   for(int i=0; i<1024; i++) { 
      result[i] = rand(); 
   }
   return result;
}
:::
auto v = makeBigVector(); // guaranteed not to copy the vector

Passing objects to functions efficiently

Consider the case where we have a vector of value objects. In old C++, adding objects to that vector always involved the object being copied, which could be costly. In C++11 this copy is generally avoided, for example:

vector<string> v;
string s { "hello world" };

v.push_back(string("hello world")); // move possible, as string is an rvalue (and 
                                    // std::string implements move semantics)
v.emplace_back("hello world"); // just illustrating how we do the same the less-verbose C++11 way
v.push_back(s);                // move can't be done, as parameter is an rvalue
v.push_back(std::move(s));     // move possible, as we're explicitly moving the data out of s, 
                               // making s now be empty

Understanding move semantics

rvalue, lvalue, and &&

To understand how move semantics work, you need to understand the concepts of rvalues and lvalues:

  • an lvalue is an expression whose address can be taken, a locator value. Anything you can make assignments to is an lvalue
  • an rvalue is an unnamed value that exists only during the evaluation of an expression, i.e.:
  • the && operator is new in C++11, and is like the reference operator (&), but whereas the & operator an only be used on lvalues, the && operator can only be used on rvalues.

To illustrate this, consider the following examples:

int a = 1; // here, a is an lvalue (and the '1' is an rvalue)
// as this function returns a local variable, it is returning an rvalue
string getName() { 
   string s = "Hello world"; return s; 
} 
:::

// as getName() returns an rvalue, assigning getName()'s result to an rvalue reference is possible
string&& name1 = getName(); 

// you can also getName() it to a value object, without any copying of string data being required
string name2 = getName();   
class A {
public:
   static A instance;

   // this is returning a reference to a static var, hence it's returning an lvalue
   static A& getInst() { return instance; } 

   // but this is not returning a reference to instance, but rather a temporary copy 
   // of instance, hence it's returning an rvalue
   static A getInstCopy() { return instance; } 
};
:::
A& inst1  = A::getInst(); // ok - we've fetched a reference to the static instance variable
A&& inst2 = A::getInst(); // COMPILE ERR: we can't assign a reference to an rvalue reference

A::getInst() = A(); // proving getInst() is an lvalue reference, we assign a new value to it 
                    // (probably getInst() should be returning const A& to prevent this)

printf("%p", &A::getInst()); // similarly, as it is an lvalue, we can get+print its address

A  inst3 = A::getInstCopy(); // ok - we've fetched a copy of the instance
A& inst5 = A::getInstCopy(); // COMPILE ERR: can't assign a reference to a temporary (an rvalue)

// ok - we've assigned an rvalue reference to the temporary copy that was made of the instance
A&& inst4 = A::getInstCopy();

Understanding when a move is possible

When passing an object to a function, or returning an object from a function, it's possible to do a move (rather than a copy) if:

  • the object is an rvalue
  • the object's class defines the special member move functions (see below)

The logic behind this is that, when a 'move' occurs, data is removed from the old object and placed into a new object. Thus a move can't be automatically done if the old object is going to be around for future use and still expected to have its data, i.e. the compiler can only do a move when:

  • the old object is a temporary – it will be no longer needed following the move, such as:
    • in the v.push_back(string("hello world")) case above – the string that is created here to pass to the vector is no longer used following this call, and the compiler is clever enough to know this.
    • when returning a local variable, as in the return by value example earlier. Again, the variable is anyway going out of scope following the return.
  • when we explicitly call std::move on an object, making clear that although the object will still be around after the move, we're happy to empty it in this way.

Ensuring move is used, rather than copy

If you want to be sure that moves are being done, you can disable copy semantics in your classes (and create wrappers for STL classes with copying disabled), so that you when you are moving objects of the class around, you know for sure they're either being cheaply moved, or that the compiler will complain if any bit of code is wanting to copy

For example:

#include <vector>

using namespace std;

// make our own version of std::vector with copying disabled
template<class T> class MyVector: public vector<T> {
public:

   // implement the same constructors as vector defines
    MyVector():vector<T>() {}
    MyVector(int size):vector<T>(size) {}
    MyVector(std::initializer_list<T> l):vector<T>(l) {}
    MyVector(const MyVector&& v):vector<T>(v) {} // don't forget the move constructor!

private:

   // disable copying
    MyVector(const MyVector&);
    MyVector& operator=(const MyVector&);
};

class MoveSemanticsTest {
public:
   static void doTests() {
      vector<vector<int>>   vv1;
      vector<MyVector<int>> vv2;

      vv1.push_back( { 1,2,3,4 } ); // this works with std::vector, using move constructor
      vv2.push_back( { 1,2,3,4 } ); // this works with MyVector, using move constructor

      vector<int>   v1  { 1,2,3,4 };
      MyVector<int> v2  { 1,2,3,4 };

      vv1.push_back(v1); // this works with std::vector, but will be doing a full copy!
      vv2.push_back(v2); // this gives a compile error with MyVector, as we've disabled copying!
   }
};

The above implementation of MyVector (as a wrapper of std::vector with copy construction disabled) would be simpler+cleaner if we could use the new Inheriting Constructors feature in C++11, however this is not yet supported in clang or MSVC, and only in gcc since version 4.8, so it's not something I'm going to be using in this blog at this stage.

Special member function rules

Special member functions

In old C++, there were four special member functions. Now with C++11's two move semantics functions, there's six:

  • Default constructor
  • Destructor
  • The two copy special member functions
    • Copy constructor
    • Copy assignment operator
  • The two move special member functions
    • Move constructor
    • Move assignment operator

The logic for whether the compiler automatically declares/defines/deletes the move special member functions is complex and somewhat along the lines of the logic with the copy special member functions (which in turn gain some new clauses relating to the move semantics in C++11).

Understanding all the logic is rather complicated. For example, it's covered here if you're interested in just how complex it gets:

Basic guidelines

First, keep things simple though good practice …

Before talking about when+how you need to implement the move special member functions, I believe the simplest approach is to first follow the following basic rules in creating your classes in the first place:

  • when dealing with pointers, always use smart pointers – unique_ptr + shared_ptr (see my post on smart pointers) rather than plain pointers for storing class members
  • prefer standard C++11 STL classes which have move semantics (and copy semantics) well defined already, to legacy alternatives that may be missing correct move+copy semantics (i.e. they might have a costly copy constructor and no move constructor, or simply have neither defined when they could easily have them, thus causing either more work for any class of yours that uses them, or causing your class to also lack move+copy support when it should be able to support it))

By so composing your class with members are either PODs (int, double, bool etc) or whose classes all have correct copy+move semantics in place,

  • you should virtually never need to worry about move+copy semantics
  • you should correspondingly never find yourself needing to write destructors to release resources in your classes
    • however, be aware that whenever you do add a destructor (say you just want to log that "class X destroyed"), then by doing so you will be disabling the automatic generation of copy+move semantics, and thus will have to define them, though at least this should just be a matter of just enabling the compiler-generated default implementations (see below …)

The only times you should find yourself breaking the above rules are:

  • when forced to wrap some legacy code, i.e. writing a file reading class that adds RAII to FILE*
    • now you have to worry about destruction, moving and copying in this one class … but at least as long as this class has implemented these correctly, all classes that then use this class can again not worry about these things
  • when your class contains a large non-dynamically allocated object, such as a static array
    • in this case you'll probably want to simply delete both the copy+move special member functions, in order to prevent any moving/copying which would otherwise occur and be costly
    • you could consider writing a special clone() function that has to be explicitly called when you really do want+need to make a copy of the object, if needed

How to best define+delete special member functions

In such cases where you thus do have to define/delete special member functions, follow the following guidelines:

  • if defining one of the move special member functions, always define the other as well, so either both or neither are defined (and apply the same logic to the copy special member functions)
  • for classes for which all the members have safe copy+move semantics (i.e., they are not plain pointers, rather they are POD objects [int, double, bool, etc], or they are classes which can either be safely copied/moved or disable copying/moving), either:

    • don't define any destructor or move/copy special member functions – the compiler will implement/disable them according to whether all the members support moving/copying (with an automatic implementation involving calling the copy/move-constructors all of the individual members), or not ….
    • or, if you want to prevent any copying of such a class, then you can delete the copy special member functions, and declare the move special member functions to be default, i.e.:

      Class A {
      private:
         A(const A&);
         A& operator=(const A&);
      
      public:
         A(A&& x) = default;
         A& operator=(A&& a) = default;
      }
      
    • or, if you for some reason had to make a destructor and lost the automatic generation of the *move+copy special member functions, simply define them all as either default, or private/delete, depending upon whether you want them or not

  • for more complex classes (i.e. classes that contain pointers and would thus need a destructor, and would cause problems with the default move+copy):

    • define a destructor, and either define or delete (but don't simply forget to do either!) both copy special member functions, and similar either define or delete both move special member functions, as desired.

Note that for classes that inherit from other classes (especially classes you're not familiar with), you need to be extra careful, and thus may want to generally be more explicit in defining/deleting the special member functions, as the rules of whether the compiler automatically defines/deletes these functions gets complicated.

As a general rule I also recommend disabling copy semantics on any classes where copying is likely to incur a cost (and enabling move semantics if cheap moving is possible, otherwise deleting it also).

  • this will allow you to confidently make use of move semantics throughout your code – using your classes as values when passing them to and returning them from functions – without fear that expensive copies may in some cases be occurring, as now the compiler will always warn you when any copying is occurring.

Supporting move semantics in your classes

To support moving, you class needs:

  • a Move constructor of the form C::C(C&& other);
  • a Move assignment operator of the form C& C::operator=(C&& other);

These will often be automatically defined+deleted by the compiler (as explained above), and even when you do need to define/delete them (maybe you're just defining them so you can be sure they are defined), often it's sufficient to just define them to use the default implementation or delete them, again as explained in the previous section.

This example illustrates how to implement them in what should be the very rare times when actually you do need to implement them yourself:

class MyClass {
   int* buffer = nullptr;

private:

   // disable copying
   MyClass(const MyClass&);
   MyClass& operator=(const MyClass&);

public:

   // move constructor
   MyClass(MyClass&& other) {
      buffer = other.buffer;
      other.buffer = nullptr;
   }

   // move assignment operator
   MyClass& MyClass::operator=(MyClass&& other) {
      if(this != &other) {
         if(buffer) {
            delete buffer;
         }
         buffer = other.buffer;
         other.buffer = nullptr;
      }
   }

   ~MyClass {
      if(buffer) {
          delete buffer;
       }
   }
};

The disabling of copying in the above example is optional – you could alternatively implement the Copy constructor and Copy assignment operator. What you don't want to do is leave the default copy semantics.

Also note, that the only reason it's neccessary to define the move+copy semantics in the above example is that this class is breaking the guidlines explained earlier. If we simply store buffer in a unique_ptr rather than a plain pointer, then we will get the exact same behaviour (moving will be enabled and work correctly, and copying will be disabled) without any of the boilerplate (and without any of the risk of messing up the boilerplate), i.e. the following gives all the functionality of the above example and thus should be strongly preferred:

class MyClass {
   unique_ptr<int> buffer;
};

Using std::move

Say in the above example – where you need to implement move semantics because of the pointer you're using there – you also have a bunch of other class members which can all be moved safely, then to move them within the move constructor + move assignment operator you can simply use std::move, i.e. if MyClass above also had a member string myName, then the move constructor could be:

   MyClass(MyClass&& other) {
      buffer = other.buffer;
      other.buffer = nullptr;

      myName = std::move(other.myName);
   }

   MyClass& MyClass::operator=(MyClass&& other) {
      if(this != &other) {
         if(buffer) {
            delete buffer;
         }
         buffer = other.buffer;
         other.buffer = nullptr;

         myName = std::move(other.myName);
      }
   }

You would similarly use std::move to do an efficient implemention of a method like say push_back from std::vector, which can take an rvalue reference (&&) to avoid unneccessary copying, i.e.:

class MyVector {
   unique_ptr<T> buffer;
   :::   
   void push_back(T&& value) {
      buffer[index] = std::move(value);
   }
};

14 comments

  1. Thomas · · Reply


    vector makeBigVector() {
    vector result;
    for(int i=0; i<1024; i++) {
    result[i] = rand();
    }
    return result;
    }

    This code will result in a segmentation fault since you have forgotten to reserve memory for the vector

  2. Hi. Nice guide about constructors.

    BTW I have one question. What do you think about such construction:
    Foo&& f1 = createFoo();
    Foo f2(f1); (1)
    Foo f3((Foo&&)f1); (2)

    In (1) copy constructor will be called. In (2) move constructor.
    So, f1 declared as rvalue reference but because it has name f1 treated like lvalue.

  3. what is the compilation flags to make it work in gcc or g++ .

  4. I think that In section “Passing objects to functions efficiently” there is a typo:

    ==============
    vector v;
    string s { “hello world” };

    v.push_back(string(“hello world”)); // move possible, as string is an rvalue (and
    // std::string implements move semantics)
    v.push_back(s); // move can’t be done, as parameter is an **rvalue**
    ===================

    It should be lvalue, or am getting it wrong?

  5. Hi!
    Great series of articles! It helped me a lot in understanding some of the gory details of the new C++11. I will follow your blog in the future when looking for some wisdom!!
    Cheers!

  6. “v.push_back(s); // move can’t be done, as parameter is an rvalue”
    mistype. supposed to be “as parameter is an lvalue”

  7. There may be a typo in the section entitled “Ensuring Move is used, rather than Copy”:

    MyVector(const MyVector&& v):vector(v) {} // don’t forget the move constructor!

    –> I think you need to remove “const”?

  8. […] Source: Lesson #5: Move Semantics […]

  9. awijesurendra · · Reply

    Thanks for a great blog with some very lucid explanations of some of the more useful new C++11 features. I hope you find the time / motivation to add new posts to this when you can. Cheers

  10. MyClass& MyClass::operator=(MyClass&& other)
    probaly needs a return.

  11. Excellent blog, I bookmarked it already! Please do keep up the good work of simplifying C++’s concepts and keep us infromed.

    Another thing…is there an rss feed so I can be notified for new posts?

  12. Also, what do you know about C++ functors?

  13. Nice examples! thanks.
    The operator method
    MyClass& MyClass::operator=(MyClass&& other) { … }
    is missing a final:
    return *this;

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: