Lesson #2: Range-based for

The 'range-based for' (i.e. foreach style) for loops provide C++11 with a simple for-each style loop syntax. It works with both the STL collection classes (hiding the complexity of using the STL iterator's manually), as well as with plain C arrays, and can be made to work with any custom classes as well (see Using with your own collection classes below).

As such, 'range-based for' is a feature with which C++11 reduces unnecessary verbosity while simplifying the language. It's well supported in all the main C++ compilers already – clang (3.0+above), gcc (4.6+above), MSVC (since VC11).

The following examples illustrate how it works (note that the C++11 examples also make use of new auto keyword – check-out my previous post if you're not familiar with this yet):

Old C++:

for(std::vector<int>::iterator it = myVector.begin(); it != myVector.end(); ++it) { *it *= 2; }

for(std::map<int, set<std::string> >::iterator it = myMap.begin(); it != myMap.end(); ++it) { ... }

for(std::set<MyClass>::iterator it = mySet.const_begin(); it != mySet.const_end(); ++it) {
   const MyClass& value = *it;
   ...
}

int arr[] = { 1, 2, 3, 4};
for(int i=0; i < sizeof(arr) / sizeof(int); i++) {
   int& value = arr[i];
   ...
}

// iterate all elements in a multi-map for which the key matches "hello" ...
pair<std::multimap<std::string,int>::iterator, std::multimap<std::string,int>::iterator> range = 
   myMultiMap.equal_range("hello");

for(std::multimap<std::string,int>::iterator it=range.first; it!=range.second; ++it) { ... }

C++11:

for(auto& value: myVector) { value *= 2; }

for(auto it: myMap) { ... }

for(const auto& value: mySet) { ... }

int arr[] { 1, 2, 3, 4 };
for(int value: arr) {
   ...
}

// iterate all elements in a multi-map for which the key matches "hello" ...
auto range = myMultiMap.equal_range("hello");
for(auto& it: range) { ... }

Use of the 'auto' keyword

Note that all of the above examples make use of the auto keyword (described in the previous post).

Without the auto keyword, you'd have to specify the full type, which in some cases you don't really want to have to do (i.e. pair<int, set<std::string> > when iterating the map – this is really an implementation detail I don't want to care about – like where with old C++ all I needed to know was that I could use loopVariable->first and loopVariable->second to get the key+value, here all I need to know is that I can use loopVariable.first and loopVariable.second to get the key+value).

Thus, the use of the auto keyword in in this situation should generally be preferred.

The type of the loop variable?

Previously when iterating the old C++ way, your loop variable was an iterator, which then has to be dereferenced (as if it were a pointer) to get the value (which can be one example of a potentially confusing use of C++ operator overloading, because the iterator really is more than just a pointer to the value).

Now when using the new 'range-based for' loop, what you get is just the value – you're actually getting the dereferenced iterator, that is, the following two loops are basically equivalent:

for(vector<int>::iterator it = v.begin(); it != v.end(); ++it) {
   auto value = *it;
   ...
}
for(auto value: v) {
   ...
} 

Thus for example, in the case of iterating a std::map<KEY,VALUE>, note that this loop variable is going to be a std::pair<KEY, VALUE> (just like when you deference the iterator when iterating a map in old C++ you are returned a pair).

By value or reference?

Just like you would do when dereferencing the iterator in the above example, you can choose whether you want to work with a value, a reference, or a const reference, i.e.:

// loop-variable is a value
for(auto i: arr) { printf("%d\n", i); }

// loop-variable is a reference, so can change it - this will give a compile error if arr is a const
for(auto& i: arr) { i += 1; } 

// loop-variable is a const reference - can't change it
for(const auto& i: arr) { printf("%d\n", i); } 

Generally, the following guidelines apply:

  • use a reference – auto& – when you want to make changes.
  • prefer to use a const reference – const auto& – if the object incurs any copying penalty.
  • if neither of the above apply, whether you use by value (auto) or by const reference is a matter of preference. If in doubt, const reference should be the safer choice.

Limitations

There are some limitations when using the range-based for-loop:

  • If you're used to iterating arrays by index, you no longer have that index if you need it for something (i.e. avoiding the trailing comma when outputting a comma-separated list, or just outputting the first N elements). You can of course track separately a count, but then you lose some of the conciseness gains.
  • If you want to iterate a collection backwards, there's no simple standard way in C++11 to do that with the 'ranged-based for'. You can use boost, which provides an adaptor so that you can do for (int y : boost::adaptors::reverse(x)), or you can create such an adaptor yourself. Or you can just use the old syntax with rbegin() and rend() in this case and at least still benefit from using auto.

Using 'range-based for' on your own collection classes

One area where things do get a little complex is in adapting your own custom classes to work with the new 'range-based for' loop. While this is not at all a new issue in C++11, in my case at least – in implementing my own collection classes – I had not bothered to implement iteration support very often, or at least not entirely properly conformant iteration support.

This was because, firstly, often using the iterators with the old syntax was simply not worth the hassle, i.e. if I have to choose between:

for(MyNamespace::MyVectorType<int>::iterator it = myVector.begin(); it != myVector.end(); ++it) {
   total += *it;
}

and

for(int i=0; i<myVector.size(); i++) {
   total += myVector[i];
}

I would anyway typically choose the latter for conciseness reasons.

Even if I had some collection which was not iterable so easily (i.e. it doesn't contain an underlying array and so can't be iterated by index), then there was no need to work out how to make them exactly conform to the specification – as long as there was some way defined to loop over the collection, which ideally but not neccessarily looked similar to the way the standard STL classes did it.

However with C++11, implementing proper iteration support becomes much more worth doing, and thus can be worth the dive into the understanding what the various ways are that classes can be extended to support iteration in this style.

Adding begin() + end() class member functions

The simplest way is to define begin() + end() member functions. The variable returned by these needs to:

  • be incrementable, such that incrementing the value returned by begin() will eventually result in a value that matches the value returned by end()
  • return a sensible value corresponding to the when operator* is applied to it
  • if creating an custom iterator class for this variable, to fulfil this, the following three operators need to be fined:
    • the prefix increment operator – T& operator++()
    • the != comparison operator – bool operator!=(const T& t)const
    • for dereferencing to return the value – T& operator*()

One thing to note – begin() and end() will be called just once at the start of the loop, rather than every loop iteration.

For something like an array, it may be as simple as returning pointers, i.e. the following code works:

template<class T>
class MyArrayWrapper {
   T* data;
   int size;

public:
   int* begin() { return size>0 ? &data[0] : nullptr; }
   int* end()   { return size>0 ? &data[size-1] : nullptr; }

   ...
};

MyArrayWrapper<int> arr;
:::
for(int i: arr) {
   ...
}

For classes where simply returning a pointer (like in the above case) won't work, you generally need to define an iterator class, which fits the same needs mentioned above, i.e.:

template<class T, int SIZE>
class MyCircularBuffer {
   T* data;
   int beginPosition;
   int endPosition;

   class Iterator {
      T* data;
      int position;
   public:
      Iterator(T* _data, int _position):data(_data),position(_position) {}

      T& operator*() { return data[position]; }
      Iterator& operator++() { if(++position == SIZE) position = 0; return *this; }
      bool operator!=(const Iterator& it) const { return position != it.position; }
   };

public:
   Iterator begin() { return { data, beginPosition }; }
   Iterator end()   { return { data, endPosition }; }
};

MyCircularBuffer<int, 256> buf;
for(int i: buf) {
   ...
}

Just one problem ….

The above would be the end of the story. However, the above code will now not be able to iterate when the variable is a const, i.e. the following code won't compile:

   MyCircularBuffer<int, 256> buf;
   const auto& constBuf = buf;
   for(int i: constBuf) { // FAILS to compile - can't call non-const member function begin() on const object! 
      printf("%d", i); 
   }

One easy (but not correct) way to 'fix' this problem is to make the begin() and end() methods const, i.e. in MyCircularBuffer alter the method definitions to be:

   Iterator begin() const { return { data, beginPosition }; }
   Iterator end()   const { return { data, endPosition }; }

You will now be able to happily iterate both const and non-const variables. And many people (and most examples on the web) would happily stop there, as the above code is sufficient for a perfectly usable solution. However there is still one problem if we want to do things properly and use const. Take the following code example:

   MyCircularBuffer<int, 256> buf;
   const auto& constBuf = buf;

   for(int i: constBuf)        { printf("%d", i); }   // this will now run ok, which is good
   for(const int& i: constBuf) { printf("%d", i); }   // this will also now run ok, which is good
   for(int& i: constBuf)       { printf("%d", i++); } // this will compile+run, but it shouldn't!!

   std::vector<int> v;
   const auto& constV = v;

   for(int i: constV)        { printf("%d", i); }   // this is ok
   for(const int& i: constV) { printf("%d", i); }   // this is also ok
   for(int& i: constV)       { printf("%d", i++); } // FAILS to compile - can't iterate a const 
                                                    // collection with non-const ref!

The above code demonstrates that while we can now iterate our collection when it's being referenced by a const variable, it's also now possible to modify the contents of a MyCircularBuffer const object. We confirm above that this really shouldn't be possible by trying the same with std::vector, with which we get a compile error.

The full solution, with both const + non-const iteration

To fix the problem we first remove that naughty const that we just applied to our begin() and end() functions' signatures, and now we separately define a const iterator. The final result has a little more boilerplate, but can now handle iteration over both const + non-const objects correctly:

template<class T, int SIZE>
class MyCircularBuffer {
   T* data;
   int beginPosition;
   int endPosition;

   class Iterator {
      T* data;
      int position;
   public:
      Iterator(T* _data, int _position):data(_data),position(_position) {}

      T& operator*() { return data[position]; }
      Iterator& operator++() { if(++position == SIZE) position = 0; return *this; }
      bool operator!=(const Iterator& it) const { return position != it.position; }
   };
   class ConstIterator {
      T* data;
      int position;
   public:
      ConstIterator(T* _data, int _position):data(_data),position(_position) {}

      const T& operator*() const { return data[position]; }
      ConstIterator& operator++() { if(++position == SIZE) position = 0; return *this; }
      bool operator!=(const ConstIterator& it) const { return position != it.position; }
   };

public:
   Iterator begin() { return { data, beginPosition }; }
   Iterator end()   { return { data, endPosition }; }

   ConstIterator begin()const { return { data, beginPosition }; }
   ConstIterator end()  const { return { data, endPosition }; }

};

The only difference in the const iterator is that we define const T& operator*() const instead of T& operator*().

Supporting existing classes with global begin() + end()

It's also possible to support existing classes by creating these begin() and end() functions not as member functions, but rather as free functions (which should be in the same namespace as the class). This is exactly how support was added in C++11 to allow the 'ranged-for' loop to iterate over C-style arrays.

For example, if MyCircularBuffer didn't define any begin() or end(), then the following would work (at least, if the member variables are made accessible):

template<class T, int SIZE>
class Iterator {
    T* data;
    int position;

public:
    Iterator(T* _data, int _position):data(_data),position(_position) {}

    T& operator*() { return data[position]; }
    Iterator& operator++() { if(++position == SIZE) position = 0; return *this; }
    bool operator!=(const Iterator& it) const { return position != it.position; }
};

template<class T, int SIZE> Iterator<T, SIZE> begin(MyCircularBuffer<T,SIZE>& buf) { 
   return { buf.data, buf.beginPosition }; 
}
template<class T, int SIZE> Iterator<T, SIZE> end(MyCircularBuffer<T,SIZE>& buf) { 
   return { buf.data, buf.endPosition }; 
}

template<class T, int SIZE>
class ConstIterator {
    T* data;
    int position;
public:
    ConstIterator(T* _data, int _position):data(_data),position(_position) {}

    const T& operator*() const { return data[position]; }
    ConstIterator& operator++() { if(++position == SIZE) position = 0; return *this; }
    bool operator!=(const ConstIterator& it) const { return position != it.position; }
};

template<class T, int SIZE> ConstIterator<T, SIZE> begin(const MyCircularBuffer<T,SIZE>& buf) {
   return { buf.data, buf.beginPosition };
}
template<class T, int SIZE> ConstIterator<T, SIZE> end(const MyCircularBuffer<T,SIZE>& buf) {
   return { buf.data, buf.endPosition };
}


MyCircularBuffer<int, 256> buf;
for(int i: buf) {
   ...
}

One last possibility

Note, that as a third option it's also possible to achieve exactly the same thing by specializing std::begin() or std::end() in the same way.

6 comments

  1. David · · Reply

    Thanks for this blog post, it was really helpful!

  2. Nice post, but shouldn’t the definition of the end() point to an element “one past the last element of the container”? You have defined it as the last element:
    int* end() { return size>0 ? &data[size-1] : nullptr; }
    If, for instance you have an container with one element, begin() and end() will refer to the same element if your definition is used.

  3. Robert Basham · · Reply

    I have an matrix class that tracks changes, and the distinction between const and nonconst iterator is important; I don’t want a nonconst iterator called unless I am actually changing matrix cell values (because that sets a changed flag). Your example of using const and nonconst iterators with range-based for under “The full solution…” is very valuable in this regard. Unfortunately, it doesn’t work with Visual Studio 2013. In the sample code given, “for (auto iter : buf)” and “for (const auto &iter : buf)” both call the begin() method returning an Iterator, not a ConstIterator. I assume this is a limitation of the VS compiler. A workaround is to use a cast, like this: “for (auto iter : static_cast<MyCircularBuffer const>(buf))”, but that’s not very pretty, and undermines the simplicity of the ranged-base for.

    1. Robert Basham · · Reply

      Correction: although a static_cast works with the sample code, thecast to get a ConstIterator in VS2013 should really be a const_cast, as:
      for (auto iter : const_cast(buf))”

  4. Grant Rostig · · Reply

    Nice article!

  5. […] ‘for’ loop : blog 1, blog […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: