Simplifying Loops with C++11 in Qt Ways

Recently, I looked through the code base of a medium-sized project to see how I could simplify handwritten for-loops by using C++11’s new range-based for and STL algorithms with lambda expressions. The results in short: Range-based for makes loops simpler, easier to understand and often faster. STL algorithms are often a bit harder to read and write than range-based for loops, because lambda expressions are pretty clumsy, algorithm naming is inconsistent, and algorithm interfaces are inconvenient. But, they are still better than handwritten for-loops.

Contents

  • Using Range-Based For – Introduces the new C++11 range-based for-loop and compares it to index-based, iterator-based and foreach loops.
  • Using STL Algorithms with Lambdas – Gives typical examples of how to rewrite handwritten for-loops with STL algorithms and range-based for. Discusses the pros and cons of the different implementations.
  • Conclusion

Using Range-Based For

When I looked through the 900+ for-loops written in a medium-sized project, I spotted three main ways how to iterate over a container: index-based loops, iterator-based loops and loops using Qt’s foreach macro. Index-based loops are often inefficient. Iterator-based loops are efficient but hard to read. Qt’s foreach macro only works with Qt containers. We can get rid of all these problems by using C++11’s new range-based for. Let me illustrate this with an example.

The function toggleTimers() below iterates over the list m_timerColl by iterating over the admissible indices from 0 to m_timerColl.size() - 1.

// Example 1a: Index-Based Loop (old C++)

// thing.h
class Thing {
public:
    void toggleTimers();

private:
    QList<QTimer*> m_timerColl;
};

// thing.cpp
void Thing::toggleTimers() {
    for (int i = 0; i < m_timerColl.size(); ++i) {
        if (m_timerColl[i]->isActive())
            m_timerColl[i]->stop();
        else
            m_timerColl[i]->start();
    }
}

The implementation suffers from two inefficiencies. First, the body of the for-loop accesses the element at index i three times to call some functions on the timers. Calling the index operator three times (m_timerColl[i]) is not only code duplication, but also inefficient. The index operator uses some arithmetical operations to calculate the pointer to the location where the element with index i resides in memory. Second, the expression m_timerColl.size() is evaluated in every iteration of the loop. We can fix these two inefficiencies as follows.

// Example 1b: Improved Index-Based Loop (old C++)

// thing.cpp
void Thing::toggleTimers() {
    QTimer *timer = 0;
    const int timerCount = m_timerColl.size();
    for (int i = 0; i < timerCount; ++i) {
        timer = m_timerColl[i];
        if (timer->isActive())
            timer->stop();
        else
            timer->start();
    }
}

As Qt guarantees that the index operator works in constant time, index-based for-loops run in linear time. Note that index-based iteration over std::list is not possible, because the standard C++ list container does not provide an index operator. As the C++ standard must not prescribe a special implementation of lists (as done by Qt), the best guarantee for the index operator would be linear time using std::advance(). Hence, index-based loops would always have quadratic time complexity. This is why STL lists don’t have an index operator.

Let us admit it: Qt made us lazy! We were too lazy to write the for-loop with STL iterators.

// Example 1c: Iterator-Based Loop  (old C++)

// thing.cpp
void Thing::toggleTimers() {
    QList::iterator end = m_timerColl.end();
    for (QList::iterator timer = m_timerColl.begin(); timer != end; ++timer) 
    {
        if ((*timer)->isActive())
            (*timer)->stop();
        else
            (*timer)->start();
    }
}

Not excactly a pretty sight, but at least more efficient than index-based loops. The iterator timer is incremented in each step, which is more efficient than incrementing the index i and performing some arithmetical operations for the index operator.

Qt has – of course – a solution for the problems with index-based and iterator-based for-loops: the foreach macro.

// Example 1d: Foreach Macro (Qt/C++)

// thing.cpp (Qt/C++)
void Thing::toggleTimers() {
    foreach (QTimer *timer, m_timerColl) {
        if (timer->isActive())
            timer->stop();
        else
            timer->start();
    }
}

This version is as efficient as the iterator-based loop (see Example 2) and even simpler to read than the index-based loops (see Examples 1a and 1b). But it only works for Qt containers. It does not work for STL containers. Moreover, it’s a macro. If the type of the “iterator” variable contains a comma like QPair, the foreach gets confused. We would have to declare the “iterator” variable before the foreach loop.

We can conclude that all of the above for-loops have problems: performance, simplicity and generality. The new range-based for statement of C++11 solves all of these problems.

// Example 1e: Range-Based For-Loop (C++11)

// thing.cpp
void Thing::toggleTimers() {
    for (auto timer : m_timerColl) {
        if (timer->isActive())
            timer->stop();
        else
            timer->start();
    }
}

If we looked at the implementation of range-based for, we would see that it is implemented very similar to the iterator-based version. The range-based for combined with auto is a nifty way to avoid clumsy STL iterators without sacrificing performance.

When we move to C++11 in our projects, we should consistently use range-based for instead of index-based loops, iterator-based loops or foreach macros. Based on my experience, we should be able to do this for 95% of the for-loops. For-loops that require indices are pretty rare. This is fully in line with Swift – the new iOS programming language – dropping the increment and decrement operators in version 2.2.

Using STL Algorithms with Lambdas

Rule #84 of [CodingRules] advises us to “prefer algorithm calls to handwritten loops”, because algorithm calls are “more expressive and maintainable, less error-prone, and as efficient”. So, let us roll up our sleeves, rewrite some typical hand-written loops with STL or Qt algorithms, and discuss how the results compare to the originals.

Using the any_of Algorithm

Let us start with a simple example that was implemented in a pretty complicated way.

// Example 2a (old C++)

// In thing.h
QList<int> m_groups;

// In thing.cpp
bool Thing::containsGroup(int groupId) const
{
    bool didFind = false;
    for (int i = 0; i < m_groups.count(); ++i) {
        if (m_groups[i] == groupId) {
            didFind = true;
            break;
        }
    }
    return didFind;
}

When I looked at the code for the first time, I was puzzled for a couple of seconds and then started chuckling. In Qt, we could simply rewrite this function as follows.

// Example 2b (Qt)

// In thing.h
QList<int> m_groups;

// In thing.cpp
bool Thing::containsGroup(int groupId) const
{
    return m_groups.contains(groupId);
}

Now, it is more than obvious what this function does. It checks whether the groupId is contained in the container m_groups. The writer of the original code (Example 2a) wasted quite a few “brain cycles” to reinvent the wheel. The original solution is also inefficient, because it uses an index-based loop.

If we cannot use Qt containers, we can improve the original solution – even before the advent of C++11. We simply use the find algorithm of the STL.

// Example 2c (old C++)

// In thing.h
std::list<int> m_groups;

// In thing.cpp
#include <algorithm>
using namespace std;

bool Thing::containsGroup(int groupId) const
{
    return find(m_groups.begin(), m_groups.end(), groupId) != m_groups.end();
}

The function find() returns an iterator to the element in m_groups that is equal to groupId. If this iterator is not equal to the end iterator, m_groups contains an element with groupId and the function returns true. Otherwise, it returns false.

C++11 comes with a new function any_of, which returns true if the container m_groups contains at least one occurrence of groupId. This allows us to drop the comparison with the end iterator. C++11 also introduces constant versions of the begin and end iterator. As we do not modify the container, these constant versions are more appropriate than the non-constant ones. Using these two features, we can change the function containsGroup() as follows.

// Example 2d (C++11)

bool Thing::containsGroup(int groupId) const
{
    return any_of(m_groups.cbegin(), m_groups.cend(),
                  [groupId](int id) { return groupId == id; } );
}

The highlighted code is a lambda function, which returns true if the id of the current element is equal to groupId. The capture [groupId] says that the lambda expression may only use the variable groupId from the outside. The expression (int id) is the parameter list of the lambda function. While iterating over the container, any_of will pass each element of the container – with type int – to the lambda function as the function’s argument.

Preferring algorithm calls to handwritten loops – as stated in rule #84 of [CodingRules] – is certainly good advice. All refactored versions (Examples 2b, 2c and 2d) are much easier to understand than the original version (Example 2a). We can look at them and know immediately what they do. They are certainly more efficient and less error-prone.

I think that the STL could do better than with the any_of algorithm. The STL provides any_of, which is equivalent to std::find_if and which can be implemented using find_if. However, the STL does not provide any equivalent to find. Unfortunately, such an algorithm is neither part of C++14 nor C++17.

Fortunately, we can easily write such a function. We call the new function contains, because overloading of any_of does not work and because it is a more telling name than any_of.

// Example 2e (C++)
template<class InputIt, class T>
bool contains(InputIt first, InputIt last, const T& value)
{
    return std::find(first, last, value) != last;
}

Then, we can use our brand new contains algorithm to simplify our containsGroup function.

// Example 2f (C++11)
bool Thing::containsGroup(int groupId) const
{
    return contains(m_groups.cbegin(), m_groups.cend(), groupId); 
}

I would prefer this version over the Qt version (Example 2b), because it is more general in supporting both Qt and STL containers, that is, it is more robust when things change. More developers should know this solution than the Qt solution. It may be a candidate for QtAlgorithms. By the way, any_of, which actually behaves like any_of_if, should be made available as contains_if.

Using the find_if Algorithm

What does the for-loop in the function findConfigElementById() do?

// Example 3a (old C++)

// thing.h
struct ConfigElement {
    ConfigElement() : id(-1), value(0) {}
    ConfigElement(int cid, const QString &cname, int cvalue)
        : id(cid), name(cname), value(cvalue) {}
    int id;
    QString name;
    int value;
};

QList<ConfigElement> m_configElements;

// thing.cpp
bool Thing::findConfigElementById(int id, ConfigElement *element) const
{
    bool didFind = false;
    ConfigElement currentElement;
    int numberOfConfigElements = m_configElements.size();
    for (int i = 0; i < numberOfConfigElements; ++i) {
        currentElement = m_configElements.at(i);
        if (currentElement.id == id) {
            didFind = true;
            *element = currentElement;
            break;
        }
    }
    return didFind;
}

// client.cpp
ConfigElement element;
if (aThing.findConfigElementById(5, &element))
    // do something with element
else
    // do something else with element

Well, I guess it took all of us some time to figure out that this function actually performs a find_if. Then, let us simply write down the intent of the for-loop directly.

// Example 3b (C++11)

// thing.cpp
bool Thing::findConfigElementById(int id, ConfigElement *element) const
{
    auto pos = find_if(m_configElements.cbegin(), m_configElements.cend(),
                       [id](const ConfigElement &elem) { return elem.id == id; });
    if (pos != m_configElements.cend()) {
        *element = *pos;
        return true;
    }
    return false;
}

Using the algorithm find_if states the intent of the function very clearly. The return values depend on whether an element was found in the container or not. Hence, the if-statement is absolutely normal. Calling find_if and computing return values depending on the result of find_if becomes second nature, once we start using STL algorithms regularly.

The new implementation is also more efficient. It got rid of the index-based loop with direct access to the container elements. It also replaces three temporary variables by one. Finally, the new implementation makes less assumptions about the container used. It can use any STL or Qt container. The container does not have to provide direct access to its element. We can easier switch to a different container. All in all, the advice of [CodingRules] to “prefer algorithm calls to handwritten loops” is good advice.

This time Qt cannot play the role of the white knight and save us with a super-simple solution. We can, however, come up with a pretty simple solution using a range-based for-loop.

// Example 3c (C++11)

// thing.cpp
bool Thing::findConfigElementById(int id, ConfigElement *element) const
{
    for (const auto &elem : m_configElements) {
        if (elem.id == id) {
            *element = elem;
            return true;
        }
    }
    return false;
}

It is important to write const auto &elem instead of auto elem in the for-loop. The latter form would copy the current element into the loop variable elem in each iteration (see the implementation of the range-based for-loop for details). The former form with the constant reference avoids these copies – and is more efficient.

The solution shown in Example 3c is as efficient and as general as the solution with the find_if algorithm (Example 3b). On top of that, it is simpler to read and write.

This result is typical for almost all of the 900+ for-loops: The range-based for yields simpler code than the STL algorithms. The reason is that STL algorithms with lambda expressions tend to be pretty lengthy and hence feel a bit clumsy. However, we don’t have to say good-bye to the advice of “preferring algorithm calls to handwritten loops”, as Jossutis points out in [StandardLibrary]:

Historically, one of the most important algorithms was for_each(). […] Note, however, that since C++11, the range-based for loop provides this behavior more conveniently and more naturally […]. Thus, for_each() might lose its importance over time.

This settles it: range-based for is an algorithm in disguise!

I have bad news. We are not yet done with improving our function. The function violates the most important principle of object-oriented design: the single-responsibility principle! Its first responsibility is to check whether there is an element matching a given criterion. Its second responsibility is to return the element itself through its second parameter element.

We can fix this by making findConfigElementById() return an object of type ConfigElement. By checking the validity of the returned object, clients can figure out whether the container holds an element satisfying the given criterion. Let us put this in code.

// Example 3d (C++11)

// thing.h
struct ConfigElement {
    ...
    bool isValid() const { return id != -1; }
};

// thing.cpp
ConfigElement Thing::findConfigElementById(int id) const
{
    for (const auto &elem : m_configElements) {
        if (elem.id == id) {
            return elem;
        }
    }
    return ConfigElement{};
}

// client.cpp
const auto element = aThing.findConfigElementById(5);
if (element.isValid())
    // do something with element
else
    // do something else with element

Changing the interface of findConfigElementById() improves the client code. We can define element as constant, which isn’t possible in the original solution. We can express exactly whether we want to modify element later on or not.

Changing the interface has another less visible advantage. It is slightly more efficient than the previous version (Example 3c). For the previous version, client code always creates a ConfigElement object before it calls the function findConfigElementById() – no matter whether the function finds a matching element or not. The latest version creates such a default object only if the function doesn’t find anything. Return-value optimisation makes sure that this object is constructed directly into the element variable of the client code – without any copying taken place.

But that’s not all. When the function finds a value, the previous version copy assigns the found element to the return value (*element = elem), whereas the latest version copy constructs its return value (return elem). This is more efficient because the copy assignment must call the destructors on every member variable of the target of the assignment. Copy construction doesn’t incur this overhead.

So, following coding rules and principles pays off. We have turned the original implementation of findConfigElementById() into a much simpler and slightly more efficient implementation. Good job!

Using the replace_if Algorithm

We have looked at non-modifying for-loops so far. This is going to change now.

// Example 4a (old C++)

// thing.h
struct Parameter {
    int id;
    int value;
};
QList<Parameter> m_group;

// thing.cpp
void ForLoopsTest::updateParameterInGroup(const Parameter &param)
{
    Parameter current;
    for (int i = 0; i < m_group.size(); ++i) {
        current = m_group[i];
        if (current.id == param.id) {
            m_group.replace(i, param);
        }
    }
}

This function replaces every element current in the list m_group by the new element param, if the IDs of current and param are equal. The function performs a replace_if. We can easily rewrite this function with a range-based for.

// Example 4b (C++11)

// thing.cpp
void ForLoopsTest::updateParameterInGroup(const Parameter &param)
{
    for (auto &current : m_group) {
        if (current.id == param.id) {
            current = param;
        }
    }
}

Note that the loop variable current is declared as a non-constant variable, because the assignment current = param changes the object in the list m_group referred to by current.

The following solution using the replace_if makes it slightly more obvious than the previous solution what’s going on. But, as usual, it’s a bit longer.

// Example 4c (C++11)

// thing.cpp
void ForLoopsTest::updateParameterInGroup(const Parameter &param)
{
    replace_if(m_group.begin(), m_group.end(),
               [param](const Parameter &current) { return current.id == param.id; },
               param);
}

Note that we must use the non-constant begin and end iterator, because we change the underlying container by the replacements.

Using the accumulate Algorithm

This is the first example, where the STL algorithm is less efficient than a range-based for-loop. The index-based for-loop runs through a list of string pairs, concatenates the string pairs in a certain format, and appends the concatenated string to the result.

// Example 5a (old C++)

// thing.h
QList< QPair<QString, QString> > m_versions;

// thing.cpp
QString Thing::getModuleVersions() const
{
    QString result;
    int versionSize = m_versions.size();
    for (int i = 0; i < versionSize; ++i) {
        result += QString("%1: %2\n").arg(m_versions[i].first).arg(m_versions[i].second);
    }
    return result;
}

Rewriting this function with a range-based for-loop yields the following result.

// Example 5b (C++11)

// thing.cpp
QString Thing::getModuleVersions() const
{
    QString result;
    for (const auto &version : m_versions) {
        result += QString("%1: %2\n").arg(version.first).arg(version.second);
    }
    return result;
}

The for-loop accumulates the result line by line. Hence, the accumulate algorithm from the numeric header should be a good match.

// Example 5c (C++11)

// thing.cpp
#include <numeric>
using namespace std;

QString Thing::getModuleVersions() const
{
    return accumulate(m_versions.begin(), m_versions.end(), QString{},
               [](const QString &result, const QPair<QString, QString> &version) {
                   return result + QString("%1: %2\n").arg(version.first).arg(version.second);
               });
}

The third argument QString{} in the accumulate call is the initial value of the accumulated result. In each iteration, the algorithm concatenates the result accumulated so far with a new value and assigns this concatenated string to the accumulated result. The range-based for-loop does not need to compute this concatenated string but can simply append the new value to the accumulated result. Furthermore, if the accumulated type (here QString) does not support move semantics or does not use implicit sharing like QString or Qt’s container types, assigning the concatenation to the accumulated result performs a deep copy.

Hence, the range-based for solution is more efficient than the accumulate algorithm and should be the preferred solution.

Using the copy Algorithm on a C-Style Array

The following example copies data from one C-style array – starting at the third position – to another C-style array.

// Example 6a (old C++)

// thing.h
static const quint8 UDS_CAN_OFFSET = 2;
static const quint8 SIZE_CAN_MESSAGE = 8;
static const quint8 SIZE_UDS_MESSAGE = 200;
quint8 m_udsMessage[SIZE_UDS_MESSAGE];


// thing.cpp
void Thing::convertCanToUdsMessage(const quint8 *canMessage)
{
    for (quint8 i = UDS_CAN_OFFSET; i < SIZE_CAN_MESSAGE; ++i) {
        m_udsMessage[i - UDS_CAN_OFFSET] = canMessage[i];
    }
}

Even before C++11, we were able to replace the handwritten loop by the copy algorithm. The for-loop shrinks to a single line.

// Example 6b (old C++)

// thing.cpp
#include <algorithm>
using namespace std;

void Thing::convertCanToUdsMessage(const quint8 *canMessage)
{
    copy(canMessage + UDS_CAN_OFFSET, canMessage + SIZE_CAN_MESSAGE, m_udsMessage);
}

The copy call copies the elements from index 2 to index SIZE_CAN_MESSAGE - 1 in the array canMessage to the array m_udsMessage. The destination array m_udsMessage must be big enough to store all the copied elements.

The big problem with the above implementations is that they use C-style arrays. CAN messages can be longer than 8 bytes. The above functions would simply ignore all additional bytes. UDS messages are typically much shorter than 200 bytes. So, the functions waste memory. It would be a good idea to replace the C-style array by std::vector or QVector, which is actually what rule #77 of [CodingRules] (“use vector instead of arrays”) suggests.

Before we switch from C-style arrays to QVector, we should rewrite the above function such that it works both with C-style arrays and QVector. We would need the iterator functions begin() and end(), which are provided by a class like QVector but not by a built-in type like a C-style array. C++11 comes to our rescue with the global versions of begin() and end() taking a container as their argument.

There is one little problem left. We can apply the global begin() and end() functions only to C-style arrays whose size is known. By passing the canMessage as const quint8 *canMessage or const quint8 canMessage[], we strip off the size information. We can fix this by passing a reference to the array canMessage of fixed size SIZE_CAN_MESSAGE. This is written in a slightly capricious way as const quint8 (&canMessage)[SIZE_CAN_MESSAGE].

// Example 6c (C++11)

// thing.cpp
void Thing::convertCanToUdsMessage(const quint8 (&canMessage)[SIZE_CAN_MESSAGE])
{
    copy(begin(canMessage) + UDS_CAN_OFFSET, end(canMessage), begin(m_udsMessage));
}

Now, we can simply change the type of the parameter canMessage to const QVector<quint8> & – without changing the implementation of our function.

// Example 6d (C++11)

// thing.h
static const quint8 SIZE_UDS_MESSAGE = 200;
quint8 m_udsMessage[SIZE_UDS_MESSAGE];

// thing.cpp
void Thing::convertCanToUdsMessage(const QVector<quint8> &canMessage)
{
    copy(begin(canMessage) + UDS_CAN_OFFSET, end(canMessage), begin(m_udsMessage));
}

If we change the type of m_udsMessage to QVector<quint8> as well, we must make sure that QVector reserves enough space to store all the elements from canMessage. Otherwise, the copy algorithm will crash, because it accesses non-existing indices of the target vector m_udsMessage.

Reserving enough space for the vector m_udsMessage undoes the advantage of using a vector over a C-style array. The remedy is to use back_inserter(m_udsMessage) instead of begin(m_udsMessage). The back_inserter() makes sure that the vector is big enough, before it copies an element to a non-existing position in m_udsMessage.

// Example 6e (C++11)

// thing.h
QVector<quint8> m_udsMessage;

// thing.cpp
void Thing::convertCanToUdsMessage(const QVector<quint8> &canMessage)
{
    copy(begin(canMessage) + UDS_CAN_OFFSET, end(canMessage), back_inserter(m_udsMessage));
}

The downside is that this version does not work with C-style arrays any more. It crashes for C-style arrays. Well, we cannot have everything.

This last version (Example 6e) would typically be the final step in a big refactoring, where we replaced C-style arrays by vectors. It is good proof of rule #77 from [CodingRules] that we should use vectors instead of C-style arrays. We eliminated quite a few opportunities to shoot ourselves in the foot.

Note that this example is one of the rare cases, where we cannot replace an index-based for-loop by a range-based for-loop. However, we can get rid of the indices by using the copy algorithm.

Conclusion

C++11’s new range-based for turns out to be a pretty powerful feature. When we rewrite handwritten for-loops with the new range-based for, the result is almost always simpler and often more efficient. Range-based for-loops are always simpler than iterator-based for-loops. Well, that was the main reason why range-based for was introduced in C++11, wasn’t it?

With the introduction of lambda expressions in C++11, we can consider using STL algorithms seriously for the first time – instead of handwritten for-loops. Lambda expressions are more flexible, more powerful and simpler to write and read than the function objects from the pre-C++11 era. As the examples above show, STL algorithms with lambda expressions tend to be lengthier than range-based for-loops. However, STL algorithms always state explicitly what they do. STL algorithms and range-based for have very similar performance most of the times.

The advice of rule #84 from [CodingRules] to “prefer algorithm calls to handwritten loops” is very good advice. And don’t forget that the range-based for is a powerful, yet simple replacement of the for_each algorithm. The rule should be updated to: “prefer algorithm calls (including range-based for) to handwritten loops”. Range-based for will often be your first choice, before you fall back to STL algorithms with lambdas.

References

  • [EffectiveCpp11] Scott Meyers. Effective Modern C++: 42 Specific Ways to Improve your Use of C++11 and C++14. O’Reilly, 2015
  • [CodingRules] Herb Sutter, Andrei Alexandrescu. C++ Coding Standards: 101 Rules, Guidelines, and Best Practices. Pearson Education, 2005
  • [StandardLibrary] Nicolai M. Jossutis. The C++ Standard Library, 2nd edition. Pearson Education, 2012