Iterators aren’t the problem, they just enable it

The post presenting the failure case for iterators contrasted this simple Python list comprehension:

res = [i*i for i in a if i > 0]

with its C++ equivalent using the STL:

vector<int> a {10, 11, -1, 12};

vector<int> t1 {};
copy_if (a.cbegin(), a.cend(), back_inserter(t1),
              [](auto i) { return i > 0; });

vector<int> res {};
transform (t1.cbegin(), t1.cend(), back_inserter(res),
               [](auto i) { return i * i; });

The list comprehension reduces the notation to the essentials of the algorithm: The source a and sink res are each mentioned once, while the code presents the transformations of an individual value i on its way from source to sink.

By contrast, the STL version obscures the transformation of i, reducing it to the parameter of two lambda expressions. The bulk of the C++ code is concerned with the source, sink, and the intermediate result t1. For each step, the programmer must specify the start and end of the range, decide whether the range is immutable (cbegin()/cend()) or mutable (begin()/end()), the class of transformation (copy_if or transform), the specifics of the transformation (via lambda expressions for the filter and and the square), and the extension strategy for the sink for each transformation (in this case, a back_inserter). That’s a lot of details. Indeed, the actual transformations performed, so elegantly phrased in the list comprehension, are secondary in the STL version, for which the traversal mechanics are the primary focus.

The classic defence of designs requiring detailed specification of the mechanics is that they permit more efficient code. Certainly the STL version permits gcc 6.2 to generate great code (see the post linked above). It is much harder to determine whether the list comprehension will inevitably generate less efficient code. In statically-typed languages such as Haskell, the compiler has the same type information about the source and sink as C++. In just-in-time compiled languages such as JavaScript, the compiler can generate code for the source and sink types actually encountered in program runs. In both cases, the compiler has greater flexibility in how intermediate results are stored than the C++ compiler because the intermediate representation is left entirely to the compiler rather than specified by the programmer. I wouldn’t bet against a modern compiler to generate as efficient code for the list comprehension as the C++ compiler generates for the STL version.

Even for environments in which list comprehensions do not generate comparable code, for small lists the inefficiency is irrelevant outside tight inner loops. If a programmer can whip off simple code to transform small lists in the less-frequent sections of code, they have more time to optimize the code that genuinely slows the system. Execution efficiency is not an absolute virtue but only a requirement for code that contributes substantively to execution time.

The STL code is especially inefficient in terms of programmer time. Each decision requires at least momentary attention from the programmer. Each feature that the programmer is required to specify incurs a cognitive load for that decision. Some of those decisions can be subsumed into idioms—for example, always specifying const_iterators (via cbegin()/cend()) for the source range in a copy() or copy_if() algorithm, but these idioms are highly local. For example, you must specify regular, mutable iterators (via begin()/end()) for the range passed to generate(). Idioms only mitigate the load incurred by all these decisions, they do not eliminate it.

The decisions required of a programer writing the code are mirrored by the questions that must be answered by any programmer reading the code. Although the process is somewhat different, the result is similar: Code that specifies that many choices is harder to read than code that merely implies them.

The complexity arises from designing around iterators

I have framed this discussion in terms of iterators but iterators themselves are not causing the complexity. The successful case showed that with the right syntactic sugar, an iterator-based approach could be equivalently clear and terse as the list comprehension approach. But the syntactic sugar of ranged for only addresses a small set of requirements. The moment we move outside that range, we have to specify traversal rules and containers for intermediate results.

The complexity of the STL approach arises from the very core of the STL design: by separating containers from algorithms and linking them via the category of iterators supported by a given container, the design forces the programmer to consider all three parts. The designer can simplify around the edges but the core complexity remains irreducible. Iterators are a cool idea but designing collections and algorithms around them seems to inevitably make the resulting design complex to use.

There is one particular piece of the puzzle that is complex enough to warrant its own post. I’ve alluded several times to the choice of how to actually add results to the sinks. This is a use case where C++’s insistence on explicit allocation and construction of values goes from a strength to a liability. I’ll tackle this point in the next post.