In this post, we are continuing to explore lambdas and comparing them to function objects. In the previous post, Under the covers of C++ lambdas - Part 1: The static invoker, we looked at the static invoker. Part 2 takes a closer look at captures.
This post is once again all about under the covers of lambdas and not about how and where to apply them. For those of you who like to know how they work and where to use them, I recommend Bartłomiej Filipek's book C++ Lambda Story:
Bartek is also the one who made me look deeper into this post's topic, lambda captures. Capturing variables or objects is the probably most compelling thing about lambdas. A few weeks ago, Bartłomiej Filipek approached me with the example below, which also led to a C++ Insights issue (see issue #347). It was initially raised to Bartek by Dawid Pilarski during the review of Bartek's C++ Lambda Story book.
1 2 3 4 5
The code C++ Insights created for it was the following (yes, the past tense is intentional here):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
Bartek's observation was that the way C++ Insights shows the transformation, we get more copies than we should and want. Look at the constructor of
__lambda_5_12. It takes an
std::string object by copy. Then in the class-initializer list,
_str is copied into
str. That makes two copies. As a mental model, once again, think
str being an expensive type. Bartek also checked what compilers do with a hand-crafted
struct that leaves a bread-crumb for each special-member function called. I assume you are not surprised, but with real lambdas, there is no additional copy. So how does the compiler do this?
First, let's see what the Standard says. N4861 [expr.prim.lambda.closure] p1 says the closure type is a class type. Then in p2
The closure type is not an aggregate type.
Now, one thing that (I think is key) is the definition of aggregate [dcl.init.aggr] p1.2
no private or protected direct non-static data members
This is to my reading some kind of double negation. As the closure type is a class but not an aggregate, the data members must be private. All the other restrictions for aggregates are met anyway.
Then back in [expr.prim.lambda.closure], p3
The closure type for a lambda-expression has a public inline function call operator...
Here public is explicitly mentioned. I read it that we use class rather than struct to define the closure type.
What does the Standard say about captures? The relevant part for this discussion is [expr.prim.lambda.capture] p15:
When the lambda-expression is evaluated, the entities that are captured by copy are used to direct-initialize each corresponding non-static data member of the resulting closure object
The data members are direct-initialized! Remember, we have a
class, and the data members are
Captures Fact Check
The AST C++ Insights uses from Clang says that the closure type is defined with class. It also says that the data members are private. So far, the interpretation of the Standard seems fine. I don't tweak or interfere at this point. But, Clang doesn't provide a constructor for the closure type! This is the part that C++ Insights makes up. This is why it can be wrong. And this is why the C++ Insights transformation was wrong for Bartek's initial example. But wait, the data members are
private, and there is no constructor. How are they initialized? Especially with direct-init?
Do capturing lambdas have a constructor?
I discussed this with Jason about this; I think at last year's code::dive. He also pointed out that C++ Insights shows a constructor while it is unclear whether there really is one. [expr.prim.lambda.closure] p13 says the following:
The closure type associated with a lambda-expression has no default constructor if the lambda-expression has a lambda-capture and a defaulted default constructor otherwise. It has a defaulted copy constructor and a defaulted move constructor (220.127.116.11). It has a deleted copy assignment operator if the lambda-expression has a lambda-capture and defaulted copy and move assignment operators otherwise...
There is no explicit mention of a constructor to initialize the data members. But even with a constructor, we cannot get direct-init. How does it work efficiently?
Suppose we have a
class with a private data member. In that case, we can get direct-init behavior by using in-class member initialization (or default member initializer as it is called in the Standard).
1 2 3 4 5 6
Here we define a variable in an outer scope A and use it later B to initialize a private member of
Closure. That works, but note that inside
Closure, it is
_x now. We cannot use the same name for the data member as the one from the outer scope. The data member would shadow the outer definition and initialize it with itself. For C++ Insights, I cannot show it that way if I don't replace all captures in the call operator with a prefixed or suffixed version.
Once again, we are in compiler-land. Here is my view. All the restrictions like
private and a constructor are just firewalls between C++ developers and the compiler. It is an API if you like. Whatever the compiler internally does is up to the compiler, as long as it is as specified by the Standard. Roughly Clang does exactly what we as users are not allowed to do, it to some extend, uses in-class member initialization. In the case of a lambda, the compiler creates the closure type for us. Variables names are only important to the compiler while parsing our code. After that, the compiler thinks and works with the AST. Names are less important in that representation. What the compiler has to do, is to remember that the closure type's
x was initialized with an outer scope
x. Believe me, that is a power the compiler has.
C++ Insights and lambda's constructors
Thanks to Bartek's idea, the constructors of lambdas take their arguments by
const reference now. This helps in most cases to make the code behave close to what the compiler does. However, when a variable is moved into a lambda, the C++ Insights version is still slightly less efficient than what the compiler generates. Here is an example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
If you run this on your command-line or in Compiler Explorer, you get the following output:
This is the transformed version from C++ Insights:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
Here is the output which you can see on Compiler Explorer:
1 2 3
Notice the second
move-ctor? This is because it is still no direct-init. I need a second
move in the lambda's constructor to keep the move'ness. The compiler still beats me (or C++ Insights).
Lambdas: 2, Function objects: 0
In the next part of the lambda series, I will go into details about generic lambdas. We will continue to compare lambdas to function objects and see which, in the end, scores better.
Support the project
I’m grateful to Bartłomiej Filipek for reviewing a draft of this post.