Write more C++ code thanks to constexpr
Since the keyword constexpr
and its behavior got included in C++, it has been improved in each and every new standard of the language.
I'm a big fan of constexpr
and am not alone. Jason Turner is also very vocal, having coined the term "constexpr all the things".
Well, demonstrating the powers of constexpr
is nonetheless something difficult. I know that from my training classes and various consulting contracts. Today, I'd like to share a story from back in time when a customer hired me to consult. They did develop an embedded system and ran out of memory. Not during run-time, but before. The features they wanted to put in the chip were too big in code size and somewhat RAM.
Initial constexpr
-free example
They used a class I've seen a couple of times in embedded systems with some variations. A string brings its memory picky-back. Here is a reduced implementation.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
|
Disclaimer: this is touching UB. You must ensure no padding between the base class and the derived class. Let's assume they took care of that, which they did.
What you're looking at is the base class FixedStringBase
. The purpose here is to provide an interface. In the original, it's a bit richer with functions. The class used to create such a string is FixedString
. You can see that it's a class template taking a non-type template parameter N
; the size of that storage it reserves as a char
array.
The constructor of FixedString
determines the length of the C-string literal and copies the data into the char
array.
Below is an example of how such a string is used. There are more use cases, but this is the most interesting one.
1 2 3 4 5 6 7 8 9 10 11 |
|
The first variable in A is created by hand. As you can see, the person who created that string didn't commit to its length and over-allocated.
The second variable B uses the better approach, with the help of make_fixed_string
. This function determines the length of the string with a trick at compile-time. Using that length, the function creates an ideally allocated FixedString
. Better.
How are things going
But how good are both versions? The answer is terrible. You spent too much memory on that implementation. The application stores the C-string literal and the storage for the FixedString
. Additionally, while both variables are static
and const
in an optimal world, you would not see a call and the code of the constructor. But even with -O3
, there is a lot of code for, well, nothing.
Once I saw this, I suggested sprinkling a bit of constexpr
into that code. I've been told that it would be a waste of time since, after all, a lot of the use cases involved altering a FixedString
at run-time. I pointed out that a sizeable number of static strings still consume more memory than I liked. We decided to do the experiment.
I added constexpr
to the two constructors, the member function length
, and the make_fixed_string
function. Additionally, I replaced the const
with constexpr
for the two variables.
Well, what should I say...
The big bang
Big surprise! Less on my side :-)
The code Clang generated for this brief example was different big time. The constexpr
solution used 31 lines of assembly, while the original code required 42. But, more importantly, the memory consumption went down. In the constexpr
solution, the compiler understands what we are trying to achieve and does not include the C-string literal at all. This effect cannot be measured with assembly instructors because it affects the data segment.
The compiler also understands the layout of FixedString
, where you can see the original string stored directly in the place where it belongs. Needless to say, there is no code for the constructor present in the binary in this example.
Peaking into the assembly code
These are the relevant lines of assembly Clang produced for the constexpr
-less version:
1 2 3 4 5 6 7 8 9 10 11 |
|
Here you see the assembly for the constexpr
-all-the-things version (a bit of a lie, more could be constexpr
):
1 2 3 4 5 6 7 8 9 10 |
|
Here, you can see the compiler understands the FixedString
layout. You can see the length of the string and the string itself. It's like a C-string. Keeping things easy even for debugging or analyzing the binary via objdump
and friends.
Want to see for yourself? Sure, here is a link to Compiler Explorer: compiler-explorer.com/z/4hx1KdGf8.
Takeaways
Of course, measure, measure, measure. But also, please understand that constexpr
can really impact your code. Not just code size. The above also gives your application a speed-up because there is no code to initialize a static string anymore.
Andreas
P.S: Sorry for the title. But if you think about it, it's true. Thanks to constexpr
, you can write more code that still fits onto the device!