C++20 ranges benefits: avoid dangling pointers
In my last monthly post, C++20 benefits: consistency with ranges, we looked at what ranges do for us when it comes to consistency and how we can get the same level of consistency for our code.
Today I like to continue with the last example and see how ranges prevent us from dangling pointers. Another element that is great to have in our codebase.
Dangling pointers are bad
Okay, I assume that you already know that dangling pointers are bad. Just to be on the same page, let's recap what dangling pointers are and how quickly we can accidentally create one.
In the previous post, we finished by creating our custom
begin function, which works for both free and member functions:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
We used it like this:
1 2 3 4
This example does not really excel, as the result of
begin is never used. Let's change that. Suppose we have the following code:
1 2 3 4 5
In A, we use
begin to retrieve the begin iterator, and in B, we dereference it. This is, for example, what range-based for-loop does. Now the code does something, at least a bit of something, but it is enough for today's purpose.
This code works and is perfectly fine. But what happens if we change it slightly, like this:
1 2 3
In this case, we pass a temporary object, of type
custom::begin. This changes everything. The call to
custom::begin is fine. Dereferencing
iter2 isn't. We have a dangling pointer. The temporary object is destroyed after the full expression, after the semicolon in C.
Once we start using
iter2, we are looking at undefined behavior.
Avoid dangling pointers - Strategy 1
One approach to avoid a dangling pointer is that
custom::begin rejects temporaries. Only l-value references make sense here. A simple approach comes to mind, let's ban all other types with a
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
A slight change, here in A, and we effectively prevent our
custom::begin getting called with temporaries.
This is good, and in some cases exactly what you want. However, this is not what ranges do. The
static_assert solution has one drawback, we cannot pass a temporary to it. Ah, wasn't that the purpose of this exercise? Yes, but passing the temporary isn't the issue. As long as we do not dereference the result, it doesn't matter.
Let's look at an alternative solution.
Avoid dangling pointers - Strategy 2
static_assert is the limiting factor here. The
std::is_lvalue_reference_v<R> part is fine.
We know the type at compile-time, so let's use another
constexpr if to guard the good case and return a special type
dangling in all other cases:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
With this change, we delay the error to the point where it really occurs. The interesting construct is
dangling, which is returned in the case
begin is invoked with a temporary A. Below, you see a possible implementation of
1 2 3 4 5 6
We can see that
dangling is a struct with a default constructor and a constructor, which is a variadic template. There is no implementation. All this type should do is to give users a helpful error message. Assume we uncomment D:
1 2 3
Once we compile it, we get the following error message with Clang:
1 2 3 4 5 6 7 8 9 10
It is the name that should draw the users attention to the fact that here is something wrong. We delayed this error until the variable is really used. This is the version ranges use because under some circumstances, this behavior is desired.
For your own codebase, you can decide from case to case what's the better option.
I hope you learned something. I appreciate your feedback. Please reach out to me on Twitter or via email.