Logo

Blog


C++20 ranges benefits: avoid dangling pointers

This post is a short version of Chapter 3 Ranges from my latest book Programming with C++20. The book contains a more detailed explanation and more information about this topic.

In my last monthly post, C++20 benefits: consistency with ranges, we looked at what ranges do for us when it comes to consistency and how we can get the same level of consistency for our code.

Today I like to continue with the last example and see how ranges prevent us from dangling pointers. Another element that is great to have in our codebase.

Dangling pointers are bad

Okay, I assume that you already know that dangling pointers are bad. Just to be on the same page, let's recap what dangling pointers are and how quickly we can accidentally create one.

In the previous post, we finished by creating our custom begin function, which works for both free and member functions:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
namespace custom {
  namespace details {
    struct begin_fn {  A Callable
      template<class R>
      constexpr auto operator()(R&& rng) const
      {
        B Free-function
        if constexpr(requires(R rng) { begin(std::forward<R>(rng)); }) {
          return begin(std::forward<R>(rng));

          C Same as above for containers
        } else if constexpr(requires(R rng) {
                              std::forward<R>(rng).begin();
                            }) {
          return std::forward<R>(rng).begin();
        }
      }
    };
  }  // namespace details

  D Callable variable named begin
  inline constexpr details::begin_fn begin{};
}  // namespace custom

We used it like this:

1
2
3
4
void Use(auto& c)
{
  custom::begin(c);
}

This example does not really excel, as the result of begin is never used. Let's change that. Suppose we have the following code:

1
2
3
4
5
Container c{};

auto iter1 = custom::begin(c);  A Get the begin

auto value = *iter1;  B Get the value

In A, we use begin to retrieve the begin iterator, and in B, we dereference it. This is, for example, what range-based for-loop does. Now the code does something, at least a bit of something, but it is enough for today's purpose.

This code works and is perfectly fine. But what happens if we change it slightly, like this:

1
2
3
auto* iter2 = custom::begin(Container{});  C Get the begin

auto value2 = *iter2;  D Get the value

In this case, we pass a temporary object, of type Container, to custom::begin. This changes everything. The call to custom::begin is fine. Dereferencing iter2 isn't. We have a dangling pointer. The temporary object is destroyed after the full expression, after the semicolon in C.

Once we start using iter2, we are looking at undefined behavior.

Avoid dangling pointers - Strategy 1

One approach to avoid a dangling pointer is that custom::begin rejects temporaries. Only l-value references make sense here. A simple approach comes to mind, let's ban all other types with a static_assert.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
struct begin_fn {
  template<class R>
  constexpr auto operator()(R&& rng) const
  {
    A Ban all others
    static_assert(std::is_lvalue_reference_v<R>);

    if constexpr(requires(R rng) { begin(std::forward<R>(rng)); }) {
      return begin(std::forward<R>(rng));

    } else if constexpr(requires(R rng) {
                          std::forward<R>(rng).begin();
                        }) {
      return std::forward<R>(rng).begin();
    }
  }
};

A slight change, here in A, and we effectively prevent our custom::begin getting called with temporaries.

This is good, and in some cases exactly what you want. However, this is not what ranges do. The static_assert solution has one drawback, we cannot pass a temporary to it. Ah, wasn't that the purpose of this exercise? Yes, but passing the temporary isn't the issue. As long as we do not dereference the result, it doesn't matter.

Let's look at an alternative solution.

Avoid dangling pointers - Strategy 2

Obviously, the static_assert is the limiting factor here. The std::is_lvalue_reference_v<R> part is fine.

We know the type at compile-time, so let's use another constexpr if to guard the good case and return a special type dangling in all other cases:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
struct begin_fn {
  template<class R>
  constexpr auto operator()(R&& rng) const
  {
    if constexpr(std::is_lvalue_reference_v<R>) {
      if constexpr(requires(R rng) { begin(std::forward<R>(rng)); }) {
        return begin(std::forward<R>(rng));

      } else if constexpr(requires(R rng) {
                            std::forward<R>(rng).begin();
                          }) {
        return std::forward<R>(rng).begin();
      }
    } else {
      return dangling{rng};  A Catch temporaries
    }
  }
};

With this change, we delay the error to the point where it really occurs. The interesting construct is dangling, which is returned in the case begin is invoked with a temporary A. Below, you see a possible implementation of dangling:

1
2
3
4
5
6
struct dangling {
  constexpr dangling() = default;
  template<class... Args>
  constexpr dangling(Args&&...) noexcept
  {}
};

We can see that dangling is a struct with a default constructor and a constructor, which is a variadic template. There is no implementation. All this type should do is to give users a helpful error message. Assume we uncomment D:

1
2
3
auto iter2 = custom::begin(Container{});  C Get the begin

// auto value2 = *iter2;  D Get the value

Once we compile it, we get the following error message with Clang:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
<source>:62:18: error: indirection requires pointer operand ('dangling' invalid)
   auto value2 = *iter2;  // #D Get the value
                 ^~~~~~
1 error generated.
ASM generation compiler returned: 1
<source>:62:18: error: indirection requires pointer operand ('dangling' invalid)
   auto value2 = *iter2;  // #D Get the value
                 ^~~~~~
1 error generated.
Execution build compiler returned: 1

It is the name that should draw the users attention to the fact that here is something wrong. We delayed this error until the variable is really used. This is the version ranges use because under some circumstances, this behavior is desired.

For your own codebase, you can decide from case to case what's the better option.

I hope you learned something. I appreciate your feedback. Please reach out to me on Twitter or via email.

Andreas