Logo

Blog


Can I touch a moved-from object

In this blog post, I try to bring you a topic closer that was already discussed and written about multiple times: move semantics. Just to give you two references:

Herb's article says that it is a 9-minute read. It depends on your reading speed of books whether you manage to consume 260 in 9 minutes. But then, Herb's article would still be a faster read, right :-)

Both are excellent sources. One tries to keep it basic while to other brings you up to speed with every detail you need to know if you care about this topic deeply. That it took Nico 260 to explain a single feature of C++ says enough for itself.

My aim for this blog post is to simplify a lot of things and break them down to the basics, a bit like Herb did.

I let you in on a secret I sometimes share in my classes. When I first heard about move semantics over ten years ago, I only heard that things are moveable now and that this is so much faster than copying. For some time, I was wondering which assembly instruction did manage to move an entire C++ object. Was there some way to change the address of the two objects? Of course, neither of these is the case, but you probably already know that.

Copy vs. Move

When teaching move semantics is start with this example:

1
2
3
4
5
void Copy(char** dst, char** src, size_t size)
{
  *dst = new char[size];
  memcpy( *dst, *src, size);
}

We all know this is what we used for so many years, a plain simple copy of data. Absolutely free of C++ or even modern. Yet the keep points are there. Allocating new memory is costly. Even if you say that speed is not the factor you need to optimize for, at this point, the memory consumption is increased. Then there is the memcpy. Sure, you can use a STL algorithm for this job, but this doesn't change the fact that in the end, the data needs to be copied. Whether this impacts your performance depends on your system and the data. The larger the array size is, the more time is consumed by duplicating it.

Nothing is wrong with the code above, aside from you saying that it is not very C++-ish. Whenever we really need to duplicate data, we have to pay for the price, which is fine. But in all the cases where we no longer need the src-object, for example, because it is a temporary object, copying the data puts unnecessary pressure on our system. It is comparable with you renting a second apartment and ensuring that the furniture is the same, as well as the size of the apartment. Some of you might have two apartments for a good reason. I highly doubt that anyone has two which are identical. Now imagine the time you need to spend in a furniture store to purchase your couch again. Usually, only a few people do this. Why? Because we normally move!

This brings me to this piece of code:

1
2
3
4
5
void Move(char** dst, char** src)
{
  *dst = *src;
  *src = nullptr;
}

These models the situation where we no longer need the source object, like with our old apartment, we can take its contents and transfer them to the destination. In code, this is the exchange of two pointers, and we are done. The benefit? This operation takes a constant time, no matter how many Lord of the Rings pages are stored in the source object. There is no allocation, and with that, no increase in memory usage. Whenever we no longer need the source object, this is the most efficient way to transfer the data.

What does this mean for classes?

Have a look at the following class Test:

1
2
3
4
5
6
class Test {
public:
  Test() = default;

  Test(Test&);  A This is a copy constructor
};

I assume some of you know that we do not need to make the copy constructor's argument const as you see it in A above. Back in the days, this form of copy constructor allowed us to write a copy constructor that swapped the data, much like Move above. The issue was that it was impossible to express the difference between a copy or a swap. This is where move semantics came in with the new notation for rvalues references && and the move operations. We now can direct lvalues to the copy constructor and rvalues to the move constructor.

Basically, what we do in the move members of a class is still exactly what I showed above in Move. Just that we can express the intend much better, and thanks to rvalue references, the compiler can optimize our code by calling the move operations instead of copy. I know clients who told me that enabling -std=c++11 did lead to a noticeable speed-up of the application. They were heavy STL users and, my guess is that they worked with a lot of temporary objects. Move semantics is the perfect tool to optimized copies into moves. Because the STL supported them in C++11, it worked immediately for all containers in C++11.

Can I touch a moved-from object?

This is the question of this post: can I touch a moved-from object. The answer is it depends. Have a look at this minified Vector implementation

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
struct size_type {
  size_t sz;
};

class Vector {
  size_t mSize{};
  int*   mData{};

public:
  Vector(size_type size)  A 
  : mSize{size.sz}
  , mData{new int[size.sz]{}}
  {}

  ~Vector()
  {
    delete[] mData;
    mData = nullptr;
  }

  Vector(Vector&& rhs) noexcept  B 
  : mSize{rhs.mSize}             C 
  , mData{rhs.mData}             D 
  {
    rhs.mData = nullptr;  E 
  }

  int& at(size_t idx)
  {
    if(mSize <= idx) {  F 
      throw std::out_of_range{"ups"};
    }

    return mData[idx];  G 
  }
};

Much is left out to focus on the important parts of Vector. In A, we have a constructor which allocates the given number of elements in our Vector. It sets the member mSize and uses new to allocate the memory for mData. Next, in B, we have the move-constructor. The first thing we do there in C is to obtain the size from the moved-from object rhs. I decided not to use std::move here to illustrate even more that it degrades to a copy. After C mSize and rhs.mSize have the same value. After that, the actual data is moved in D. Here, I also don't use std::move because the pointer isn't moved anyways. E is required to prevent a double free.

Now, let's go down to F. Here we are looking at the implementation of at, which for std::vector does provide a range check. Should this check determine that the provided index is in range, we return mData at position idx. Let's execute a couple of object creations and assignments with Vector:

1
2
3
4
Vector v1{size_type{5}};   A 
Vector v2{std::move(v1)};  B 

int x = v1.at(2);  C 

First, we create v1, a Vector containing five elements in A. Then, in B, we move v1 into the freshly creating v2. After that, we access element 2 of v1. Note that this access is in range. Go back to the initial question, "Can I touch a moved-from object". Obviously, you can touch it! It is still there, not giving a single glue that it is a move-from object! We need syntax highlighting and a search for std::move to even see that v1 is in a move-from state. Now that we established that you could touch it, the better question is either:

  • can I touch a move-from object safely, or
  • should I touch a move-from object

The standard specifies for STL types in [lib.types.movedfrom] that

Unless otherwise specified, such moved-from objects shall be placed in a valid but unspecified state.

The unspecified is the troublemaker here. Look at Vector as a black box. Then you don't know what happens inside the move-constructor. For our case, I did not set mSize to zero above. Why? Simply because there is no immediate need. The destructor still works. It doesn't care for mSize at all. From the cleanup perspective, the object is in a valid state. All temporaries will work perfectly with it. I also saved a few CPU cycles by not assigning zero to rhs.mSize. But of course, once you try to access an element with at, it will fail badly. The out-of-range check doesn't protect against this nullptr-access. This whole nullptr-access issue is easy to fix, we just need to set rhs.mSize to zero, and all will work. But with a black-box view, we don't know whether this has or hasn't been done. This is an illustration of why the answer to the question "can I touch a move-from object safely" is so hard.

One way to go is the mantra never touch a move-from object. I think this is a good way of dealing with this situation. Let's face it, in a lot of cases is the access to a move-from object unwanted. Even with a defined result, the overall behavior of our program may be wrong.

The standard gives an example of this issue for std::vector [defns.valid]

If an object x of type std::vector<int> is in a valid but unspecified state, x.empty() can be called unconditionally, and x.front() can be called only if x.empty() returns false.

Now, sometimes we need to touch this move-from object, we need to re-use it. Referring to the STL, there is unique_ptr. We have the specification for the move constructor in [unique.ptr.single.ctor], which specifies a postcondition:

Postconditions: get() yields the value u.get() yielded before the construction. u.get() == nullptr. ...

This postcondition is what you are looking for if you need to figure out whether you can safely re-use a moved-from object (at least when it comes to the STL). What unique_ptr does here is to behave like it is freshly constructed. We can construct an empty unique_ptr.

Summary

You can do anything with it you would do with any object that you get passed without knowing its state, i.e., you would not call v[5] on a vector without checking that it contains at least six elements.

You can touch a move-from object safely, but you need to call a function without a precondition. In a lot of cases, it is simpler to follow the rule never touch a moved-from object.

I hope this post helps you to understand the moved-from state better, allowing you precise decision on what to do with a move-from object in the future.

Andreas