push_back vs emplace_back: When to use what

Apr 04, 2023

In today's post, I like to address a topic already discussed by others. However, in classes or code reviews, I still see the confusion. That's why I want to highlight this once more. What I'm talking about? The difference between push_back and emplace_back or better when using which function of the std::vector.

`emplace_back` for better performance

People often tell me they use emplace_back when this topic comes up for performance reasons. Because it is faster than push_back as it creates the object in place. And that is true. However, when I look at the code, I see that the statement does not apply to the code written. Let's use an example and test your knowledge.

We start with a bit of helper code:

struct Widget {
  Widget() { printf("Widget()\n"); }
  ~Widget() { printf("~Widget()\n"); }

  Widget(int) { printf("Widget(int)\n"); }

  Widget(const Widget&) { printf("Widget(const Widget&)\n"); }
  Widget& operator=(const Widget&)
  {
    printf("Widget& =operator(const Widget&)\n");
    return *this;
  }

  Widget(Widget&&) noexcept { printf("Widget(Widget&&)\n"); }
  Widget& operator=(Widget&&) noexcept
  {
    printf("Widget& =operator(Widget&&)\n");
    return *this;
  }
};

I will use Widget later with std::vector. The purpose of Widget is to see which special member function gets invoked.

Testing your knowledge

Equipped with this helper class, I make three attempts to add a new value to a std::vector efficiently:

A Avoid seeing the realloc's
std::vector<Widget> v{};
v.reserve(5);

printf("- push_back\n");

B Using push\_back with a temporary object.
v.push_back(Widget{3});

printf("- emplace_back\n");

C Using emplace\_back with a temporary object.
v.emplace_back(Widget{3});

printf("- emplace_back\n");

D Using emplace\_back to create a new object.
v.emplace_back(3);

printf("-------\n");

A reduces the output to the interesting one. We don't want to see allocations and moves when the vector has to resize.

The interesting part starts in B, where I use push_back to add a newly created temporary object to my vector v. In C, I use a different approach with emplace_back to add a newly created temporary object to v. Last, in D, I use a variation of C, emplace_back, together with just a value that Widget has a constructor for.

The question for you is, which version is the best regarding efficiency and performance?

Looking at the reality

For the example given, the answer is D. This is the purpose of emplace_back. Let's have a look at the output of this small program:

- push_back
Widget(int)
Widget(Widget&&)
~Widget()
- emplace_back
Widget(int)
Widget(Widget&&)
~Widget()
- emplace_back
Widget(int)
-------
~Widget()
~Widget()
~Widget()

Here we can see that the first emplace_back leads to the same invocations of Widget's special member functions as push_back. Both first create a temporary Widget by calling the constructor, taking an int. After that, both move this freshly created temporary object into v and finally destroy the now moved-from temporary object. Yes, I know, we still moved the data, so it looks perfect.

But, look at the second invocation of emplace_back, the one from D. All we see here is the call to the constructor taking an int of Widget. No additional move, no destructor call. Why? Because we did not create an intermediated temporary object.

Constructing an object in-place

The purpose of emplace_back is to construct an object in place. Whenever you see the containers type in an emplace_back call, this code is simply wrong. Either you can then use push_back, or you should get rid of the temporary object.

The ability of emplace_back is to create an object in place. As we can already see from the output, we don't beat that with a push_back or a std::move.

There is more. Some people sometimes tell me that B and C aren't that bad because the compiler optimizes things away. Well, optimizations are great and good, but not relying on optimization is even better. Say we have a std::vector<std::string>. The internal implementation of std::string is not just a pointer and a length. The std::string comes with some additional bytes for the Small String Optimization, something around 16 bytes. These bytes must be copied in case of push_back and even the emplace_back attempt in C. Because despite that, we see a call to Widget(Widget&&) in our output, this doesn't mean that all the data in Widget is moveable. In the case of std::string, we have some non-moveable data. Be aware of that.

Summary

Use push_back when you have an existing temporary object that you want to move into your std::vector. Or, more generally, use push_back when you want to move an existing object into your std::vector.

Use emplace_back when you create a new temporary object. Instead of creating that temporary object, pass the object's constructor arguments directly to emplace_back.

Andreas

« C++ Insights Episode 37: C++20's range-based for-loop with initializer

C++ Insights Episode 36: Coroutine customization points »

Blog

emplace_back for better performance

Testing your knowledge

Looking at the reality

Constructing an object in-place

Summary

`emplace_back` for better performance