Insert, Push, and Emplace

Standard C++ containers (or collections) are essential tools. Some, like vector, queue, deque, and stack are list-like: elements are accessed by position. Others, such as map or set, are more associative in nature: elements are accessed by a key.

To add an object to a vector, you can call insert or push_back. Stacks and queues both allow you to add elements using push. Map allows insertions with insert or using the [ ] operator.

In C++11 and beyond, all these containers have new functions that seem to behave similarly to the above methods: emplace, emplace_back, and emplace_front.

Which begs the question: what’s the difference between these different methods of adding items to collections?

The Terminology

The definitions for “push” and “emplace” are fairly ambiguous, but they do have fairly consistent meanings we can derive. “Insert” already has a pretty clear meaning, but we’ll discuss it for completeness.

Push

“Pushing” in programming usually means “to add to” or “to append to”.

Containers like queue and stack only allow adding elements at one spot (either at the end of the queue or top of the stack), so they only have a single push method. Deque (double-ended queue) allows adding elements onto either end, so it has push_front and push_back. Vector can only efficiently add elements to the end, so push_back is provided.

In all those cases, “push” means to either add an item to the front or back of a list of existing items.

Insert

You’ll notice that any time you can add an item in the middle of a list, the term is instead “insert”. This is the case for vector, which allows adding items at any index, and map/unordered_map, which structurally don’t really have concepts of “front” or “back”.

“Insert” means to add an item in the middle of a list of items, or to add an item to a collection that is not really list-like.

Emplace

The definition of emplace is “to put into place.” Within the context of C++ containers, the definition seems to be “to construct in place”.

Reviewing the C++ containers, you’ll notice a pattern emerge:

  • Any container with insert will have an emplace as well.
  • Any container with push_front will have an emplace_front as well.
  • Any container with push_back will have an emplace_back as well.

I believe the difference meant to be conveyed between “push/insert” and “emplace” is the difference between “moving an object to a location” and “constructing an object at a location.”

Let’s dig deeper into the distinction between those two ideas with a real-world analogy!

Copy, Move, or Construct

Imagine a scenario where you are contracted to build a roller coaster at a theme park. There are three strategies you might choose from:

  1. Construct the roller coaster at a testing facility. Once happy with the roller coaster, build an exact copy of it at the theme park.
  2. Construct the roller coaster at a testing facility. Once happy with the roller coaster, move it from the testing facility to the theme park.
  3. Construct the roller coaster at the theme park.

Any real-life company would certainly go out of business if they chose options 1 or 2. Building an exact copy is a waste of time and resources. Moving doesn’t duplicate the coaster, but the transportation aspect could be costly. In reality, once a design was decided upon, you’d just construct the coaster at the desired final location.

What About C++ Again?

In C++, options 1 or 2 are commonly used. When you set two objects equal to one another, the values from one object are either copied to the other, or moved in some circumstances. You could imagine, when this applies to very large objects, that it could be quite expensive.

For example, let’s say you create an object and push it onto the end of a vector:

void Foo::Bar() {
  MyObj obj("test", 5);   // Construct object on stack.
  myVector.push_back(obj);  // Push onto back of vector.
}

The object is constructed on the stack as a local variable. When you push the object onto the vector, one of two things will happen:

  1. A new instance of MyObj is allocated at the end of the vector and the data from obj is copied to the new instance.
  2. A new instance of MyObj is allocated at the end of the vector and the data from obj is moved to the new instance.

Copying vs. Moving

Before C++11, the language syntax only supported copying: the concept of a copy constructor was used for this. When you assigned one object to another, the copy constructor was used. When you copy an object, both the original object and its copy are usable.

In C++11, “moving” data was introduced, along with the move constructor. When you move an object, the moved-to instance is usable, but the moved-from instance is no longer usable. Moving is generally said to be more efficient, but it leaves the moved-from instance in an “unspecified” state.

Imagine an object that contains a lot of data. Creating a copy could be quite costly because memory has to be allocated to hold a copy of the data, and then the data itself must be copied from one memory location to another. When you move, however, a new instance of the object is still created, but that new instance just “takes over” control of the data the old instance used. The old instance isn’t usable anymore, since it gave all its data to the new instance.

How is it decided whether to copy or move? You want to copy an object if you intend to continue using the copied-from object as a distinct, separate entity. You can move an object if the moved-from object will no longer be used. Sometimes, the compiler knows this without you telling it; other times, you have to tell the compiler explicitly (using std::move).

Back to Emplacing…

So, “insert” and “push” both take an existing object and either copy it into the container or move it into the container. However, as with our roller coaster example: why copy or move when you can just build it at the final spot in the first place?

This is the key to emplacement: rather than copying or moving a pre-existing object into a container, we will instead provide the appropriate constructor arguments to allow the container to allocate and construct the object in the container’s memory directly.

void Foo::Bar() {
  MyObj obj("test", 5);
  myVector.push_back(obj); // Copy or move construct.

  myVector.emplace_back("test", 5); // Pass constructor arguments directly
  myVector.emplace_back(obj); // This also works - why?
}

Emplacement allows us to allocate a piece of memory and construct our object directly at that location in memory. We can avoid any temporary object creation or unnecessary copy/move operations.

Note that last line: emplace_back(obj). Why does that work? The reason is: we are providing valid constructor arguments! But in that case, we are just invoking the copy constructor. As a result, the performance benefits of emplacement are somewhat defeated.

There will be times where you already have an instance of an object, and so “push” or “insert” are fine options. Emplace is most useful when you have to construct a new object instance just so you can pass it to the container. Instead, you can let the container construct the object for you, and already in the container’s memory as a bonus.

Issues with Emplacement

It’s worth noting that any performance gains from using “emplace” over “push” will likely be minimal unless the container is holding very large objects. Emplacing vs. pushing int should not be something you are concerned about. Only when larger objects lead to performance concerns is it worth thinking about.

Some have also pointed out that emplacement can cause readability issues. Take the following code snippet:

void Foo::Bar() {
  cities.emplace_back("Los Angeles", "California");
}

What kind of object is the “cities” collection holding? It’s a bit unclear. The name of the vector itself provides a clue that it might contain objects of type City…but a less helpfully named variable would make it even harder to parse. And I guess the City class has a constructor that takes (string name, string state)? You hope so!

Arguably, the readability is better with something like this:

void Foo::Bar() {
  City losAngeles("Los Angeles");
  losAngeles.SetState("California");
  cities.push_back(losAngeles);
}

But that’s for you to judge!

C++ 
comments powered by Disqus