Wednesday, August 29, 2012

ZSH Macros

Today, I'll present a module for zsh that I wrote few days ago. The aim of this module is to provide a way to create easily temporary shell scripts, and save their favorites. If you are familiar with Emacs, and if you think about its macros, you're right! I designed this with the macro concept in mind.

I had the idea when working with people who aren't familiar with shell scripts, and who don't want to try it for helping them. The original example was a work-flow with a TODO to update regularly. And yet commands to do were not so difficult:

$ git add TODO
$ git commit -m "Update the TODO."
$ git stash
$ git pull --rebase
$ git stash pop
$ git push

In reality, my coworker didn't plan to stash nor pull, but since this is my story, I can change it a little! :)

I thought that it is easy to write a script to make that works. I just have to copy these lines and paste them in a script. But, on one hand I find this boring, on the other hand someone who is not interested in scripts will never do that. So I had to find a transparent way for the user to have the same result without having the feeling that he plays with scripts. It's here that came the idea to mimic the behavior of the Emacs macros.

Notice that even if the Emacs macros works on text, and the title of this post might be confusing, I don't want to write a tool to enhance the Emacs macro in the Zsh command Line Editor (zle), but I want a way to create simple scripts in a easy and fast way.

1 Zsh macros!

I first tried to create a Perl script that can create scripts, but I realized that there is too many drawbacks (no history, no completion…). So I found an alternative way, fully integrated in zsh: hooks. For people who doesn't know what are hooks: It is a set of functions called at a specific time that allow the user to run their own functions. It is useful for letting the user personalizing a software. As an example, I wrote a git hook to check the log message (I talked about it in a previous post). For this module, I use the preexec hook for achieving my goal. In this post, I'll present my module, why it can be useful for day-to-day usage, and how to use it. In some next posts, I'll show some useful tricks I had to use to make it works.

1.1 Why using it?

Because it allows you to save your time. It is easy to install, easy to learn and easy to use. I realized that sometimes I repeat the same sequence of lines several time. It ends up by having these lines concatenated in one line with a && between them. Pretty ugly, right? But because it is just repeated less than ten times, I don't want to write a script for that because it is faster for me to just reuse my zsh history. But with this module, you just type your sequence of command once, and then you just have to hit macro_execute to get it repeated. Personally, I have aliased this command to e. It is the fastest and cleanest way I know to repeat your work properly.

1.2 How to install it

Glad to see you here! You'll see that it is a good choice :) The first step to install it is to get it from my github (directory zsh_macros). Once it is done, you just have to source the file, in your shell to try it, or in your configuration file if you're sure that it will fit to your needs.

Some things to check: if you have already bound either <ctrl-x><(> or <ctrl-x><)>, I don't bind anything (I don't want to mess up your own configuration!). These bindings are the same than under Emacs. Feel free to adapt these bindings to your own wishes!

You then have to add something in your configuration file:

setopt prompt_subst
RPROMPT='$(zmacros_info_wrapper)'

The first line allows the prompt to perform command substitution in prompts. The second one set your RPROMPT to the value of zmacros_info_wrapper that allows you to know the status of your macro. If you have already assigned something to your RPROMPT, you could simply add it to it.

Once this is done, everything should work fine. If this is not the case, you can either send me a bug report or send me a patch. Now, let's see how use this module.

1.3 How to use zsh macros?

In this part, I assume the bindings are the one originally provided. I think a screenshot may help to figure out how it looks like, I first run it in the shell, to show what you have to do, and what are the result. Between <> are represent the keyboard macros. Do not write it out :).

$ echo foo
foo
$ <ctrl-x><(> echo bar
bar
$ echo baz
baz
$ <ctrl-x><)> e
bar
baz

This screenshot shows how it appears for the user:

The flag on the right (<R$id>) appears right after you type <ctrl-x><(>, and disappear right after you type <ctrl-x><)>. Pretty easy right?

Note that if you don't like key-bindings (Are you a Vim user?), you can call macro-record and macro-end and you'll get the same effects.

Let's go a little deeper: you can have several macro. This module doesn't support nested macros in a same shell, but you can make as many macros as you want. Each macro is associated with an id. This is what is printed on the flag after the R in the prompt. You can run macro-execute with an optional argument that corresponds to the id of the macro you want to run. By default it's the last recorded. Notice that each script has its own file, and there is a master file that track each of them. To add and execute macros, we read and write on this file in /tmp. This way has its advantages and its drawbacks. We have concurrency problems, but since a real user can't make several things in the same time, that should not be a real problem.

The advantages are that a macro recorded in a shell can be used in another one, and you can register two macros at the same time, because the only access to the main file is made when you call macro-record. So recording two macros in two different shells is fine.

All your scripts live as long as your /tmp is not cleaned. If you want to keep a macro for a longer use, it is possible. You just have to call macro-name that will take the macro you asked for (if you give no id, the last one is considered), and copy it in the directory ZMACROS_DIRECTORY. You can set it at the top of the file macros.zsh. Maybe it is a good idea to add this directory to your path, it will allow you call these new functions simply.

This is the features available in this first version of the zsh macros. I planned to add some new ones, but if you have any request, comment, or anything, feel free to comment this post! I'd like to know what would be helpful and what you think about this module.

Wednesday, August 22, 2012

C++11: A generic Singleton

1 Context

Yesterday, I found the solution to a problem that I was unable to solve one year ago. The problem was simple: Creating a generic Singleton. If you don't know what a Singleton is, here is the definition given in the book "Design Pattern" from the Gamma et al.: "Ensure a class only has one instance, and provide a global point of access to it".

I wanted to have a generic singleton because I had to code several singletons in several projects and I hate making the same thing several times. I was unable to succeed in my tries because of the initialization problem. There was no known method to handle the initialization that works for each possible constructors.

A perfect way to change a class into a Singleton would be, in my opinion, something like just inheriting from a base class Singleton. That would be awesome! But I don't think anyone has ever succeed. The best discussion about a generic Singleton I have read is in the Andrei Alexandrescu's book (If you know a better stuff about it, please let me know!). It notably talks about the deletion problem, that is not approached in this article. I just focus on the construction that can be improved thanks to C++11.

The common version is to use the Curiously Recurring Template Pattern (CRTP). So let's start by introducing the CRTP. The concept is simple, we inherit from a template class that takes us as parameter. It allows to make static polymorphism or instance counter… Wikipedia may help for these examples.

I had to say that when I discovered this thing I was amazed. I am passionated by the concept of the template, by the language in the language. One year ago I tried to solve the Singleton with a CRTP (which is the solution used in the C++ Modern Design book), and my specifications was the original one, and I had to call a method "destroySingleton" at the end of my programs. So I have to find a generic constructor that works for all of my Singleton classes and the work is done.

2 First Try

At the beginning, when I had only two classes and two constructors, I though to play on the fact that the template functions not used are not instantiated, so they can be incorrect (no type checking, no name binding… It just must be valid syntactically). Here is an example:

#include <string>

template <class T>
class Singleton
{
public:
  static
  T* get_instance(int v)
  {
    if (!instance_)
      instance_ = new T(v);

    return instance_;
  }

  static
  T* get_instance(std::string s)
  {
    if (!instance_)
      instance_ = new T(s);

    return instance_;
  } // Sounds familiar...

  static
  void destroy_instance()
  {
    delete instance_;
    instance_ = nullptr;
  }

private:
  static T* instance_;
};

template <class T> T* Singleton<T>::instance_ = nullptr;

// The classes that will be singletons:

class Single: public Singleton<Single>
{
public:
  Single(std::string s): s_(s) {}

private:
  std::string s_;
};

class Map: public Singleton<Map>
{
public:
  Map(int scale): scale_(scale) {}

private:
  int scale_;
};

// How to create and destroy them:

int main()
{
  Single* s = Single::get_instance("Forever Alone");
  Map* m = Map::get_instance(42);

  // Use these singletons.

  Map::destroy_instance();
  Single::destroy_instance();
}

When we instantiate Singleton<Single> there is no constructor that takes an int (so the first get_instance is incorrect), and the program still compiles successfully. This is because the function is not called, so not instantiated, so there is no reason to wine for g++.

This version respects one half of the deal to have a singleton, but it is totally non-intrusive in the code of the classes transformed into singleton. To respect the second half ("only one instance"), we have to tweak a little. We have to make the constructors of the derived classes private, and to grant the friendship to the mother class. As a reminder, the friend keyword allows to give access to all of the protected/private methods to the friend class. This way, no user can create directly an object of the derived class, and we respect the specifications.

Now we respect the two part, but we are still unhappy with this. In fact, we have only two classes, and we had to create two get_instance, because we have two different constructors. Since the number of possible combination of arguments to give to the constructor of an object is infinite, our solution isn't suitable. And that's where I was stuck.

3 The Solution

And yesterday, I finally realized that the solution was just in one of my previous post about the C++11 mechanisms that allows to create emplace_back. The solution was the perfect forwarding! In fact, we just have to create a get_instance that forwards the arguments to the constructor. No repetition, fully generic. That's what I wanted to reach! Here is the final version of the Singleton class:

#include <iostream>

template <class T>
class Singleton
{
public:
  template <typename... Args>
  static
  T* get_instance(Args... args)
  {
    if (!instance_)
      {
        instance_ = new T(std::forward<Args>(args)...);
      }

    return instance_;
  }

  static
  void destroy_instance()
  {
    delete instance_;
    instance_ = nullptr;
  }

private:
  static T* instance_;
};

template <class T> T*  Singleton<T>::instance_ = nullptr;

class Map: public Singleton<Map>
{
  friend class Singleton<Map>;
private:
  Map(int size_x, int size_y): size_x_{size_x}, size_y_{size_y} {}

public:
  int size_x_;
  int size_y_;
};

int main()
{
  Map* m = Map::get_instance(4, 5);

  std::cout << m->size_y_ << std::endl; // Outputs 5.

  Map::destroy_instance();
}

As said above, we are not interested in the destruction problem. It is well-covered in the Alexandrescu's book, and I recommend you to read it if you want to see more on the subject. We create a destroy_instance method in the aim to do not leak our instance. The user must call it at the end of the program.

The real novelty of this, is the use of std::forward when creating the object. So we can give to it any class, and it will work. I take the example of a Map (for example for a game) which takes two arguments, but I hope it is clear for everyone that it works with any constructors.

Note that, I didn't write the copy and move constructors private, but they should be. I omit them only to gain some space in my post. For the same reason, I didn't write the class Single in the second example. But it is clear that it works.

Before concluding, I just want to show that the use of the metaprogramming here doesn't lead to a hideous error message by our compiler when not used correctly. Let's assume that we replace the call to get_instance with two arguments, by a call with no arguments. What happens? Here is the answer of g++ (4.6.1):

singleton.cc: In static member function 'static T* \
Singleton<T>::get_instance(Args ...) [with Args = {}, T = Map]':
singleton.cc:53:30:   instantiated from here
singleton.cc:16:9: erreur: no matching function for call to 'Map::Map()'
singleton.cc:16:9: note: candidates are:
singleton.cc:40:3: note: Map::Map(int, int)
singleton.cc:40:3: note:   candidate expects 2 arguments, 0 provided
singleton.cc:35:7: note: constexpr Map::Map(const Map&)
singleton.cc:35:7: note:   candidate expects 1 argument, 0 provided
singleton.cc:35:7: note: constexpr Map::Map(Map&&)
singleton.cc:35:7: note:   candidate expects 1 argument, 0 provided

The message is clear, and there is all the information needed to understand it, and to fix our mistake.

In this post I proposed the use of std::forward for building a generic singleton. I don't talk the deletion problem because for my personal use, I accept to call the destroy method.

But this is just one example to show how the C++11 can help programmers to build new things easily. I hope my article motivates you to experiment new things! Feel free to share your opinion in comments :-)

PS: I just realize that my current solution isn't perfect. In fact, we can't get the instance without giving a valid argument list (valid = there is a constructor that takes this list). So a call to get_instance without argument, which should return the instance leads to an error if there is no constructor that takes no argument. This is not really what we want. A fix would be to separate the initialization and the get_instance. But that doesn't invalidate what I wanted to demonstrate. So it's okay :)

Thursday, August 16, 2012

A wonderful internship

Today I talk about the internship I made from the middle of October (2011) to the middle of January (2012). I'll start by presenting a little the context, and what I have done exactly.

I worked at Aldebaran Robotics on NAO. The goal of my internship was to refactor a blob of code that allows NAO to recharging himself by going to its station. The behavior was developed in Python, and my goal was to build the same thing in C++. Here is the video that shows this behavior:

Pretty cool right? You can guess how excited I was to attack a project like this, and I start by analyzing the existing program to understand what are the keys to make it works in C++. And then, we (my supervisor and I) realized that we could build a better framework to track one or several object(s). We designed an architecture (what are the modules and their responsibilities), and (because my supervisor was great) I was responsible of the whole architecture: what are the objects, how they interact, etc. I develop all by myself and building a set of libraries that allows the user to build a tracker. The aim of the whole project was to being able to propose a set of tracker (a red ball tracker, an Aldebaran logo tracker, …) with as less duplication as possible, and we also wanted to let the user create its own kind of tracker by adding its own little brick of software. This was a success since one of my supervisor developed a face tracker in 20 minutes and 160 lines of code. I created a tool able to track one or two objects. When working with one object, NAO was able to follow an object with its head, or its whole body and staying at a given distance of the object. When there is two objects he is able to go to an exact position in a repair defined by the center of the two objects. A good tracker is also able to detect when it losts its target. Our tool starts by looking at the last place he saw it.

I think it is better if I don't go too much in details about this project since I don't owe its right. But here is what it looks like in video (yes I have asked for the rights for that :P).

In the first part of the video, I show the search of a red ball, and how it finds and follows it.

In the second part, it tracks the (kind of) pie-charts on the box, and go to the left red cross on the floor. He doesn't look at the floor, the red cross is just for us to see that it goes where we asked for!

And the last part was a reproduction of the charging station, just for seeing that we are able to reproduce it with my work. You'll see that it seems to stop. In fact he does because he reaches the first target, and I had to run through my computer to tell him to go to the second!

No more suspense, let's look at the video!

I feel like some personages in cartoons because you see my hand, my foot, my body, but never my head :-P

Feel free to ask if you have any questions about that, I'll be happy to answer.

Tuesday, August 14, 2012

Debugging C++ (Part 4): A print method

And here is the last post of this series about debugging C++. In this one, I develop a good way to use prints for debugging. The main drawback of the prints is that it is hard to maintain. The presented method doesn't have this problem.

I use prints sometimes, when valgrind tells me nothing, and that I can't run gdb for any reason (for example, I use opaque data structure and all the information gdb is able to give is "root node = 2"… Nice! It happens when I work with BDD (Binary Decision Diagram, I think I'll talk about it in the future)). This method works in C++, and is inspired of three things. I'll start by describing these things, and then I show the whole thing.

Akim Demaille, one of my teacher, gave me the base of this method. It consists in using macro to print. Here is the thing:

#define ECHO(content) std::cerr << content << std::endl
#define VAR(v) "`" #v "': " << v

int main(int argc, char *argv[])
{
  int i = 42;
  int j = 51;

  ECHO(VAR(i) << " - " << VAR(j)); // Outputs "`i': 42 - `j': 51"
}

I add quote _`'_ to see what happens when I want to see the result of an operation (i + j for example). Now, let's see how it works. In C or C++, you can make a concatenation by just putting two quoted things side by side. As an example this snippets

std::cout << "42 " "is " "the " "answer." << std::endl;

outputs "42 is the answer.". These method also relies on the stringification. The operator # put in front of a variable in a macro quotes it. It allows to make this. The macro VAR takes a variable, and returns its name quoted, and its value ready for being printed to std::cerr. And ECHO just print it followed by a newline.

Looks simple when you see it, but I found it ingenious. I remember myself making this job by hand so much time… I think this was very cool and a sane base to build something better!

We had a discussion with a friend of mine (Guillaume Delahodde) about a way to decide whether print a thing or not when we want to debug or not without any run-time overhead. And the result of this discussion was a macro that decides to execute the code or not:

#ifndef NDEBUG
# define LINE(content) content
#else
# define LINE(content)
#endif

I already talk about NDEBUG in the first post of this series about simple methods to avoid creating stupid bug, so I assume it is clear why it appears here. The usage of this is simple, put your print code inside LINE and it will be printed only in debug mode. So we just have to change the definition of ECHO a little, and it will only print on debug mode.

#define ECHO(content) LINE(std::cerr << content << std::endl)

Now our prints are only here in debug mode, and it is better than before. We could have stop here, but we could enhance the comfort of the user. I think writing VAR(i) << " - " << VAR(j) very annoying. It corresponds to 25 characters, and I think this is too much. I had the occasion to read the book of Andrei Alexandrescu (Modern C++ Design), and in there it uses a set of macros to add syntactic sugar for the user. It is hand written macros that threat the first argument and call the macro that threats n-1 arguments until 1. I call this macro LVN with N the number of argument. I just wrote it for 2, 3 and 4 arguments since they are the more common. Let's see how it looks like:

#define LV2(first, second) VAR(first) << " - " << VAR(second)
#define LV3(first, second, third) VAR(first) << " - " << LV2(second, third)
#define LV4(first, second, third, fourth) \
  VAR(first) << " - " << LV3(second, third, fourth)

And now we are able to replace our 25 characters to print 2 variables into: LV2(i, j). Much better right? Let's see the whole thing in an example:

#include <iostream>

#ifndef NDEBUG
# define LINE(line) line
#else
# define LINE(line)
#endif

#define ECHO(content) LINE(std::cerr << content << std::endl)
#define VAR(v) "`" #v "': " << v

#define LV2(first, second) VAR(first) << " - " << VAR(second)
#define LV3(first, second, third) VAR(first) << " - " << LV2(second, third)
#define LV4(first, second, third, fourth) \
  VAR(first) << " - " << LV3(second, third, fourth)

int main(int argc, char *argv[])
{
  int i = 4;
  int j = 51;

  std::cerr << "`i': " << i << " - " << " `j': " << j << std::endl;
      // Outputs the same thing as the following methods in a worse way.
  ECHO(VAR(i) << " - " << VAR(j)); // Still outputs "`i': 42 - `j': 51"
  ECHO(LV2(i, j));                 // Also outputs "`i': 42 - `j': 51"
}

The first print has the drawback to be boring to maintain. If you go to the release mode you have to track all of these prints and remove them, if you change the name of a variable you have to change it twice (assuming you're not using a perfect refactoring tool), and it is longer to write. The last version is simple to write, you don't have to haunt it to prevent them from printing, and they introduces no overhead in release mode. We are far from the dummy printf method, and it is cool.

The moral of this series is that there is a lot of debugging methods, and you have to know them for being able to adapt your method to the situation. For example, dmesg is very specific, but the other method were very hard to use in these situations.

I invite you to share yours in the comments. I'm sure that there is a lot of other method out there, just waiting for being learned!

Sunday, August 12, 2012

Debugging C++ (Part 3): dmesg

Welcome in the third post of this series about debugging C++. In here, I will talk about something less usual because it allows to debug after the crash of the program, this method use dmesg. I just present the case where we work with several libraries and your program crashes without any clue on which library is responsible of this, nor how to reproduce this behavior.

1 dmesg

I heard about dmesg when reading the tsuna's blog. Unfortunately I don't have any competence (for now) for reading assembler. But the fact that we can discover the name of the faulty function is helpful. I had to use this when I worked on a robot during my internship (a next post will present that). We worked with libraries and this is what shows my post.

On a robot there is a lot of parameters coming from the miscellaneous sensors, and the execution of the same program depends on a lot of parameters. So it is really hard to reproduce a bug. If the nice "segmentation fault" message appears, how can you debug that? Considering that you can't run valgrind, and running gdb is painful.

dmesg was my solution. I wrote a shell script to make the computation for me. Let's start by creating a dummy library which exports one function that segfault if a null pointer is given, and let be sadistic, we will call it with nullptr.

// file: libprint.hh
#ifndef TMP_LIBPRINT_HH_
# define TMP_LIBPRINT_HH_

int dereference(int* t);

#endif // !TMP_LIBPRINT_HH_


// file: libprint.cc
#include "libprint.hh"

int dereference(int* t)
{
  return *t;
}

// file: main.cc
#include "libprint.hh"

int main()
{
  return dereference(nullptr);
}

We create a libprint.so that contains the dereference function. And we compile the file main into a binary print linked with this library. And oh, surprise! Segmentation fault. Let's start the hunting. We call dmesg, and look at the last line:

[184608.332284] print[31332]: segfault at 0 ip b772e422 sp bf8ad218 error 4 in libprint.so[b772e000+1000]

We need two information: the name of the library that contains the bug, and the address of the faulty instruction in this library. To get the name of the library, we have to take the last field and to remove the part into []. To have the address of the faulty instruction we have to take the value of the instruction pointer (ip), and the value before the + in the last field. And we just have to subtract the value of the second value to the value of ip. If you are wondering why subtracting these two values to know the address of the ip in the library a draw may help.

I hope the picture helped, in fact, this subtraction removes the offset corresponding to the position of the library (address).

The question is how to make this process automatically? First, we can make the assumption that we always run dmesg right after the error, so we can suppose that we can make a call to tail to keep only the last line. But sometimes this assumption isn't correct, so our solution must be able to get a value given in argument. In here we use the shell default value assignment. As a little remainder:

output=$1
output=${output:="default value"}

If an argument is given, output will be equal to its value, otherwise it will be equal to "default value". So we can use it to decide whether we use the first argument of the program or directly call dmesg.

The part of the message before the colon is useless, so we can remove it. Then we have to get the value of the fifth field to get the value associated to ip, and we have to get the last field.

The name of the library and the address where it is mapped in the memory lie in the last field. So we have to cut it in two and we can get the needed information.

All these operations can be made by using only awk and sed.

Once we have the two addresses we just have to make the operation. We use the builtin system of the shell to make the subtract. Beware, they are in hexadecimal! So we must prefix the value by 0x to tell the base to the shell. Now we have the result (in decimal), we want it converted into hexadecimal, we use bc. It is a tool for making numeric computations. And we are grateful, there is a way to make it convert a number from a base to another. The syntax is simple, you have to set the variable obase to 16 (default value is 10). And that's all, remember to append the 0x before the address, because bc won't.

Here is the complete script:

#! /bin/sh

output=$1
output=${output:=`dmesg | tail -1`}
output=`echo $output | sed -e 's/.*: //'`

first=`echo $output | awk '{ print $5; }'`
second=`echo $output | awk '{print $11; }'`

library=`echo $second | sed -e 's/\[.*//'`
second=`echo $second | sed -e 's/.*\[//' -e 's/\+.*//'`

address=`echo $((0x$first - 0x$second))`
address=`echo "obase=16; $address" | bc`

echo "Segmentation fault in $library at: 0x$address."

And the way to use it is simple, just run it just after a segmentation fault when working with a library. Here is what it says about our case.

$ ./dmesg.sh
Segmentation fault in libprint.so at: 0x422.

And now, just run gdb like this (it is how I get with my libprint.so example):

$ gdb libprint.so
...
(gdb) disass 0x422
Dump of assembler code for function _Z11dereferencePi:
   0x0000041c <+0>:     push   %ebp
   0x0000041d <+1>:     mov    %esp,%ebp
   0x0000041f <+3>:     mov    0x8(%ebp),%eax
   0x00000422 <+6>:     mov    (%eax),%eax
   0x00000424 <+8>:     pop    %ebp
   0x00000425 <+9>:     ret
End of assembler dump.
(gdb) ...

If you are fluent with assembler you could read it, or use the meta data given by gdb: "Z11dereferencePi". Oops, I realized that I have forgot to use "-g" when compiling. Not important: we have a mangled symbol. We can use one of the method presented in one of my previous post. And voila, we know that our mistake is in the function dereference(int*). Pretty good when, without this method I was unable to know where it fails, why, and in the impossibility to reproduce it since there is too much parameters. I don't know how I would have done without this method.

I put this script on my github account, so if you want to fork it to enhance it, it is possible.

Hope you liked it!

Friday, August 10, 2012

Debugging C++ (Part 2): Valgrind and gdb

This article is the second post of the "Debugging C++" series. This post is an introduction of Valgrind and gdb. It is intended for people that don't know these because it starts from the beginning and gives some links to the documentation. In the valgrind part, I present the notion of "definitely lost", "indirectly lost", etc. If you are already familiar with these notions, I invite you to go through the second part of this post about gdb. In this second part I start by briefly presenting what is gdb and what it is useful. But the main interest of this section is that I present the notion of reverse-debugging, which allows you to run your program backward. I also present a useful trick when you want to debug a code you wrote (and you can modify) inside an unknown environment (for example, imagine you want to debug your (buggy) malloc implementation and you test it with ls).

Before starting tools like Valgrind or gdb, think about compiling your code in debug mode, and not using optimizations. Otherwise you'll get trouble to debug it. For example with gdb, your cursor will jump from one line to another without following the linearity of the source code. This can be disappointing, this is due to that.

1 Valgrind

Valgrind is a set of tools that contains a memory checker (memcheck), a profiler (users can use two modules: callgrind and cachegrind), a tool for checking the memory consumption (massif), a synchronisation checker (Helgrind). The tools I use the most in this list is Memcheck, and this is all I will talk about in this post. I'm sure there will be other posts on this list of tools in the future. This is just a brief introduction. If you already know valgrind, you should skip this part.

Memcheck helps you to detect any memory corruption. How it looks like? Let's assume you have a program like this:

#include <vector>

int main(int argc, char *argv[])
{
  std::vector<int> v;
  v.reserve(2);

  v[2] = 4;
}

The call to reserve allocates the space for two integers. But we try to write in a unallocated address. So it is a mistake from the programmer, and even if in this example it is trivial that there is a mistake, sometimes it is not, because it is hidden in a lot of code. So, detecting this is not so simple. That's where Memcheck is helpful. Here is what he says about this snippet.

==4641== Invalid write of size 4
==4641==    at 0x40099A: main (test.cc:8)
==4641==  Address 0x5955048 is 0 bytes after a block of size 8 alloc d
==4641==    at 0x4C286E7: operator new(unsigned long)\
 (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==4641==    by 0x400FB3: __gnu_cxx::new_allocator<int>::\
 allocate(unsigned long, void const*) (new_allocator.h:94)
==4641==    by 0x400ECA: std::_Vector_base<int, std::allocator<int> >::\
 _M_allocate(unsigned long) (in /tmp/aout)
==4641==    by 0x400D3B: int* std::vector<int, std::allocator<int> >::\
 _M_allocate_and_copy<std::move_iterator<int*> >(unsigned long, \
 std::move_iterator<int*>, std::move_iterator<int*>) (stl_vector.h:1109)
==4641==    by 0x400AC8: std::vector<int, std::allocator<int> >::\
 reserve(unsigned long) (vector.tcc:76)
==4641==    by 0x400988: main (test.cc:6)

4641 corresponds to the PID of the process. Every line coming from valgrind is formatted like this: ==PID==. Line beginning by a space are just the continuation of the line above. It was just for having the output of valgrind fitting in the page.

Remember to use "-g" when you compile your program, otherwise you'll get the name of the binary instead of the name of the file and the line number.

It starts by giving the name of the violation: "Invalid write" and tells us how many bytes we violate. Here this is 4 (sizeof(int)). Then he tells us where we have made a mistake, and then he shows what is the mistake. We have written after an allocated area. And then he shows the stack of the call that led to allocate this area. It starts by a call to reserve in the main line 6, and so on.

As you can see, the message is clear and it helps finding this kind of mistakes easily. This is a valuable tool for writing bug-free software. Every of your program should work without any error in valgrind. Because even if it doesn't create an error directly, it can lead to very weird error later and this is called (in my school at least^^) a mystical error. Because there is no clue that could help (without valgrind).

This is a simple example just to see how powerful is this tool. Another common use of Memcheck is its ability to detect memory leaks. As an example:

int main()
{
  int* p = new int;

  p = nullptr;
}

This is a trivial case of a memory leak because after the affectation of p to nullptr, there is no more pointer to the area allocated with new. It is lost. This is trivial to detect it, but that could be less trivial. So once again, we can rely on Memcheck to warn us. Let's see what he has to say about this:

==4736== HEAP SUMMARY:
==4736==     in use at exit: 4 bytes in 1 blocks
==4736==   total heap usage: 1 allocs, 0 frees, 4 bytes allocated
==4736==
==4736== 4 bytes in 1 blocks are definitely lost in loss record 1 of 1
==4736==    at 0x4C286E7: operator new(unsigned long) (in \
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==4736==    by 0x40060D: main (tests.cc:3)
==4736==
==4736== LEAK SUMMARY:
==4736==    definitely lost: 4 bytes in 1 blocks
==4736==    indirectly lost: 0 bytes in 0 blocks
==4736==      possibly lost: 0 bytes in 0 blocks
==4736==    still reachable: 0 bytes in 0 blocks
==4736==         suppressed: 0 bytes in 0 blocks

I just show the interesting part of the output of valgrind. To get this, I ran valgrind –leak-check=full <program_name>. We start by reading the heap summary, which says that when we exit, 4 bytes are still in use. It indicates the number of allocation and the number of deallocation. We see that the two numbers are not equal, so there is a problem. There is more allocation than free, it's a leak.

At the bottom of the message we can see the kind of leak. There are four kinds of leaks. The full explanation can be found in the documentation, in the section 4.2.7. A short explanation follows.

AA and BB are heap blocks, and C the set of pointer of your program (in reality it is more than that. See the section 4.2.7 cited above). And consider an arrow represent that there is at least one pointer to the first byte of the allocated area, and an arrow and a `?' represents a pointer to a byte inside the allocated area and not the first (this is called interior-pointer in the link given above). No arrow means that the heap block is unreachable. These are simplified examples to understand the concept behind these things.

Once we have understood what represents these names, we can look at the message in the middle that gives the location of each memory blocks that leak. It says definitely lost. So we just override the address of this area, and it appears line 3. Once we know the supposed reason of this leak and its position in our source file, it is easy to fix it.

This is the power of Memcheck. This is the first thing I run when I have a bug. Because even if the error he can reveal is not responsible of the bug, it will be annoying in the future.

A important note about valgrind, since it use its own allocator and deallocator, some bugs may disappear when using Memcheck. It happens to me (at least) once, when I had the output of my program that depends on the state of the computer (state of the heap in fact), which led my algorithm to have different results if I run it several time consecutively. This is generally due to a memory corruption or something like that, so I used Memcheck. And under Memcheck, the program starts to act normally.

An interesting post about valgrind that talks about can be found here.

2 gdb

A second tool I use when I still have a bug after using valgrind is gdb (GNU Debugger).

gdb is a debugger that allows you to walk through your code at run-time and see how the lines you (or your colleagues) wrote influence the program and detect bugs.

I don't want to write a basic tutorial to gdb since there are really tons of them on the Internet. I'd prefer talking about two things that can be useful and not known by everyone. The first one is backward debugging. The second one is a trick to debug a library when you can't run gdb on the program that uses the library. Baptiste Afsa, one of my teacher at Epita, gave me this trick.

2.1 Backward debugging

Why backward debugging? Simply because I facepalmed myself too many time after going too far in my debugging process. I mean going after the critical point after running through trivial code slowly for 5 minutes and being forced to restart the whole thing.

This feature was introduced in 2009 with gdb version 7.0. Documentation is here. To be able to enable this feature, you need to tell gdb to record the state of your program. The official tutorial is here. So I don't have to explain that in details (No I'm not lazy! :) but I don't like duplication and I don't like duplication). I recommend to read the tutorial!

But the thing is that recording introduces an overhead when you use next, continue or whatever. This can slow down the debugging process, but I think it can really help the programmer.

Give it a try, it is helpful.

2.2 The infinite loop trick

This tips allows you to debug the code of a library call by a program that you have difficulties to run with gdb. The case where I use it is when we had to code malloc (free, realloc, calloc too) at school. And to check if it works we used LD_PRELOAD to make binary use our version of malloc instead of the standard version.

Let's assume we run ls with our malloc, how would you debug it?

$ gdb ls
...
Reading symbols from /bin/ls...(no debugging symbols found)...done.
(gdb) break main
Function "main" not defined.

Well… Seems hard right? The solution given by my teacher was to use an infinite loop. How it works:

#include <iostream>

int main()
{
  bool cond = true;

  while (cond)
    continue;

  std::cout << "Out of the infinite loop" << std::endl;
  // Stuff
}

The trick was to add this snippet at the beginning of the code of our function, and to run the program. For example, compile the code above (don't forget -ggdb). This is an infinite loop, yay!

gdb can attach to a running process with its PID, and is able to set the value of a variable when debugging. So the trick is just:

$ pgrep <program_name>
<pid>
$ gdb attach <pid>
...
main () at tests.cc:8
8           continue;
(gdb) set var cond = false
(gdb) n
7         while (cond)
(gdb) n
10        std::cout << "Out of the infinite loop" << std::endl;
(gdb)

By setting the variable cond to false, we are out of the infinite loop, and we have the hand on the program to debug as we want. Impressive right?

I think this trick is useful for desperate situation like the one described above.

This was two great features I wanted to share with you. Do you have features that you want to share?

Thursday, August 9, 2012

Debugging C++ (Part 1): How to write less bugs

Hi all! I planed to make a post about some different methods to debug a C++ program. And I realized that it will be a very long post, so I split it into 4 parts. The first part is about some methods to avoid dummy bugs notably using two new C++11 features. The second is a presentation of valgrind and gdb. I include in the gdb part an introduction to the reverse-debugging that consists in running the program backward. The third part is an explanation of the usage of dmesg to find an instruction in a library that leads to a segmentation fault. One of the advantage is that it works after the crash, and don't need to restart the program. The fourth and last part is about the print method. I present a way to make this method pleasant to use and easy to maintain.

This is far from being an exhaustive list of debugging methods, just some of my favorites. You are invited to share yours in comments! :-)

This first post presents the importance of using warnings when compiling, assert to verify the coherence of the program, and miscellaneous things introduced by C++11.

1 Warnings

The first thing to do to avoid stupid bugs is to think before writing any piece of code. It can be hard sometimes, but it's totally worth it.

My global philosophy about programming is that I want my computer to insult me whenever he can. I want a compiler able to detect as many errors as possible.

So, the thing to do in the aim to make the compiler as hard as possible, is to enable warnings. Personally, on g++ I always use -W, -Wall, -Wextra and -Werror for changing all the warnings into errors. This can save some hours of debugging. Let's see an example of a buggy code that compiles without warning, but with enable warnings it won't and it is great!

int i = -42;
unsigned int j = 51;

if (i > j)
  {
     // Bug found.
  }

It can be disappointing that i > j is evaluated as true. It is due to an implicit conversion. The i once compared with an unsigned int is converted into a unsigned int equal to UINT_MAX - 41. So this is really easy to make this error when the type are declared too early and you forgot what is the type of i and j. Warnings are just mandatory! I hope this little example is enough to convince you. I'm sure there are several hundred of examples like this one, and you just have to run through the net to find out another examples.

2 Assert

A good practice is to use the macro assert available in the header cassert. This is a macro that evaluates its content and stops the program if its content is evaluated to false. If you define NDEBUG (the common way is to pass the -DNDEBUG option to g++, -D allows to define a macro), the code inside the parenthesis of the macro isn't evaluated.

The main interest of assert is that it can be a good checker for preconditions or postconditions. Beware, you must not use it as a way to manage run time error. It is here to verify all along of your development that you are not receiving something weird. If you use well assert, it must stop the flow of your program before it starts acting crazily. By making this, you ensure looking at the good spot for finding the source of the problem, and not to a side effect that occurs 20 functions later. This can reduce considerably the debugging time.

As said above, the code between the parenthesis isn't evaluated in release mode. So a bad use of assert would be to put real code in it. Because once released, this code will not be ran. It is also its advantage. Checking all these preconditions can introduce an overhead, but you don't have to worry about it in release mode since this is like this code never exists.

As a little conclusion, if you don't already use assert, start now! :) It can change a lot of things and it has already saved a lot of debugging hours for me. I hope it will be the same for you!

3 Miscellaneous

3.1 Preventing Narrowing

Now I will give some little tips which can help. There are a lot of tips like this. Once again, I invite you to leave your own tips in the comments!

A common problem in C or C++ is narrowing. Preventing this is an addition of the C++11, which can prevent a lot of bugs. As an example:

void doit(int);

int main()
{
  float i = 4.2;

  doit(i); // Huum... A bug hard that could be hard to find.
  doit({i}); // warning: narrowing conversion of 'i' from
             // 'float' to 'int' inside { } [-Wnarrowing]
}

This examples shows how it can help to avoid some kind of bugs. I recommend using it around all the variables you want to protect. These situations happens, and why not use the language to help you to not losing your time?

3.2 nullptr

It is also important to use strong typed variable. It helps the compiler to help you! Once again, C++11 comes with a strongly type null pointer nullptr. NULL is just 0 (see Stroustrup FAQ). And it can lead to bugs related to the dispatch on overloaded function.

void print(long int i) {  std::cout << "long int: " << i << std::endl; }

void print(int* i)     {  std::cout << "pointer" << std::endl;         }

int main()
{
  long int i = 51;

  print(i);       // prints "long int: 51"
  print(NULL);    // Raises a compile-time warning and prints "long int: 0"
  print(nullptr); // prints "pointer"
}

The warning is "passing NULL to non-pointer argument 1 of 'void print(long int)'". Hopefully there is a warning in this case because this is not the wanted comportment. The introduction of nullptr allows to represent the concept of a null pointer and to have it strongly and correctly typed. I think it is a good idea to use it instead of the NULL or 0.

3.3 Yoda Condition

I use this name after reading this very funny post about new programming jargon. This goal is for people who makes typo when they write like writing = instead of ==. I have to admit, I have rarely something like this written in my code since I don't use magic number (constant values written in the source in the middle of the code). But sometimes it can help. Here is an example:

int main()
{
  int i = 51;

  if (i = 51)
    std::cout << "Oops" << std::endl;

  if (51 = i)
    std::cout << "Thanks g++!" << std::endl;
}

Since some people want to write assignment in their conditions, the compiler can't warn about this. So you have to make it scream by helping him. In the second if we get an error "error: lvalue required as left operand of assignment".

That's all for the little tips, I hope you see why being drastic with yourself can help you. Writing these asserts is longer than not writing them because you have to think to all the precondition needed etc. But I can assure you that you are so happy when you see your program crash because of an assert and not with a segmentation fault or some crappy things like that. About the warnings, at the first glance, it seems annoying to be warns about everything, but programming is made of little details too. So use it! :)

For the miscellaneous tips, this is just little habits to take that can improve the work flow by reducing little mistakes. The last one is more a funny thing than a strong guideline as are using {} to prevent narrowing and nullptr to help the compiler by saying that we use a pointer.

Don't hesitate to post your own tips in comments ;)

Sunday, August 5, 2012

C++11: Vector Improved, How It Works?

C++11 comes with several interesting things, and in this post, we will talk a little about a new method in std::vector that comes with the new standard. This method can lead to improve the performance of your programs. This post is divided in two: The first part explains why it is good to have this new method (emplace_back) and how to use it, and the second part will try to make the way it works clear. To achieve this goal, we have to go through several new things (variadic template, constexpr, argument forwarding) that we explain a little to understand the whole thing.

In the previous standard, sometimes you had to create an array of let's say Point3D (or whatever) in a loop. Let's suppose we know the number of elements we have to put in the vector, we first show the slowest version I think about, and then we show a more optimized version thanks to C++11 standard.

1 A Case Study

struct Point3D
{
  Point3D(int xx, int yy, int zz)
    : x {xx}
    , y {yy}
    , z {zz}
  {
    std::cout << "Cstor" << std::endl;
  }

  Point3D(const Point3D& p)
    : x {p.x}
    , y {p.y}
    , z {p.z}
  {
    std::cout << "Copy Cstor" << std::endl;
  }

  int x;
  int y;
  int z;
};

int main()
{
  std::vector<Point3D> v;

  for (unsigned int i = 0; i < 10; ++i)
    {
      v.push_back(Point3D {i, i - 69, i + 42});
    }

  for (unsigned int i = 0; i < 10; ++i)
    {
      std::cout << v[i].x << std::endl;
    }
}

What we want in this program, is to have only 10 calls to the constructor because we just want ten objects, so our requirement seems legit right? But if we run this program, we can see 10 calls to the first constructor and 25 to the copy constructor. This is huge! There are two reasons for these copies:

  • Vectors are dynamic arrays. And each time it reaches its limit, it doubles its size. So all the elements are copied at each reallocation. It starts with a size 1, then has to double its size, and the same for 2, 4 and 8. If we add these numbers, we have 15 copies. These can be deleted by using the "reserve" method since we known the number of elements. By this call we avoid the reallocation and the copies.
  • The problem of who is responsible of the destruction of which object is a complex one. The STL handles this by copying the object it takes in their containers and to destroy them when the container is destroyed. This is why we still have 20 copies and not 10. But with C++11 comes an emplace_back method (emplace exists too, but we focus on the push_back dual here). This method removes the copy done by the push_back method. For a user, the only changes to make is to replace the call to push_back by a call to emplace_back. The arguments to give to the new method are the arguments to give to the constructor:
v.emplace_back(i, i - 69, i + 42);

2 How It Works

Now, let's see what are the changes done for being able to make this emplace_back method. For this post I have used the glibcxx version 4.7.1 to discover the changes.

Before going into the code, we have to present variadic templates and its power, the forwarding parameters concept, and how to use it. Then, we show the difference between push_back and emplace_back.

2.1 Variadic Template and Forwarding

Forwarding parameters is a way to take all the arguments received by a function, and to resend it as it comes. This comes with variadic templates. Variadic templates allow to pass an undefined number of arguments to a function/method. It allows to have a program which looks functional (a Head and the Rest). The code that follows combines two new features of the C++11, constexpr and variadic parameters. A little word about constexpr: it allows to make computation at compile time if they are possible (there are several conditions to respect for that, but for now, we just assume that they are respected). In this example, we find the min of a list of argument at compile time (thanks to constexpr). This is a generic version that works only with integers but works on 2, 3, 4, … n arguments.

constexpr
int min(int n, int p)
{
  return n < p ? n : p;
}

template<typename... Args>
constexpr
int min(int n, Args... args)
{
    return min(n, min(args...));
}

int main(int argc, char *argv[])
{
  static_assert(min(4, 5, 6, 42, 7, 3, 6) == 3, "min is incorrect");
}

The recursion mechanism is classical, but what I would have implemented with a vector in the past (in a run-time version) or template recursion (in a compile-time version) is fully expressible with the concept of variadic template, and it is enjoyable, because it is kind of beautiful. It allows to be computed at compile-time if all the arguments are deductible at this time, or simply computed at run-time. It is also invisible for the user.

It is important to note that the arguments are passed by copy. If we wanted to pass them by reference, the game would be harder… And to be honest, at this time, I don't know how I should do. It seems more complex to handle the recursion because it implies to implement different combination of the arguments ( int&, int&&; int&&, int&; …). There was a proposal to make it as a part of the standard library (proposition n772), or with initializer_list. The proposal shows it is easier to implement and faster with the second method. We keep this min implementation as a simple example to understand variadic templates. Any proposition of a fully generic working version (using constexpr and variadic template) of this is encouraged! A solution might be to use std::forward since it seems to be made for handling the forwarding, but my first tries were not successful.

Now you have seen why I think the variadic template are good to play with, let's talk about forwarding arguments. Forwarding is made by the std::forward function from the `utility' header (code can be found in `bits/move.h'). It forwards the arguments exactly as they are received (more information in the man). Here is a little example of what makes std::forward in the code:

#include <iostream>
#include <utility>

struct Item
{
  Item(): value{0} {}

  Item(const Item& p){ std::cout << "Copy" << std::endl; }

  int value;
};

void take_args(int& a, int&& b, int c, Item& f)
{
  std::cout << a << " - " << b << " - " << c
            << " - " <<  f.value << std::endl;
}

template <typename... Args>
void call_take_args(Args&&... args)
{
  take_args(std::forward<Args>(args)...);
}

int main()
{
  Item f;
  int i = 2;
  call_take_args(i, 4, 5, f);
}
// The program outputs "2 - 4 - 5 - 0".

If we remove the reference in the last take_args argument, "Copy" is also printed. We can see that the program won't compile if we remove a `&' for the second argument, or if we add an extra one to the first. This is because std::forward keeps the r/l-valueness of the argument received. How are they able to know this? Let's take a look at the source of this function (version 4.7.1 of glibcxx):

template<typename _Tp>
constexpr _Tp&&
forward(typename std::remove_reference<_Tp>::type& __t) noexcept
{ return static_cast<_Tp&&>(__t); }

template<typename _Tp>
constexpr _Tp&&
forward(typename std::remove_reference<_Tp>::type&& __t) noexcept
{
  static_assert(!std::is_lvalue_reference<_Tp>::value, "template argument"
                " substituting _Tp is an lvalue reference type");
  return static_cast<_Tp&&>(__t);
}

The first version of the function takes a lvalue and the second a rvalue. The static_assert just checks if the specialization works as expected. This specialization works because std::remove_reference takes a type `T' (possibly U, U&, U&&), and the `type' is a typedef of T. So we are able to manage a use case where we give as parameter T, T& or T&& the same way. As a result we have two functions with the following signature: forward(T& __t) and forward(T&& __t). If the argument is a lvalue it goes in the first one, otherwise to the second. noexcept just helps the compiler to know that the function/method will not throw an exception or that the program should be stopped if an exception tries to escape from here.

Now let's take a look at what they return. The fact that both functions seem to return the same thing might look weird. We have the feeling that they return the same type: `T&&'. But in fact, `T' could be a `U&' or a `U&&', and C++11 comes with a rule named reference collapsing rules (see this article which talk about it). This is simple:

  • U& & => U&
  • U&& & => U&
  • U& && => U&
  • U&& && => U&&

This looks like the `and' truth table where & is 0 and && is 1. A good way to remember I think. So: if _Tp is a U& (the first function), the returned object will be U& too, and if it was U&& it will be U&&. Which follows the rules of a perfect forwarding. Now we have understood the concept of the forwarding, and the way it is done, we can take a look at the code of push_back and emplace_back to know what makes the change.

2.2 Emplace_back and Push_back

Now we have all the tools in our hands to understand the difference between these two methods, we can just show the source code:

// Taken from bits/stl_vector.h
void
push_back(const value_type& __x)
{
   if (this->_M_impl._M_finish != this->_M_impl._M_end_of_storage)
     {
        _Alloc_traits::construct(this->_M_impl, this->_M_impl._M_finish,
                                 __x);
        ++this->_M_impl._M_finish;
     }
    else
#ifdef __GXX_EXPERIMENTAL_CXX0X__
      _M_emplace_back_aux(__x);
#else
      _M_insert_aux(end(), __x);
#endif
}

// Taken from bits/vector.tcc
template<typename... _Args>
void
vector<_Tp, _Alloc>::
emplace_back(_Args&&... __args)
{
   if (this->_M_impl._M_finish != this->_M_impl._M_end_of_storage)
     {
        _Alloc_traits::construct(this->_M_impl, this->_M_impl._M_finish,
                                 std::forward<_Args>(__args)...);
        ++this->_M_impl._M_finish;
     }
   else
       _M_emplace_back_aux(std::forward<_Args>(__args)...);
}

The main difference is the use of the std::forward. The macro __GXX_EXPERIMENTAL_CXX0X__ is defined when an option to activate C++11 is set. The _M_emplace_back_aux and _M_insert_aux are responsible of the reallocation when the number of available position (available means allocated and not already taken) is down to 0.

Since there is no other difference between the two methods, that means the interesting point is in the construct method (we just leave the *aux method for this post, it will take too much time to analyze them).

Before that, we have to present briefly the concept of putting `::' before a function/method name and the concept of using a placement new. If your are familiar with these concepts, you should skip these paragraphs and go directly after the code of the construct method.

It is possible for a class to override its operator new. This (static) method is responsible of the allocation of the object. If we put `::' before the operator new that means that we want to take the one in the global namespace and not the overridden one. The code below shows the difference between these two notation:

struct Foo
{
  Foo(int a)
    : value{a}
  {
  }

  static void* operator new(size_t)
  {
    return (void*)42; // g++ is able to detect if
                      // we return 0 and raises a warning.
  }

  int value;
};

int main(int argc, char *argv[])
{
  Foo* bad = new Foo(4); // segmentation fault.
  Foo* good = ::new Foo(4); // ok.
  return 0;
}

The operator new returns a pointer to an unallocated area, which leads to a segmentation fault when the constructor is called and tries to write the value of `a'. If we put `::' in front of new, this buggy operator new is not called, and there is no memory corruption.

Now, let's talk about placement new. This is just a way to use specific position in the memory, and this can be useful when working with memory pools, or when performance is needed: allocation and deallocation are expensive. Avoiding this can speed up the program. The syntax is just: new(position), where position is a pointer to where the object must be. This is as simple as that.

Now let's close the parenthesis and look at the code of construct (this is not the construct which is directly called, but the other functions that calls this one are not interesting since the same path is followed by emplace_back and push_back):

// Taken from ext/new_allocator.h
template<typename _Up, typename... _Args>
void
construct(_Up* __p, _Args&&... __args)
{ ::new((void *)__p) _Up(std::forward<_Args>(__args)...); }

This version is the C++11 version, which manages the case where we copy, and the case where we construct. Let's start by understanding this line, they use the placement new because this is not up to the `new' to decide where put this object, since it must be put at the end of the vector.

There is a call to the constructor and the arguments received are forwarded. These arguments are the ones given to the emplace_back method. And then it is up to the overloading to make it works. If we are making a copy (we arrived here by push_back), the copy constructor is called. Otherwise, we call another constructor if it exists (if not, we get a compile-time error).

And that's all! To arrive here, we have to know variadic template, argument forwarding, constexpr, placement new, operator new overriding and reference collapsing rules. But at least, we understand what are the changes needed to be able to make this emplace_back method, and how the C++11 makes it possible. Thanks to the new standard for this improvement! :-)

PS: Thanks to Christopher Chedeau and Gregoire Bouchetoun for their comments and corrections about this article.