Hooking a C++ Constructor

NOTE: This post mentions injecting DLL's into a game. Doing this kind of stuff in single-player games for your own amusement is fun. In multiplayer games, where you might be harming other people's enjoyment, it is simply gross. Don't ruin someone's free time.

I've been playing around with modding a game by injecting a DLL into the process and hooking native functions. Most C or C++ functions can be easily hooked: write a new function that uses the same calling convention, takes the same arguments, and returns a compatible type. Install the hook using some library and you're basically done.

However, C++ constructors are not that easy. What calling conventions do those use, does it return anything, and if so, what?

__thiscall

Constructors are basically member functions, and as such they use the __thiscall calling convention. The constructor must return the this pointer.

VC++ does not allow you to specify the __thiscall calling convention on non-member functions, but that's pretty easy to work around. Write a class and add a member function to it. This member function can then be used to replace the original constructor.

Imagine that our imaginary game uses the following class:

class Player {
    int health_points;
    int weapon_rank;

public:
    // Constructor
    Player(int level)
    {
        health_points = 20 * level;
        weapon_rank = level;
    }

    // ... rest ...
};

The health points and weapon rank correspond directly to the player's level. We want to have some fun and change this, so we hook the constructor by writing a compatible class in our DLL with a member function that mirrors the constructor's interface:

struct HackedPlayer {
    int health_points;
    int weapon_rank;

    HackedPlayer* constructor(int level)
    {
        health_points = 100'000'000;
        weapon_rank = 9001;

        return this;
    }
};

void install_hooks()
{
    MH_CreateHook(
        (void*)0x0059cf20, // We know the original constructor is located here
        memfn_voidp(&HackedPlayer::constructor),
        nullptr
    );
}

Now when we run the game, inject the DLL, and install our hooks, the player won't be initialised using its constructor, instead it will use our HackedPlayer::constructor method. We've changed the rules of the game 😎.

Problems..

The example I gave was very simplified. It actually works, but in most games classes are much more complicated than that. Imagine if the class contains a std::string:

 class Player {
     int health_points;
     int weapon_rank;
+    std::string name;

 public:
     // Constructor
+    Player(int level, std::string given_name)
     {
         health_points = 20 * level;
         weapon_rank = level;
+        name = given_name;
     }
 }

We add the std::string to our HackedPlayer and assign to it just like the original constructor does:

struct HackedPlayer {
    int health_points;
    int weapon_rank;
    std::string name;

    HackedPlayer* constructor(int level, std::string given_name)
    {
        health_points = 100'000'000;
        weapon_rank = 9001;
        name = given_name;  // We added this line, what could go wrong? :^)

        return this;
    }
};

It appears like not much changed, but actually, if you try this new example you'll find that the program now crashes! Somehow the function we replaced the constructor with is no longer compatible. What gives?

The crash occurs at the name = given_name assignment. The problem is that std::string, unlike int, is a very complicated type. std::string manages a dynamic buffer of memory containing the text. During string assignment, a check is performed to see if the current buffer is large enough to hold the new text, if it is large enough the text is copied into the buffer, if it's too small, the current buffer is deallocated and a new buffer is created that has the right size to hold the new text.

None of this makes sense inside of HackedPlayer::constructor. The name variable isn't an actual std::string because no std::string has been constructed. Once we try to use the std::string assignment operator everything goes off the rails, it'll try to write into a buffer that doesn't exist, or deallocate it without it ever having been allocated. We're breaking the rules of C++ and sooner or later the program will crash.

What's in a name?

The reason why it works with the constructor but doesn't work when we hook it is because the real constructor has some 'hidden' code that constructs the name member variable before the body of the constructor is run. Our hook doesn't do this because it's not a constructor, it's a member function that tries to act like one.

There are several ways to solve this problem and I'm not going to present the easiest one (use placement new on the std::string). Instead I want to look at why we started by hooking a constructor using a member function, and how we might be able to use a real constructor instead.

To hook a function using MH_CreateHook we must pass in a void pointer to the original function, in this case we hard-code the address, and we must pass in a void pointer to the replacement function. With normal functions this is pretty easy, but here we're using a member function (for the __thiscall calling convention) which makes it slightly more difficult. Unlike a free function, you can't cast a member function to void* directly, instead you must use a helper function such as memfn_voidp:

template<class MemFn>
void* memfn_voidp(MemFn memfn)
{
    return *(void**)&memfn;
}

This is just a trick you have to know about and use. I got this trick from a comment in the Microsoft Detours library :^).

Alright, so it's slightly more difficult when you hook using a member functions opposed to a free function. What about using a constructor? Well.. it's pretty much impossible:

MH_CreateHook(
    (void*)0x0059cf20,
    memfn_voidp(&HackedPlayer::HackedPlayer),
    nullptr
);
// Results in compiler error:
// error C2277: 'HackedPlayer::{ctor}': cannot take address of this member function

We can't refer to a constructor directly, it's not allowed by the C++ standard nor as a VC++ extension. So how can we workaround this restriction? We'll have to write a tiny bit of assembly.

Hook Shim

Here's where we're at:

We must work around or break one of these rules.

This is the plan:

Earlier I said that VC++ does not allow you to specify __thiscall on free functions, so how can we write a free function the 'handles' the __thiscall calling convention? We will need to write some assembly code, that way we are not held back by this VC++ restriction. I'm not all that familiar with writing assembly, so I want to make this as easy as possible for myself, the x86 assembly code should call back into C++ as soon as possible.

Let's actually start with that C++ code:

HackedPlayer* __stdcall
construct_HackedPlayer(void* self, int level, std::string given_name)
{
    return new (self) HackedPlayer(level, given_name);
}

Pretty simple, we use placement new to immediately invoke the constructor. We return the this pointer which follows how the constructor is supposed to work, but wait! we're not using the right calling convention! We can't specify __thiscall but why are we specifying __stdcall instead?

Well, __thiscall and __stdcall are quite similar. The only difference is that the this/self pointer is passed via the ecx register with __thiscall but on top of the stack with __stdcall. This means that our assembly code (which will be invoked using the __thiscall convention) only has to convert the call into __stdcall and invoke this C++ function.

We do this by putting the ecx register onto the stack, being careful to preserve the return address:

__declspec(naked)
__declspec(noinline)
HackedPlayer* thiscall_shim(void* self, int level, std::string given_name) noexcept
{
    __asm {
        xchg ecx, [esp]
        push ecx
        jmp construct_HackedPlayer
    }
}

Now we install this shim, and hopefully everything works:

MH_CreateHook(
    (void*)0x0059cf20,
    (void*)thiscall_shim,
    nullptr
);

And yes, it works! Despite being at level one, I have one hundred million health points and a weapon rank over 9000!

I'm happy I was able to figure this out, it took some time but I now have a solution that should work for any constructor, no matter how many or what kind of argument it takes. There are easier ways to solve this problem, but for my use-case I really wanted a matching real constructor in my DLL, not a member function with a bunch of extra stuff added into it so that it emulates one.