C++ supports developers in object-orientated programming and removes from the developer the responsibility of dealing with many object-oriented programming (OOP) paradigm problems. But these problems do not magically disappear. Rather it is the compiler that aims to provide a solution to many of the complexities that arise from C++ objects, virtual methods, inheritance etc. At its best the solution is almost transparent for the developers. But beware of assuming or relying on ‘under-the-hood’ behavior. This is what I want to share in this post - some aspects of how compilers deal with C++ objects, virtual methods, inheritance, etc. At the end I want to describe a real-world problem that I analyzed recently, which I called a “pointer casting vulnerability”.

Pointers C vs C++

C++ introduces classes supported by the C++ language standard, which is a big change. Compilers need to take care of many problems, e.g. constructors, destructors, separating fields, method calling etc.

In C we are able to create a function-pointer so why shouldn't we be able to create a pointer-to-member-function in C++? What does it mean? If we have a class with implementation of any method, from the C developer point of view this is just a function declared inside the object. C++ should allow us to create pointer to exactly this method. It is called a pointer-to-member-function.

How can you create them? It is more complex than a function pointer in C. Let's see an example:

class Whatever {
public:
	int func(int p_test) {
		printf("I'm method \"func()\" !\n");
	}
};

typedef int   (Whatever::*func_ptr)(int);
func_ptr p_func = &Whatever::func;
Whatever *base = new Whatever();
int ret = (base->*p_func)(0x29a);  

The definition of the function pointer is not much different comparing to C. But the way to call the function from the pointer obviously is because of the implied this pointer. At this point some magic happens. Why do we need to have an instance of the class and why we are using it as a base to call the pointer? The answer requires us to analyze what these “pointers” look like in memory.

Normal pointers (known from C language) have size of a CPU word. If the CPU operates on 32 bit registers, a C-like pointer will be 32 bits long and will contain just a memory address. But the pointer-to-function-member C++ pointers mentioned above are sometimes also called “fat pointers”. This name gives us a hint that they keep more information, not just a memory address. The fat pointer implementation is compiler-dependent, but their size is always typically bigger than a function pointer. In the Microsoft Visual C++ a pointer-to-member-function (fat pointer) can be 4, 8, 12 or even 16 bytes in size! Why so many options and why is so much memory needed? It all depends on the nature of the class it's associated with.

Classes and inheritance

The nature of a “pointer to member function” is driven by the layout of the class for the member function that we’re wanting to point to.

There are some excellent references on the details of C++ object layout – see [1,2] for example. We give just one example class and associated layout: consider two unrelated classes that derive from the same base class:

class Tcpip {
public:
    short ip_id;
    virtual short chksum();
};

class Protocol_1 : 				class Protocol_2
	: virtual public Tcpip {				: virtual public Tcpip {
public:						public:
	int value_1;					int value_2;
	virtual int proto_1_init();			virtual int proto_2_init();
};						};

These two classes could be written by completely different developers or even companies. They don't need to be aware of each other. Now we imagine the situation that a third company wants to write a wrapper for these two protocols and export APIs that are independent of the specification of either. The new class could look like this:

class Proto_wrap 
	: public Protocol_1, public Protocol_2 {
public:
	int value_3;
	int parse_something();
};

Note that having declared Protocol_1 as Protocol_1 with virtual inheritance means that there is a single version of ip_id (and chksum()) in the memory layout and the statement

	pProtoWrap->chksum(); 

is unambiguous. The layout of a Proto_wrap object is:

(Without the virtual inheritance, each of Protocol_1 and Protocol_2 would have its own copy of the ip_id member, leading to ambiguity if we were to try something like: int a = pProto_wrap->ip_id )

Pointer-to-member-function (fat pointers)

Now that we have recalled some relevant background we can return to the original problem. Why are pointer-to-member-function bigger than a C-style function pointer? The Microsoft VC++ compiler can generate pointer-to-member-function (fat pointers) that are 4, 8, 12 or even 16 bytes long [3,4]. Why are there so many options and why do they need so much memory? Hopefully thinking about the object layout example above provides some hints...

If we create a pointer-to-member-function to a static function it will be converted to a normal (C-like) pointer and will be 4-bytes long (on 32 bits arch, in other case CPU word size). Why? Because static functions have a fixed address that is unrelated to any specific object instance.

In the single inheritance case, any member function can be referred to as an offset from a single ‘this’ pointer.

In the multiple inheritance case however, given a derived object (e.g. Proto_wrap) it is not the case that its ‘this’ pointer is valid for each base class. Rather ‘this’ needs to be adjusted depending on which base class is being referred to. In this case “fat pointer” will be 2 CPU words long:

				| offset | “this” |

See [5] for a more detailed walkthrough.

Additionally if our object uses virtual inheritance (the layout example given in the previous section), then we need to know not only which of the vtables is relevant (Protocol_1’s or Protocol_2’s) but also the offset within that corresponding to the member function that we’re wanting to point to. In this case the pointer-to-member-function size will be 12 bytes (3 CPU words size).

This is not the end… You can also forward declare an object and in this case the compiler has no idea about its memory layout and will allocate a 16-byte structure for the pointer-to-member-function unless you specify the kind of inheritance associated with the object via special compiler switches/pragmas. [3,4].

So now I will try to explain some interesting security-related behavior which I met during my work…

C++ pointer casting vulnerability

Let's analyze the following skeleton example. We have a base class, which is virtually inherited by two further classes: RealData holds some data ; Manage can process specific types of data; ‘BYTE *_ip’ is used as the means to direct which of Manage’s processing methods should be called.

class UnknownBase;

class RealData {
   friend Manage;
   public:
...
      ULONG_PTR  _lcurr;	// some real data...
      int   _flags;
      int   _flags2;
      int   _flags3;

};

class Manage : public virtual UnknownBase {   
      friend class ProcessHelper;
public:
      BYTE *   _ip;
      RealData *    _curr;
...
};

class ProcessHelper  : virtual UnknownBase {
   public:
      typedef LONG_PTR (Manage::*ManageFunc)();

      struct DummyStruct {
         ManageFunc _executeMe;  // pointer to member function!
      };
...
};

LONG_PTR Manage::frame() {
   LONG_PTR offset = (this->*(((ProcessHelper::DummyStruct *) _ip)->_executeMe))();
   return offset;
}

The key to this vulnerability is the rather convoluted cast in Manage::frame(). Note the types involved:

  • _ip is of type BYTE * type
  • ProcessHelper::DummyStruct is a struct with a pointer-to-function member type ManageFunc

So the ‘BYTE *’ data is actually being cast (in a roundabout way via a struct) to a ManageFunc pointer-to-member-function type, ie the instruction is really equivalent to:

LONG_PTR offset = ((ManageFunc)_ip)();

However the compiler errors out on such a statement, flagging that ‘BYTE *’ and ‘ManageFunc‘ are incompatible types (different sizes in particular!) to be casting to and from. It appears here that the developer worked round the compiler error by introducing the ‘struct DummyStruct’ subterfuge: they assumed that under the hood the ManageFunc really was just a standard pointer, and were able to indirectly achieve the incorrect cast… C/C++ will always allow the persistent developer to eventually do the wrong thing.

Let’s run through how this breaks in practice. We create the following instances:

  Manage *temp_manage = new Manage;
  Real_block *temp_real_block = new RealData;

To illustrate the issue we might set up the ‘flags’ members as follows:

  temp_real_block->_flags = 0x41414141;
  temp_real_block->_flags2 = 0x41414141;
  temp_real_block->_flags3 = 0x41414141;

And let’s suppose the code does something like the following:

  temp_manage->_ip = (BYTE *)&temp_real_block->_lcurr;
  temp_manage->frame(); //  does the (ManageFunc)_ip) cast

This leads to a crash - after casting to the DummyStruct structure with pointer-to-member-function our base casting expects to have a fat pointer memory layout associated with virtual inheritance, specifically expecting to find vbtable offset information at the 3rd CPU word: this value is taken and added to the whole pointer. In our case, _ip was pointing at _lcurr and so we have the following adjacent data:

      ULONG_PTR  _lcurr;
      int   _flags;
      int   _flags2;
      int   _flags3;

So here, the arbitrary _flags data will be added to the memory address _lcurr in an attempt to form the address of the member function.

Note that RTTI (run-time type information) does not help here; the incorrect cast is directly computing an incorrect memory address to call.

The security consequences are potentially severe - full remote code execution (RCE). In such an incorrect ‘standard pointer’ to ‘pointer-to-member-function’ cast scenario, the data adjacent to the standard pointer will be used to calculate the address of the member function. If the attacker controls this then by choosing suitable values here, he can cause that address calculation to result in a value of his choosing, thus gaining control of execution.

In the real vulnerability we didn’t have direct control over what will be written to the “_flags” field. But we were able to execute some code path which set “_flags” value to not zero – the number 2 (two). So we were able to set “_flags” to the value 2 and then execute vulnerable code. Because pointer was badly calculated (because 2 was added to the pointer), memory which was cast to the structure had bad values. Inside of the structure was function pointers and because they were shifted by 2 bytes, they were pointing somewhere in memory which always was somewhere in the heap range. An attacker could spray the heap and thus control this. [6]

Summarize

The higher level a language, the more problems the associated compilers must solve. But the developer is ultimately responsible for writing correct code. Typically C/C++ compilers will ultimately allow you to cast to and from unrelated types (C pointers to pointers-to-member-functions for example) and back again. Developers should avoid such illegal activity, take careful note of compiler warnings that occur when they break the rules, and be aware that if they persist they’re on their own… Microsoft Visual Compiler detects described situation and inform developers about that by printing appropriate message:

error C2440: 'type cast' : cannot convert from 'BYTE *' to 'XXXyyyZZZ'. 
  There is no context in which this conversion is possible.

Btw. I would like to thanks following people for help with my work:

  • Tim Burrell (MSEC)
  • Greg Wroblewski (MSEC PENTEST)
  • Suha Can

Best regards,
Adam Zabrocki

References

[1] Reversing Microsoft Visual C++ Part II: Classes, Methods and RTTI

[2] C++: under the hood

[3] MSDN: Inheritance keywords

[4] MSDN: pointers-to-members pragma

[5] Pointers to member functions are very strange animals

[6] http://technet.microsoft.com/en-us/security/bulletin/ms13-002