In this series of tutorials I am going to discuss some of the inner workings of ATL and the techniques that ATL uses.
Let's start the discussion by talking about the memory layout of a program. Let's make a simple program which doesn't have any data members and take a look at the memory structure of it.
#include <iostream>
using namespace std;
class Class {
};
int main() {
Class objClass;
cout << "Size of object is = " << sizeof(objClass) << endl;
cout << "Address of object is = " << &objClass << endl;
return 0;
}
The output of this program is
Size of object is = 1
Address of object is = 0012FF7C
Now if we are going to add some data members then the size of the class is the sum of all the storage of the individual member variables. It is also true in the case of template. Now let's take a look at template class of Point.
#include <iostream>
using namespace std;
template <typename T>
class CPoint {
public:
T m_x;
T m_y;
};
int main() {
CPoint<int> objPoint;
cout << "Size of object is = " << sizeof(objPoint) << endl;
cout << "Address of object is = " << &objPoint << endl;
return 0;
}
Now the output of the program is
Size of object is = 8
Address of object is = 0012FF78
Now add inheritance too in the program. Now we are going to inherit class Point3D from Point class and see the memory structure of this program.
#include <iostream>
using namespace std;
template <typename T>
class CPoint {
public:
T m_x;
T m_y;
};
template <typename T>
class CPoint3D : public CPoint<T> {
public:
T m_z;
};
int main() {
CPoint<int> objPoint;
cout << "Size of object Point is = " << sizeof(objPoint) << endl;
cout << "Address of object Point is = " << &objPoint << endl;
CPoint3D<int> objPoint3D;
cout << "Size of object Point3D is = " << sizeof(objPoint3D) << endl;
cout << "Address of object Point3D is = " << &objPoint3D << endl;
return 0;
}
The output of this program is
Size of object Point is = 8This program shows the memory structure of the derived class. It shows the memory occupied by the object is sum of its data member plus its base member.
Address of object Point is = 0012FF78
Size of object Point3D is = 12
Address of object Point3D is = 0012FF6C
Things become interesting when a virtual function joins the party. Take a look at the following program
#include <iostream>
using namespace std;
class Class {
public:
virtual void fun() { cout << "Class::fun" << endl; }
};
int main() {
Class objClass;
cout << "Size of Class = " << sizeof(objClass) << endl;
cout << "Address of Class = " << &objClass << endl;
return 0;
}
The output of the program is
Size of Class = 4
Address of Class = 0012FF7C
And situation becomes more interesting when we add more than one virtual function.
#include <iostream>
using namespace std;
class Class {
public:
virtual void fun1() { cout << "Class::fun1" << endl; }
virtual void fun2() { cout << "Class::fun2" << endl; }
virtual void fun3() { cout << "Class::fun3" << endl; }
};
int main() {
Class objClass;
cout << "Size of Class = " << sizeof(objClass) << endl;
cout << "Address of Class = " << &objClass << endl;
return 0;
}
The output of the program is same as above program. Let's do one more experiment to better understand it.
#include <iostream>
using namespace std;
class CPoint {
public:
int m_ix;
int m_iy;
virtual ~CPoint() { };
};
int main() {
CPoint objPoint;
cout << "Size of Class = " << sizeof(objPoint) << endl;
cout << "Address of Class = " << &objPoint << endl;
return 0;
}
The output of the program is
Size of Class = 12
Address of Class = 0012FF68
The output of these programs shows that when you add any virtual function in the class then its size increases one int
size. i.e. in Visual C++ it increase by 4 bytes. It means there are 3 slots for integer in this class one for x one for y and one to handle virtual function that is called a virtual pointer. First take a look at the new slot, namely the virtual pointer that is at the start (or end) the object. To do this we are going to directly access memory occupied by the object. We do this by storing the address of an object in an int
pointer and using the magic of pointer arithmetic.
#include <iostream>
using namespace std;
class CPoint {
public:
int m_ix;
int m_iy;
CPoint(const int p_ix = 0, const int p_iy = 0) :
m_ix(p_ix), m_iy(p_iy) {
}
int getX() const {
return m_ix;
}
int getY() const {
return m_iy;
}
virtual ~CPoint() { };
};
int main() {
CPoint objPoint(5, 10);
int* pInt = (int*)&objPoint;
*(pInt+0) = 100; // want to change the value of x
*(pInt+1) = 200; // want to change the value of y
cout << "X = " << objPoint.getX() << endl;
cout << "Y = " << objPoint.getY() << endl;
return 0;
}
The important thing in this program is
int* pInt = (int*)&objPoint;In which we treat object as an integer pointer after store its address in integer pointer. The output of this program is
*(pInt+0) = 100; // want to change the value of x
*(pInt+1) = 200; // want to change the value of y
X = 200
Y = 10
Of course this is not our required result. This shows when 200 is store in the location where m_ix
data member is resident. This means m_ix
i.e. first member variable, start from second position of the memory not the first. In other words the first member is the virtual pointer and then rest is the data member of the object. Just change the following two lines
int* pInt = (int*)&objPoint;
*(pInt+1) = 100; // want to change the value of x
*(pInt+2) = 200; // want to change the value of y
And we get the required result. Here is the complete program
#include <iostream>
using namespace std;
class CPoint {
public:
int m_ix;
int m_iy;
CPoint(const int p_ix = 0, const int p_iy = 0) :
m_ix(p_ix), m_iy(p_iy) {
}
int getX() const {
return m_ix;
}
int getY() const {
return m_iy;
}
virtual ~CPoint() { };
};
int main() {
CPoint objPoint(5, 10);
int* pInt = (int*)&objPoint;
*(pInt+1) = 100; // want to change the value of x
*(pInt+2) = 200; // want to change the value of y
cout << "X = " << objPoint.getX() << endl;
cout << "Y = " << objPoint.getY() << endl;
return 0;
}
And output of the program is
X = 100
Y = 200
This clearly shows that whenever we add the virtual function into the class then the virtual pointer is added at first location of memory structure.
#include <iostream>
using namespace std;
class Class {
virtual void fun() { cout << "Class::fun" << endl; }
};
int main() {
Class objClass;
cout << "Address of virtual pointer " << (int*)(&objClass+0) << endl;
cout << "Value at virtual pointer " << (int*)*(int*)(&objClass+0) << endl;
return 0;
}
The output of this program is
Address of virtual pointer 0012FF7C
Value at virtual pointer 0046C060
The virtual pointer stores the address of a table that is called the virtual table. And a virtual table stores address of all the virtual functions of that class. In other words the virtual table is an array of addresses of virtual functions. Let's take a look at the following program to get an idea of it.
#include <iostream>
using namespace std;
class Class {
virtual void fun() { cout << "Class::fun" << endl; }
};
typedef void (*Fun)(void);
int main() {
Class objClass;
cout << "Address of virtual pointer " << (int*)(&objClass+0) << endl;
cout << "Value at virtual pointer i.e. Address of virtual table "
<< (int*)*(int*)(&objClass+0) << endl;
cout << "Value at first entry of virtual table "
<< (int*)*(int*)*(int*)(&objClass+0) << endl;
cout << endl << "Executing virtual function" << endl << endl;
Fun pFun = (Fun)*(int*)*(int*)(&objClass+0);
pFun();
return 0;
}
This program has some uncommon indirection with typecast. The most important line of this program is
Fun pFun = (Fun)*(int*)*(int*)(&objClass+0);Here
Fun
is a
typedef
'd function pointer.
typedef void (*Fun)(void);
Let's dissect the lengthy uncommon indirection. (int*)(&objClass+0)
gives the address of the virtual pointer of the class which is the first entry in the class and we typecast it to int*
. To get the value at this address we use the indirection operator (i.e. *
) and then again typecast it to int*
i.e. (int*)*(int*)(&objClass+0)
. This will give the address of first entry of the virtual table. To get the value at this location, i.e. get the address of first virtual function of the class again use the indirection operator and now typecast to the appropriate function pointer type. So
Fun pFun = (Fun)*(int*)*(int*)(&objClass+0);
Means get the value from the first entry of the virtual table and store it in pFun after typecast it into the Fun type.
#include <iostream>
using namespace std;
class Class {
virtual void f() { cout << "Class::f" << endl; }
virtual void g() { cout << "Class::g" << endl; }
};
int main() {
Class objClass;
cout << "Address of virtual pointer " << (int*)(&objClass+0) << endl;
cout << "Value at virtual pointer i.e. Address of virtual table "
<< (int*)*(int*)(&objClass+0) << endl;
cout << endl << "Information about VTable" << endl << endl;
cout << "Value at 1st entry of VTable "
<< (int*)*((int*)*(int*)(&objClass+0)+0) << endl;
cout << "Value at 2nd entry of VTable "
<< (int*)*((int*)*(int*)(&objClass+0)+1) << endl;
return 0;
}
The output of this program is
Address of virtual pointer 0012FF7C
Value at virtual pointer i.e. Address of virtual table 0046C0EC
Information about VTable
Value at 1st entry of VTable 0040100A
Value at 2nd entry of VTable 0040129E
#include <iostream>
using namespace std;
class Class {
virtual void f() { cout << "Class::f" << endl; }
virtual void g() { cout << "Class::g" << endl; }
};
int main() {
Class objClass;
cout << "Address of virtual pointer " << (int*)(&objClass+0) << endl;
cout << "Value at virtual pointer i.e. Address of virtual table "
<< (int*)*(int*)(&objClass+0) << endl;
cout << endl << "Information about VTable" << endl << endl;
cout << "Value at 1st entry of VTable "
<< (int*)*((int*)*(int*)(&objClass+0)+0) << endl;
cout << "Value at 2nd entry of VTable "
<< (int*)*((int*)*(int*)(&objClass+0)+1) << endl;
cout << "Value at 3rd entry of VTable "
<< (int*)*((int*)*(int*)(&objClass+0)+2) << endl;
cout << "Value at 4th entry of VTable "
<< (int*)*((int*)*(int*)(&objClass+0)+3) << endl;
return 0;
}
The output of this program is
Address of virtual pointer 0012FF7C
Value at virtual pointer i.e. Address of virtual table 0046C134
Information about VTable
Value at 1st entry of VTable 0040100A
Value at 2nd entry of VTable 0040129E
Value at 3rd entry of VTable 00000000
Value at 4th entry of VTable 73616C43
Output of this program shows that the last entry of vtable is NULL. Let's call virtual function from the knowledge we have.
#include <iostream>
using namespace std;
class Class {
virtual void f() { cout << "Class::f" << endl; }
virtual void g() { cout << "Class::g" << endl; }
};
typedef void(*Fun)(void);
int main() {
Class objClass;
Fun pFun = NULL;
// calling 1st virtual function
pFun = (Fun)*((int*)*(int*)(&objClass+0)+0);
pFun();
// calling 2nd virtual function
pFun = (Fun)*((int*)*(int*)(&objClass+0)+1);
pFun();
return 0;
}
The output of this program is
Class::f
Class::g
Now let's see the case of multiple inheritance. Let's see the simple case of multiple inheritances
#include <iostream>
using namespace std;
class Base1 {
public:
virtual void f() { }
};
class Base2 {
public:
virtual void f() { }
};
class Base3 {
public:
virtual void f() { }
};
class Drive : public Base1, public Base2, public Base3 {
};
int main() {
Drive objDrive;
cout << "Size is = " << sizeof(objDrive) << endl;
return 0;
}
The output of this program is
Size is = 12
This program shows when you drive class with more then one base class then drive class have virtual pointer of all of base classes.
#include <iostream>
using namespace std;
class Base1 {
virtual void f() { cout << "Base1::f" << endl; }
virtual void g() { cout << "Base1::g" << endl; }
};
class Base2 {
virtual void f() { cout << "Base2::f" << endl; }
virtual void g() { cout << "Base2::g" << endl; }
};
class Base3 {
virtual void f() { cout << "Base3::f" << endl; }
virtual void g() { cout << "Base3::g" << endl; }
};
class Drive : public Base1, public Base2, public Base3 {
public:
virtual void fd() { cout << "Drive::fd" << endl; }
virtual void gd() { cout << "Drive::gd" << endl; }
};
typedef void(*Fun)(void);
int main() {
Drive objDrive;
Fun pFun = NULL;
// calling 1st virtual function of Base1
pFun = (Fun)*((int*)*(int*)((int*)&objDrive+0)+0);
pFun();
// calling 2nd virtual function of Base1
pFun = (Fun)*((int*)*(int*)((int*)&objDrive+0)+1);
pFun();
// calling 1st virtual function of Base2
pFun = (Fun)*((int*)*(int*)((int*)&objDrive+1)+0);
pFun();
// calling 2nd virtual function of Base2
pFun = (Fun)*((int*)*(int*)((int*)&objDrive+1)+1);
pFun();
// calling 1st virtual function of Base3
pFun = (Fun)*((int*)*(int*)((int*)&objDrive+2)+0);
pFun();
// calling 2nd virtual function of Base3
pFun = (Fun)*((int*)*(int*)((int*)&objDrive+2)+1);
pFun();
// calling 1st virtual function of Drive
pFun = (Fun)*((int*)*(int*)((int*)&objDrive+0)+2);
pFun();
// calling 2nd virtual function of Drive
pFun = (Fun)*((int*)*(int*)((int*)&objDrive+0)+3);
pFun();
return 0;
}
The output of this program is
Base1::f
Base1::g
Base2::f
Base2::f
Base3::f
Base3::f
Drive::fd
Drive::gd
This program show that the virtual function of drive store in the vtable of first vptr.
static_cast
. Let's take a look at he following program to better understand it.
#include <iostream>
using namespace std;
class Base1 {
public:
virtual void f() { }
};
class Base2 {
public:
virtual void f() { }
};
class Base3 {
public:
virtual void f() { }
};
class Drive : public Base1, public Base2, public Base3 {
};
// any non zero value because multiply zero with any no is zero
#define SOME_VALUE 1
int main() {
cout << (DWORD)static_cast<Base1*>((Drive*)SOME_VALUE)-SOME_VALUE << endl;
cout << (DWORD)static_cast<Base2*>((Drive*)SOME_VALUE)-SOME_VALUE << endl;
cout << (DWORD)static_cast<Base3*>((Drive*)SOME_VALUE)-SOME_VALUE << endl;
return 0;
}
ATL use a macro name offsetofclass defined in ATLDEF.H to do this. Macro is defined at
#define offsetofclass(base, derived) /
((DWORD)(static_cast<base*>((derived*)_ATL_PACKING))-_ATL_PACKING)
This macro returns the offset of the base class vptr in the drive class object model. Let's see an example to get an idea of this
#include <windows.h>
#include <iostream>
using namespace std;
class Base1 {
public:
virtual void f() { }
};
class Base2 {
public:
virtual void f() { }
};
class Base3 {
public:
virtual void f() { }
};
class Drive : public Base1, public Base2, public Base3 {
};
#define _ATL_PACKING 8
#define offsetofclass(base, derived) /
((DWORD)(static_cast<base*>((derived*)_ATL_PACKING))-_ATL_PACKING)
int main() {
cout << offsetofclass(Base1, Drive) << endl;
cout << offsetofclass(Base2, Drive) << endl;
cout << offsetofclass(Base3, Drive) << endl;
return 0;
}
The memory layout of the drive class is
0
4
8
Output of this program shows this macro returns the offset of vptr of the required base class. In Don Box's Essential COM, he used a similar macro to this. Change the program little bit and replace ATL macro with Box's macro.
#include <windows.h>
#include <iostream>
using namespace std;
class Base1 {
public:
virtual void f() { }
};
class Base2 {
public:
virtual void f() { }
};
class Base3 {
public:
virtual void f() { }
};
class Drive : public Base1, public Base2, public Base3 {
};
#define BASE_OFFSET(ClassName, BaseName) /
(DWORD(static_cast<BaseName*>(reinterpret_cast<ClassName*>/
(0x10000000))) - 0x10000000)
int main() {
cout << BASE_OFFSET(Drive, Base1) << endl;
cout << BASE_OFFSET(Drive, Base2) << endl;
cout << BASE_OFFSET(Drive, Base3) << endl;
return 0;
}
The output and purpose of this program is the same as the previous program.
Let's do something practical and use this macro in our program. In fact we can call the virtual function of our required base class by getting the offset of base class vptr in drive's memory structure.
#include <windows.h>
#include <iostream>
using namespace std;
class Base1 {
public:
virtual void f() { cout << "Base1::f()" << endl; }
};
class Base2 {
public:
virtual void f() { cout << "Base2::f()" << endl; }
};
class Base3 {
public:
virtual void f() { cout << "Base3::f()" << endl; }
};
class Drive : public Base1, public Base2, public Base3 {
};
#define _ATL_PACKING 8
#define offsetofclass(base, derived) /
((DWORD)(static_cast<base*>((derived*)_ATL_PACKING))-_ATL_PACKING)
int main() {
Drive d;
void* pVoid = NULL;
// call function of Base1
pVoid = (char*)&d + offsetofclass(Base1, Drive);
((Base1*)(pVoid))->f();
// call function of Base2
pVoid = (char*)&d + offsetofclass(Base2, Drive);
((Base2*)(pVoid))->f();
// call function of Base1
pVoid = (char*)&d + offsetofclass(Base3, Drive);
((Base3*)(pVoid))->f();
return 0;
}
The output of the program is
Base1::f()
Base2::f()
Base3::f()
I tried to explain the working of offsetofclass
macro of ATL in this tutorial. I Hope to explore other mysterious of ATL in the next article.