In this article, I’m going to describe how to implement COM interface hooks. COM hooks have something in com mon with the user-mode API hooks (both in goals and in methods), but there are also some significant differences due to the features of COM technology. I’m going to show two of the most often used approaches to the problem, emphasizing advantages and disadvantages of each one. The code sample is simplified as much as possible, so we can concentrate on the most important parts of the problem.
Before we start with intercepting calls to COM objects, I’d like to mention some underlying concepts of COM technology. If you know this stuff well, you can just skip this boring theory and move straight to the practical part.
All COM classes implement one or several interfaces. All the interfaces must be derived from IUnknown
. It’s used for reference counting and obtaining pointers to other interfaces implemented by an object. Every interface has a globally unique interface identifier - IID
. Clients use interface pointers to call all methods of COM objects.
This feature makes COM com ponents independent on the binary level. It means if a COM server is changed, it doesn’t require its clients to be recom piled (as long as the new version of the server provides the same interfaces). It is even possible to replace COM server with your own implementation.
All calls of COM interface methods are executed by means of virtual method table (or simply vtable
). A pointer to vtable
is always the first field of every COM class. This table is, in brief, an array of pointers - pointers to the class methods (in order of their declaration). When the client invokes a method, it makes a call by the according pointer.
COM servers work either in the context of a client process or in the context of some another process. In the first case, the server is a DLL which is loaded into client process. In the second case, the server executes as another process (maybe even on another com puter). To com municate with the server, the client loads so-called Proxy/Stub DLL. It redirects calls from the client to the server.
To be easily accessible, a COM server should be registered in the system registry. There are several functions, which clients can use to create an instance of COM , but usually it is CoGetClassObject
, CoCreateInstanceEx
or (the most com mon) CoCreateInstance
.
If you want to get more detailed information, you can use MSDN or one of the sources in the References section.
Let’s see how we can intercept calls to COM interface. There are several different approaches to this problem. For instance, we can modify registry or use CoTreatAsClass
or CoGetInterceptor
functions. In this article, two of the most com monly used approaches are covered: use of proxy object and patching of the virtual method table. Each of them has its own advantages and disadvantages, so what to choose depends on the task.
A piece of code for this article contains the implementation for the simplest COM server DLL, client application and two samples of COM hooking that demonstrate the approaches I’m going to describe.
Let’s run the client application without hooks installed. First, we register the COM server invoking the com mand regsvr32 Com Sample.dll . And then we run Com SampleClient.exe or SrciptCleint.js to see how the client of the sample server works. Now it’s time to set some hooks.
COM is pretty much about binary encapsulation. Client uses any COM server via interface and one implementation of the server can be changed to another without rebuilding the client. This feature can be used in order to intercept calls to COM server.
The main idea of this method is to intercept the COM object creation request and substitute the newly created instance with our own proxy object. This proxy object is a COM object with the same interface as the original object. The client code interacts with it as it is the original object. The proxy object usually stores a pointer to the original object, so it can call original object’s methods.
As I mentioned, proxy object has to implement all interfaces of the target object. In our sample, it will be just one interface: ISampleObject
. The proxy class is implemented with ATL:
class ATL_NO_VTABLE CSampleObjectProxy :
public ATL::CCom ObjectRootEx< ATL::CCom MultiThreadModel> ,
public ATL::CCom CoClass< CSampleObjectProxy, &CLSID_SampleObject> ,
public ATL::IDispatchImpl< ISampleObject, &IID_ISampleObject,
&LIBID_Com SampleLib, 1 , 0 >
{
public :
CSampleObjectProxy();
DECLARE_NO_REGISTRY()
BEGIN_COM _MAP(CSampleObjectProxy)
COM _INTERFACE_ENTRY(ISampleObject)
COM _INTERFACE_ENTRY(IDispatch)
END_COM _MAP()
DECLARE_PROTECT_FINAL_CONSTRUCT()
public :
HRESULT FinalConstruct();
void FinalRelease();
public :
HRESULT static CreateInstance(IUnknown* original, REFIID riid, void **ppvObject);
public :
STDMETHOD(get_ObjectName)(BSTR* pVal);
STDMETHOD(put_ObjectName)(BSTR newVal);
STDMETHOD(DoWork)(LONG arg1, LONG arg2, LONG* result);
...
};
STDMETHODIMP CSampleObjectProxy::get_ObjectName(BSTR* pVal)
{
return m_Name.CopyTo(pVal);
}
STDMETHODIMP CSampleObjectProxy::DoWork(LONG arg1, LONG arg2, LONG* result)
{
*result = 42 ;
return S_OK;
}
STDMETHODIMP CSampleObjectProxy::put_ObjectName(BSTR newVal)
{
return m_OriginalObject-> put_ObjectName(newVal);
}
Notice that if there are methods you’re not interested in (for example, put_ObjectName
), you have to implement them in the proxy anyway.
Now, we have to intercept creation of the target object to replace it with our proxy. There are several Windows API functions capable of creating COM objects, but usually CoCreateInstance
is used.
To intercept target object creation, I’ve used mhook library to hook CoCreateInstance
and CoGetClassObject
. The technique of setting API hooks is a widely covered topic. If you want to get more detailed information about it you can see, for example, Easy way to set up global API hooks article by Sergey Podobry.
Here is the implementation of the CoCreateInstance
hook function:
HRESULT WINAPI Hook::CoCreateInstance
(REFCLSID rclsid, LPUNKNOWN pUnkOuter, DWORD dwClsContext, REFIID riid, LPVOID* ppv)
{
if (rclsid == CLSID_SampleObject)
{
if (pUnkOuter)
return CLASS_E_NOAGGREGATION;
ATL::CCom Ptr< IUnknown> originalObject;
HRESULT hr = Original::CoCreateInstance(rclsid, pUnkOuter,
dwClsContext, riid, (void **)&originalObject);
if (FAILED(hr))
return hr;
return CSampleObjectProxy::CreateInstance(originalObject, riid, ppv);
}
return Original::CoCreateInstance(rclsid, pUnkOuter, dwClsContext, riid, ppv);
}
To see how the sample of proxy object the approach works, specify the full name of the Com InterceptProxyObj.dll in the AppInit_DLLs
registry value (HKEY_LOCAL_MACHINE/Software/Microsoft/Windows NT/CurrentVersion/Windows ). Now you can run Com SampleClient.exe or ScriptClient.js and see that calls to the target object method are intercepted.
The other way of intercepting calls to a COM object is to modify the object’s virtual methods table. It contains pointers to all public
methods of a COM object, so they can be replaced with the pointers to hook functions.
Unlike the previous one, this approach doesn’t require hooks to be set before the client gets pointer to the target object. They can be set at any place where a pointer to the object is accessible.
Here is the HookMethod
function code which sets a hook for a COM method:
HRESULT HookMethod(IUnknown* original, PVOID proxyMethod,
PVOID* originalMethod, DWORD vtableOffset)
{
PVOID* originalVtable = *(PVOID**)original;
if (originalVtable[vtableOffset] == proxyMethod)
return S_OK;
*originalMethod = originalVtable[vtableOffset];
originalVtable[vtableOffset] = proxyMethod;
return S_OK;
}
To set hooks for the ISampleObject
interface methods, the InstallCom InterfaceHooks
function is used:
HRESULT InstallCom InterfaceHooks(IUnknown* originalInterface)
{
// Only single instance of a target object is supported in the sample
if (g_Context.get())
return E_FAIL;
ATL::CCom Ptr< ISampleObject> so;
HRESULT hr = originalInterface-> QueryInterface(IID_ISampleObject, (void **)&so);
if (FAILED(hr))
return hr; // we need this interface to be present
// remove protection from the vtable
DWORD dwOld = 0 ;
if (!::VirtualProtect(*(PVOID**)(originalInterface),
sizeof (LONG_PTR), PAGE_EXECUTE_READWRITE, &dwOld))
return E_FAIL;
// hook interface methods
g_Context.reset(new Context);
HookMethod(so, (PVOID)Hook::QueryInterface, &g_Context-> m_OriginalQueryInterface, 0 );
HookMethod(so, (PVOID)Hook::get_ObjectName, &g_Context-> m_OriginalGetObjectName, 7 );
HookMethod(so, (PVOID)Hook::DoWork, &g_Context-> m_OriginalDoWork, 9 );
return S_OK;
}
Virtual method table may be in a write-protected area, so we have to remove the protection with VirtualProtect
before setting hooks.
Variable g_Context
is a structure, which contains data associated with the target object. I’ve made g_Context
global to simplify the sample though it supports only one target object to exist at the same time.
Here is the hook functions code:
typedef HRESULT (WINAPI *QueryInterface_T)
(IUnknown* This, REFIID riid, void **ppvObject);
STDMETHODIMP Hook::QueryInterface(IUnknown* This, REFIID riid, void **ppvObject)
{
QueryInterface_T qi = (QueryInterface_T)g_Context-> m_OriginalQueryInterface;
HRESULT hr = qi(This, riid, ppvObject);
return hr;
}
STDMETHODIMP Hook::get_ObjectName(IUnknown* This, BSTR* pVal)
{
return g_Context-> m_Name.CopyTo(pVal);
}
STDMETHODIMP Hook::DoWork(IUnknown* This, LONG arg1, LONG arg2, LONG* result)
{
*result = 42 ;
return S_OK;
}
Take a look at the hook functions definitions. Their prototypes are exactly as the target interface methods prototypes except that they are free functions (not class methods) and they have one extra parameter - this
pointer. That’s because COM methods are usually declared as stdcall
, this
is passed as an implicit stack parameter.
When using this approach, you have several things to remember. First of all, when you set a method hook, it will work not only for the current instance of the COM object. It’ll work for all objects of the same class (but not for all the classes implementing the interface hooked). If there are several classes implementing the same interface and you want to intercept calls for all instances of this interface, you will need to patch vtables
of all this classes.
If you want to store some data, which is specific for every object, you have to store in the static memory area a collection of contexts accessible by the target object pointer value. You also have to watch target object’s lifetime. And if you expect multithreaded access for the target object, you have to provide synchronization for the static collection.
If you need to call the target object’s method from your hook function, you have to be careful. You can’t just call a hooked method by an interface pointer because it will cause an access to vtable
and a call of the hook function (which is not what you want). So you have to save the pointer to the original method and use it directly to call the method.
Here is another tricky thing. When you set a hook, be careful and do not hook the same method twice. If you save a pointer to the original method, it will be rewritten on the second hook attempt.
The good news is that in this approach, you don’t have to implement hooks for the methods you don’t need to intercept. And intercepting object’s creation is not necessary too.
To see how this part of the sample works, specify the full name of Com InterceptVtablePatch.dll in the AppInit_DLLs
registry value, just like you did before, and run the client.
Both approaches described in the article have advantages and disadvantages. The proxy object approach is much easier to implement, especially if you need sophisticated logic in your proxy. But you have to replace the target object with your proxy before the client gets the pointer to the original object, and it may be difficult or simply impossible in some cases. Also you have to provide in your proxy the same interface as the target object has, even if you have only a couple of methods to intercept. And real-world COM interfaces can be really large. If the target object has several interfaces, you’ll probably need to implement them all.
The vtable
patching approach demands much more careful implementation and requires a developer to remember a lot of things. It needs some extra amount of code for handling several target object instances of the same interface or calling the target’s methods. But it doesn’t require to set hooks directly after target’s creation, the hooks can be set at any moment. It also allows implementing hooks only for methods you actually need to intercept.
Which approach is handier at the moment usually depends on the situation.