COM深入理解(下)
——方法参数类型为CRuntimeClass*、void*等
本文上篇已经说明了类对象实际是一个结构实例,并且为了实现进程间传递类对象指针以达到引用的目的,需要为此类专门编写一个代理类,并在传递时例示(即实例化)其一个对象以实现代理对象。而此代理类必定分成两部分,即一部分的成员函数专门在客户进程被调用,另一部分专门在组件进程被调用以实现进程间的数据传递进而将客户的调用命令原封不动地传递给组件对象以实现客户操作组件对象。
上面的做法实际就是编写自定义汇集操作时应该干的事,只不过还需照着COM的几个附加规定来做,如必须实现IMarshal接口等。本文说明如何为这样的类型传递编写标准的代理/占位组件以跨进程传递类对象的指针(使用MIDL来完成)。
为了在客户端生成一个代理对象,必须将某些信息传递过去,然后在客户端根据传递的信息构建一个代理对象。在IDL语言的类型定义中,没有类这种类型,因此是不可能让接口方法的参数类型为某个自定义类的指针。但是的确有此需要,则只能将类对象指针转成某种IDL中识别的类型,最好的候选人就是void*,然后借助MIDL生成的代码将构建代理对象的信息传递过去。
void*不带有任何语义,其仅表示一个地址,因此在IDL中传递void*是错误的,因为MIDL无法根据void*所带的语义确定应该如何汇集其指向内存中的内容。但是MIDL还是提供了多种途径来解决这个问题的,下面仅说明其中两个用得最多的方法:[call_as()]属性和[wire_marshal()]属性。
[local]和[call_as()]
[local] 接口或接口方法都可以加上[local]属性以表示此方法或此接口中的方法不需要生成汇集代码,进而就避免了上面由于void*不带有任何语义而不能汇集其指向内容这个问题,因为不需要生成汇集代码,进而其所修饰的方法的参数可以为void*。此属性所修饰的方法或接口被称为本地方法或本地接口,因为这些方法没有汇集代码,不能进行远程调用。这在COM的标准接口中应用十分广泛。如查看IUnknown的IDL代码,其就是一个本地接口。再如查看IClassFactory接口的IDL定义,如下:
[
object,
uuid(00000001-0000-0000-C000-000000000046),
pointer_default(unique)
]
interface IClassFactory : IUnknown
{
typedef [unique] IClassFactory * LPCLASSFACTORY;
[local]
HRESULT CreateInstance(
[in, unique] IUnknown * pUnkOuter,
[in] REFIID riid,
[out, iid_is(riid)] void **ppvObject);
[call_as(CreateInstance)]
HRESULT RemoteCreateInstance(
[in] REFIID riid,
[out, iid_is(riid)] IUnknown ** ppvObject);
[local]
HRESULT LockServer(
[in] BOOL fLock);
[call_as(LockServer)]
HRESULT __stdcall RemoteLockServer(
[in] BOOL fLock);
}
其中的CreateInstance和LockServer就是本地函数,MIDL将不会为这两个函数生成汇集代码,也就是代理/占位代码,其表现就是类似下面的两个函数原型的代码:
HRESULT STDMETHODCALLTYPE IClassFactory_LockServer_Proxy( IClassFactory * This,
BOOL fLock );
HRESULT STDMETHODCALLTYPE IClassFactory_LockServer_Stub( IClassFactory * This,
BOOL fLock );
也就是说,当在.idl文件中检测到一个接口方法的定义时,MIDL都会为这个方法生成两个附加的函数,名字分别为<InterfaceName>_<MethodName>_Proxy和<InterfaceName>_<MethodName>_Stub,以分别作为代理和占位的代码。如上面的RemoteCreateInstance,将生成IClassFactory_RemoteCreateInstance_Proxy和IClassFactory_RemoteCreateInstance_Stub这么两个函数的声明和定义。
但是当方法被[local]属性修饰时,则不会生成上面的两个函数的声明和定义,因为它们被假定一定用于直接调用,不会有汇集的需要,因此没有汇集代码,并被称为本地方法。但它们还是会被加入接口这个函数指针数组的行列,即生成的接口头文件中依旧可以看见这类方法的声明(但是在类型库中却没有,这可以认为是MIDL的一个BUG,不过是可以绕过的)。
[call_as()] 接口方法可以被加上[call_as()]属性进行修饰,以指定此方法将被作为括号中指定的本地方法调用的替代品,即被作为什么调用。它不像[local]属性修饰的方法,其依旧会生成汇集代码,但却不会出现在接口中,即生成的头文件中,看不见这类方法的声明(但是在类型库中却看得见,这是一个BUG,可以通过预定义宏绕过)。此被称为方法别名,因为其将两个方法关联了起来,其中一个([local]修饰的)是另一个([call_as]修饰的)的别名,被实际使用。
如前面的RemoteLockServer就带有属性[call_as(LockServer)]以表示此函数是当客户调用LockServer时,并且需要进行汇集操作时调用的。将[local]修饰的方法称为本地版,[call_as()]修饰称为远程版,则可以认为远程版函数解决了本地版函数没有生成汇集代码的问题,因为本地版函数可能有某些特殊要求(如参数类型为void*)而不能生成汇集代码。
既然[call_as()]产生了一个函数别名,对两个函数进行了关联,因此必须有一种机制实现这种关联。MIDL就是通过要求开发人员自己编写本地版方法的汇集代码来实现这个关联关系。对于上面的LockServer,MIDL将会为其生成两个函数原型,如下:
HRESULT STDMETHODCALLTYPE IClassFactory_LockServer_Proxy( IClassFactory * This,
BOOL fLock );
HRESULT __stdcall IClassFactory_LockServer_Stub( IClassFactory * This,
BOOL fLock );
但仅仅是原型,即声明,没有定义。因此开发人员需自己编写上面两个函数的定义。注意:虽然名字是IClassFactory_LockServer_Stub,但它的原型正好和RemoteLockServer对调,以实现将远程版函数传递过来的参数再转成本地版的参数。
因此关联的过程就是:客户调用IClassFactory_LockServer_Proxy,然后开发人员编写此函数,并在其中将传进来的MIDL不能或不希望被处理的参数类型转成IClassFactory_RemoteLockServer_Proxy的参数形式,并调用之以传递参数。在组件端,COM运行时期库调用开发人员编写的IClassFactory_LockServer_Stub(注意:此函数的原型不是LockServer,而是RemoteLockServer)以将通过网络传过来的参数换成原始的MIDL不能或不希望被处理的参数形式,并调用传进来的IClassFactory*参数的LockServer方法以实现调用了组件对象的方法,然后返回。下面举个简例:
有个自定义类CA,如下:
class CA
{
long m_a, m_b;
public:
long GetA();
void SetA( long a );
};
欲在下面的接口中传递其对象指针:
///////////////////////abc.idl/////////////////////////
import "oaidl.idl";
import "ocidl.idl";
[
object,
uuid( 1A 201ABC-A669 -4ac 7-9E02-2DA772E927FC),
pointer_default(unique)
]
interface IAbc : IUnknown
{
[local] HRESULT GetA( [out] void* pA );
[call_as( GetA )] HRESULT RemoteGetA( [out] long *pA, [out] long *pB );
};
新建一DLL工程,关掉“预编译头文件”编译开关,将生成的abc_i.c、abc_p.c、dlldata.c和abc.h加到工程中,并建立一个abc.def文件加入到工程中以导出几个必要的用于注册的函数,如下:
;;;;;;;;;;;;;;;;;;;;;;;;abc.def;;;;;;;;;;;;;;;;;;;;;;;;;
LIBRARY "abc"
EXPORTS
DllCanUnloadNow PRIVATE
DllGetClassObject PRIVATE
DllRegisterServer PRIVATE
DllUnregisterServer PRIVATE
并新添加一个abc.cpp文件,如下:
///////////////////////abc.cpp/////////////////////////
#include "abc.h"
#include <new>
class CA
{
public:
long m_a, m_b;
long GetA();
void SetA( long a );
};
HRESULT STDMETHODCALLTYPE IAbc_GetA_Proxy( IAbc *This, void *pA )
{
if( !pA )
return E_INVALIDARG;
CA *pAA = reinterpret_cast< CA* >( pA );
// 调用远程版的代理函数以传递参数,由MIDL生成
return IAbc_RemoteGetA_Proxy( This, &pAA->m_a, &pAA->m_b );
}
HRESULT STDMETHODCALLTYPE IAbc_GetA_Stub( IAbc *This, long *pA, long *pB )
{
void *p = CoTaskMemAlloc( sizeof( CA ) );
if( !p )
return E_FAIL;
CA *pAA = new( p ) CA; // 生成一个类对象
// 调用对象的本地方法
HRESULT hr = This->GetA( pAA );
if( SUCCEEDED( hr ) )
{
*pA = pAA->m_a;
*pB = pAA->m_b;
}
// 释放资源
pAA->~CA();
CoTaskMemFree( p );
return hr;
}
最后添加预定义宏REGISTER_PROXY_DLL和_WIN32_WINNT=0x500,并连接rpcrt4.lib库文件,确保没有打开/TC或/TP编译开关以保证对上面的abc.cpp进行C++编译,而对MIDL生成的.c的源文件进行C编译。
使用时如下:
IAbc *pA; // 假设已初始化
CA a;
pA->GetA( reinterpret_cast< void* >( &a ) );
而组件实现的代码如下:
STDMETHODIMP CAbc::GetA( void *pA )
{
if( !pA )
return E_INVALIDARG;
*reinterpret_cast< CA* >( pA ) = m_A;
return S_OK;
}
如上就实现了将类CA的对象进行传值操作,但不是传址操作。前面已说明,欲进行后者,必须编写相应的代理类。先使用上面的方法将必要的信息传递后,再根据传递的信息初始化类CA的代理对象以建立连接。一般如非得已最好不要编写代理对象,而通过将类转成接口形式,由MIDL辅助生成代理/占位组件以变相实现。
下面介绍使用[wire_marshal()]属性进行传值操作。
[wire_marshal()]
前面使用方法别名机制实现了传递自定义数据类型,但是其是以方法为单位进行处理的,当要多次使用某一个数据类型时,如前面的CA*,如果对每个使用到CA*的方法都进行上面的操作,很明显地效率低下,为此MIDL提供了[wire_marshal()]属性(当然不止这么一个属性)。
[wire_marshal()]属性只能用于类型定义,即typedef中,使用语法如下:
typedef [wire_marshal(wire_type)] type-specifier userm-type;
其将一个线类型(wire-type,即MIDL可以直接处理的类型)和一个描述类型(type-specifier,即不能或不打算被MIDL处理的特殊数据类型)相关联,并用一个可识别名字(userm-type)标识。其和[transmit_as()]属性类似,都是将两个类型进行关联,就如前面的[local]和[call_as()]将两个方法进行关联一样,只不过[wire_marshal()]是直接将描述类型按IDL的列集格式(网络数据描述NDR——Network Data Representation)列集到指定的缓冲区中,而[transmit_as()]还需汇集代码在中间再转换一次,因此[wire_marshal()]的效率要更高,只不过由于需要编写列集代码,因此需要了解NDR格式,处理数据对齐等问题,所以显得麻烦和复杂。最常见的应用就是句柄的定义,如下:
typedef union _RemotableHandle switch( long fContext ) u
{
case WDT_INPROC_CALL: long hInproc;
case WDT_REMOTE_CALL: long hRemote;
} RemotableHandle;
typedef [unique] RemotableHandle * wireHWND;
#define DECLARE_WIREM_HANDLE(name) /
typedef [wire_marshal(wire ## name)] void * name
DECLARE_WIREM_HANDLE( HWND );
也就是说我们常用的HWND类型是:
typedef [wire_marshal( wireHWND )] void* HWND;
即其在应用程序中(即客户或组件,即代理/占位的使用者)是void*类型,当需要传输时,实际是传输结构RemotableHandle的一个实例,而此结构是一个以fContext为标识的联合,实际为8字节长。
为了实现上面提到的void*和RemotableHandle*的关联,开发人员必须提供下面四个函数的定义:
unsigned long __RPC_USER < userm-type >_UserSize( // 返回欲请求的缓冲区大小
unsigned long __RPC_FAR *pFlags, // 一个标志参数,后叙
// 给出当前已经请求的缓冲区大小,返回的大小应该以此作为起点
unsigned long StartingSize,
< userm-type > __RPC_FAR * pUser_typeObject ); // 欲传递的描述类型的实例
unsigned char __RPC_FAR * __RPC_USER < userm-type >_UserMarshal( // 列集
unsigned long __RPC_FAR * pFlags, // 标志参数
unsigned char __RPC_FAR * Buffer, // 已分配的缓冲器有效指针
< userm-type > __RPC_FAR * pUser_typeObject ); // 欲列集的描述类型的实例
unsigned char __RPC_FAR * __RPC_USER < userm-type >_UserUnmarshal( // 散集
unsigned long __RPC_FAR * pFlags, // 标志参数
unsigned char __RPC_FAR * Buffer, // 列集数据的缓冲器指针
// 描述类型的实例指针,从列集数据中散集出描述类型后,放在此指针所指内存之中
< userm-type > __RPC_FAR * pUser_typeObject );
void __RPC_USER < userm-type >_UserFree( // 释放UserUnmarshal中分配的内存
unsigned long __RPC_FAR * pFlags, // 标志参数
// UserUnmarshal中的pUser_typeObject参数,一个描述类型的实例的指针
< userm-type > __RPC_FAR * pUser_typeObject );
对于前面的HWND,开发人员就必须提供如下四个函数的定义(当然Microsoft是已经提供了的):
unsigned long __RPC_USER
HWND_UserSize( unsigned long*, unsigned long, HWND* );
unsigned char* __RPC_USER
HWND_UserMarshal( unsigned long*, unsigned char*, HWND* );
unsigned char* __RPC_USER
HWND_UserUnmarshal( unsigned long*, unsigned char*, HWND* );
void __RPC_USER
HWND_UserFree( unsigned long*, HWND* );
在MIDL生成的汇集代码中,遇到方法参数类型为HWND时,发生如下事情:
1. 调用HWND_UserSize并传递应用程序(客户或组件,视HWND是in参数还是out参数)传进来的HWND的实例以得到欲传递此实例需要的缓冲区大小
2. 在RPC通道上分配相应的内存块
3. 调用HWND_UserMarshal,依旧传递前面的HWND实例以及分配到的缓冲区的指针以将此HWND实例列集到缓冲区中
4. 通过RPC通道将缓冲区内容传递到对方进程空间中
5. 调用HWND_UserUnmarshal,并传递通过RPC通道得到的列集数据缓冲区的指针和生成的一临时HWND实例的指针以记录散集出来的HWND实例
6. 以返回的HWND实例为参数调用应用程序的方法
7. 调用HWND_UserFree,传递前面因调用HWND_UserUnmarshal而生成的临时记录散集出的HWND实例的指针以释放因此分配的内存
以上,就是[wire_marshal()]属性对线类型和描述类型的绑定的实现。但其中漏了一点,就是标志参数pFlags的使用。此标志参数是一个4字节数字,其高16位是一些关于NDR格式的编码规则,以使得NDR引擎(将填写好的缓冲区内容按NDR格式串的规则进行排列以在网上传输的程序)能做出正确的数据转换。其低16位是一个MSHCTX枚举值,指明调用环境,是进程内还是跨进程、是远程还是本地(具体信息还请查阅MSDN),因而可以在上面的四个函数中根据此值作出相应的优化。
下面为上面的CA*实现[wire_marshal()]属性。
前面已经了解到,CA*由于在IDL中没有对应的类型,应该使用void*来进行传递,在abc.idl中增加如下代码:
typedef struct _SA
{
long a, b;
} *PSA;
typedef [wire_marshal( PSA )] void* PA;
并为接口IAbc增加一个方法:
HRESULT SetA( [in] PA a );
接着在abc.cpp中增加如下代码:
unsigned long __RPC_USER PA_UserSize( unsigned long* /* pFlags */,
unsigned long StartingSize,
PA* /* ppA */ )
{
// 之所以有StartingSize,因为此参数可能并不是第一个被列集的参数,
// 如:HRESULT SetA( [in] long tem1, [in] char tem2, [in] PA a );
// 此时的StartingSize就为sizeof( long ) + sizeof( char )
// 而之所以还要再将其传进来是为了对齐需要
// 此处没有进行对齐处理,因为结构_SA是只有两个unsigned long的简单
// 结构,无须再刻意对齐。
return StartingSize + sizeof( _SA );
}
unsigned char* __RPC_USER PA_UserMarshal( unsigned long *pFlags,
unsigned char *Buffer,
PA *ppA )
{
// 按线种类(即结构_SA)的定义填冲缓冲区,注意必须按照NDR传输格式
// 进行填充,这里由于_SA简单,所以只是简单地复制,没有什么对齐及一
// 致性数据的问题。关于NDR传输格式的详细内容,请参考
// http://www.opengroup.org/onlinepubs/9629399/chap14.htm
if( *pFlags & MSHCTX_INPROC )
{
// 是进程内调用,直接将CA*进行传递,而不进行拷贝
*reinterpret_cast< void** >( Buffer ) = *ppA;
}
else
{
CA *pA = reinterpret_cast< CA* >( *ppA );
PSA pSA = reinterpret_cast< PSA >( Buffer );
pSA->a = pA->m_a;
pSA->b = pA->m_b;
}
// 返回缓冲区的有效位置,当前位置后的sizeof( _SA )个字节
return Buffer + sizeof( _SA );
}
unsigned char* __RPC_USER PA_UserUnmarshal( unsigned long *pFlags,
unsigned char *Buffer,
PA *ppA )
{
if( *pFlags & MSHCTX_INPROC )
{
// 是进程内调用,直接将CA*进行传递,而不进行拷贝
*ppA = *reinterpret_cast< void** >( Buffer );
}
else
{
void *p = CoTaskMemAlloc( sizeof( CA ) );
if( !p )
return Buffer + sizeof( _SA );
CA *pAA = new( p ) CA; // 生成一个类对象
PSA pSA = reinterpret_cast< PSA >( Buffer );
pAA->m_a = pSA->a;
pAA->m_b = pSA->b;
*ppA = p;
}
// 返回缓冲区的有效位置,当前位置后的sizeof( _SA )个字节
return Buffer + sizeof( _SA );
}
void __RPC_USER PA_UserFree( unsigned long *pFlags,
PA *ppA )
{
if( !( *pFlags & MSHCTX_INPROC ) )
{
// 不是进程内汇集,分配了内存,释放资源
CA *pAA = reinterpret_cast< CA* >( *ppA );
pAA->~CA();
CoTaskMemFree( pAA );
}
}
使用中,则:
IAbc *pA; // 假设已初始化
CA a;
a.SetA( 654 );
PA pAA = &a;
pA->SetA( pAA ); // 或者直接pA->SetA( &a );
pA->GetA( &a );
非常明显,MIDL提供的可用于自定义类型传递的属性很正常地不止上面几个,如:[transmit_as()]、[handle]等,在此仅起抛砖引玉的作用,关于MIDL提供的其他属性,还请参考MSDN。上面的实现方法中,都不仅仅提供了汇集自定义数据类型的渠道,还提供了优化的途径(如上面的pFlags标志参数)。因此在编写代理/占位组件时,应考虑在关键地方应用类似的属性进行生成代码的优化。
Using user defined type in COM&ATL
· Preface
The reason I got into this is that I've rarely used any help from newsgroups or similar communities. On the other hand since I've used code provided by other developers/programmers on CodeProject and CodeGuru it seemed reasonable to join a couple of them and just have a look.
Early in May 2000 I noticed several posts about UDTs and their interaction with VB and ATL. At this point I may say I had not any real experience on the subject. As a matter of fact I've never developed professionally in COM with C++ or ATL. In addition I've learned the hard way that one cannot apply the same coding techniques one uses with C or C++ to VB. Still I consider myself novice in the COM environment.
It is true that there is very little help in implementing UDTs in COM and even less in implementing arrays of UDTs. In the past it was not even thinkable to use UDTs in COM. Nowadays there is support for UDTs in COM but there are no real example projects on how to use this feature. So a personal mail by a fellow developer inspired me to go onto this.
I am going to present a step by step approach on creating an ATL project which using UDTs to communicate with a VB Client. Using it with a C++ Client will be easy as well.
This document will proceed along with the project. I assume you are familiar with ATL, COM and VB. On the way I may present practices I use myself, which may be irrelevant to the cause of this example, but on the other hand you may have also used these practices as well or beginners may benefit from these.
Create the ATL project.
As a starting point create an ATL DLL project using the wizard. Set the name of the project to UDTDemo and accept the defaults. Now let's have a look at the generated "IDL" file.
//UDTDemo.IDL
import "oaidl.idl";
import "ocidl.idl";
[
uuid(C 21871A 1-33EB-11D4-A 13A -BE 2573A 1120F ),
version(1.0),
helpstring("UDTDemo 1.0 Type Library")
]
library UDTDEMOLib
{
importlib("stdole32.tlb");
importlib("stdole2.tlb");
};
Modify the type library name
As you expected this, there is nothing unknown in this file so far. Well, the fact is that I do not really like the "Lib" portion added to the name of the projects I create, and I always change it before any object is being inserted into the project. This is very easy.
As a first step edit the "IDL" file and set the library name to what you like. You have only to remember that this is case sensitive when the MIDL generated code is used. The modified file is shown bellow.
//UDTDemo.IDL
import "oaidl.idl";
import "ocidl.idl";
[
uuid(C 21871A 1-33EB-11D4-A 13A -BE 2573A 1120F ),
version(1.0),
helpstring("UDTDemo 1.0 Type Library")
]
library UDTDemo //UDTDEMOLib
{
importlib("stdole32.tlb");
importlib("stdole2.tlb");
};
The second step is to replace any occurrence of the previous library name with the new one. The only file apart the "IDL" one, where the library name is found is the main project implementation file, "UDTDemo.cpp", where DllMain is called and the _module is initialized. You may also use the "Find in Files" command from the toolbar and search for the "UDTDEMOLib" string.
What ever way we use we have to replace the "LIBID_UDTDEMOLib" string with the "LIBID_UDTDemo" one. Mind the case of the strings. It is case sensitive.
Now you are ready to change the name of your type library to what you really like. Again keep in mind that this is not going to be trivial unless it is done before any object is added into the project, or before any early compilation of the project.
Bellow is the modified DllMain function of our project.
//UDTDemo.cpp
extern "C"
BOOL WINAPI DllMain(HINSTANCE hInstance, DWORD dwReason, LPVOID /*lpReserved*/)
{
if (dwReason == DLL_PROCESS_ATTACH)
{
//_Module.Init(ObjectMap, hInstance, &LIBID_UDTDEMOLib);
_Module.Init(ObjectMap, hInstance, &LIBID_UDTDemo);
DisableThreadLibraryCalls(hInstance);
}
else if (dwReason == DLL_PROCESS_DETACH)
_Module.Term();
return TRUE; // ok
}
You may Compile the project now. Be sure everything is done right. In case something goes wrong you should make sure all occurrences of "UDTDEMOLib" are replaced with "UDTDemo".
Defining the structure.
An empty project is not of any use. Our purpose is to define a UDT, or struct respectively, and this is what I am going to do right now.
The demo project will handle named variables. This means we need a structure for holding the Name, and the Value of a variable. Although I haven't tested it yet, we may add a VARIANT to hold some other Special data.
The above types where chosen so as you may see the whole story, and not take any homework at all. :)
So open the UDTDemo.idl file and add these lines before the library block.
//UDTDemo.idl
typedef
[
uuid(C 21871A 0-33EB-11D4-A 13A -BE 2573A 1120F ),
version(1.0),
helpstring("A Demo UDT variable for VB projects")
]
struct UDTVariable {
[helpstring("Special case variant")] VARIANT Special;
[helpstring("Name of the variable")] BSTR Name;
[helpstring("Value of the variable")] long Value;
} UDTVariable;
Save and build again. Everything should compile without any problems. Well you have to follow this pace in this demo project. :)
User Defined data Types. The Theory.
Whenever a structure is created in IDL we need to specify a UUID for it so as the type library manager interfaces can get information about the UDT and access it. (I also realized why on this project :) ).
UUIDs
How is the UUID for the structure acquired? No, we do not execute the guidgen utility. My next free hack is this. It may not be approved, but it works. Go to the library section, copy the UUID of the library, and paste it after the typedef keyword of the structure inside the angle brackets. Then go to the 8th digit and subtract (one) 1.
The library uuid(C 21871A 1-33EB-11D4-A 13A -BE 2573A 1120F )
|
/./
The UDTVariable uuid(C 21871A 0-33EB-11D4-A 13A -BE 2573A 1120F )
As it is documented the UUIDs are created using the current date, the current time and the unique number of the network card of the computer. Well, the time and date part resides in the first eight-digit part of the UUID. The next four-digit part is not known to me so far. The rest of it is unique and, we may say, identifies your PC. So, subtracting one (1) from the time part takes us to the past. Finally this UUID is still unique!!!
As a rule of thumb, after the library UUID is created, I add one (1) to the time part of the UUID for every interface and coclass I insert into the project. Subtract one (1) for the structures or enumerations I use. Basically the interface UUIDs are replaced and will be demonstrated later.
The only reason I get into this kind of trouble is because it is said that Windows handle consequent UUIDs in the registry faster!
More type attributes.
After the definition of the UUID for our structure we define its version number. This is a hack discovered after checking the type library of a VB created object. VB adds a version number to everything it adds to a type library. This will never be used in this project but why not use it?
Then add a help string. This brief description is very useful to everyone using our library. I recommend using it all the time.
We could also add the public keyword to make the structure visible from the library clients. This is not necessary as it will finally be implicitly visible to the clients. Clients should not be able to create any structure which might not be used in the interfaces exposed by our object.
The UDT Data members.
Let's proceed to the data members now. First every data member of our UDT must be an automation compatible type. In the simpler form, as I've conclude, in a UDT we are allowed to use only the types defined in the VARIANT union, if you have checked the code, or whatever VB allows us to use inside a variant type.
This is only for our sake to avoid creating marshaling code for our structure. Otherwise you are free to pass even a bit array with a structure :).
The data types of our UDT members were chosen so as we can expect some difficulty and make the demonstration as complete as possible.
Arrays as structure members.
At this point you should know how to declare a structure of simple types in IDL. Finally, now that you know how to declare a UDT structure to be used in VB we have to take the exercise 1 and create a UDT which holds an array of UDTs. The reason is, that arrays are also special cases, and since we haven't put an array in our structure in the first place, lets make a case with an array. Using an array of longs or what ever other type would be the same at this point of the demonstration.
//UDTDemo.idl
typedef
[
uuid(C 21871 9F -33EB-11D4-A 13A -BE 2573A 1120F ),
version(1.0),
helpstring("A Demo UDT Holding an Array of Named Variables")
]
struct UDTArray {
[helpstring("array of named variables")]
SAFEARRAY(UDTVariable) NamedVars;
} UDTArray;
As you have noticed, the only difference is that we used the SAFEARRAY declaration in the first place, but we also included the name of our newly declared UDT. This is the right way to give the COM mechanism the knowledge of what the array will be holding. At this point we have declared a UDT holding a typed array.
Declaring an array of longs it would be as simple, as declaring the following.
SAFEARRAY(long)
Performing a test.
We may compile our project once more. At this point it would be nice to create a test VB project , and add our library into this client project through the references in the project menu. Now press F2 to check what VB may see in our library. Well, nothing but the globals appears in the window.
This is due to the fact that we have declared our UDT's outside the library block in the IDL file. Well, if any declared item, (enum, UDT, Interface) outside the library block, is not explicitly or implicitly imported inside the library block, then this item is unknown (not known) to the clients of the type library.
Lets make a simple test. Save the VB project, and then close it. Otherwise the UDTDemo project will not pass the link step. Inside the "UDTDemo.idl" file go inside the library block and add the following lines.
//UDTDemo.IDL
library UDTDemo //UDTDEMOLib
{
importlib("stdole32.tlb");
importlib("stdole2.tlb");
struct UDTVariable;
struct UDTArray;
};
Build the UDTDemo project once more and open the VB demo project. Open the references dialog, uncheck the UDTDemo line close it and then register our UDTDemo again with the VB project through the references.
Opening the object browser now, will show both the UDT's we have defined in our library. Close the VB project, and comment out the previous lines in the "UDTDemo.idl" file. These structures will be added implicitly into the library through the interfaces we are going to define.
End of the test.
The big secret for our UDT is that the MIDL compiler attaches enough information about it with the library, so as it may be described with the IRecordInfo interface. So, Ole Automation marshaling knows how to use our UDT type as a VT_RECORD type. Items identified as records may be wired. So do arrays of records.
One more thing. The SAFEARRAY(UDTVariable) declaration is a typedef for LPSAFEARRAY. This means that the structure is really declared as
struct UDTArray
{
LPSAFEARRAY NamedVars;
}
This leads us to the conclusion that there is no information provided for us about the type of data the array holds inside our code. Only type library compitible clients know the type information.
The Demo UDT Object
So far we have some really useless structures. We may not use these anywhere, except in VB internally only if we change the "UDTDemo.idl" file.
So to make our demo project a bit useful, lets add an object to our project. Use the hopefully well known insert "new ATL Object" menu item. In the Atl Object Wizard select "simple object" and press "next".
Then type "UDTDemoOb" as a short name in the "ATL object wizard properties". We may use what ever name we like, but we have to avoid using "UDTDemo" as it collides with the library name.
Then as I may always suggest, in the attributes tab, check the "support IsupportErrorInfo" choice, leave it apartment threaded, but as it dawned on me right now, check the "Support Connection points" on the dialog as well.
Pressing "ok" now the wizard will create two interfaces and a coclass object for as in the IDL file, and a class to implement the UDTDemoOb interface.
We checked the support for connection points, because when we use the proxy code generator for connection point interfaces, the code is far from right in the first place, when any of the parameters is of array type. It gives a slight warning about booleans, and compiles the wrong code. So we have to see it as well.
At this point, as It is mentioned at the beginning of this document, I am going to replace the wizard generated UUIDs. the rest of you may compile the project or check this with me.
You may skip this if you like
Do not compile the project yet.
First copy the library UUID and paste it above every UUID defined for a) the IUDTDemoOb interface, b) the _IUDTDemoObEvents events interface and c) the UDTDemo coclass. While you copy the UUID, you may comment out the wizard generated ones. Then starting with the above stated order increase by one the first part of the library interface, for each new occurrence. Parts of the code will look like this.
Collapse
//UDTDemo.idl
[
object,
//uuid( 9117A 521 -34C 3-11D4-A 13A -AAA07458B 90F ), //previous one
uuid(C 21871A 2-33EB-11D4-A 13A -BE 2573A 1120F ), //library one, modified
dual,
helpstring("IUDTDemoOb Interface"),
pointer_default(unique)
]
interface IUDTDemoOb : IDispatch
{
};
[
//uuid( 9117A 523 -34C 3-11D4-A 13A -AAA07458B 90F ), //previous one
uuid(C 21871A 3-33EB-11D4-A 13A -BE 2573A 1120F ), //library one, modified
helpstring("_IUDTDemoObEvents Interface")
]
dispinterface _IUDTDemoObEvents
{
properties:
methods:
};
[
//uuid( 9117A 522 -34C 3-11D4-A 13A -AAA07458B 90F ), //previous one
uuid(C 21871A 4-33EB-11D4-A 13A -BE 2573A 1120F ), //library one, modified
helpstring("UDTDemoOb Class")
]
coclass UDTDemoOb
{
[default] interface IUDTDemoOb;
[default, source] dispinterface _IUDTDemoObEvents;
};
In the above items you may notice that the newly created UUIDs defer in the first part, and they are consequent. But these defer in both the first and second part with the UUID of the library. The fact is that these UUIDs are created one day later, than the one created for the library.
Since the newly created uuid's are consequent we know we are not mistaken to replace them with others consequent also, which should have been created in the past.
At this moment there are three more occurrences of the UUID of the coclass UDTDemo object. These are in the "UDTDemo.rgs" file. So copy the new UUID of the object, open the ".rgs" file in the editor, and replace the old UUID with the new one.
The above are performed for all objects created by the wizard.
Collapse
// UDTDemoOb.rgs
HKCR
{
UDTDemo.UDTDemoOb.1 = s 'UDTDemoOb Class'
{
CLSID = s '{C 21871A 3-33EB-11D4-A 13A -BE 2573A 1120F }'
}
UDTDemo.UDTDemoOb = s 'UDTDemoOb Class'
{
CLSID = s '{C 21871A 3-33EB-11D4-A 13A -BE 2573A 1120F }'
CurVer = s 'UDTDemo.UDTDemoOb.1'
}
NoRemove CLSID
{
ForceRemove { CLSID = s '{C 21871A 3-33EB-11D4-A 13A -BE 2573A 1120F } = s 'UDTDemoOb Class'
{
ProgID = s 'UDTDemo.UDTDemoOb.1'
VersionIndependentProgID = s 'UDTDemo.UDTDemoOb'
ForceRemove 'Programmable'
InprocServer32 = s '%MODULE%'
{
val ThreadingModel = s 'Apartment'
}
'TypeLib' = s '{C 21871A 1-33EB-11D4-A 13A -BE 2573A 1120F }'
}
}
}
End of skip area
Compile the project. Make sure everything is ok. If we check the project with the VB client at this point, we will only see the UDTDemo object appear in the object browser. This is correct.
So lets go on and add a property to our object. Using the property wizard add a property named UdtVar, accepting a pointer to a UDTVariable. We'll get later to the pointer thing. The UDTVariable is not in the type list of the dialog, so we have to manually add it. Check the picture bellow.
This is how our interface looks like after pressing the [Ok] button.
[
object,
//uuid( 9117A 521 -34C 3-11D4-A 13A -AAA07458B 90F ),
uuid(C 21871A 2-33EB-11D4-A 13A -BE 2573A 1120F ),
dual,
helpstring("IUDTDemoOb Interface"),
pointer_default(unique)
]
interface IUDTDemoOb : IDispatch
{
[propget, id(1), helpstring("Returns a UDT Variable in the UDTObject")]
HRESULT UdtVar([out, retval] UDTVariable * *pVal);
[propput, id(1), helpstring("Sets a UDT Variable in the UDTObject")]
HRESULT UdtVar([in] UDTVariable * newVal);
};
Lets check the put property first. Most of us know that in the put property we have to pass variables "by value". Here we defined a [in] variable as pointer to UDTVariable. So we pass the variable "by reference". In the C and C++ field we know that this is faster to do so. The same applies to VB and COM. In VB when dealing with types and structures we are forsed to use the byref declaration, no matter which direction the data goes to. It is up to the callee to enforce the integrity of the incoming data, so that when the method returns the input parameter is unchanged.
On the other hand the get property takes an argument of type pointer to pointer. In the beginning it looks right, since a "pointer to pointer" is a reference to a "pointer", and the get property argument type is always declared as the pointer to the put property argument type.
As always when the argument is an out one, the callee is responsible for allocating the memory. This means that we have to call "new UDTVariable" in our get_ method. But VB does not understand pointers. Does it?.
The above VB error says that VB can not accept a pointer to pointer in a get method returning a UDT. So we have to alter the get property of our object to accept only a pointer to UDTVariable. Still our method handles the memory allocation for the extra string in the UDT. Let's see it.
VB dimension a UDTVariable
Allocate memory for the UDT.
The memory is sizeof( UDTVariable ).
Pass the variables address to the object.
Object allocates memory for UDT.Name
Object initializes the string
If object.special is not an integral type
allocates memory for the type
set the value of Object.Special.
So our get method is still responsible for allocating memory for the UDTVariable. It just does not allocate the UDTVariable body memory.
So after this we may go to the get method of our interface, and remove one of the "*" characters. Alongside with this modification change the argument names from pVal and newVal to "pUDT". This is a bit more clear for the VB, C++ client app developer since the beginning of autocompletion in the studio environment.
We also want this property to be the default one. Go and replace the id(1) with id(0) in both methods. Our interface now looks like this.
[
object,
//uuid( 9117A 521 -34C 3-11D4-A 13A -AAA07458B 90F ),
uuid(C 21871A 2-33EB-11D4-A 13A -BE 2573A 1120F ),
dual,
helpstring("IUDTDemoOb Interface"),
pointer_default(unique)
]
interface IUDTDemoOb : IDispatch
{
[propget, id(0), helpstring("Returns a UDT Variable in the UDTObject")]
HRESULT UdtVar([out, retval]
UDTVariable *pUDT);
[propput, id(0), helpstring("Sets a UDT Variable in the UDTObject")]
HRESULT UdtVar([in]
UDTVariable *pUDT);
};
This is not enough though. We have to inform the CUDTDemoOb class for the change in the interface. So go to the header file, remove the "*" from the get_UdtVar method, and since we are there change the name of the argument to "pUDT". Do the same for the ".cpp" file.
Here are the modifications in the CUDTDemoOb class
//CUDTDemoOb.h
STDMETHOD(get_UdtVar)(/*[out, retval]*/ UDTVariable *pUDT);
STDMETHOD(put_UdtVar)(/*[in]*/ UDTVariable *pUDT);
//CUDTDemoOb.cpp
STDMETHODIMP CUDTDemoOb::get_UdtVar(UDTVariable *pUDT)
STDMETHODIMP CUDTDemoOb::put_UdtVar(UDTVariable *pUDT)
We are now ready to compile the project.
So what are these warnings about incompatible automation interface?. ("warning MIDL2039 : interface does not conform to [oleautomation] attribute") You may safely ignore this warning. It is stated in several articles. When the MIDL compiler is upgraded the warning will go away. (well, this may not be right in case the interface is defined inside the Library block).
We may open the VB project again, and check the object browser. The property is there, declared for our object. There is also a reference to the UDTVariable. This is correct, since now the UDT is implicitly inserted into the UDTDemo library through the IUDTDemoOb interface.
Using the UDTVariable
Lets go back to the UDTDemo library and make it do something useful. First we need a UDTVariable member in the CUDTDemoOb class. So open the header file and add a declaration for a variable.
//CUDTDemoOb.h
protected:
UDTVariable m_pUDT;
We also have to modify the constructor of our class to initialize the m_pUDT structure. We also need to add a destructor to the class.
//CUDTDemoOb.h
CUDTDemoOb()
{
CComBSTR str = _T("Unnamed");
m_pUDT.Value = 0; //default value zero (0)
m_pUDT.Name = ::SysAllocString( str ); //default name "Unnamed"
::VariantInit( &m_pUDT.Special ); //default special value "Empty"
}
virtual ~CUDTDemoOb()
{
m_pUDT.Value = 0; //just in case
::SysFreeString( m_pUDT.Name ); //free the string memory
::VariantClear( &m_pUDT.Special ); // free the variant memory
}
Now it is time we added some functionality into the properties of our class.
Always check for an incoming NULL pointer, when there is a pointer involved. So go into both the get_ and put_ properties implementation and add the following.
//CUDTDemoOb.cpp
If( !pUDT )
return( E_POINTER );
Now get into the put_UdtVar property method. What we have to do, is assign the members of the incoming variable into the one our object holds. This is easy for the Value member but for the other two, we have to free their allocated memory before assigning the new values. That is why we have selected a string and a variant. So the code will now look like the following.
//CUDTDemoOb.cpp
STDMETHODIMP CUDTDemoOb::put_UdtVar(UDTVariable *pUDT)
{
if( !pUDT )
return( E_POINTER );
if( !pUDT->Name )
return( E_POINTER );
m_pUDT.Value = pUDT->Value; //easy assignment
::SysFreeString( m_pUDT.Name ); //free the previous string first
m_pUDT.Name = ::SysAllocString( pUDT->Name ); //make a copy of the incoming
::VariantClear( &m_pUDT.Special ); //free the previous variant first
::VariantCopy( &m_pUDT.Special, &pUDT->Special ); //make a copy
return S_OK;
}
As every great writer says, we remove error checks for clarity :).
You may have noticed that we also check the string Name for null value. We have to. BSTRs are declared as pointers so this field might be NULL. The point is that a NULL pointer is not an empty COM string. An Empty com string is one with zero length.
After the method returns, our object has a copy of the incoming structure, and that is what we wanted to do.
Now forward to the get_UdtVar method. This is the opposite of the previous one. We have to fill in the incoming structure with the values of the internal UDT structure of the object.
We may check the code.
//CUDTDemoOb.cpp
STDMETHODIMP CUDTDemoOb::get_UdtVar(UDTVariable *pUDT)
{
if( !pUDT )
return( E_POINTER );
pUDT->Value = m_pUDT.Value; //return value
::SysFreeString( pUDT->Name ); //free old (previous) name
pUDT->Name = ::SysAllocString( m_pUDT.Name ); //copy new name
::VariantClear( &pUDT->Special ); //free old special value
::VariantCopy( &pUDT->Special, &m_pUDT.Special ); //copy new special value
return S_OK;
}
The main difference is now that the Name and Special members of the incoming UDT may be NULL and Empty respectively. This is allowed because our object is obliged to fill in the structure. The callee is only responsible for allocating the memory for the UDT itself alone and not its members.
Why do we free the incoming string ?. well, because the callee may pass in an already initialized UDT. The SysFreeString and VariantClear system methods may handle NULL string pointers and empty variants respectively. Freeing the string may give us errors. In case the method is not called from VB the Name BSTR pointer, may hold a not NULL but invalid pointer (trash). So this would have been
HRESULT hr = ::SysFreeString( pUDT->Name ); //free old (previous) name
if( FAILED( hr ) )
return( hr ); //if for any reason there is error FAIL
Compile the project, open the VB client project, add a button to the form, and do some checks with assignments there.
Private Sub cmdFirstTest_Click()
Dim a_udt As UDTVariable ''define a couple UDTVariables
Dim b_udt As UDTVariable
Dim ob_udt As New UDTDemoOb ''declare and create a UDEDemoOb object
a_udt.Name = "Ioannis" ''initialize one of the UDTS
a_udt.Value = 10
a_udt.Special = 15.5
ob_udt.UdtVar = a_udt ''assign the initialized UDT to the object
b_udt = ob_udt.UdtVar ''assign the UDT of the object to the second UDT
''put a breakpoint here and check the result in the immediate window
End Sub
Now try this.
b_udt = ob_udt.UdtVar ''assign the UDT of the object to the second UDT
''put a breakpoint here and check the result in the immediate window
b_udt.Special = b_udt ''it actually makes a full copy of the b_udt
b_udt.Special.Special.Name = "kostas" ''vb does not use references
ARRAYS OF UDTs
So, we have not seen any arrays so far you may say. It is our next step. We are going to add a method to our interface, which will return an array of UDTs. It will take two numbers as input, start and length, and will return an array of UDTVariables with length items, holding consequent values.
So go to the UDTDemo project, right click on the IUDTDemoOb interface, and select "add method".
In the Dialog, type "UDTSequence" as the name of the method, and add the following as the parameters. "[in] long start, [in] long length, [out, retval] SAFEARRAY(UDTVariable) *SequenceArr". Press [Ok] and lets see what the wizard added into the project for us.
Do not compile now !
Well the definition of the new method has been inserted into the IUDTDemoOb interface.
Collapse
//udtdemo.idl
[
object,
//uuid( 9117A 521 -34C 3-11D4-A 13A -AAA07458B 90F ),
uuid(C 21871A 2-33EB-11D4-A 13A -BE 2573A 1120F ),
dual,
helpstring("IUDTDemoOb Interface"),
pointer_default(unique)
]
interface IUDTDemoOb : IDispatch
{
[propget, id(0), helpstring("Returns a UDT Variable in the UDTObject")]
HRESULT UdtVar([out, retval] UDTVariable *pUDT);
[propput, id(0), helpstring("Sets a UDT Variable in the UDTObject")]
HRESULT UdtVar([in] UDTVariable *pUDT);
[id(1), helpstring("Return successive named values")]
HRESULT UDTSequence([in] long start
[in] long length,
[out, retval] SAFEARRAY(UDTVariable) *SequenceArr);
};
The above is edited a bit so it may be visible here at once. There should not be something we do not know so far. We saw earlier what SAFEARRAY(UDTVariable) is. This is the declaration of a pointer to a SAFEARRAY structure holding UDTVariables. So SequenceArr is really a reference to a SAFEARRAY pointer. Everything is fine so far.
Now lets check the header file of the CUDTDemoOb class.
//udtdemoob.h
public:
STDMETHOD(UDTSequence)(/*[in]*/ long start,
/*[in]*/ long length,
/*[out, retval]*/ SAFEARRAY(UDTVariable) *SequenceArr);
STDMETHOD(get_UdtVar)(/*[out, retval]*/ UDTVariable *pUDT);
STDMETHOD(put_UdtVar)(/*[in]*/ UDTVariable *pUDT);
At first it looks right. It is not. There is not any macro or something visible to the compiler to understand the SAFEARRAY(UDTVariable) declaration. As we said at the beginning of this document, our code will never have enough type information about the SAFEARRAY structure. The type information for arrays should be checked at run time. So we have to modify the code. Replace SAFEARRAY(UDTVariable) with SAFEARRAY *.
This is how the code should look like.
//udtdemoob.h
public:
STDMETHOD(UDTSequence)(/*[in]*/ long start,
/*[in]*/ long length,
/*[out, retval]*/ SAFEARRAY **SequenceArr);
STDMETHOD(get_UdtVar)(/*[out, retval]*/ UDTVariable *pUDT);
STDMETHOD(put_UdtVar)(/*[in]*/ UDTVariable *pUDT);
You've probably realized that we have to modify the implementation file of CUDTDemoOb class to correct this problem. Well I was surprised to see that for the first time, the wizard had not even added the declaration of the SequenceArr.
//udtdemoob.cpp
STDMETHODIMP CUDTDemoOb::UDTSequence(long start, long length ) //Where is the SafeArray ?
{
return S_OK;
}
As you see, we have to add the SAFEARRAY **SequenceArr declaration. On the other hand if the SequenceArr was declared just replace is as we did in the header.
//udtdemoob.cpp
STDMETHODIMP CUDTDemoOb::UDTSequence(long start, long length, SAFAARRAY **SequenceArr )
{
return S_OK;
}
Now we may compile the project. Check again the VB client project, in the object browser to see the new method, and that it returns an array of UDTVarables.
So return to the implementation of UDTSequence to start adding checking code. First we have to test that the outgoing array variable is not null. The second check is the length variable. It may not be less than or equal to zero.
//udtdemoob.cpp
STDMETHODIMP CUDTDemoOb::UDTSequence(long start, longlength,
SAFEARRAY **SequenceArr)
{
if( !SequenceArr )
return( E_POINTER );
if( length <= 0 ) {
HRESULT hr=Error(_T("Length must be greater than zero") );
return( hr );
}
return S_OK;
}
You may notice the usage of the Error method. This is provided by ATL and is very easy to notify clients for errors without getting into much trouble.
The next step is to check the actual array pointer. The dereferenced one. This is the "*SequenceArr". There are two possibilities at this point. Ether this is NULL, which is Ok since we return the array, or holds some non zero value, where supposing it is an array we clear it and create a new one.
So the method goes on.
//udtdemoob.cpp
STDMETHODIMP CUDTDemoOb::UDTSequence(long start, long length,
SAFEARRAY **SequenceArr)
{
if( !SequenceArr )
return( E_POINTER );
if( length <= 0 ) {
HRESULT hr = Error( _T("Length must be greater than zero") );
return( hr );
}
if( *SequenceArr != NULL ) {
::SafeArrayDestroy( *SequenceArr );
*SequenceArr = NULL;
}
return S_OK;
}
Create The Array
Now we may create a new array to hold the sequence of named variables. Our first thought here is to use the ::SafeArrayCreate method, since we do not know what we exactly need. Search the MSDN library and in the documentation we find nothing about UDTs. On the other hand the ::SafeArrayCreateEx method implies it may create an array of Records (UDTs).
As the normal version, this method needs access to a SAFEARRAYBOUND structure, the number of dimensions, the data type, and a pointer to IRecordInfo interface. So, go by the book. a) we need an array of "records" use VT_RECORD, b) we need only one (1) dimension, c) we need a zero based array (lbound) with length (cbElements). Ok. This is what we have so far.
SAFEARRAYBOUND rgsabound[1];
rgsabound[0].lLbound = 0;
rgsabound[0].cElements = length;
*SequenceArr = ::SafeArrayCreateEx(VT_RECORD, 1, rgsabound, /*what next ?*/ );
Searching in the MSDN once more, reveals the "::GetRecordInfoFromGuids" method. Actually there are two of them, but this one seemed easier to use for this tutorial. The arguments to this method are,:
So, go into the IDL file, copy the uuid of the UDTVariable structure and paste it at the beginning of the implementation file. Then make it a formal UUID structure.
So this "C 21871A 0-33EB-11D4-A 13A -BE 2573A 1120F " becomes
//udtdemoob.cpp
const IID UDTVariable_IID = { 0xC 21871A 0,
0x33EB,
0x11D4, {
0xA1,
0x 3A ,
0xBE,
0x25,
0x73,
0xA1,
0x12,
0x 0F
}
};
now we are ready, to create an uninitialized array of UDTVariable structures. inside the UDTSequence function
Collapse
//////////////////////////////////////////////////
//here starts the actual creation of the array
//////////////////////////////////////////////////
IRecordInfo *pUdtRecordInfo = NULL;
HRESULT hr = GetRecordInfoFromGuids( LIBID_UDTDemo,
1, 0,
0,
UDTVariable_IID,
&pUdtRecordInfo );
if( FAILED( hr ) ) {
HRESULT hr2 = Error( _T("Can not create RecordInfo interface for"
"UDTVariable") );
return( hr ); //Return original HRESULT hr2 is for debug only
}
SAFEARRAYBOUND rgsabound[1];
rgsabound[0].lLbound = 0;
rgsabound[0].cElements =length;
*SequenceArr = ::SafeArrayCreateEx( VT_RECORD, 1, rgsabound, pUdtRecordInfo );
pUdtRecordInfo->Release(); //do not forget to release the interface
if( *SequenceArr == NULL ) {
HRESULT hr = Error( _T("Can not create array of UDTVariable "
"structures") );
return( hr );
}
//////////////////////////////////////////////////
//the array has been created
//////////////////////////////////////////////////
Now we have created an uninitialized array, and have to put data on it. You may also make tests with VB at this point, to check that the method returns arrays with the expected size. Even without data.
If you get an the HRESULT error code "Element not found" make sure you have typed the UDTVariable_IID correctly.
At this point you should also know that the memory which has been allocated by the system for the array is zero initialized. This means that the Value and Name members are initialized to zero (0) and the Special member is initialized to VT_EMPTY. This is helpful in case we'd like to distinguish between an initialized or not slot in the array.
Add Data into the Array
There are two ways to fill in an array with data. One is to add it one by one, using the ::SafeArrayPutElement method, and the other is to use the ::SafeArrayAccessData to manipulate the data a bit faster. In my experience we are going to use the first one when we want to access a single element and the second one when we need to perform calculation in the whole range of the data the array holds.
Safe arrays of structures appear in memory as normal arrays of structures. At first there might be a misunderstanding that in the SAFEARRAY there is record information kept with every item in the array. This is not true. There is only one IRecordInfo or ITypeInfo pointer for the whole array. SAFEARRAYs use a simple old trick. They allocate the memory to hold the SAFEARRAY structure but there is also some more memory allocated to hold the extra pointer if necessary at the begining. This is stated in the MSDN library.
So now we are going to create two internal methods for demonstrating both ways of entering data into the array.
First we'll use the ::SafeArrayPutElement method. In the CUDTDemoOb class declaration, insert the declaration of this method. This method should be declared protected, since it will only be called internally by the class itself.
//udtdemoob.h
protected:
HRESULT SequenceByElement(long start, long length, SAFEARRAY *SequenceArr);
The only difference from the UDTSequence method is that this one accepts only a pointer to a SAFEARRAY. Not the pointer to pointer used in SAFEARRAY (UDTSequence).
The algorithm to fill the array is really simple. For every UDTVariable in the array, we set successive values starting from start into the Value member of our structure, convert this numerical to BSTR and assign it to the Name member of the structure. Finally set the value of the Special member to be either of type long or double and assign to it the same numeric value, except that when we use the double version add "0.5" to have different data there.
In the implementation file of our class add the method definition.
//udtdemoob.cpp
HRESULT CUDTDemoOb::SequenceByElement(long start,
long length,
SAFEARRAY *SequenceArr)
{
return( S_OK );
}
we may skip checking the incoming variables in this method, since these are supposed to be called only inside the class, and the initial checks taken before calling these.
//udtdemoob.cpp
HRESULT CUDTDemoOb::SequenceByElement(long start,
long length,
SAFEARRAY *SequenceArr)
{
HRESULT hr = SafeArrayGetLBound( SequenceArr, 1, &lBound );
if( FAILED( hr ) ) return( hr );
return( S_OK );
}
The first check to be performed is the lower bound of the array. Although we state that we handle zero-based arrays, one may pass a special bounded array. In VB it is easy to get one-based arrays. It is also a way to know we have a valid SAFEARRAY pointer.
The following code makes the conversion from numeric to string, and assigns the string value to the Name member of the a_udt structure.
//udtdemoob.cpp
HRESULT CUDTDemoOb::SequenceByElement(long start,
long length,
SAFEARRAY *SequenceArr)
{
HRESULT hr = SafeArrayGetLBound( SequenceArr, 1, &lBound );
if( FAILED( hr ) )
return( hr );
hr = ::VariantChangeType( &a_variant, &a_udt.Special, 0, VT_BSTR );
hr = ::VarBstrCat( strDefPart, a_variant.bstrVal, &a_udt.Name );
return( S_OK );
}
You may see the code in the accompanying project, so we are going to explain the big picture. Inside the loop this line is executed.
//udtdemoob.cpp
HRESULT CUDTDemoOb::SequenceByElement(long start,
long length,
SAFEARRAY *SequenceArr)
{
HRESULT hr = SafeArrayGetLBound( SequenceArr, 1, &lBound );
if( FAILED( hr ) )
return( hr );
hr = ::VariantChangeType( &a_variant, &a_udt.Special, 0, VT_BSTR );
hr = ::VarBstrCat( strDefPart, a_variant.bstrVal, &a_udt.Name );
hr = ::SafeArrayPutElement( SequenceArr, &i, (void*)&a_udt );
return( S_OK );
}
In this line of code, the system adds the a_udt in the ith position of the array. What we have to know is that in this call, the system makes a full copy of the structure we pass in it. The reason the system may perform the full copy is the usage of the IRecordInfo interface we used in the creation of the array. As a result we have to release the memory held by any BSTR or VARIANT we use. In our situation we only release the a_variant variable since this holds the reference of the only resource allocated string.
Let's move to the ::SafeArrayAccessData method and check out the differences. The first change, is that now we use a pointer to UDTVariable p_udt. The second big difference is that inside the loop there is only code to set the members of the structure, through the pointer. The only actual code to access the array is outside the loop with the methods to access and release the actual memory the data resides to. There is also one more check inside the loop
//udtdemoob.cpp
//....
if( p_udt->Name )
::SysFreeString( p_udt->Name );
//....
This is to demonstrate that since we access the data without any other interference we have to release any memory allocated for a BSTR string, a VARIANT or even an interface pointer before assigning data to it. As it was pointed before, checking for the NULL value might be adequate for this simple demonstration.
I hope it is obvious that it is better calling the second method - ::SafeArrayAccessData - when there is need to access all or most of the data in the array, but might also be appropriate to use the the ::SafeArrayGetElement and ::SafeArrayPutElement pair of methods if you want to modify one or two elements at a time.
As a final step insert the following lines at the end of the body of the UDTSequence method, and test it with the VB client project. You may comment out which ever you like to see how it works, and that they both give the same results.
// hr = SequenceByElement( start, length, *SequenceArr );
hr = SequenceByData( start, length, *SequenceArr );
Static Arrays
Our method presents a fault in the design. It may only return a dynamically created array. This means the array is created on the heap. Try adding the following lines in VB and check this out.
dim a_udt_arr(5) as UDTVariable
dim a_udt_ob as UDTDemoOb
a_udt_arr = UDTDemoOb.UDTSequence(15, 5) ''Error here
Well, conformant arrays, I think this is what they call them, are only available as [in] arguments in this demo. So for the moment add one more check to our UDTSequence method. The other problem is that arrays are always passed as doubly referenced pointers.
So let's try out a modify the array approach.
Add one more property to the interface
Call it Item like in collections. The signature will be
//udtdemoob.idl
[propput, .....]
Item( [in] long index,
[in] SAFEARRAY(UDTVariable) *pUDTArr,
[in] UDTVariable *pUDT );
[propget, .....]
Item( [in] long index,
[in] SAFEARRAY(UDTVariable) *pUDTArr,
[out, retval] UDTVariable *pUDT );
The reason we add this, is to demonstrate some checks for the incoming arrays. As you may have guessed by the method definition, arrays although defined as [in] are still modifiable in every way. Our first check is to see if it is an array of UDTVariable structures. Since this check is performed in at least two methods, we may put it in its own protected function inside the object implementation class.
As you have noticed, our object still does not keep any state about the incoming arrays.
HRESULT IsUDTVariableArray( SAFEARRAY *pUDTArr, bool &isDynamic )
The only difference in what you might expect is the bool reference at the end of the declaration. Well, this check function will be able to inform us if a) we may actually modify the array, (append or remove items by reallocating the memory, or even destroy and recreate the array), b) we may only modify individual UDTVariable structures inserted in the array. The former feature will not be implemented in the demonstrating project.
Our first check is the number of dimensions of the incoming array. We want this to be one dimensioned. After reading the tutorial you may expand this to multidimensional arrays although there is a slight issue.
long dims = SafeArrayGetDim( pUDTArr );
if( dims != 1 ) {
hr = Error( _T("Not Implemented for multidimentional arrays") );
return( hr );
}
the next step is to check that the array is created so as to hold structures. This is easily done by checking that the features flag of the incoming array indicates records support.
unsigned short feats = pUDTArr->fFeatures; //== 0x0020;
if( (feats & FADF_RECORD) != FADF_RECORD ) {
hr = Error( _T("Array is expected to hold structures") );
return( hr );
}
Final check is to compare the name of the structure the array holds with ours. To do this we have to get access to the IRecordInfo interface pointer the array holds.
IRecordInfo *pUDTRecInfo = NULL;
hr = ::SafeArrayGetRecordInfo( pUDTArr, &pUDTRecInfo );
if( FAILED( hr ) && !pUDTRecInfo )
return( hr );
Now do the comparing.
BSTR udtName = ::SysAllocString( L"UDTVariable" );
BSTR bstrUDTName = NULL; //if not null. we are going to have problem
hr = pUDTRecInfo->GetName( &bstrUDTName);
if( VarBstrCmp( udtName, bstrUDTName, 0, GetUserDefaultLCID()) != VARCMP_EQ ) {
::SysFreeString( bstrUDTName );
::SysFreeString( udtName );
hr = Error(_T("Object Does Only support [UDTVariable] Structures") );
return( hr );
}
In the accompanying project there are also some more checks as demonstration, which are available only through the debugger. Implementing the Item property is straightforward after this.
Using VARIANTS
I do not think this is enough so far, as we have not discussed using our structure with variants. So let's add one more property to our object. Add the following definition to our interface.
HRESULT VarItem([in] long items, [out, retval]
LPVARIANT pUdtData );
Now go to the definition of the new property in the implementation file of the CUDTDemoOb class and let's do something.
First some checks. The usual check for the null pointer, and then check if the VARIANT contains any data. If it is not empty we should clear it.
if( !pUdtData )
return( E_POINTER );
if( pUdtData->vt != VT_EMPTY )
::VariantClear( pUdtData );
The next step is to implement the algorithm which is to return a) a single UDTVariable structure if the item variable is equal or less than one (1). b) an array of structures if item is larger than one (1).
In both situations we have to set the type of the outgoing VARIANT to VT_RECORD, and this is the only similarity in accessing the VARIANT pUdtData variable. For the single UDTVariable structure, we have to set the pRecInfo member of the VARIANT to a valid IRecordInfo interface pointer. This has been demonstrated earlier. Then assign the new structure to the pvRecord member of the variant. Returning an array on the other hand, we must update the type of the outgoing VARIANT to be of type VT_ARRAY as well. Then we just assign an already constructed array to the parray member of the variant. Both the assignments are easily done, since we have already implemented appropriate properties and methods in our object.
Collapse
if( items <= 1 ) {
IRecordInfo *pUdtRecordInfo = NULL;
hr = ::GetRecordInfoFromGuids( LIBID_UDTDemo,
1, 0,
0,
UDTVariable_IID,
&pUdtRecordInfo );
if( FAILED( hr ) ) {
HRESULT hr2= Error( _T("Can not create RecordInfo"
"interface for UDTVariable") );
return( hr );
} //assign record information on the variant
pUdtData->pRecInfo = pUdtRecordInfo;
pUdtRecordInfo = NULL; //MIND. we do not release the interface.
//VariantClear should
pUdtData->vt = VT_RECORD;
pUdtData->pvRecord= NULL;
hr= get_UdtVar( (UDTVariable*) &(pUdtData->pvRecord) );
} else {
//here the valid pointer of the union is the array.
//so the array holds the record info.
pUdtData->vt = VT_RECORD | VT_ARRAY;
hr = UDTSequence(1, items, &(pUdtData->parray) );
}
I think this is enough for a basic tutorial on UDT's with COM. There is no interface defined to access the second type UDTArray defined in the type library, but this should be straightforward at this moment (I tricked you :) ). In the demo project, I've explicitly added the structure in the library body, so you can play with this in VB.
"Safe Arrays" in EVENTS
I've also said that there is a flaw in the code created by the wizard for the interfaces creates to pass any kind of arrays back. This is partially been taken care of with the implementation of the VarItem method. An event method is demonstrated in the project. Here is what has been changed in the generated method.
Supposing that not many of us have used events in the controls, I am going to be a bit more specific on this.
Let's begin the journey to ConnectionPoints. First we have to add a method to the IUDTDemoObEvents interface. Here is the signature of this method. So far you have the knowledge to understand the signature of this method. Additionally only the UDTDemo.idl has changed so far.
[id(1), helpstring
("Informs about changes in an array of named vars")]
HRESULT ChangedVars(SAFEARRAY(UDTVariable) *pVars);
Now compile once more the project, and check the Object Browser in the VB client. You may see the event declared in the object.
Now where the project is compiled, and the UDTDemo type library is updated, we may update the CUDTDemoOb class to use the IUDTDemoObEvents interface. In the project window, right click on the CUDTDemoOb class, and from the popup menu select Implement connection point.
In the following dialog box, select the (check on it) _IUDTDemoObEvents interface and press [ok].
The wizard has now added one more file into the project. "UDTDemoCP.h" in which the CProxy_IUDTDemoEvents< class T > template class is implemented, and handles the event interface of the UDTDemoOb coclass object. The CUDTDemoOb class is now deriving from the newly generated proxy class.
The proxy class holds the Fire_ChangedVars method, which is implemented and we can call it from any point of our class to fire the event.
So let's go to the implementation of the UDTSequence method just for the demonstration and fire the event.
//UDTDemoOb.cpp - UDTSequence method
//hr = SequenceByElement( start, length, *SequenceArr );
hr = SequenceByData( start, length, *SequenceArr);
return Fire_ChangedVars( SequenceArr );
//<<---- changed here //return S_OK;
Now compile the project, and watch the output.
warning C4800:
'struct tagSAFEARRAY ** ' : forcing value to bool 'true' or 'false'
(performance warning)
This is not really a warning. This is an implementation error and causes runtime problems. Let's see just for the demonstration of it. Open the VB Client again and add the following in the declarations of the demo form. I hope you know what the WithEvents keyword means.
Dim WithEvents main_UDT_ob As UDTDemoOb
Update the following as well
Private Sub Form_Load()
Set main_UDT_ob = New UDTDemoOb
End Sub
Private Sub Form_Unload(Cancel As Integer)
Set main_UDT_ob = Nothing
End Sub
Private Sub main_UDT_ob_ChangedVars(pVars() As UDTDemo.UDTVariable)
Debug.Print pVars(1).Name, pVars(1).Special, pVars(1).Value
End Sub
Set a breakpoint in the debug statement of the event handler and run the client. See what we get.
And in stand alone execution we get
Well, the actual error is the following and should be the expected error since we know the warning. This was discovered in the VC++ debugger as the return HRESULT of the Invoke method.
0x80020005 == Type Mismatch
It's time we checked the code the wizards generated for us.
Collapse
HRESULT Fire_ChangedVars(SAFEARRAY * * pVars)
{
CComVariant varResult;
T* pT = static_cast<T*>(this);
int nConnectionIndex;
CComVariant* pvars = new CComVariant[1];
int nConnections = m_vec.GetSize();
for( nConnectionIndex = 0; nConnectionIndex < nconnections; nConnectionIndex++) {
pT->Lock();
CComPtr<IUnknown> sp = m_vec.GetAt(nConnectionIndex);
pT->Unlock();
IDispatch* pDispatch = reinterpret_cast<IDispatch*>(sp.p);
if (pDispatch != NULL)
{
VariantClear(&varResult);
pvars[0] = pVars;
DISPPARAMS disp = { pvars, NULL, 1, 0 };
pDispatch->Invoke( 0x1,
IID_NULL,
LOCALE_USER_DEFAULT,
DISPATCH_METHOD,
&disp,
&varResult,
NULL, NULL);
}
}
delete[] pvars;
return varResult.scode;
}
lets check the trouble lines.
CComVariant* pvars = new CComVariant[1];
int nConnections = m_vec.GetSize();
This logically assumes that there might be more than one clients connected with the instance of our object. But no error check means that at least one client is expected to be connected. This is wizard code so it should perform some checks. We are not expected to know every detail of the IConnectionPointImpl ATL class.
int nConnections = m_vec.GetSize();
if( !nConnections )
return S_OK;
CComVariant* pvars = new CComVariant[1];
Of course I'm exaggerating, but this is my way of doing such things.
This final line, incorrectly assumes that there is only one client connected to our object. Each time Invoke is called inside the loop, the varResult variable is set to the return value of the method being invoked. Neither varResult is being checked for returning any error code, neither the return value of the Invoke method itself, which in our project gave the right error. So as is, calling the event method, will succeed or fail depending on notifying the last object connected with our UDTDemoOb object. Consider using a Single Instance Exe Server with clients connected on it !
pDispatch->Invoke( 0x1, .. .
return varResult.scode;
this is not to blame anyone, since if we'd like per connection error handling we should make it ourselves. Just remember that you have to take care of it depending on the project.
The Actual Problem
pvars[0] = pVars;
CComVariant does not handle arrays of any kind. But since it derives directly from the VARIANT structure it is easy to modify the code to do the right thing for us. We used VARIANTs earlier so you may try it yourselves first.
//pvars[0] = pVars;
pvars[0].vt = VT_ARRAY | VT_BYREF | VT_RECORD;
pvars[0].pparray = pVars;
To pass any kind of array with a VARIANT you just have to define the VT_Type of the array, or'd with the VT_ARRAY type. The only difference from our previous example is that here we use the VT_BYREF argument as well. This is necessary since we have a pointer to pointer argument. Of course byref in VB means we use the "pparray" member of the variant union. For an array holding strings it would be
pvars[0].vt = VT_ARRAY | VT_BSTR; //array to strings
pvars[0].parray = ...
pvars[0].vt = VT_ARRAY | VT_BYREF | VT_BSTR; //pointer to array to strings
pvars[0].pparray = ...
Again, although we deal with an array holding UDT structures we do not have to set an IRecordInfo interface inside the variant.
Compile the project and try this out. Do not fear unless you change the idl file of the project the code does not change. This is the reason we first define all methods in the event (sink) interface and then implement the connection point interface in our object.
Final Note
As most of you may have noticed this has been written quite some time ago. The reason it is posted at this moment is that I had to use user defined structures (UDTs) for a demo project I work on, and this article was really helpful during its implementation. So I hope it is worth reading and helpful to the developer community as well.
References:
MSDN Library:
Platform SDK /Component Services / COM / Automation / User Defined Data Types. Extending Visual Basic with C++ DLLs, by Bruce McKinney. April 1996
MSJ magazine:
Q&A ActiveX / COM, by Don Box. MSJ November 1996 Underastanding Interface Definition Language: A Developer's survival guide, by Bill Hludzinski MSJ August 1998.
Books:
Beginning ATL COM Programming, by Richard Grimes, George Reilly, Alex Stockton, Julian Templeman, Wrox Press, ISBN 1861000111 Professional ATL COM Programming, by Richard Grimes. Wrox Press. ISBN 1861001401
ioannhs_s
|
Click here to view ioannhs_s's online profile. |
一、 为什么要用COM
软件工程发展到今天,从一开始的结构化编程,到面向对象编程,再到现在的COM编程,目标只有一个,就是希望软件能象积方块一样是累起来的,是组装起来的,而不是一点点编出来的。结构化编程是函数块的形式,通过把一个软件划分成许多模块,每个模块完成各自不同的功能,尽量做到高内聚低藕合,这已经是一个很好的开始,我们可以把不同的模块分给不同的人去做,然后合到一块,这已经有了组装的概念了。软件工程的核心就是要模块化,最理想的情况就是100%内聚0%藕合。整个软件的发展也都是朝着这个方向走的。结构化编程方式只是一个开始。下一步就出现了面向对象编程,它相对于面向功能的结构化方式是一个巨大的进步。我们知道整个自然界都是由各种各样不同的事物组成的,事物之间存在着复杂的千丝万缕的关系,而正是靠着事物之间的联系、交互作用,我们的世界才是有生命力的才是活动的。我们可以认为在自然界中事物做为一个概念,它是稳定的不变的,而事物之间的联系是多变的、运动的。事物应该是这个世界的本质所在。面向对象的着眼点就是事物,就是这种稳定的概念。每个事物都有其固有的属性,都有其固有的行为,这些都是事物本身所固有的东西,而面向对象的方法就是描述出这种稳定的东西。而面向功能的模块化方法它的着眼点是事物之间的联系,它眼中看不到事物的概念它只注重功能,我们平常在划分模块的时侯有没有想过这个函数与哪些对象有关呢?很少有人这么想,一个函数它实现一种功能,这个功能必定与某些事物想联系,我们没有去掌握事物本身而只考虑事物之间是怎么相互作用而完成一个功能的。说白了,这叫本末倒置,也叫急功近利,因为不是我们智慧不够,只是因为我们没有多想一步。面向功能的结构化方法因为它注意的只是事物之间的联系,而联系是多变的,事物本身可能不会发生大的变化,而联系则是很有可能发生改变的,联系一变,那就是另一个世界了,那就是另一种功能了。如果我们用面向对象的方法,我们就可以以不变应万变,只要事先把事物用类描述好,我们要改变的只是把这些类联系起来的方法,只是重新使用我们的类库,而面向过程的方法因为它构造的是一个不稳定的世界,所以一点小小的变化也可能导致整个系统都要改变。然而面向对象方法仍然有问题,问题在于重用的方法。搭积木式的软件构造方法的基础是有许许多多各种各样的可重用的部件、模块。我们首先想到的是类库,因为我们用面向对象的方法产生的直接结果就是许多的类。但类库的重用是基于源码的方式,这是它的重大缺陷。首先它限制了编程语言,你的类库总是用一种语言写的吧,那你就不能拿到别的语言里用了。其次你每次都必须重新编译,只有编译了才能与你自己的代码结合在一起生成可执行文件。在开发时这倒没什么,关键在于开发完成后,你的EXE都已经生成好了,如果这时侯你的类库提供厂商告诉你他们又做好了一个新的类库,功能更强大速度更快,而你为之心动又想把这新版的类库用到你自己的程序中,那你就必须重新编译、重新调试!这离我们理想的积木式软件构造方法还有一定差距,在我们的设想里希望把一个模块拿出来再换一个新的模块是非常方便的事,可是现在不但要重新编译,还要冒着很大的风险,因为你可能要重新改变你自己的代码。另一种重用方式很自然地就想到了是DLL的方式。Windows里到处是DLL,它是Windows 的基础,但DLL也有它自己的缺点。总结一下它至少有四点不足。(1)函数重名问题。DLL里是一个一个的函数,我们通过函数名来调用函数,那如果两个DLL里有重名的函数怎么办?(2)各编译器对C++函数的名称修饰不兼容问题。对于C++函数,编译器要根据函数的参数信息为它生成修饰名,DLL库里存的就是这个修饰名,但是不同的编译器产生修饰的方法不一样,所以你在VC 里编写的DLL在BC里就可以用不了。不过也可以用extern "C";来强调使用标准的C函数特性,关闭修饰功能,但这样也丧失了C++的重载多态性功能。(3)路径问题。放在自己的目录下面,别人的程序就找不到,放在系统目录下,就可能有重名的问题。而真正的组件应该可以放在任何地方甚至可以不在本机,用户根本不需考虑这个问题。(4)DLL与EXE的依赖问题。我们一般都是用隐式连接的方式,就是编程的时侯指明用什么DLL,这种方式很简单,它在编译时就把EXE与DLL绑在一起了。如果DLL发行了一个新版本,我们很有必要重新链接一次,因为DLL里面函数的地址可能已经发生了改变。DLL的缺点就是COM的优点。首先我们要先把握住一点,COM和DLL一样都是基于二进制的代码重用,所以它不存在类库重用时的问题。另一个关键点是,COM本身也是DLL,既使是ActiveX控件.ocx它实际上也是DLL,所以说DLL在还是有重用上有很大的优势,只不过我们通过制订复杂的COM协议,通COM本身的机制改变了重用的方法,以一种新的方法来利用DLL,来克服DLL本身所固有的缺陷,从而实现更高一级的重用方法。COM没有重名问题,因为根本不是通过函数名来调用函数,而是通过虚函数表,自然也不会有函数名修饰的问题。路径问题也不复存在,因为是通过查注册表来找组件的,放在什么地方都可以,即使在别的机器上也可以。也不用考虑和EXE的依赖关系了,它们二者之间是松散的结合在一起,可以轻松的换上组件的一个新版本,而应用程序混然不觉。
二、用VC进行COM编程,必须要掌握哪些COM理论知识
我见过很多人学COM,看完一本书后觉得对COM的原理比较了解了,COM也不过如此,可是就是不知道该怎么编程序,我自己也有这种情况,我也是经历了这样的阶段走过来的。要学COM的基本原理,我推荐的书是《COM技术内幕》。但仅看这样的书是远远不够的,我们最终的目的是要学会怎么用COM去编程序,而不是拼命的研究COM本身的机制。所以我个人觉得对COM的基本原理不需要花大量的时间去追根问底,没有必要,是吃力不讨好的事。其实我们只需要掌握几个关键概念就够了。这里我列出了一些我自己认为是用VC编程所必需掌握的几个关键概念。(这里所说的均是用C++语言条件下的COM编程方式)
(1) COM组件实际上是一个C++类,而接口都是纯虚类。组件从接口派生而来。我们可以简单的用纯粹的C++的语法形式来描述COM是个什么东西:
class IObject |
看清楚了吗?IObject就是我们常说的接口,MyObject就是所谓的COM组件。切记切记接口都是纯虚类,它所包含的函数都是纯虚函数,而且它没有成员变量。而COM组件就是从这些纯虚类继承下来的派生类,它实现了这些虚函数,仅此而已。从上面也可以看出,COM组件是以 C++为基础的,特别重要的是虚函数和多态性的概念,COM中所有函数都是虚函数,都必须通过虚函数表VTable来调用,这一点是无比重要的,必需时刻牢记在心。为了让大家确切了解一下虚函数表是什么样子,从《COM+技术内幕》中COPY了下面这个示例图:
(2) COM组件有三个最基本的接口类,分别是IUnknown、IClassFactory、IDispatch。
COM规范规定任何组件、任何接口都必须从IUnknown继承,IUnknown包含三个函数,分别是 QueryInterface、AddRef、Release。这三个函数是无比重要的,而且它们的排列顺序也是不可改变的。QueryInterface用于查询组件实现的其它接口,说白了也就是看看这个组件的父类中还有哪些接口类,AddRef用于增加引用计数,Release用于减少引用计数。引用计数也是COM中的一个非常重要的概念。大体上简单的说来可以这么理解,COM组件是个DLL,当客户程序要用它时就要把它装到内存里。另一方面,一个组件也不是只给你一个人用的,可能会有很多个程序同时都要用到它。但实际上DLL只装载了一次,即内存中只有一个COM组件,那COM组件由谁来释放?由客户程序吗?不可能,因为如果你释放了组件,那别人怎么用,所以只能由COM组件自己来负责。所以出现了引用计数的概念,COM维持一个计数,记录当前有多少人在用它,每多一次调用计数就加一,少一个客户用它就减一,当最后一个客户释放它的时侯,COM知道已经没有人用它了,它的使用已经结束了,那它就把它自己给释放了。引用计数是COM编程里非常容易出错的一个地方,但所幸VC的各种各样的类库里已经基本上把AddRef的调用给隐含了,在我的印象里,我编程的时侯还从来没有调用过AddRef,我们只需在适当的时侯调用Release。至少有两个时侯要记住调用Release,第一个是调用了 QueryInterface以后,第二个是调用了任何得到一个接口的指针的函数以后,记住多查MSDN 以确定某个函数内部是否调用了AddRef,如果是的话那调用Release的责任就要归你了。 IUnknown的这三个函数的实现非常规范但也非常烦琐,容易出错,所幸的事我们可能永远也不需要自己来实现它们。
IClassFactory的作用是创建COM组件。我们已经知道COM组件实际上就是一个类,那我们平常是怎么实例化一个类对象的?是用‘new’命令!很简单吧,COM组件也一样如此。但是谁来new它呢?不可能是客户程序,因为客户程序不可能知道组件的类名字,如果客户知道组件的类名字那组件的可重用性就要打个大大的折扣了,事实上客户程序只不过知道一个代表着组件的128位的数字串而已,这个等会再介绍。所以客户无法自己创建组件,而且考虑一下,如果组件是在远程的机器上,你还能new出一个对象吗?所以创建组件的责任交给了一个单独的对象,这个对象就是类厂。每个组件都必须有一个与之相关的类厂,这个类厂知道怎么样创建组件,当客户请求一个组件对象的实例时,实际上这个请求交给了类厂,由类厂创建组件实例,然后把实例指针交给客户程序。这个过程在跨进程及远程创建组件时特别有用,因为这时就不是一个简单的new操作就可以的了,它必须要经过调度,而这些复杂的操作都交给类厂对象去做了。IClassFactory最重要的一个函数就是CreateInstance,顾名思议就是创建组件实例,一般情况下我们不会直接调用它,API函数都为我们封装好它了,只有某些特殊情况下才会由我们自己来调用它,这也是VC编写COM组件的好处,使我们有了更多的控制机会,而VB给我们这样的机会则是太少太少了。
IDispatch叫做调度接口。它的作用何在呢?这个世上除了C++还有很多别的语言,比如VB、 VJ、VBScript、JavaScript等等。可以这么说,如果这世上没有这么多乱七八糟的语言,那就不会有IDispatch。:-) 我们知道COM组件是C++类,是靠虚函数表来调用函数的,对于VC来说毫无问题,这本来就是针对C++而设计的,以前VB不行,现在VB也可以用指针了,也可以通过VTable来调用函数了,VJ也可以,但还是有些语言不行,那就是脚本语言,典型的如 VBScript、JavaScript。不行的原因在于它们并不支持指针,连指针都不能用还怎么用多态性啊,还怎么调这些虚函数啊。唉,没办法,也不能置这些脚本语言于不顾吧,现在网页上用的都是这些脚本语言,而分布式应用也是COM组件的一个主要市场,它不得不被这些脚本语言所调用,既然虚函数表的方式行不通,我们只能另寻他法了。时势造英雄,IDispatch应运而生。:-) 调度接口把每一个函数每一个属性都编上号,客户程序要调用这些函数属性的时侯就把这些编号传给IDispatch接口就行了,IDispatch再根据这些编号调用相应的函数,仅此而已。当然实际的过程远比这复杂,仅给一个编号就能让别人知道怎么调用一个函数那不是天方夜潭吗,你总得让别人知道你要调用的函数要带什么参数,参数类型什么以及返回什么东西吧,而要以一种统一的方式来处理这些问题是件很头疼的事。IDispatch接口的主要函数是Invoke,客户程序都调用它,然后Invoke再调用相应的函数,如果看一看MS的类库里实现 Invoke的代码就会惊叹它实现的复杂了,因为你必须考虑各种参数类型的情况,所幸我们不需要自己来做这件事,而且可能永远也没这样的机会。:-)
(3) dispinterface接口、Dual接口以及Custom接口
这一小节放在这里似乎不太合适,因为这是在ATL编程时用到的术语。我在这里主要是想谈一下自动化接口的好处及缺点,用这三个术语来解释可能会更好一些,而且以后迟早会遇上它们,我将以一种通俗的方式来解释它们,可能并非那么精确,就好象用伪代码来描述算法一样。-:)
所谓的自动化接口就是用IDispatch实现的接口。我们已经讲解过IDispatch的作用了,它的好处就是脚本语言象VBScript、 JavaScript也能用COM组件了,从而基本上做到了与语言无关它的缺点主要有两个,第一个就是速度慢效率低。这是显而易见的,通过虚函数表一下子就可以调用函数了,而通过Invoke则等于中间转了道手续,尤其是需要把函数参数转换成一种规范的格式才去调用函数,耽误了很多时间。所以一般若非是迫不得已我们都想用VTable的方式调用函数以获得高效率。第二个缺点就是只能使用规定好的所谓的自动化数据类型。如果不用IDispatch我们可以想用什么数据类型就用什么类型,VC会自动给我们生成相应的调度代码。而用自动化接口就不行了,因为Invoke的实现代码是VC事先写好的,而它不能事先预料到我们要用到的所有类型,它只能根据一些常用的数据类型来写它的处理代码,而且它也要考虑不同语言之间的数据类型转换问题。所以VC自动化接口生成的调度代码只适用于它所规定好的那些数据类型,当然这些数据类型已经足够丰富了,但不能满足自定义数据结构的要求。你也可以自己写调度代码来处理你的自定义数据结构,但这并不是一件容易的事。考虑到IDispatch的种种缺点(它还有一个缺点,就是使用麻烦,:-) )现在一般都推荐写双接口组件,称为dual接口,实际上就是从IDispatch继承的接口。我们知道任何接口都必须从 IUnknown继承,IDispatch接口也不例外。那从IDispatch继承的接口实际上就等于有两个基类,一个是IUnknown,一个是IDispatch,所以它可以以两种方式来调用组件,可以通过 IUnknown用虚函数表的方式调用接口方法,也可以通过IDispatch::Invoke自动化调度来调用。这就有了很大的灵活性,这个组件既可以用于C++的环境也可以用于脚本语言中,同时满足了各方面的需要。
相对比的,dispinterface是一种纯粹的自动化接口,可以简单的就把它看作是IDispatch接口 (虽然它实际上不是的),这种接口就只能通过自动化的方式来调用,COM组件的事件一般都用的是这种形式的接口。
Custom接口就是从IUnknown接口派生的类,显然它就只能用虚函数表的方式来调用接口了
(4) COM组件有三种,进程内、本地、远程。对于后两者情况必须调度接口指针及函数参数。
COM是一个DLL,它有三种运行模式。它可以是进程内的,即和调用者在同一个进程内,也可以和调用者在同一个机器上但在不同的进程内,还可以根本就和调用者在两台机器上。这里有一个根本点需要牢记,就是COM组件它只是一个DLL,它自己是运行不起来的,必须有一个进程象父亲般照顾它才行,即COM组件必须在一个进程内.那谁充当看护人的责任呢?先说说调度的问题。调度是个复杂的问题,以我的知识还讲不清楚这个问题,我只是一般性的谈谈几个最基本的概念。我们知道对于WIN32程序,每个进程都拥有4GB的虚拟地址空间,每个进程都有其各自的编址,同一个数据块在不同的进程里的编址很可能就是不一样的,所以存在着进程间的地址转换问题。这就是调度问题。对于本地和远程进程来说,DLL 和客户程序在不同的编址空间,所以要传递接口指针到客户程序必须要经过调度。Windows 已经提供了现成的调度函数,就不需要我们自己来做这个复杂的事情了。对远程组件来说函数的参数传递是另外一种调度。DCOM是以RPC为基础的,要在网络间传递数据必须遵守标准的网上数据传输协议,数据传递前要先打包,传递到目的地后要解包,这个过程就是调度,这个过程很复杂,不过Windows已经把一切都给我们做好了,一般情况下我们不需要自己来编写调度DLL。
我们刚说过一个COM组件必须在一个进程内。对于本地模式的组件一般是以EXE的形式出现,所以它本身就已经是一个进程。对于远程DLL,我们必须找一个进程,这个进程必须包含了调度代码以实现基本的调度。这个进程就是dllhost.exe。这是COM默认的DLL代理。实际上在分布式应用中,我们应该用MTS来作为DLL代理,因为MTS有着很强大的功能,是专门的用于管理分布式DLL组件的工具。
调度离我们很近又似乎很远,我们编程时很少关注到它,这也是COM的一个优点之一,既平台无关性,无论你是远程的、本地的还是进程内的,编程是一样的,一切细节都由COM自己处理好了,所以我们也不用深究这个问题,只要有个概念就可以了,当然如果你对调度有自己特殊的要求就需要深入了解调度的整个过程了,这里推荐一本《COM+技术内幕》,这绝对是一本讲调度的好书。
(5) COM组件的核心是IDL。
我们希望软件是一块块拼装出来的,但不可能是没有规定的胡乱拼接,总是要遵守一定的标准,各个模块之间如何才能亲密无间的合作,必须要事先共同制订好它们之间交互的规范,这个规范就是接口。我们知道接口实际上都是纯虚类,它里面定义好了很多的纯虚函数,等着某个组件去实现它,这个接口就是两个完全不相关的模块能够组合在一起的关键试想一下如果我们是一个应用软件厂商,我们的软件中需要用到某个模块,我们没有时间自己开发,所以我们想到市场上找一找看有没有这样的模块,我们怎么去找呢?也许我们需要的这个模块在业界已经有了标准,已经有人制订好了标准的接口,有很多组件工具厂商已经在自己的组件中实现了这个接口,那我们寻找的目标就是这些已经实现了接口的组件,我们不关心组件从哪来,它有什么其它的功能,我们只关心它是否很好的实现了我们制订好的接口。这种接口可能是业界的标准,也可能只是你和几个厂商之间内部制订的协议,但总之它是一个标准,是你的软件和别人的模块能够组合在一起的基础,是COM组件通信的标准。
COM具有语言无关性,它可以用任何语言编写,也可以在任何语言平台上被调用。但至今为止我们一直是以C++的环境中谈COM,那它的语言无关性是怎么体现出来的呢?或者换句话说,我们怎样才能以语言无关的方式来定义接口呢?前面我们是直接用纯虚类的方式定义的,但显然是不行的,除了C++谁还认它呢?正是出于这种考虑,微软决定采用IDL来定义接口。说白了,IDL实际上就是一种大家都认识的语言,用它来定义接口,不论放到哪个语言平台上都认识它。我们可以想象一下理想的标准的组件模式,我们总是从IDL开始,先用IDL制订好各个接口,然后把实现接口的任务分配不同的人,有的人可能善长用VC,有的人可能善长用VB,这没关系,作为项目负责人我不关心这些,我只关心你把最终的DLL 拿给我。这是一种多么好的开发模式,可以用任何语言来开发,也可以用任何语言来欣赏你的开发成果。
(6) COM组件的运行机制,即COM是怎么跑起来的。
这部分我们将构造一个创建COM组件的最小框架结构,然后看一看其内部处理流程是怎样的
IUnknown *pUnk=NULL; |
这就是一个典型的创建COM组件的框架,不过我的兴趣在CoCreateInstance身上,让我们来看看它内部做了一些什么事情。以下是它内部实现的一个伪代码:
CoCreateInstance(....) |
这段话的意思就是先得到类厂对象,再通过类厂创建组件从而得到IUnknown指针。继续深入一步,看看CoGetClassObject的内部伪码:
CoGetClassObject(.....) |
下图是从COM+技术内幕中COPY来的一个例图,从图中可以清楚的看到CoCreateInstance的整个流程。
其实
(7) 一个典型的自注册的COM DLL所必有的四个函数
DllGetClassObject:用于获得类厂指针
DllRegisterServer:注册一些必要的信息到注册表中
DllUnregisterServer:卸载注册信息
DllCanUnloadNow:系统空闲时会调用这个函数,以确定是否可以卸载DLL
DLL还有一个函数是DllMain,这个函数在COM中并不要求一定要实现它,但是在VC生成的组件中自动都包含了它,它的作用主要是得到一个全局的实例对象。
(8) 注册表在COM中的重要作用
首先要知道GUID的概念,COM中所有的类、接口、类型库都用GUID来唯一标识,GUID是一个128位的字串,根据特制算法生成的GUID可以保证是全世界唯一的。 COM组件的创建,查询接口都是通过注册表进行的。有了注册表,应用程序就不需要知道组件的DLL文件名、位置,只需要根据CLSID查就可以了。当版本升级的时侯,只要改一下注册表信息就可以神不知鬼不觉的转到新版本的DLL。
三.attribute.
(1)pointer attribute
[unique]:
· Can have the value NULL.
· Can change during a call from NULL to non-NULL, from non-NULL to NULL, or from one non-NULL value to another.
· Can allocate new memory on the client. When the unique pointer changes from NULL to non-NULL, data returned from the server is written into new storage.
· Can use existing memory on the client without allocating new memory. When a unique pointer changes during a call from one non-NULL value to another, the pointer is assumed to point to a data object of the same type. Data returned from the server is written into existing storage specified by the value of the unique pointer before the call.
· Can orphan memory on the client. Memory referenced by a non-NULL unique pointer may never be freed if the unique pointer changes to NULL during a call and the client does not have another means of dereferencing the storage.
· Does not cause aliasing. Like storage pointed to by a reference pointer, storage pointed to by a unique pointer cannot be reached from any other name in the function.
The following restrictions apply to unique pointers:
· The [unique] attribute cannot be applied to binding-handle parameters ( handle_t) and context-handle parameters.
· The [unique] attribute cannot be applied to [out]-only top-level pointer parameters (parameters that have only the [out] directional attribute).
· By default, top-level pointers in parameter lists are [ref] pointers. This is true even if the interface specifies pointer_default(unique). Top-level parameters in parameter lists must be specified with the [unique] attribute to be a unique pointer.
· Unique pointers cannot be used to describe the size of an array or union arm because unique pointers can have the value NULL. This restriction prevents the error that results if a NULL value is used as the array size or the union-arm size.
[ref]:相当于C++中的const
A reference pointer has the following characteristics:
· Always points to valid storage; never has the value NULL. A reference pointer can always be dereferenced.
· Never changes during a call. A reference pointer always points to the same storage on the client before and after the call.
· Does not allocate new memory on the client. Data returned from the server is written into existing storage specified by the value of the reference pointer before the call.
· Does not cause aliasing. Storage pointed to by a reference pointer cannot be reached from any other name in the function.
A reference pointer cannot be used as the type of a pointer returned by a function.
If no attribute is specified for a top-level pointer parameter, it is treated as a reference pointer.
[ptr]:
The full pointer designated by the [ptr] attribute approaches the full functionality of the C-language pointer. The full pointer can have the value NULL and can change during the call from NULL to non-NULL. Storage pointed to by full pointers can be reached by other names in the application supporting aliasing and cycles. This functionality requires more overhead during a remote procedure call to identify the data referred to by the pointer, determine whether the value is NULL, and to discover if two pointers point to the same data.
Use full pointers for:
· Remote return values.
· Double pointers, when the size of an output parameter is not known.
· Null pointers.