前言:以下的秘密其实不能算作是Windows的秘密,或许好多人都已经知道,但是知道的人或许并不是很多,我今天把它介绍出来,也算是我为BCB论坛中所有的Fans做了一点贡献!!虽然我是半年前就在csdn上注册的,但是由于工作原因,一直没有来这里,上个星期开始才算是真正向组织报的到,来了,总该有点见面礼吧,以下权当见面礼吧!
1。为什么要这么做?
研究Windows中的汉字已经有好长一段时间了,其实目的只是为了汉语合成,基于音节的合成。但是进展总是很缓慢,在漫长的摸索中虽然自己的主要目标没有实现,倒是副产品产出了不少,前几天见ALNG(),一个很执著的程序员(大概是程序员的执著劲的一个典型的代表吧),在论坛上发了个关于取得汉语拼音以及笔划的贴子,很执著,大家讨论得也很热闹,好多方法我还是第一次见到,比如说数笔划(后来细查资料才知道通过数Stroke Font的汉字的笔划的确可以得到结果),自己又长见识了。只不过限于篇幅,自己也只是清描淡写的说了几句,没有太在意,倒是后来又忍不住去讲了一通通过IME的反向转换可以取得拼音和笔划,意犹未尽,想想今天还有点时间,就为大家做点事情吧!
我不知道这知识可能会为你带来什么,但是可能有用,也可能没用,如果你想做一个自己的输入法,想做一个自己的tts,想做一个自己的手写识别软件,想得到任何一个汉字的笔划,笔划数,汉语拼音,我想这应该还是有点用的!
2。从字符映射表谈起
开始-->程序--->附件--->系统工具---〉字符映射表
这一连串的菜单想必大家都不会陌生,那就打开这个程序吧。看仔细了(唉,论坛上没地方插图片,不然下面就是一个字符映射表程序的图片了...呵呵,提个意见!!)
我们先不管最上面的"字体"Edit框,先选字符集为Unicode(为什么选它?这里面最全嘛!什么字符都有!),分组先选"按拼音分类的简体中文"(先关注拼音嘛!),就会出来一个声母面板,上面有A-Z,我们每一次把鼠标放在A的序列。在主窗体的ComoboBox把鼠标放在第一行A上,看看状态烂,你看到"U+0041"的标示。继续放在"啊"字上,看状态栏你会看到:"U+554A",继续放在其他的字上,有以下类似结果:
字 状态栏内容
A U+0041(0x41)
啊 U+554A(0xB0A1)
阿 U+963F(0xB0A2)
呵 U+5475(0xBAC7)
....
很显然,系统里有一张表来存放字以及字的发音,并且他们之间的存放位置是存在对应关系的,假设我们能够破译这种对应关系,那么我们的目的就基本达到了。那么这张表是什么表,存在哪里,入口在哪里,怎么去查这张表???一个迷!!在后面我们就会看到这个入口,并且能查到我们想要的东西!!期待!别走开!
让我们看看更奇特的,仍然选字符集为Unicode(为什么选它?这里面最全嘛!什么字符都有!),分组先选"按偏旁部首分类的表意文字"(再关注笔划嘛!),如果你选中了"高级查看",就会有一个标题为"分组"的框弹出来,上面是基本部首按笔划数分类。我们选
"一",主窗体的Comobobox就会出现所有部首为"一"汉字,注意观察:第一行全部是在
"一"这个部首上加一笔构成的汉字,第n行全部是在"一"上面加上n笔之后构成的汉字,也就是说:第i行的汉字笔划是本页的部首笔划m加上i之后构成的笔划数为m+i的汉字。
假设能破译这张表我们的目的不也就达到了吗?这张表的入口究竟在那里呢??
让我们再次把目光投向U+xxxx(0xyyyy)这个标示,猛然发现A后面那个括号的0x41怎么就那么眼熟呢?一想:噢,原来是A的ASCII码。
那也就是说,这张表多多少少应该根ASCII码表有点联系了,再一查,果然,凡是ACSII码表中有的,这里面所对应0xYYYY的值跟ASCII码表中的全部一致!也就是说系统中有一张表包括了ASCII码表----Unicode编码表!
3.漫谈Uncode编码表
关于这张表,程序员都知道一点,关于说为什么要有Unicode,我也不细说了,那么最重要的,这张表是如何编码的呢:(懒了一点,以下的来自微软的某篇文档,我就直接贴出来,大家看了源汁源味的,或许理解更深:),我就胡乱解释了,
Unicode
Unicode is a worldwide character-encoding standard. Windows NT, Windows 2000, and Windows XP use it exclusively at the system level for character and string manipulation. Unicode simplifies localization of software and improves multilingual text processing. By implementing it in your applications, you can enable the application with universal data exchange capabilities for global marketing, using a single binary file for every possible character code.
Unicode defines semantics for each character, standardizes script behavior, provides a standard algorithm for bidirectional text, and defines cross-mappings to other standards. Among the scripts supported by Unicode are Latin, Greek, Han, Hiragana, and Katakana. Supported languages include, but are not limited to, German, French, English, Greek, Chinese, and Japanese.
Unicode can represent all of the world's characters in modern computer use, including technical symbols and special characters used in publishing. Because each Unicode code value is 16 bits wide, it is possible to have separate values for up to 65,536 characters. Unicode-enabled functions are often referred to as "wide-character" functions. Note that the implementation of Unicode in 16-bit values is referred to as UTF-16. For compatibility with 8- and 7-bit environments, UTF-8 and UTF-7 are two transformations of 16-bit Unicode values. For more information, see The Unicode Standard, Version 2.0.
Windows supports applications that use either Unicode or the regular ANSI character set. Mixed use in the same application is also possible. Adding Unicode support to an application is easy, and you can even maintain a single set of sources from which to compile an application that supports either Unicode or the Windows ANSI character set.
Functions support Unicode by assigning its strings a specific data type and providing a separate set of entry points and messages to support this new data type. A series of macros and naming conventions make transparent migration to Unicode, or even compiling both non-Unicode and Unicode versions of an application from the same set of sources, a straightforward matter.
Implementing Unicode as a separate data type also enables the compiler's type checking to ensure that only Unicode parameters are used with functions expecting Unicode strings
Surrogates
There is a need to support more characters than the 65,536 that fit in the 16-bit Unicode code space. For example, the Chinese speaking community alone uses over 55,000 characters. To answer this need, the Unicode Standard defines surrogates. A surrogate or surrogate pair is a pair of 16-bit Unicode code values that represent a single character. The first (high) surrogate is a 16-bit code value in the range U+D800 to U+DBFF. The second (low) surrogate is a 16-bit code value in the range U+DC00 to U+DFFF. Using surrogates, Unicode can support over one million characters. For more details about surrogates, refer to The Unicode Standard, version 2.0.
Windows 2000 introduces support for basic input, output, and simple sorting of surrogates. However, not all system components are surrogate compatible. Also, surrogates are not supported in Windows 95/98/Me.
The system supports surrogates in the following ways:
The cmap 12 OpenType font format is introduced, which directly supports the 4-byte character code. Refer to the OpenType font specification for more detail.
Windows USER supports surrogate-enabled IMEs.
Windows GDI APIs support cmap 12 so surrogates can be displayed correctly.
Uniscribe APIs support surrogates.
Windows controls, including Edit and Rich Edit, support surrogates.
HTML engine supports HTML page that includes surrogates for display, editing (through Outlook Express), and forms submission.
System sorting table supports surrogates.
Planes two and three (defined in ISO/IEC 10646) are reserved for ideographic characters.These planes fall in the high surrogate range of U+D840 to U+D8BF.
General Guidelines for Software Development
Windows handles surrogates as pairs of 16-bit code values. The system processes surrogate pairs in a way similar to the way it processes nonspacing marks. At display time, the surrogate pair display as one glyph by means of Uniscribe. (This conforms to the requirements in the Unicode Standard, version 2.0)
Applications automatically support surrogates if they support Unicode and use system controls and standard APIs, such as ExtTextOut and DrawText. Thus, if your code uses standard system controls or uses general ExtTextOut-type calls to display, surrogate pairs should work without any changes necessary.
Applications implementing their own editing support by working out glyph positions for themselves may use Uniscribe for all text processing. Uniscribe has separate APIs to deal with complex script processing (such as line service, hit testing, and cursor movement). The application must call the Uniscribe APIs specifically to get these advanced features. Applications written to the Uniscribe API are fully multilingual. However, this does impose a performance penalty, so some applications may want to do their own surrogate processing.
Because surrogates are well defined, you can also write your own code to handle surrogate text processing. When a program encounters a separated Unicode value from either the lower reserved range or the upper reserved range, it must be one half of a surrogate pair. Thus, you can detect a surrogate pair by doing simple range checking. If you encounter a Unicode value in the lower or upper range, then you need to track backward or forward one 16-bit width to get the rest of the character. Keep in mind that CharNext and CharPrev move by 16-bit code points, not by surrogates.
For sorting, all surrogate pairs are treated as two Unicode code points. Surrogates are sorted after other Unicode code points, but before the PUA (private user area). Sorting for a standalone surrogate character (that is, either the high or low character is missing) is not supported.
If you are a font or IME provider, note that Windows disables surrogate support by default. If you provide a font and IME package that requires surrogate support, you must set the following registry values:
[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\LanguagePack]
SURROGATE=(REG_DWORD)0x00000002
[HKEY_CURRENT_USER\Software\Microsoft\Internet Explorer\International\Scripts\42]
IEFixedFontName=[Surrogate Font Face Name]
IEPropFontName=[Surrogate Font Face Name
Unicode Subset Bitfields
The Unicode Subset Bitfields (USB) are used in the FONTSIGNATURE and LOCALESIGNATURE structures.
Bit Unicode
subrange Description
0 0020 - 007e Basic Latin
1 00a0 - 00ff Latin-1 Supplement
2 0100 - 017f Latin Extended-A
3 0180 - 024f Latin Extended-B
4 0250 - 02af IPA Extensions
5 02b0 - 02ff Spacing Modifier Letters
6 0300 - 036f Combining Diacritical Marks
7 0370 - 03ff Basic Greek
8 Reserved
9 0400 - 04ff Cyrillic
10 0530 - 058f Armenian
11 0590 - 05ff Basic Hebrew
12 Reserved
13 0600 - 06ff Basic Arabic
14 Reserved
15 0900 - 097f Devanagari
16 0980 - 09ff Bengali
17 0a00 - 0a7f Gurmukhi
18 0a80 - 0aff Gujarati
19 0b00 - 0b7f Oriya
20 0b80 - 0bff Tamil
21 0c00 - 0c7f Telugu
22 0c80 - 0cff Kannada
23 0d00 - 0d7f Malayalam
24 0e00 - 0e7f Thai
25 0e80 - 0eff Lao
26 10a0 - 10ff Basic Georgian
27 Reserved
28 1100 - 11ff Hangul Jamo
29 1e00 - 1eff Latin Extended Additional
30 1f00 - 1fff Greek Extended
31 2000 - 206f General Punctuation
32 2070 - 209f Subscripts and Superscripts
33 20a0 - 20cf Currency Symbols
34 20d0 - 20ff Combining Diacritical Marks for Symbols
35 2100 - 214f Letter-like Symbols
36 2150 - 218f Number Forms
37 2190 - 21ff Arrows
38 2200 - 22ff Mathematical Operators
39 2300 - 23ff Miscellaneous Technical
40 2400 - 243f Control Pictures
41 2440 - 245f Optical Character Recognition
42 2460 - 24ff Enclosed Alphanumerics
43 2500 - 257f Box Drawing
44 2580 - 259f Block Elements
45 25a0 - 25ff Geometric Shapes
46 2600 - 26ff Miscellaneous Symbols
47 2700 - 27bf Dingbats
48 3000 - 303f Chinese, Japanese, and Korean (CJK) Symbols and Punctuation
49 3040 - 309f Hiragana
50 30a0 - 30ff Katakana
51 3100 - 312f
31a0 - 31bf Bopomofo
Extended Bopomofo
52 3130 - 318f Hangul Compatibility Jamo
53 3190 - 319f CJK Miscellaneous
54 3200 - 32ff Enclosed CJK Letters and Months
55 3300 - 33ff CJK Compatibility
56 ac00 - d7a3 Hangul
57 d800 - dfff Surrogates. Note that setting this bit implies that there is at least one codepoint beyond the Basic Multilingual Plane that is supported by this font.
58 Reserved
59 4e00 - 9fff
2e80 - 2eff
2f00 - 2fdf
2ff0 - 2fff
3400 - 4dbf CJK Unified Ideographs
CJK Radicals Supplement
Kangxi Radicals
Ideographic Description
CJK Unified Ideograph Extension A
60 e000 - f8ff Private Use Area
61 f900 - faff CJK Compatibility Ideographs
62 fb00 - fb4f Alphabetic Presentation Forms
63 fb50 - fdff Arabic Presentation Forms-A
64 fe20 - fe2f Combining Half Marks
65 fe30 - fe4f CJK Compatibility Forms
66 fe50 - fe6f Small Form Variants
67 fe70 - fefe Arabic Presentation Forms-B
68 ff00 - ffef Halfwidth and Fullwidth Forms
69 fff0 - fffd Specials
70 0f00 - 0fcf Tibetan
71 0700 - 074f Syriac
72 0780 - 07bf Thaana
73 0d80 - 0dff Sinhala
74 1000 - 109f Myanmar
75 1200 - 12bf Ethiopic
76 13a0 - 13ff Cherokee
77 1400 - 14df Canadian Aboriginal Syllabics
78 1680 - 169f Ogham
79 16a0 - 16ff Runic
80 1780 - 17ff Khmer
81 1800 - 18af Mongolian
82 2800 - 28ff Braille
83 a000 - a48c Yi
Yi Radicals
84-122 Reserved
123 Windows 2000/XP: Layout progress: horizontal from right to left
124 Windows 2000/XP: Layout progress: vertical before horizontal
125 Windows 2000/XP: Layout progress: vertical bottom to top
126 Reserved; must be 0
127 Reserved; must be 1
注意其中最重要的方法:GetPhonetic,能最大限度的满足你们的需要!
IFELanguage 2
May 2001
Microsoft Corporation
This document names the methods associated with IFELanguage, Version 2.0, for Microsoft® IME 2002, Japanese version. (2 printed pages)
Interface ID: IID_IFELanguage2
Version 2.0
The IFELanguage interface provides services that may be language dependent. (Different for each target language.) The services listed below are those currently available.
IFELanguage version 2.0 is the latest version of IFELanguage Version 1.0 already in publication.
IFELanguage version 2.0 contains the following three functions defined in IFELanguage version 1.0:
IFELanguage::Open
IFELanguage::Close
IFELanguage::GetPhonetic
In addition, version 2.0 supports the following functions:
FELanguage::GetMorphResult
IFELanguage::GetConversionModeCaps
IFELanguage::GetPhonetic
IFELanguage version 2.0 uses some data structures to give the clients detailed information. These are:
WDD (Word Descriptor)
MORRSLT (Sentence Descriptor)
See the specification of data structure in IFELanguage Data Structure. The definition of the functions can be seen in IFELanguage Functions.
IFELanguage Functions (函数说明)
IFELanguage methods
Description
Open
Open the interface to initiate a session
Close
Close the interface to terminate a session
GetMorphResult
Get morphological analysis result
GetConversionModeCaps
Get conversion mode capability
GetPhonetic
Convert string to phonetic symbol
HRESULT IFELanguage::Open
This method must be called before use of IFELanguage for initialization.
Parameters: None.
Return Values:
HRESULT
S_OK: Successfully terminated.
S_FALSE: Fails to create result.
Note MS-IME 98 Japanese version has a limitation that multiple processes should not initialize via this call at the same time.
HRESULT IFELanguage::Close
This method must be called after use of IFELanguage for termination.
Parameters: None
Return Values:
HRESULT
S_OK: Successfully terminated.
S_FALSE: Fails to create result.
HRESULT IFELanguage:: GetPhonetic
It converts the input string (which usually contains Kanji character) to phonetic symbol.
Parameters
BSTR string
(IN) a string of Kanji characters, to convert to phonetic symbols.
LONG start
(IN) the number of character from which IFELanguage be.g.in conversion. The first character is 1 (not 0).
LONG length
(IN) the length of character to convert. If this value is (-1), it means whole length from start column is selected.
BSTR *phonetic
(OUT) the result string. This string is allocated by SysAllocStringLen and must be freed by clients.
Return Values
HRESULT
S_OK: Successfully terminated.
S_FALSE: Fails to create result.
HRESULT IFELanguage::GetMorphResult
This method is used to get morphological analysis result. Before you use this function, call IFELanguage::Open for initialization once.
Parameters
DWORD dwRequest (IN) Kind of request for conversion.
FELANG_REQ_CONV
FELANG_REQ_RECONV
FELANG_REQ_REV
DWORD dwCMode (IN) Any combination of the values below is possible, which determines the conversion output characters and some conversion options.
FELANG_CMODE_NOPRUNING
FELANG_CMODE_PINYIN
FELANG_CMODE_BOPOMOHO
FELANG_CMODE_HANGUL
FELANG_CMODE_MONORUBY
FELANG_CMODE_KATAKANAOUT
FELANG_CMODE_HIRAGANAOUT
FELANG_CMODE_HALFWIDTHOUT
FELANG_CMODE_FULLWIDTHOUT
FELANG_CMODE_PRECONV
FELANG_CMODE_RADICAL
FELANG_CMODE_UNKNOWNREADING
FELANG_CMODE_MARGECAND
FELANG_CMODE_ROMAN
FELANG_CMODE_BESTFIRST
FELANG_CMODE_PLAURALCLAUSE
FELANG_CMODE_SINGLECONVERT
FELANG_CMODE_AUTOMATIC
FELANG_CMODE_PHRASEPREDICT
FELANG_CMODE_CONVERSATION
FELANG_CMODE_NAME
FELANG_CMODE_USENOREVWORDS
FELANG_CMODE_NOINVISIBLECHAR
INT cwchInput (IN) Number of characters of ptchInput
WCHAR *pwchInput (IN) Original input characters to be converted by morphology engine.
NULL means to get the next entry for the previously given string, with next rank.
This must be UNICODE string.
The order that next entries are returned is defined by an implementation.
DWORD *pfCInfo (IN) means the information for each column (ptchInput[x]). If pfCInfo[x] (column info) can be a combination of the flags below:
FELANG_CLMN_WBREAK
FELANG_CLMN_NOWBREAK
FELANG_CLMN_PBREAK
FELANG_CLMN_NOPBREAK
FELANG_CLMN_FIXR
FELANG_CLMN_FIXD
NULL can be used. That means all of pfCInfo[] are FALSE. (No request from client)
MORRSLT **ppResult (OUT) Address of an MORRSLT structure that receives morphology result data.
GetMorphResult() allocates memory (using the OLE task allocator) for the returned data, and sets the pResult to point to the memory.
The application must free the memory pointed to by pResult, by using the CoTaskMemFree function.
Return Values:
HRESULT S_OK:
More candidates exist. If you call this function with ptchInput NULL, the next best candidate for the previous ptchInput will be gotten.
S_FALSE: Fails to create result.
E_NOCAND: No more candidates
E_LARGEINPUT: Too large input
Fails to create result.
dwRequest
request for conversion.
FELANG_REQ_CONV : to get a composition string from a reading.
This request requires the input string is a reading which does not have any ideographic character.
FELANG_REQ_RECONV : to get a composition string from a composition string.
The request is to get a re-conversion result from a conversion result.
FELANG_REQ_REV : to get a reading from a composition string.
On this request, the contents of WDD[n].wDispPos and WDD[n].wReadPos are exchanged.
MORRSLT.pwchRead is used as the source of rev. conversion.
pfCInfo
Information of an input in each column
any combination of the flags below can be set for each column of an input string.
FELANG_CLMN_WBREAK PwchInput[x] is forced to be the start of word in analysis
FELANG_CLMN_NOWBREAK The result must not be divided here when IFELanguage makes a word. FELANG_CLMN_NOWBREAK always includes FELANG_CLMN_NOPBREAK.
FELANG_CLMN_PBREAK PwchInput[x] is forced to be the start of a phrase in analysis. FELANG_CLMN_PBREAK always includes FELANG_CLMN_WBREAK.
FELANG_CLMN_NOPBREAK The result must not be divided here when IFELanguage makes a PHRASE.
FELANG_CLMN_FIXR PwchInput[x] is used as reading character in a conversion result on FELANG_REQ_REV request. That is, the reading character of the conversion result is fixed to be ptchInput[x].
FELANG_CLMN_FIXD PwchInput[x] is used as display character in a conversion result on FELANG_REQ_CONV request. That is, the display character of the conversion result is fixed to be ptchInput[x].
dwCMode
Conversion mode
any combination of the values below is possible, which determine the conversion mode.
FELANG_CMODE_NOPRUNING: no pruning (filter) in morphology analysis.
On input of "", it may return, for example,"" ("" is the body part of ""). This result is impossible from grammar, but with FELANG_CMODE_NOPRUNING, one can get this candidate.
FELANG_CMODE_PINYIN: output is in Pinyin (Chinese phonetic) characters when IFELANG_REQ_REV is specified.
FELANG_CMODE_BOPOMOHO: output is in Bopomoho (Taiwan phonetic) characters when IFELANG_REQ_REV is specified.
FELANG_CMODE_HANGUL: output is in Hangul (Korean phonetic) characters when IFELANG_REQ_REV is specified.
FELANG_CMODE_KATAKANAOUT: output is in Katakana (Japanese phonetic) characters when IFELANG_REQ_REV is specified.
FELANG_CMODE_HIRAGANAOUT: output is in Hirakana (Japanese phonetic) characters when IFELANG_REQ_REV is specified.
FELANG_CMODE_ROMAN: output is in Roman characters when IFELANG_REQ_REV is specified.
FELANG_CMODE_HALFWIDTHOUT: output is in Half Width characters.
FELANG_CMODE_FULLWIDTHOUT: output is in Full Width characters. [default].
FELANG_CMODE_MONORUBY: it makes a reading string for mono-ruby wherever possible. If this flag is set, the returned result is put at paMonoBubyPos. This mode is valid when FELANG_REQ_REV is specified.
FELANG_CMODE_PRECONV: It does the following conversion before the actual conversion. Those are implementation defined.
ROMAN to a phonetic characters. Ex.[‚]¨[n
The auto collect before the conversion. Ex. []¨[|]
The conversion of the punctuation mark and the brackets. Ex.[]¨[C]
FELANG_CMODE_RADICAL: reading is returned in radical (CHN)
FELANG_CMODE_UNKNOWNREADING: reading is returned in unknown format. (CHN)
FELANG_CMODE_MERGECAND: The result strings (candidates) are merged when the strings are the same.
FELANG_CMODE_ROMAN (JPN): enable the roma-kana conversion.
FELANG_CMODE_BESTFIRST: force simple conversion to get better performance time rather than conversion accuracy.
FELANG_CMODE_PLAURALCLAUSE corresponds to the analysis in IME_SMODE_ PLAURALCLAUSE
FELANG_CMODE_SINGLECONVERT corresponds to the analysis in IME_SMODE_SINGLECONVERT
FELANG_CMODE_AUTOMATIC corresponds to the analysis in IME_SMODE_AUTOMATIC
FELANG_CMODE_PHRASEPREDICT corresponds to the analysis in IME_SMODE_PHRASEPREDICT
FELANG_CMODE_CONVERSATION corresponds to the analysis in IME_SMODE_CONVERSATION
FELANG_CMODE_NAME(JPN) Name mode (same as FELANG_CMODE_PLAURALCLAUSE)
FELANG_CMODE_USENOREVWORDS(JPN) Optional conversion using additional words. (e.g. reverse conversion from address to ZIP code).
FELANG_CMODE_NOINVISIBLECHAR
Remove invisible characters in output string when FELANG_CMODE_NOINVISIBLECHAR is specified. This flag is usually used for display purpose. Both CHN and TWN IME will return some spaces and invisible character (e.g.. First tone mark, U+02c9, in Bopomofo). However, for these invisible characters should be removed on displaying.
REMARKS:
This function should not fail on any input (except memory full, etc.) In the worst case, this function returns a result w/ an unknown word.
HRESULT IFELanguage::GetConversionModeCaps
This method is used to get conversion mode capability of the IFELanguage. This function returns which FELANG_CMODE_xxxfs are supported by the object. *pdwCaps is a combination of FELANG_CMODE_xxxfs which are supported.
Parameters:
DWORD *pdwCaps (OUT) *pdwCaps is a bit field which contains FELANG_CMODE_xxx.
Return Values:
HRESULT S_OK
SucSuccessfullyrminated.
S_FALSE
Fails to create result.
HRESULT IFELanguage::GetPhonetic.
This method is a thin wrapper function of GetJMorphResult with FELANG_REQ_REV. It simply converts the input string (which usually contains Kanji character) to phonetic symbol. This function is designed to be called by VB (VBA) through TLB.
Parameters:
BSTR string (IN) a string of Kanji characters, to convert to phonetic symbols.
LONG start (IN) the number of character from which IFELanguage be.g.in conversion. The first character is 1 (not 0).
LONG length (IN) the length of character to convert. If this value is (-1), it means whole length from estartf column is selected.
BSTR *phonetic (OUT) the result string. This string is allocated by SysAllocStringLen and must be freed by clients.
Return Values:
HRESULT
S_OK: Successfully terminated.
S_FALSE: Fails to create result.
--by Friecin