Chapter 3 Encoding Character Sets|DBCS-Enabling Your Core-Code Base(getcpinfo)

DBCS-Enabling Your Core-Code Base

As mentioned in Chapter 2, a good internationalizationshortcut is to use a single source-code base for all languageeditions of a program. This means that all language editions arebuilt from the same source files, but it can also mean that alllanguage editions share some or all of the executable code. Witha fully run-time DBCS-enabled code base, any language edition canhandle double-byte characters when running on a DBCS edition ofthe operating system. A user can run an English application on aJapanese edition of Windows, freely typing in and editing kanjistrings without problems. For example, Microsoft Visual C++ 2'sintegrated editing environment is fully DBCS-enabled. If you runVisual C++ 2 on Japanese Windows, you can put kanji literalcharacters and strings in your source files. The following is anexample of fully run-time DBCS-enabled code:

// an example of a fullyrun-time DBCS-enabled function
int charcount (char *pszStr)
{
int count;
for (count = 0; *pszStr; pszStr = CharNext(pszStr))
++count;
return count;
}

This might seem like a great scheme at first glance, butconstantly calling the system API CharNext in the innerloop is needlessly expensive, especially when the application isrunning on a non-DBCS platform. (How many French users willactually run your program on a Japanese edition of Windows?) Notonly will the code be less efficient, but it will have to containbuffers that are twice as large in order to hold DBCS characters.

Run-Time Optimization

If having fully run-time DBCS-enabled code is important,optimization can help. One option for run-time optimization is tosidetrack the system by writing your own edition of CharNext,using information about the code page provided by the Win32 API GetCPInfo.The example in Figure 3-6 avoids the overhead of making a systemcall in the inner loop and uses an inline function to keep codereadable.

CPINFO CPInfo; // aWindows-defined structure for code-page info
BYTE *vbLBRange; // table of lead-byte range values, which canvary
// in length depending on the code page
BOOL vfDBCS; // Are we running on a DBCS edition of Windows?

{
// ...somewhere in the initialization code...
GetCPInfo(CP_ACP, &CPInfo);
vbLBRange = CPInfo.LeadByte;
vfDBCS = (CPInfo.MaxCharSize > 1); // Is the max length inbytes of
// a character in this code
// page more than 1?
}

...

inline char* MyCharNext (char *pszStr)
{
BYTE bRange = O;

// Check to see whether *pszStr is a Lead Byte-. The constant 12
// allows for up to 6 pairs of lead-byte range values.
while ((bRange < 12) && (vbLBRange[bRange] != NULL))
{
if ((*pszStr >= vbLBRange[bRange]) &&
(*pszStr <= vbLBRange[bRange+1]))
return (pszStr + 2); // Skip two bytes.

bRange += 2; // Go to the next pair of range values.
}

return (pszStr + 1); // Skip one byte.
}

Figure 3-6 By writing your own version ofCharNext you optimize performance by avoiding the need to callthe system for a heavily used operation.

A further optimization would be to make DBCS-related callsonly when the program is running on a DBCS platform. (See Figure3-7.) You'll find that this amount of effort pays off only withcode that's called the most frequently.

// fully DBCS-enabled code
int charcount (char *pszStr)
{
int count;
if (vfDBCS)
{
for (count = 0; *pszStr; pszStr = MyCharNext(pszStr))
++count;
}
else
{
for (count = 0; *pszStr; pszStr++)
++count;
}

return count;
}

Figure 3-7 Making DBCS-related calls onlywhen a program runs on a DBCS platform enhances the performanceof frequently called code.

Dual Compilation

Another widely used approach to DBCS enabling is dualcompilation. Sections of string-handling code bracketed by

#ifdef DBCS
...
#else
...
#endif

allow you to use one set of source code files and substitutecode using a compile-time switch. With this approach, theDBCS-enabled code doesn't affect your program when it's compiledwith the DBCS switch off, which is a great advantage. The maindisadvantage of this method is that in effect it creates a dualcode base that you have to compile, test, and maintainseparately. An example of this approach is shown below.

int charcount (char *pszStr)
{
int count;
#ifndef DBCS
for (count = 0; *pszStr; pszStr++)
#else
for (count = 0; *pszStr; pszStr = CharNext(pszStr))
// (or you could use MyCharNext)
#endif
++count;
return count;
}

Macros and Inline Functions

You could greatly reduce the number of #ifdef DBCS blocks (andgreatly increase the ease of maintaining the code) by definingseveral macros.

#ifndef DBCS
#define CharNext(pc) ((*pc) ? pc + 1 : pc)
#define CharPrev(pcStart, pc) ((pc > pcStart) ? pc - 1 :pcStart)

#ifndef WIN32
#define IsDBCSLeadByte (bByte) (FALSE)
#endif
#endif

// The macro Dbcs can surround code that is DBCS-only.
#ifdef DBCS
#define Dbcs (x) (x)
#else
#define Dbcs (x) // Do nothing.
#endif

Visual C++ developers can sometimes use inline functionsinstead of macros, thus gaining the benefit of easy codemaintenance without the potential traps that come with simpletext substitution.

Notice that because the sample code in Figure 3-6 above calls GetCPInfo,it will work on any Far East edition of Windows. Unfortunately, alarge body of existing software uses the values in the aboveFigure 3-5 to hard-code lead-byte or trail-byte ranges and thushas to be edited and recompiled to work for different DBCS codepages. Keep these functions in mind as you try to spare yourselfadditional work.


MSDN Library

Windows Desktop App Development
Archive
App UI
Globalization and Localization
Chapter 3 Encoding Character Sets
Double-Byte Character Sets in Windows
DBCS Programming Basics
DBCS-Enabled Programs vs. Non-DBCS-Enabled Programs
DBCS-Enabling Your Core-Code Base
How to Go Backward in a DBCS String

https://msdn.microsoft.com/en-us/library/cc194791.aspx

dbcs enabling your code code base 谷歌

你可能感兴趣的:(Chapter 3 Encoding Character Sets|DBCS-Enabling Your Core-Code Base(getcpinfo))