hi.. i have a problem using Query class from mysqlpp with wchar_t. i hope somebody can help me with this. i want to insert unicode to table. here is the sample code mysqlpp::Connection con( "test", "localhost", "test", "" ); mysqlpp::Query query = con.query(); char * test = "abcdefghij"; wchar_t * wtest = L"abcdefghij"; query << "INSERT INTO tes VALUES( '" << test << "')"; query.execute(); query << "INSERT INTO tes VALUES( '" << wtest << "')"; query.execute(); the char test will gave the right result, 'abcdefghij' inserted to db. but the wchar_t wtest not give the right result, the record filled with '004A00F4'. it seems Query not welcome unicode. so how can I insert unicode using mysqlpp??
Each Windows API function that takes a string actually comes in two versions. One version supports only 1-byte “ANSI” characters (a superset of ASCII), so they end in 'A'. Windows also supports the 2-byte subset of Unicode called UCS-2. Some call these “wide” characters, so the other set of functions end in 'W'. TheMessageBox()
API, for instance, is actually a macro, not a real function. If you define the UNICODE macro when building your program, the MessageBox()
macro evaluates to MessageBoxW()
; otherwise, to MessageBoxA()
.
Since MySQL uses the UTF-8 Unicode encoding and Windows uses UCS-2, you must convert data when passing text between MySQL++ and the Windows API. Since there’s no point in trying for portability — no other OS I’m aware of uses UCS-2 — you might as well use platform-specific functions to do this translation. Since version 2.2.2, MySQL++ ships with two Visual C++ specific examples showing how to do this in a GUI program. (In earlier versions of MySQL++, we did Unicode conversion in the console mode programs, but this was unrealistic.)
How you handle Unicode data depends on whether you’re using the native Windows API, or the newer .NET API. First, the native case:
// Convert a C string in UTF-8 format to UCS-2 format. void ToUCS2(LPTSTR pcOut, int nOutLen, const char* kpcIn) { MultiByteToWideChar(CP_UTF8, 0, kpcIn, -1, pcOut, nOutLen); } // Convert a UCS-2 string to C string in UTF-8 format. void ToUTF8(char* pcOut, int nOutLen, LPCWSTR kpcIn) { WideCharToMultiByte(CP_UTF8, 0, kpcIn, -1, pcOut, nOutLen, 0, 0); }
These functions leave out some important error checking, so see examples/vstudio/mfc/mfc_dlg.cpp
for the complete version.
If you’re building a .NET application (such as, perhaps, because you’re using Windows Forms), it’s better to use the .NET libraries for this:
// Convert a C string in UTF-8 format to a .NET String in UCS-2 format. String^ ToUCS2(const char* utf8) { return gcnew String(utf8, 0, strlen(utf8), System::Text::Encoding::UTF8); } // Convert a .NET String in UCS-2 format to a C string in UTF-8 format. System::Void ToUTF8(char* pcOut, int nOutLen, String^ sIn) { array^ bytes = System::Text::Encoding::UTF8->GetBytes(sIn); nOutLen = Math::Min(nOutLen - 1, bytes->Length); System::Runtime::InteropServices::Marshal::Copy(bytes, 0, IntPtr(pcOut), nOutLen); pcOut[nOutLen] = '/0'; }
Unlike the native API versions, these examples are complete, since the .NET platform handles a lot of things behind the scenes for us. We don’t need any error-checking code for such simple routines.
All of this assumes you’re using Windows NT or one of its direct descendants: Windows 2000, Windows XP, Windows Vista, or any “Server” variant of Windows. Windows 95 and its descendants (98, ME, and CE) do not support UCS-2. They still have the 'W' APIs for compatibility, but they just smash the data down to 8-bit and call the 'A' version for you.
from examples/vstudio/mfc/mfc_dlg.cpp
ToUCS2
// Convert a C string in UTF-8 format to UCS-2 format.
bool
CExampleDlg::ToUCS2(LPTSTR pcOut, int nOutLen, const char* kpcIn)
{
if (strlen(kpcIn) > 0) {
// Do the conversion normally
return MultiByteToWideChar(CP_UTF8, 0, kpcIn, -1, pcOut,
nOutLen) > 0;
}
else if (nOutLen > 1) {
// Can't distinguish no bytes copied from an error, so handle
// an empty input string as a special case.
_tccpy(pcOut, _T(""));
return true;
}
else {
// Not enough room to do anything!
return false;
}
}
ToUTF8
// Convert a UCS-2 multibyte string to the UTF-8 format required by
// MySQL, and thus MySQL++.
bool
CExampleDlg::ToUTF8(char* pcOut, int nOutLen, LPCWSTR kpcIn)
{
if (_tcslen(kpcIn) > 0) {
// Do the conversion normally
return WideCharToMultiByte(CP_UTF8, 0, kpcIn, -1, pcOut,
nOutLen, 0, 0) > 0;
}
else if (nOutLen > 0) {
// Can't distinguish no bytes copied from an error, so handle
// an empty input string as a special case.
*pcOut = '/0';
return true;
}
else {
// Not enough room to do anything!
return false;
}
}