如何存取Unicode和UTF-8文本

//Unicode文本会在最前加上0xFF,0xFE两个字节的标记
//存
WideString W;
if (Memo1->Text.IsEmpty()) return;
TMemoryStream *ms=new TMemoryStream();
char S[]={0xFF,0xFE};

ms->Write(S,sizeof(S));
W=Memo1->Text;
ms->Write(W.c_bstr(),W.Length() * sizeof(WideChar));
ms->Position=0;
ms->SaveToFile("c://text.txt");
ms->Free();

//取
WideString W;
TMemoryStream *ms=new TMemoryStream();
ms->LoadFromFile("C://test.txt");
//if (ms->Size<4) return;
char S[2];
char s[]={0xFF,0xFE};
ms->Read(S,2);

if (S[0]!=s[0] || S[1]!=s[1]) return; //判断是否Unicode文本,待改进
int len=ms->Size-2;
W.SetLength(len/sizeof(WideChar));
ms->Read(W.c_bstr(),len);
Memo1->Text=W;
ms->Free();

参考函数:StringToWideChar,WideCharToString

存取UTF8也差不多,只是标记字符改成三个字节:0xEF,0xBB,0xBF;读入到Stream时先使用AnsiToUtf8()/Utf8ToAnsi()函数转换。

你可能感兴趣的:(如何存取Unicode和UTF-8文本)