.Net 字符集与编解码

0 .NET 字符集编解码

.Net 内部使用的字符集是Unicode,如果需要编码为其他诸如GBK、UTF8编码,可以通过Encoding 类来实现。

using System.Text;


void PrintBytes(byte[] bytes)
{
    foreach (var b in bytes)
    {
        Console.Write("{0:X} ", b);
    }
    Console.WriteLine();
}



Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);

string str = "主账号";

var gbkBytes = Encoding.GetEncoding("gbk").GetBytes(str);       //获取GBK编码
var utf8Bytes = Encoding.UTF8.GetBytes(str);                    //获取UTF8编码
var unicodeBytes = Encoding.Unicode.GetBytes(str);              //获取Unicode编码

PrintBytes(gbkBytes);
PrintBytes(utf8Bytes);
PrintBytes(unicodeBytes);

var gbkStr = Encoding.GetEncoding("gbk").GetString(gbkBytes);   //使用GBK解码
var utf8Str = Encoding.UTF8.GetString(utf8Bytes);               //使用UTF8解码
var unicodeStr = Encoding.Unicode.GetString(unicodeBytes);      //使用Unicode解码

Console.WriteLine(gbkStr);
Console.WriteLine(utf8Str);
Console.WriteLine(unicodeStr);

输出:

D6 F7 D5 CB BA C5
E4 B8 BB E8 B4 A6 E5 8F B7
3B 4E 26 8D F7 53
主账号
主账号
主账号

在使用C++API时,当遇到字符串处理时难免会需要处理字符编码的问题。这里主要针对于使用C++ API是遇到的一些编码被封送的情况测试。

1 Windows环境下

这里首先测试了.Net 在Windows环境下运行情况下,.Net 默认使用ANSI 编解码,其中在 DllImport 中指定的 CharSet 对导出函数的直接字符串参数生效。CharSet 取值与C++ API 端接收到的字符串情况对应如下:

1.1 请求函数:C# => C++

1.1.1 测试函数字符串参数编码

输入文字:主账号

CharSet C++API端接收到的字符集 输出Byte值
不设置 GBK D6 F7 D5 CB BA C5 
Ansi GBK D6 F7 D5 CB BA C5 
Unicode Unicode 3B 4E 26 8D F7 53 
Auto Unicode 3B 4E 26 8D F7 53 
None GBK D6 F7 D5 CB BA C5 

测试接口代码:

    //对这个接口分别设置为以下5种情况进行测试

	[DllImport(LibName, CallingConvention = CallingConvention.StdCall)]
	public static extern void RegisterFront(string ip, int port);

	[DllImport(LibName, CharSet=CharSet.Ansi, CallingConvention = CallingConvention.StdCall)]
	public static extern void RegisterFront(string ip, int port);

	[DllImport(LibName, CharSet=CharSet.Unicode, CallingConvention = CallingConvention.StdCall)]
	public static extern void RegisterFront(string ip, int port);

	[DllImport(LibName, CharSet=CharSet.Auto, allingConvention = CallingConvention.StdCall)]
	public static extern void RegisterFront(string ip, int port);

	[DllImport(LibName, CharSet=CharSet.None, CallingConvention = CallingConvention.StdCall)]
	public static extern void RegisterFront(string ip, int port);
1.1.2 测试函数结构体参数中的字符串编码

在这个 DllImport 中设置的CharSet 仅对接口函数的直接字符串类型生效。如果参数是一个对象,而对象中的字符串类型需要在定义封装对象的位置,通过StructLayout 属性的CharSet 来设置。我这里测试下来,CharSet 取值与C++ API 端接收到的字符串情况对应如下:

输入文字:主账号

CharSet C++API端接收到的字符集 输出Byte值
不设置 GBK D6 F7 D5 CB BA C5 
Ansi GBK D6 F7 D5 CB BA C5 
Unicode null
Auto null
None GBK D6 F7 D5 CB BA C5 

跟上面类似,但是Unicode 传送的不成功,想必是类型问题,Unicode 对应C++中应该对应使用wchar* 数组。

测试接口及定义结构体的代码:

[StructLayout(LayoutKind.Sequential, CharSet=CharSet.Ansi)]
public class StepReqAddPrimaryAccount
{
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 9)]
	public string? TradingDay;
    [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 32)]
	public string? PrimaryAccountID;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 64)]
	public string? PrimaryAccountName;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 64)]
	public string? BrokerPassword;
	public int ChannelID;
	public bool IsAllowLogin;
	public AccountStatusType AccountStatus;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 64)]
	public string? Password;
	public int RiskGroupID;
	public int CommissionGroupID;
}
[StructLayout(LayoutKind.Sequential, CharSet = CharSet.Unicode)]
public class StepReqUpdatePrimaryAccount
{
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 9)]
	public string? TradingDay;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 32)]
	public string? PrimaryAccountID;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 64)]
	public string? PrimaryAccountName;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 64)]
	public string? BrokerPassword;
	public int ChannelID;
	public bool IsAllowLogin;
	public AccountStatusType AccountStatus;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 64)]
	public string? Password;
	public int RiskGroupID;
	public int CommissionGroupID;
}
[StructLayout(LayoutKind.Sequential, CharSet = CharSet.None)]
public class StepReqAddAccount
{
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 9)]
	public string? TradingDay;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 32)]
	public string? AccountID;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 64)]
	public string? AccountName;
	public AccountStatusType AccountStatus;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 64)]
	public string? Password;
	public int TradeGroupID;
	public int RiskGroupID;
	public int CommissionGroupID;
}
[StructLayout(LayoutKind.Sequential, CharSet = CharSet.Auto)]
public class StepReqUpdateAccount
{
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 9)]
	public string? TradingDay;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 32)]
	public string? AccountID;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 64)]
	public string? AccountName;
	public AccountStatusType AccountStatus;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 64)]
	public string? Password;
	public int TradeGroupID;
	public int RiskGroupID;
	public int CommissionGroupID;
}
	[DllImport(LibName, CharSet=CharSet.Ansi, CallingConvention = CallingConvention.StdCall)]
	public static extern int ReqAddPrimaryAccount(StepReqAddPrimaryAccount reqAddPrimaryAccount, int requestID);
	[DllImport(LibName, CharSet = CharSet.Unicode, CallingConvention = CallingConvention.StdCall)]
	public static extern int ReqUpdatePrimaryAccount(StepReqUpdatePrimaryAccount reqUpdatePrimaryAccount, int requestID);
	[DllImport(LibName, CharSet = CharSet.Auto, CallingConvention = CallingConvention.StdCall)]
	public static extern int ReqAddAccount(StepReqAddAccount reqAddAccount, int requestID);
	[DllImport(LibName, CharSet = CharSet.None, CallingConvention = CallingConvention.StdCall)]
    public static extern int ReqUpdateAccount(StepReqUpdateAccount reqUpdateAccount, int requestID);

1.2 回调函数:C++ => C#

返回值:正确

C++中编码:GBK

CharSet C#回调函数接收到字符集 解码情况
不设置 GBK 正确
Ansi GBK 正确
Unicode null 乱码
Auto null 乱码
None GBK 正确

C++中编码:Utf8

相关测试代码:

//依次将CharSet 设置为:不设置值、Ansi、Unicode、Auto、None,进行测试
[StructLayout(LayoutKind.Sequential, CharSet=CharSet.Ansi)]
public class StepRspInfo
{
	public int ErrorID;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 256)]
	public string? ErrorMsg;
}

回调委托定义:

[UnmanagedFunctionPointer(CallingConvention.StdCall)]
public delegate void OnRspAdminUserLogin(StepRspAdminUserLogin? rspAdminUserLogin, StepRspInfo? rspInfo, int requestID, bool isLast);

2 Linux环境下

2.1 请求函数: C# => C++

2.1.1 测试函数字符串参数编码

 输入文字:主账号

CharSet C++API端接收到的字符集 输出Byte值
不设置 UTF8 E4 B8 BB E8 B4 A6 E5 8F B7
Ansi UTF8 E4 B8 BB E8 B4 A6 E5 8F B7
Unicode Unicode 3B 4E 26 8D F7 53
Auto UTF8 E4 B8 BB E8 B4 A6 E5 8F B7
None UTF8 E4 B8 BB E8 B4 A6 E5 8F B7

测试接口代码:

    //对这个接口分别设置为以下5种情况进行测试

	[DllImport(LibName, CallingConvention = CallingConvention.StdCall)]
	public static extern void RegisterFront(string ip, int port);

	[DllImport(LibName, CharSet=CharSet.Ansi, CallingConvention = CallingConvention.StdCall)]
	public static extern void RegisterFront(string ip, int port);

	[DllImport(LibName, CharSet=CharSet.Unicode, CallingConvention = CallingConvention.StdCall)]
	public static extern void RegisterFront(string ip, int port);

	[DllImport(LibName, CharSet=CharSet.Auto, allingConvention = CallingConvention.StdCall)]
	public static extern void RegisterFront(string ip, int port);

	[DllImport(LibName, CharSet=CharSet.None, CallingConvention = CallingConvention.StdCall)]
	public static extern void RegisterFront(string ip, int port);
2.1.2 测试函数结构体参数中的字符串编码

通过StructLayout 属性的CharSet 来设置结构体中的字符串编码,CharSet 取值与C++ API 端接收到的字符串情况对应如下:

输入文字:主账号

CharSet C++API端接收到的字符集 输出Byte值
不设置 UTF8 E4 B8 BB E8 B4 A6 E5 8F B7
Ansi UTF8 E4 B8 BB E8 B4 A6 E5 8F B7
Unicode null
Auto UTF8 E4 B8 BB E8 B4 A6 E5 8F B7
None UTF8 E4 B8 BB E8 B4 A6 E5 8F B7

跟上面类似,但是Unicode 传送的不成功,想必是类型问题,Unicode 对应C++中应该对应使用wchar* 数组。

测试接口及定义结构体的代码:

[StructLayout(LayoutKind.Sequential, CharSet=CharSet.Ansi)]
public class StepReqAddPrimaryAccount
{
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 9)]
	public string? TradingDay;
    [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 32)]
	public string? PrimaryAccountID;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 64)]
	public string? PrimaryAccountName;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 64)]
	public string? BrokerPassword;
	public int ChannelID;
	public bool IsAllowLogin;
	public AccountStatusType AccountStatus;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 64)]
	public string? Password;
	public int RiskGroupID;
	public int CommissionGroupID;
}
[StructLayout(LayoutKind.Sequential, CharSet = CharSet.Unicode)]
public class StepReqUpdatePrimaryAccount
{
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 9)]
	public string? TradingDay;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 32)]
	public string? PrimaryAccountID;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 64)]
	public string? PrimaryAccountName;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 64)]
	public string? BrokerPassword;
	public int ChannelID;
	public bool IsAllowLogin;
	public AccountStatusType AccountStatus;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 64)]
	public string? Password;
	public int RiskGroupID;
	public int CommissionGroupID;
}
[StructLayout(LayoutKind.Sequential, CharSet = CharSet.None)]
public class StepReqAddAccount
{
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 9)]
	public string? TradingDay;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 32)]
	public string? AccountID;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 64)]
	public string? AccountName;
	public AccountStatusType AccountStatus;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 64)]
	public string? Password;
	public int TradeGroupID;
	public int RiskGroupID;
	public int CommissionGroupID;
}
[StructLayout(LayoutKind.Sequential, CharSet = CharSet.Auto)]
public class StepReqUpdateAccount
{
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 9)]
	public string? TradingDay;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 32)]
	public string? AccountID;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 64)]
	public string? AccountName;
	public AccountStatusType AccountStatus;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 64)]
	public string? Password;
	public int TradeGroupID;
	public int RiskGroupID;
	public int CommissionGroupID;
}
	[DllImport(LibName, CharSet=CharSet.Ansi, CallingConvention = CallingConvention.StdCall)]
	public static extern int ReqAddPrimaryAccount(StepReqAddPrimaryAccount reqAddPrimaryAccount, int requestID);
	[DllImport(LibName, CharSet = CharSet.Unicode, CallingConvention = CallingConvention.StdCall)]
	public static extern int ReqUpdatePrimaryAccount(StepReqUpdatePrimaryAccount reqUpdatePrimaryAccount, int requestID);
	[DllImport(LibName, CharSet = CharSet.Auto, CallingConvention = CallingConvention.StdCall)]
	public static extern int ReqAddAccount(StepReqAddAccount reqAddAccount, int requestID);
	[DllImport(LibName, CharSet = CharSet.None, CallingConvention = CallingConvention.StdCall)]
    public static extern int ReqUpdateAccount(StepReqUpdateAccount reqUpdateAccount, int requestID);

 2.2 回调函数: C++ => C#

 返回值:正确

C++中编码:GBK

CharSet C#解码情况
不设置 乱码
Ansi 乱码
Unicode 乱码
Auto 乱码
None 乱码

C++中编码:Utf8

相关测试代码:

//依次将CharSet 设置为:不设置值、Ansi、Unicode、Auto、None,进行测试
[StructLayout(LayoutKind.Sequential, CharSet=CharSet.Ansi)]
public class StepRspInfo
{
	public int ErrorID;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 256)]
	public string? ErrorMsg;
}

回调委托定义:

[UnmanagedFunctionPointer(CallingConvention.StdCall)]
public delegate void OnRspAdminUserLogin(StepRspAdminUserLogin? rspAdminUserLogin, StepRspInfo? rspInfo, int requestID, bool isLast);

3 结论

当在与C++ API 交互时,如果在windows平台运行,建议使用GBK编码进行通信;而在Linux平台运行的话,建议使用 UTF8编码进行通信。

你可能感兴趣的:(紫云的程序人生,C++,C#,c++,c#,字符编解码)