libiconv中的iconv函数是个很容易误用的函数,如果不仔细看iconv.h头文件中说明,按照习惯用法来用,基本上只能莫名其妙百思而不得其解了。
size_t iconv (iconv_t cd, char* * inbuf, size_t *inbytesleft, char* * outbuf, size_t *outbytesleft);
大眼一看这个方法,肯定以为第一个是句柄,第二个是要转换的源字符串开头,第3个是源字节串长度,第4个是转换后保存结果的缓冲区,第5个是转换的结果的长度。
于是就用了,于是就出现段错误或者错误的结果……。
仔细看了看说明
/* Converts, using conversion descriptor ‘cd’, at most ‘*inbytesleft’ bytes
starting at ‘*inbuf’, writing at most ‘*outbytesleft’ bytes starting at
‘*outbuf’.
Decrements ‘*inbytesleft’ and increments ‘*inbuf’ by the same amount.
Decrements ‘*outbytesleft’ and increments ‘*outbuf’ by the same amount. */
才晓得第二个参数是源字符串开头没错,但iconv运行后,会改变*inbuf的值,同样也会增加*outbuf的值,也会减小*inbytesleft和*outbytesleft的值。
仔细看这些变量的名字,也能理解为什么是bytesleft而不是byteslen。所以第5个参数*outbytesleft对应的对象为保存结果的缓冲区的容量长度。一般初始化为源字符串长度的2倍即可。
#include <stdio.h>
#include "iconv.h"
int main(int argc, char** argv)
{
int ret = 0;
char* strGB = "我是中国";
int lenSrc = strlen(strGB);
int lenDst = lenSrc*5;
printf("lenDst=%d\n", lenDst);
char* out = (char*) malloc(lenDst);
char* pFreeOut = out;
iconv_t cd = iconv_open("UTF-8", "GBK");
printf("1:strGB=%p, out=%p\n", strGB, out);
ret = iconv(cd, &strGB, &lenSrc, &out, &lenDst);
printf("2:strGB=%p, out=%p\n", strGB, out);
printf("ret=%d\n", ret);
printf("lenDst=%d\n", lenDst);
printf("pFreeOut=%s\n", pFreeOut);
iconv_close(cd);
free(pFreeOut);
}
输出结果:
lenDst=40
1:strGB=0x80487e8, out=0x9714008
2:strGB=0x80487f0, out=0x9714014
ret=0
lenDst=28
pFreeOut=我是中国
转换后“我是中国”的长度为40-28=12,和out2-out1即0x9714014-0x9714008=0xC=12相符!