Sometime, we need to handle string in utf-8 coding, e.g. print it for debugging. So how to print the string in utf-8 coding? The obvious way is to display every elment in the string in hex format of width 2.
Here is a sample code.
#include <stdio.h>
#include <string.h>
static void INFO(const char* str, const int sz)
{
int i=0, shift=0;
const char* p=str;
char buf[1024];
for(i=0; i<sz;i++)
{
snprintf(buf+shift, sizeof(buf)-shift, "%02x", *p);
shift += 2;
p++;
}
buf[shift] = '\0';
printf("strlen buf=%u\n", strlen(buf));
printf("%s\n", buf);
}
int main()
{
char *s="a123";
// char* p=s;
// for(;*p != '\0';p++)
// {
// printf("%02x",*p);
// }
// printf("\n");
INFO(s, strlen(s));
return 0;
}
--------
torstan: ./a.out
strlen buf=8
61313233
note:
Do not use strncpy/strcpy to copy a utf-8 string since these two functions determine the terminal of a string by charicter '\0', and there may be some '\0' in the utf-8 coding.
The solution to this issue is to use memcpy rather than strncpy/strcpy.
Here is an example of the result after strncpy a utf-8 string.
1 ffffff24ff136bff367805ff64141350387dff24ffffffffff6e4a55ffffffffffff1dff00ff26ff
2 ffffff24ff136bff367805ff64141350387dff24ffffffffff6e4a55ffffffffffff1dff00000000