python 汉字和字节序列转换,汉字编解码

汉字占用几个字节,跟采用的编码方式有关,utf-8中一个中文汉字占用3个字节,gb2312中占用2个字节

a = '成功'
zijie = a.encode('utf-8')
print(zijie, len(zijie))  # b'\xe6\x88\x90\xe5\x8a\x9f' 6

hanzi = zijie.decode('utf-8')
print(hanzi, len(hanzi))  # 成功 2

zijie2 = a.encode('gb2312')
print(zijie2, len(zijie2))  # b'\xb3\xc9\xb9\xa6' 4

hanzi2 = zijie2.decode('gb2312')
print(hanzi2, len(hanzi2))  # 成功 2

你可能感兴趣的:(python,开发语言)