汉字占用几个字节,跟采用的编码方式有关,utf-8中一个中文汉字占用3个字节,gb2312中占用2个字节。
a = '成功'
zijie = a.encode('utf-8')
print(zijie, len(zijie)) # b'\xe6\x88\x90\xe5\x8a\x9f' 6
hanzi = zijie.decode('utf-8')
print(hanzi, len(hanzi)) # 成功 2
zijie2 = a.encode('gb2312')
print(zijie2, len(zijie2)) # b'\xb3\xc9\xb9\xa6' 4
hanzi2 = zijie2.decode('gb2312')
print(hanzi2, len(hanzi2)) # 成功 2