【Python笔记2.2】用zipfile解压zip包时遇到的Unicode字符编解码问题

python unicode字符编解码问题参见【Python笔记2.1】
python中用zipfile解压zip包网上资料一堆,这里就不多说了。
下面使用【Python笔记2.1】中总结出来的字符编解码函数来解决zipfile解压zip包的问题。时间仓促,直接上代码。

完整示例代码(含【Python笔记2.1】中的代码)

# -*- coding: utf-8 -*-
#!/usr/bin/env python2
"""
Created on Mon Jul 23 15:40:00 2018

@author: 
"""
import os
import zipfile


# UTF_8_BOM = b'\xef\xbb\xbf'


def un_zip(filepath, dst_dir):
    try:
        encoding = 'utf-8'
        dst_dir = try_encode(dst_dir, encoding)
        zip_file = zipfile.ZipFile(filepath)
        for names in zip_file.namelist():
            unzip_file = zip_file.extract(names, dst_dir)
            rename_file = os.path.join(dst_dir, try_encode(str_decode(names), encoding))
            os.rename(unzip_file, rename_file)
        zip_file.close()
        return 0, 'unzip Success'
    except Exception as err:
        print('[un_zip]: Exception.ERROR: ', err)
        return -1, 'ERROR, unzip error! please upload the zip package named in utf-8 format.'


def try_encode(s, encoding="utf-8"):
    if s is None:
        print('[tryEncode]: input param None!')
        return s
    try:
        return s.encode(encoding)
    except UnicodeEncodeError as err:
        print(err)


def try_decode(s, decoding="utf-8"):
    try:
        return s.decode(decoding)
    except UnicodeDecodeError as err:
        print(err)


def str_decode(string):
    while True:
        dec = try_decode(string, "utf-8")
        if dec is not None:
            break
        dec = try_decode(string, "ascii")
        if dec is not None:
            break
        dec = try_decode(string, "GB2312")
        if dec is not None:
            break
        dec = try_decode(string, "GBK")
        if dec is not None:
            break
        dec = try_decode(string, "Big5")
        if dec is not None:
            break
        print('[str_decode]: unknown encoding')
        dec = None
        break

    return dec

你可能感兴趣的:(python)