Pandas之入门小案例

刚好现在做测试的同事需要实现一个小功能,就是事先创建一个只有表头的excel表,后面需要传入一个列表,并判断是否存在表头中,如果存在则在该列下面标注’OK’,不存在则标记’X’!很简单的例子。

安装

pip install pandas==1.0.0
原本表格:

在这里插入图片描述

处理后表格:

在这里插入图片描述

读取excel文件

data = pd.read_excel(path, sheet_name=False, index=False)
  • 由于一开始所有列除了标题均为空,所以需要给各列赋空列表:
header = set(data.keys())
for i in header:
    if not len(data[i]):
        data[i] = []
    else:
        data[i] = list(data[i])
  • 若不为空也需要把列结果转为列表(list),否则读取出来的不是列表,而是类似表格的数据:
账户 Dashboard 订单列表 账户花费 系列花费 产品审核 SKU列表 广告账户 用户管理 部门管理 店铺管理 部门配置 团队ROI报表 异常数据 SKU销量 广告素材 个人中心
0  X         X   ok   ok   ok   ok    ok   ok   ok   ok   ok   ok      ok   ok    ok   ok   ok

循环打印结果如下:

>>> for i in data.keys():
...     print(i, data[i])
...
账户 0    X
Name: 账户, dtype: object
Dashboard 0    X
Name: Dashboard, dtype: object
订单列表 0    ok
Name: 订单列表, dtype: object
账户花费 0    ok
Name: 账户花费, dtype: object
系列花费 0    ok
Name: 系列花费, dtype: object
产品审核 0    ok
Name: 产品审核, dtype: object
SKU列表 0    ok
Name: SKU列表, dtype: object
广告账户 0    ok
Name: 广告账户, dtype: object
...
  • 不转list类型对其进行增删改,还会报如下错误:
Traceback (most recent call last):
  File "test.py", line 36, in <module>
    main()
  File "test.py", line 20, in main
    data[key].append('ok')
  File "E:\test\venv\lib\site-packages\pandas\core\series.py", line 2582, in append
    return concat(
  File "E:\test\venv\lib\site-packages\pandas\core\reshape\concat.py", line 271, in concat
    op = _Concatenator(
  File "E:\test\venv\lib\site-packages\pandas\core\reshape\concat.py", line 357, in __init__
    raise TypeError(msg)
TypeError: cannot concatenate object of type 'str'>'; only Series and DataFrame objs are valid
  • 取表头交集
new_list = set(list1) & set(list2)
或
list1.intersection(list2)
  • 取表头差集
new_list = set(list1) - set(list2)
或
list1.difference(list2)
  • 取表头并集
new_list = set(list1) | set(list2)
或
list1.union(list2)
  • 然后把这些交集表头标记’OK’,差集标注’X’:
for key in new_list:
    if len(data[key]):
        data[key].append('ok')
    else:
        data[key] = ['ok']
cj_hd = header - set(nb)
for key in cj_hd:
    n_data = 'X' if key != '账户' else user_name # 如果需要对某列特定修改
    if len(data[key]):
        data[key].append(n_data)
    else:
        data[key] = [n_data]
  • 每次测试完打开文件查看,如果想再次执行程序更新文件,需要将文件关闭,否则会报错:
Traceback (most recent call last):
  File "test.py", line 37, in <module>
    main()
  File "test.py", line 32, in main
    df.to_excel('3.xlsx', index=False)
  File "E:\test\venv\lib\site-packages\pandas\core\generic.py", line 2174, in to_excel
    formatter.write(
  File "E:\test\venv\lib\site-packages\pandas\io\formats\excel.py", line 738, in write
    writer.save()
  File "E:\test\venv\lib\site-packages\pandas\io\excel\_openpyxl.py", line 43, in save
    return self.book.save(self.path)
  File "E:\test\venv\lib\site-packages\openpyxl\workbook\workbook.py", line 392, in save
    save_workbook(self, filename)
  File "E:\test\venv\lib\site-packages\openpyxl\writer\excel.py", line 291, in save_workbook
    archive = ZipFile(filename, 'w', ZIP_DEFLATED, allowZip64=True)
  File "c:\users\user\appdata\local\programs\python\python38\lib\zipfile.py", line 1216, in __init__
    self.fp = io.open(file, filemode)
PermissionError: [Errno 13] Permission denied: 'xxx.xlsx'
附上源码
import pandas as pd
import os


def main(path, user_name, nb):
    data = dict(pd.read_excel(path, sheet_name=False, index=False))
    header = set(data.keys())
    for i in header:
        if not len(data[i]):
            data[i] = []
        else:
            data[i] = list(data[i])
    new_b = header & set(nb)
    for key in new_b:
        if len(data[key]):
            data[key].append('ok')
        else:
            data[key] = ['ok']
    cj_hd = header - set(nb)
    for key in cj_hd:
        n_data = 'X' if key != '账户' else user_name
        if len(data[key]):
            data[key].append(n_data)
        else:
            data[key] = [n_data]
    df = pd.DataFrame(data=data)

    df.to_excel(path, index=False)



if __name__ == '__main__':
    path = os.getcwd() + '/test.xlsx'
    nb = ["A","Dashb", "订单列表",  "店铺管理", "产品审核", "SKU列表", "广告账户", "用户管理", "部门配置", "团队ROI报表", "异常数据",
          "SKU销量", "广告素材", "个人中心", ]
    user_name = '1221'
    main(path, user_name, nb)

文章如果对您有帮助,欢迎点赞、收藏、关注~

你可能感兴趣的:(Python,pandas,学习笔记)