最近在做expansion of datset,所以需要把扩展的dataset写入到excel中
我已经矩阵运算全部搞定,最终输出的是两个输出 labels 和 features
自己整理为以下格式
label = [[0],
[1],
[2],
[3]
]
feature = [
[0.1, 0.2, 0.3, 0.4, 0.5],
[0.11, 0.21, 0.31, 0.41, 0.51],
[0.6, 0.7, 0.8, 0.9, 1.00],
[1.1, 1.2, 1.3, 1.4, 1.5],
]
先是准备用python带的xlrd xlrd 等包来操作感觉真的不太行
换思路,用第三方包openpyxl来操作
pip install openpyxl
官方文档在这里
https://openpyxl.readthedocs.io/en/stable/index.html
废话不多说,show you my code
# coding=utf-8
from openpyxl import Workbook
import numpy as np
wb = Workbook()
ws = wb.create_sheet("che")
label = [[0],
[1],
[2],
[3]
]
feature = [
[0.1, 0.2, 0.3, 0.4, 0.5],
[0.11, 0.21, 0.31, 0.41, 0.51],
[0.6, 0.7, 0.8, 0.9, 1.00],
[1.1, 1.2, 1.3, 1.4, 1.5],
]
#这个地方之所以 变成numpy格式是因为在很多时候我们都是在numpy格式下计算的,模拟一下预处理
label = np.array(label)
feature = np.array(feature)
label_input = []
for l in range(len(label)):
label_input.append(label[l][0])
ws.append(label_input)
for f in range(len(feature[0])):
ws.append(feature[:, f].tolist())
wb.save("chehongshu.xlsx")
openpyxl包用起来是真的方便,对于写入,只需要建立一个LIST进行append就好了,如果excel为空的那append就从第一行开始递增操作,你也可以理解为一个ws.append()操作就相当于写入一行,如果excel为有数据的时候,那写入操作从没有数据的那一行开始写入;这里也说一下本来想用Insert来着但是忽略了一个条件,就是insert有个前提条件就是For example to insert a row at 7 (before the existing row 7):,意思为插入之前你的数据的大小一定是比要插入的行数或者列数大的,也就是说插入只能插到里面,不能在边缘插。
for col in range(len(label)):
print col
ws.insert_cols(col+1)
for index, row in enumerate(ws.rows):
#print row
if index == 0:
#row[col+1].value = label[col][0]
print "label"
print label[col]
else:
print "feature"
print feature[col][index-1]
#row[col+1].value = feature[col][index-1]
def create_data_expansion(path, sheet):
data_init = pd.read_excel(path, sheet)
# print data_init
data_df = pd.DataFrame(data_init)
print data_df
data_df_transponse = data_df.T
label_expansion = np.array(data_df_transponse.index)
label_expansion_l = []
for l in range(len(label_expansion)):
label_expansion_l.append([l])
feature_expansion = np.array(data_df_transponse)
label_expansion = np.array(label_expansion_l)
return label_expansion, feature_expansion
if __name__ == "__main__":
path_name = "excel_demo.xlsx"
sheet_name = "11"
label, feature = create_data_expansion(path_name, sheet_name)
print label
print feature