更新,一直想试试进程池,因为看了B站的课程后,发现进程池比传统的multiprocess模块要方便多了,还快多了,就一直想试试,之前没有合适的例子,强化学习的部分,因为涉及到的东西太多,也没法当个简单的例子去测试。
刚好昨天做这个转换的时间非常长,典型的计算资源消耗型的任务,因此想着能不能利用起进程池,走一次流程。
试了一下果然效果拔群!
直接用nptdms模块读取的时候,如果你的tdms文件 not valid的话,会报错:
1). ValueError: Segment does not start with TDSm, but with $ '
2). ValueError: Segment does not start with TDSm, but with " %
原因是:
It sounds like you are trying to read files that are not valid TDMS files, as they don’t start with the first four TDSm bytes that should be at the beginning of all TDMS files.
参考链接:
https://github.com/adamreeve/npTDMS/issues/123
因此我只好选择最基本的文件读取方式,按行读取,手动处理为pandas的格式,然后保存为excel的文件格式。
Labview的TMDS格式文件在很多偏硬件相关的工作中经常出现,有时候面临将TDMS转成Excel通用格式的情况,正常来讲Excel只要装有TDMS插件读取是没有问题的,然而面对大量TDMS文件的时候,手动一个个将TDMS文件转成Excel文件显然很浪费时间,以下介绍使用Python配合pandas,xlsxwriter将TMDS文件批量转成Excel文件的方法:
其实就是数字+制表符+换行符的一个内容,按照最基本的文本读取,加上替换,就能完成格式转换。
974.832 1012.613 1016.344 1012.713 1012.009 1015.501 1000.094 1012.158 1011.943 1026.974 1011.168 1011.470 974.860 1011.517 1012.005 1062.471 1011.013 970.696 1012.960 1011.206 1011.439 1003.868 1011.456 1013.407 1011.972 1010.859 1011.276 1024.039 1039.353 1011.088 1012.036 1010.735 1011.564 1011.131 1011.837 1012.786
974.865 1012.567 1016.372 1012.719 1011.996 1015.527 1000.109 1012.148 1011.952 1026.951 1011.189 1011.471 974.846 1011.503 1012.065 1062.495 1011.044 970.726 1012.951 1011.242 1011.460 1003.885 1011.459 1013.423 1011.970 1010.856 1011.310 1024.043 1039.399 1011.112 1012.026 1010.742 1011.571 1011.164 1011.808 1012.732
974.862 1012.610 1016.382 1012.736 1012.038 1015.497 1000.135 1012.132 1011.952 1026.975 1011.145 1011.454 974.825 1011.512 1012.047 1062.503 1011.046 970.711 1012.946 1011.243 1011.487 1003.897 1011.468 1013.414 1012.003 1010.875 1011.280 1024.038 1039.380 1011.098 1012.008 1010.776 1011.573 1011.166 1011.865 1012.772
974.841 1012.577 1016.365 1012.700 1012.008 1015.509 1000.092 1012.156 1011.984 1026.962 1011.171 1011.443 974.826 1011.567 1012.039 1062.488 1011.032 970.715 1012.983 1011.187 1011.442 1003.905 1011.448 1013.408 1011.974 1010.837 1011.264 1024.038 1039.415 1011.086 1012.042 1010.779 1011.574 1011.141 1011.826 1012.772
import xlwt
import codecs
import os
"""
├─tdms_scripts
│ ├─excel
│ └─tdms
│──tmds2excel.py
"""
# 改为你的tmds文件夹!的路径,比如0静止
root_path = r'F:\rl_code\tdms_scripts'
# 修改这里,第0个为1,第1个从1001开始
task_num = 0
# 创建一个excel文件夹
try:
os.mkdir(os.path.join(root_path, 'excel'+str(task_num)))
except:
pass
tdms_files = os.listdir(os.path.join(root_path, 'tdms'))
# 将文件名从小到大排序
tdms_files.sort()
# [print(t) for t in tdms_files]
file_count = 1 + task_num * 1000
for tdms_file in tdms_files:
print("tdms_file:", tdms_file, end='\t')
sheetName = 'Sheet1'
start_row = 0
start_col = 0
wb = xlwt.Workbook(encoding = 'utf-8')
ws = wb.add_sheet(sheetName)
tdms_path = os.path.join(root_path, 'tdms', tdms_file)
f = open(tdms_path, encoding = 'utf-8')
row_excel = start_row
for line in f:
# 消除换行符
line = line.strip('\n')
# 以制表符'\t'作为切分对象,将长字符串,转为短字符串列表;
# 如果原数据用空格切分,则line = line.split(' ')
line = line.split('\t')
col_excel = start_col
len_line = len(line)
for j in range(len_line):
ws.write(row_excel,col_excel,line[j])
col_excel += 1
excel_path = os.path.join(root_path, 'excel'+str(task_num), str(file_count)+'.xls')
# 如果用原名,用下面的句子
# excel_path = os.path.join(root_path, 'excel', tdms_file.replace('tdms', 'xls'))
wb.save(excel_path)
row_excel += 1
f.close
print("excel_path:", excel_path)
file_count += 1
由于单个文件处理时间过长,因此写了一个多进程版本的处理脚本。
可以CPU核心数倍加速处理!
from multiprocessing import Pool
import xlwt
import codecs
import os
import time
"""
├─tdms_scripts
│ ├─excel
│ └─tdms
│──tmds2excel.py
"""
# 改为你的tmds文件夹!的路径,比如0静止
root_path = r'F:\rl_code\calibration\tdms_scripts'
# 修改这里,第0个为1,第1个从1001开始
task_num = 0
# 创建一个excel文件夹
try:
os.mkdir(os.path.join(root_path, 'excel'+str(task_num)))
except:
pass
tdms_files = os.listdir(os.path.join(root_path, 'tdms'))
# 将文件名从小到大排序
tdms_files.sort()
# [print(t) for t in tdms_files]
file_count = 1 + task_num * 1000
def tdms2excel(tdms_index_file):
st = time.time()
index, tdms_file = tdms_index_file
print("tdms_file:", tdms_file, end='\t')
sheetName = 'Sheet1'
start_row = 0
start_col = 0
wb = xlwt.Workbook(encoding = 'utf-8')
ws = wb.add_sheet(sheetName)
tdms_path = os.path.join(root_path, 'tdms', tdms_file)
f = open(tdms_path, encoding = 'utf-8')
row_excel = start_row
for line in f:
# 消除换行符
line = line.strip('\n')
# 以制表符'\t'作为切分对象,将长字符串,转为短字符串列表;
# 如果原数据用空格切分,则line = line.split(' ')
line = line.split('\t')
# print("line:", line)
col_excel = start_col
len_line = len(line)
for j in range(len_line):
ws.write(row_excel, col_excel, line[j])
col_excel += 1
excel_path = os.path.join(root_path, 'excel'+str(task_num), str(index)+'.xls')
# 如果用原名,用下面的句子
# excel_path = os.path.join(root_path, 'excel', tdms_file.replace('tdms', 'xls'))
wb.save(excel_path)
row_excel += 1
f.close
print("excel_path:", excel_path, end='\t')
print("trans_time:", time.time()-st)
if __name__ == '__main__':
p = Pool()
p.map(tdms2excel, [[index, tdms_file] for index, tdms_file in enumerate(tdms_files)])
我的笔记本是八核2.6Ghz的CPU,单个进程的处理时间在20-30秒之间,我用了进程池之后,八个进程一起执行,可以很明显的看到每隔8个,有一个时间分层,八个小工轮着干的效果就体现出来了~
这个比手动打开八个脚本,利用Python新开的脚本自动在另外的CPU核心运行的特性,利用其他CPU核心的方法,要优雅多了~
运行效果:
tdms_file: 1_2021_02_21_22_41_15.tdms excel_path: …cel0\0.xls single_time: 27.0803 total_time: 27.0957
tdms_file: 1_2021_02_21_22_41_58.tdms excel_path: …cel0\4.xls single_time: 27.4848 total_time: 27.5031
tdms_file: 1_2021_02_21_22_41_39.tdms excel_path: …cel0\2.xls single_time: 28.6732 total_time: 28.6935
tdms_file: 1_2021_02_21_22_41_48.tdms excel_path: …cel0\3.xls single_time: 29.9796 total_time: 29.9925
tdms_file: 1_2021_02_21_22_41_26.tdms excel_path: …cel0\1.xls single_time: 30.8471 total_time: 30.8605
tdms_file: 1_2021_02_21_22_42_24.tdms excel_path: …cel0\7.xls single_time: 31.049 total_time: 31.0644
tdms_file: 1_2021_02_21_22_42_14.tdms excel_path: …cel0\6.xls single_time: 32.2027 total_time: 32.2215
tdms_file: 1_2021_02_21_22_42_05.tdms excel_path: …cel0\5.xls single_time: 33.8177 total_time: 33.8405
tdms_file: 1_2021_02_21_22_42_32.tdms excel_path: …cel0\8.xls single_time: 33.5036 total_time: 60.6003
tdms_file: 1_2021_02_21_22_43_05.tdms excel_path: …el0\12.xls single_time: 29.8122 total_time: 60.6732
tdms_file: 1_2021_02_21_22_44_03.tdms excel_path: …el0\14.xls single_time: 28.4923 total_time: 60.7139
tdms_file: 1_2021_02_21_22_42_40.tdms excel_path: …cel0\9.xls single_time: 33.4553 total_time: 60.9589
tdms_file: 1_2021_02_21_22_43_54.tdms excel_path: …el0\13.xls single_time: 29.904 total_time: 60.9688
tdms_file: 1_2021_02_21_22_44_14.tdms excel_path: …el0\15.xls single_time: 28.3613 total_time: 62.2024
tdms_file: 1_2021_02_21_22_42_57.tdms excel_path: …el0\11.xls single_time: 34.8927 total_time: 64.8862
tdms_file: 1_2021_02_21_22_42_49.tdms excel_path: …el0\10.xls single_time: 39.0403 total_time: 67.7338
tdms_file: 1_2021_02_21_22_44_58.tdms excel_path: …el0\20.xls single_time: 25.6077 total_time: 86.577
tdms_file: 1_2021_02_21_22_44_36.tdms excel_path: …el0\17.xls single_time: 30.7522 total_time: 91.4259
tdms_file: 1_2021_02_21_22_44_44.tdms excel_path: …el0\18.xls single_time: 33.2317 total_time: 93.9461
tdms_file: 1_2021_02_21_22_44_51.tdms excel_path: …el0\19.xls single_time: 33.3145 total_time: 94.2744
tdms_file: 1_2021_02_21_22_44_28.tdms excel_path: …el0\16.xls single_time: 34.4439 total_time: 95.0447
tdms_file: 1_2021_02_21_22_45_12.tdms excel_path: …el0\22.xls single_time: 33.2763 total_time: 98.1626
tdms_file: 1_2021_02_21_22_45_05.tdms excel_path: …el0\21.xls single_time: 37.2711 total_time: 99.4745
tdms_file: 1_2021_02_21_22_45_19.tdms excel_path: …el0\23.xls single_time: 34.0878 total_time: 101.8221
tdms_file: 1_2021_02_21_22_45_27.tdms excel_path: …el0\24.xls single_time: 30.2179 total_time: 116.7959
tdms_file: 1_2021_02_21_22_45_35.tdms excel_path: …el0\25.xls single_time: 25.7747 total_time: 117.2011
tdms_file: 1_2021_02_21_22_45_56.tdms excel_path: …el0\28.xls single_time: 25.1771 total_time: 120.2228
tdms_file: 1_2021_02_21_22_45_42.tdms excel_path: …el0\26.xls single_time: 26.8273 total_time: 120.7738
tdms_file: 1_2021_02_21_22_45_50.tdms excel_path: …el0\27.xls single_time: 27.3055 total_time: 121.5819
tdms_file: 1_2021_02_21_22_46_03.tdms excel_path: …el0\29.xls single_time: 23.8782 total_time: 122.0407
ps: 欢迎做强化的同学加群一起学习:
深度强化学习-DRL:799378128
欢迎关注知乎帐号:未入门的炼丹学徒
CSDN帐号:https://blog.csdn.net/hehedadaq