之前做机器视觉任务处理图片时会遇到耗时较长的情况,当时就想着如果使用多进程应该能有提升。于是这里先做一个代码框架的记录,以后需要时再用上。
原有的代码:
import tqdm,glob
def process_img(fname:str):
try:
# do something
i=1
return True,""
except Exception as e:
return False,str(e)
if __name__ == '__main__':
pbar = tqdm(total=1400)
for fname in glob.iglob("./imgs/*.jpg",recursive=True):
pbar.update()
succ,msg = process_img(fname)
if not succ:
print(msg)
pbar.close()
为了方便改写,我们使用map重构:
import tqdm,glob
def process_img(fname:str):
try:
# do something
i=1
return True,""
except Exception as e:
return False,str(e)
if __name__ == '__main__':
pbar = tqdm(total=1400)
for succ,msg in map(process_img,glob.iglob("./imgs/*.jpg",recursive=True)):
pbar.update()
if not succ:
print(msg)
pbar.close()
然后我们基于multiprocessing改写为并行方式,步骤如下;
代码如下:
import tqdm,glob
def process_img(fname:str):
try:
# do something
i=1
return True,""
except Exception as e:
return False,str(e)
if __name__ == '__main__':
from multiprocessing import Pool
with Pool(processes=None) as p: # 进程池,缺省值为CPU核心数
pbar = tqdm(total=1400)
for succ,msg in p.map(
process_img, # 并行计算函数
glob.iglob("./imgs/*.jpg",recursive=True), # 可迭代对象
20 # chunk数?
):
pbar.update()
if not succ:
print(msg)
pbar.close()
关于p.map的的chunksize参数,官方文档是这么写的:
This method chops the iterable into a number of chunks which it submits to the process pool as separate tasks. The (approximate) size of these chunks can be specified by setting chunksize to a positive integer.
应该也可以不设置?以后实践了再试试
本文参考: 实用模块-9-多任务之蚂蚁搬家:multiprocessing模块_哔哩哔哩_bilibili