针对使用multiprocessing.Process() 的多线程机制
获取返回值的方法:multiprocessing.Manager(),构造线程返回结果存储结构,本质是共享内存
具体方法样例:
import os
import sys
import random
import threading
import multiprocessing
# 线程执行函数
def worker(procnum, return_dict):
"""worker function"""
print(str(procnum) + " represent!")
num = random.randint(5,20)
arr = []
for i in range(num):
arr.append(i)
# 依据线程id来存储各线程对应的处理结果
return_dict[procnum] = (procnum,arr)
if __name__ == "__main__":
manager = multiprocessing.Manager()
# 构造返回值存储结构,本质是共享内存方式
return_dict = manager.dict()
jobs = []
for i in range(5):
# 将构造的返回值存储结构传递给多线程执行函数,并标识各个线程id
p = multiprocessing.Process(target=worker, args=(i, return_dict))
jobs.append(p)
p.start()
for proc in jobs:
proc.join()
# 所有线程处理完毕后,遍历结果输出
for id,arr in return_dict.values():
print(id,arr)
但是,当返回数据非常大的时候,当线程执行完毕,存储结果时会报错,实验平台(vscode,centos 7).目前还没找到解决方法。
Traceback (most recent call last):
File "/usr/local/lib/python3.7/multiprocessing/managers.py", line 788, in _callmethod
conn = self._tls.connection
AttributeError: 'ForkAwareLocal' object has no attribute 'connection'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "excute_fanci2.py", line 357, in product
# return_dict[thread_id] = ( domains,has_word)
File "", line 2, in __setitem__
File "/usr/local/lib/python3.7/multiprocessing/managers.py", line 792, in _callmethod
self._connect()
File "/usr/local/lib/python3.7/multiprocessing/managers.py", line 779, in _connect
conn = self._Client(self._token.address, authkey=self._authkey)
File "/usr/local/lib/python3.7/multiprocessing/connection.py", line 492, in Client
c = SocketClient(address)
File "/usr/local/lib/python3.7/multiprocessing/connection.py", line 619, in SocketClient
s.connect(address)
FileNotFoundError: [Errno 2] No such file or directory
使用另外一种方法 multiprocessing.Pool() ,可以解决返回数据过大问题,目前实验没有出现问题:
import os
import sys
import random
import threading
import multiprocessing
import time
def worker(args):
## 该方法在传递多个参数时,似乎只能通过这样进行传递,否则会报错
threadname, res_path, thread_id = args[0], args[1], args[2]
result = [i for i in range(10000000)]
return (result, thread_id)
if __name__ == "__main__":
process_num = 20
## 仍然是线程池方法
pool = multiprocessing.Pool(processes = process_num)
args_list = []
## 下面构造各个线程的参数列表,如果每个线程接受多个参数,注意在多参数接收方式
for i in range(process_num):
threadname = "thread"+str(i)
res_path = str(i)+'_'
args=(threadname,res_path,i)
args_list.append(args)
## 将参数传递给线程池,绑定执行方法,map方法返回的是一个结果列表,包含各个线程的执行结果
results = pool.map(worker,args_list)
for result,id in results:
print(results)
原文:python多线程编程,获取各个线程返回值 及 相关问题_AdvSoul的博客-CSDN博客