进程是资源分配的最小单位,线程是cpu调度的最小单位
每个进程都会申请独立的资源,相互隔离
t1 = []
def add_num(num):
list1.append(num)
print(list1)
if __name__ == '__main__':
p_list = []
for i in range(10):
p = Process(target=add_num,args=(i,))
p_list.append(p)
for p in p_list:
p.start()
for p in p_list:
p.join()
print(list1)
结果
[3]
[4]
[2]
[9]
[7]
[1]
[6]
[0]
[8]
[5]
[]
因此如果需要多个进程执行同一任务,就必须要实现进程间的通信,那么进程间如何进行通信呢?一般来讲有三种通信方式:
1,Queue:用法与queue.Queue在多线程中的应用相同,只是创建的queue要作为参数传入子进程
from multiprocessing import Process,Queue
list1 = []
def put_num(q):
for i in range(10):
q.put(i)
def get_num(q):
while not q.empty():
list1.append(q.get())
print(list1)
if __name__ == '__main__':
q = Queue()
p1 = Process(target=put_num,args=(q,))
p2 = Process(target=get_num, args=(q,))
p1.start()
p2.start()
p2.join()
结果
[0]
[0, 1]
[0, 1, 2]
[0, 1, 2, 3]
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4, 5]
[0, 1, 2, 3, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6, 7]
[0, 1, 2, 3, 4, 5, 6, 7, 8]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
2,Pipeline:Pipe方法返回(conn1, conn2)代表一个管道的两个端。Pipe方法有duplex参数,如果duplex参数为True(默认值),那么这个管道是全双工模式,也就是说conn1和conn2均可收发。duplex为False,conn1只负责接受消息,conn2只负责发送消息。
send和recv方法分别是发送和接受消息的方法。例如,在全双工模式下,可以调用conn1.send发送消息,conn1.recv接收消息。如果没有消息可接收,recv方法会一直阻塞。如果管道已经被关闭,那么recv方法会抛出EOFError。
def send_num(pipe):
for i in range(10):
print('send:',i)
pipe.send(i)
time.sleep(0.3)
def recv_num(pipe):
while True:
num = pipe.recv()
print('recv1:',num)
time.sleep(1)
def recv_num2(pipe):
while True:
num = pipe.recv()
print('recv2:',num)
time.sleep(1.2)
if __name__ == '__main__':
p = Pipe()
p1 = Process(target=send_num,args=(p[0],))
p2 = Process(target=recv_num, args=(p[1],))
p3 = Process(target=recv_num2, args=(p[1],))
p1.start()
p2.start()
p3.start()
p1.join()
p2.join()
p3.join()
结果
send: 0
recv1: 0
send: 1
recv2: 1
send: 2
send: 3
recv1: 2
send: 4
recv2: 3
send: 5
send: 6
recv1: 4
send: 7
send: 8
recv2: 5
send: 9
recv1: 6
recv2: 7
recv1: 8
recv1: 9
3,Manager:manager可以实现进程间数据的共享,上面一个队列一个管道,实现的仅仅是数据的传递方法
def run1(dict1,list1):
dict1['num'] = 1
list1.reverse()
def run2(dict1,list1):
dict1['num'] = 2
list1.sort()
if __name__ == '__main__':
m = Manager()
dict1 = m.dict()
list1 = m.list(range(10))
p1 = Process(target=run1,args=(dict1,list1))
p2 = Process(target=run2,args=(dict1,list1))
print(dict1)
print(list1)
p1.start()
p2.start()
p1.join()
p2.join()
print(dict1)
print(list1)
结果
{}
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
{'num': 2}
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
这里是什么意思呢,就是一个进程对列表或字典等储存容器内容的修改,会被另一个进程获取到,在其修改的基础上再进行修改。
接下来就是进程池了,进程池的作用也很明了,就是使维持进程的运行个数在一个稳定的范围内
from multiprocessing import Process,Pool
def run(num):
print('Go:',num)
time.sleep(2) #模拟I/O操作
return 'Over:%s'%num
def run_back(msg):
print('CallBack:',msg)
if __name__ == '__main__':
pool = Pool()
rest = []
for i in range(10):
#run是函数名,元组中是参数,callback指定回调函数,回调函数是由主进程调用的,参数是run函数的返回值
p = pool.apply_async(run,(i,),callback=run_back)
rest.append(p)
for p in rest:
p.get()
pool.close()
pool.join()
结果
Go: 0
Go: 1
Go: 2
Go: 3
Go: 4
CallBack: Over:0
Go: 5
CallBack: Over:1
CallBack:Go: Over:2
6
Go:CallBack: Over:3
7
CallBack:Go: Over:4 8
Go: 9
CallBack: Over:5
CallBack: Over:6
CallBack: Over:7
CallBack: Over:8
CallBack: Over:9
这是线程池的异步非阻塞方法
其实还有个阻塞方法
from multiprocessing import Process,Pool
def run(num):
print('Go:',num)
time.sleep(2)
return 'Over:%s'%num
def run_back(msg):
print('CallBack:',msg)
if __name__ == '__main__':
pool = Pool()
for i in range(10):
p = pool.apply(run, (i,))
print(p)
pool.close()
pool.join()
阻塞方法没有回调函数
结果
Go: 0
Over:0
Go: 1
Over:1
Go: 2
Over:2
Go: 3
Over:3
Go: 4
Over:4
Go: 5
Over:5
Go: 6
Over:6
Go: 7
Over:7
Go: 8
Over:8
Go: 9
串行了,噢噢噢噢
其实二者的区别就在于get方法的调用
from multiprocessing import Process,Pool
def run(num):
print('Go:',num)
time.sleep(2)
return 'Over:%s'%num
def run_back(msg):
print('CallBack:',msg)
if __name__ == '__main__':
pool = Pool()
rest = []
for i in range(10):
p = pool.apply_async(run,(i,),callback=run_back)
# rest.append(p)
p.get()
# for p in rest:
# p.get()
pool.close() #先close再join
pool.join() # 进程池中进程执行完毕后再关闭,如果注释,那么程序直接关闭。
我稍微修改了一下get的位置,就发现阻塞了
Go: 0
CallBack: Over:0
Go: 1
CallBack: Over:1
Go: 2
CallBack: Over:2
Go: 3
CallBack: Over:3
Go: 4
CallBack: Over:4
Go: 5
CallBack: Over:5
Go: 6
CallBack: Over:6
Go: 7
CallBack: Over:7
Go: 8
CallBack: Over:8
Go: 9
CallBack: Over:9
看一下apply的源码
def apply(self, func, args=(), kwds={}):
'''
Equivalent of `func(*args, **kwds)`.
Pool must be running.
'''
return self.apply_async(func, args, kwds).get()
apply就是在源码中用了self.apply_async(func, args, kwds).get(),本质上也是用的apply_async方法