python多进程之间的通信与进程池的使用

进程是资源分配的最小单位,线程是cpu调度的最小单位

每个进程都会申请独立的资源,相互隔离

t1 = []
def add_num(num):
    list1.append(num)
    print(list1)


if __name__ == '__main__':

    p_list = []
    for i in range(10):
        p = Process(target=add_num,args=(i,))
        p_list.append(p)

    for p in p_list:
        p.start()

    for p in p_list:
        p.join()

    print(list1)

结果

[3]
[4]
[2]
[9]
[7]
[1]
[6]
[0]
[8]
[5]
[]

因此如果需要多个进程执行同一任务,就必须要实现进程间的通信,那么进程间如何进行通信呢?一般来讲有三种通信方式:

1,Queue:用法与queue.Queue在多线程中的应用相同,只是创建的queue要作为参数传入子进程

from multiprocessing import Process,Queue
list1 = []
def put_num(q):
    for i in range(10):
        q.put(i)


def get_num(q):
    while not q.empty():
        list1.append(q.get())
        print(list1)

if __name__ == '__main__':

    q = Queue()
    p1 = Process(target=put_num,args=(q,))
    p2 = Process(target=get_num, args=(q,))
    p1.start()
    p2.start()
    p2.join()

结果

[0]
[0, 1]
[0, 1, 2]
[0, 1, 2, 3]
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4, 5]
[0, 1, 2, 3, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6, 7]
[0, 1, 2, 3, 4, 5, 6, 7, 8]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

2,Pipeline:Pipe方法返回(conn1, conn2)代表一个管道的两个端。Pipe方法有duplex参数,如果duplex参数为True(默认值),那么这个管道是全双工模式,也就是说conn1和conn2均可收发。duplex为False,conn1只负责接受消息,conn2只负责发送消息。
send和recv方法分别是发送和接受消息的方法。例如,在全双工模式下,可以调用conn1.send发送消息,conn1.recv接收消息。如果没有消息可接收,recv方法会一直阻塞。如果管道已经被关闭,那么recv方法会抛出EOFError。

def send_num(pipe):
    for i in range(10):
        print('send:',i)
        pipe.send(i)
        time.sleep(0.3)

def recv_num(pipe):
    while True:
        num = pipe.recv()
        print('recv1:',num)
        time.sleep(1)

def recv_num2(pipe):
    while True:
        num = pipe.recv()
        print('recv2:',num)
        time.sleep(1.2)


if __name__ == '__main__':
    p = Pipe()
    p1 = Process(target=send_num,args=(p[0],))
    p2 = Process(target=recv_num, args=(p[1],))
    p3 = Process(target=recv_num2, args=(p[1],))

    p1.start()
    p2.start()
    p3.start()

    p1.join()
    p2.join()
    p3.join()

结果

send: 0
recv1: 0
send: 1
recv2: 1
send: 2
send: 3
recv1: 2
send: 4
recv2: 3
send: 5
send: 6
recv1: 4
send: 7
send: 8
recv2: 5
send: 9
recv1: 6
recv2: 7
recv1: 8
recv1: 9

3,Manager:manager可以实现进程间数据的共享,上面一个队列一个管道,实现的仅仅是数据的传递方法

def run1(dict1,list1):
    dict1['num'] = 1
    list1.reverse()

def run2(dict1,list1):
    dict1['num'] = 2
    list1.sort()

if __name__ == '__main__':
    m = Manager()
    dict1 = m.dict()
    list1 = m.list(range(10))
    p1 = Process(target=run1,args=(dict1,list1))
    p2 = Process(target=run2,args=(dict1,list1))
    print(dict1)
    print(list1)

    p1.start()
    p2.start()

    p1.join()
    p2.join()
    print(dict1)
    print(list1)

结果

{}
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
{'num': 2}
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

这里是什么意思呢,就是一个进程对列表或字典等储存容器内容的修改,会被另一个进程获取到,在其修改的基础上再进行修改。

接下来就是进程池了,进程池的作用也很明了,就是使维持进程的运行个数在一个稳定的范围内

from multiprocessing import Process,Pool
def run(num):
    print('Go:',num)
    time.sleep(2)  #模拟I/O操作
    return 'Over:%s'%num

def run_back(msg):
    print('CallBack:',msg)

if __name__ == '__main__':
    pool = Pool()
    rest = []
    for i in range(10):
    	#run是函数名,元组中是参数,callback指定回调函数,回调函数是由主进程调用的,参数是run函数的返回值
        p = pool.apply_async(run,(i,),callback=run_back)
        rest.append(p)
    for p in rest:
        p.get()
    pool.close()
    pool.join()

结果

Go: 0
Go: 1
Go: 2
Go: 3
Go: 4
CallBack: Over:0
Go: 5
CallBack: Over:1
CallBack:Go: Over:2 
6
Go:CallBack: Over:3
 7
CallBack:Go: Over:4 8

Go: 9
CallBack: Over:5
CallBack: Over:6
CallBack: Over:7
CallBack: Over:8
CallBack: Over:9

这是线程池的异步非阻塞方法
其实还有个阻塞方法

from multiprocessing import Process,Pool
def run(num):
    print('Go:',num)
    time.sleep(2)
    return 'Over:%s'%num

def run_back(msg):
    print('CallBack:',msg)

if __name__ == '__main__':
    pool = Pool()
    for i in range(10):
        p = pool.apply(run, (i,))
        print(p)

    pool.close()
    pool.join()

阻塞方法没有回调函数
结果

Go: 0
Over:0
Go: 1
Over:1
Go: 2
Over:2
Go: 3
Over:3
Go: 4
Over:4
Go: 5
Over:5
Go: 6
Over:6
Go: 7
Over:7
Go: 8
Over:8
Go: 9

串行了,噢噢噢噢
其实二者的区别就在于get方法的调用

from multiprocessing import Process,Pool
def run(num):
    print('Go:',num)
    time.sleep(2)
    return 'Over:%s'%num

def run_back(msg):
    print('CallBack:',msg)

if __name__ == '__main__':
    pool = Pool()
    rest = []
    for i in range(10):
        p = pool.apply_async(run,(i,),callback=run_back)
        # rest.append(p)
        p.get()
    # for p in rest:
    #     p.get()
    pool.close()  #先close再join
    pool.join()  # 进程池中进程执行完毕后再关闭,如果注释,那么程序直接关闭。

我稍微修改了一下get的位置,就发现阻塞了

Go: 0
CallBack: Over:0
Go: 1
CallBack: Over:1
Go: 2
CallBack: Over:2
Go: 3
CallBack: Over:3
Go: 4
CallBack: Over:4
Go: 5
CallBack: Over:5
Go: 6
CallBack: Over:6
Go: 7
CallBack: Over:7
Go: 8
CallBack: Over:8
Go: 9
CallBack: Over:9

看一下apply的源码

    def apply(self, func, args=(), kwds={}):
        '''
        Equivalent of `func(*args, **kwds)`.
        Pool must be running.
        '''
        return self.apply_async(func, args, kwds).get()

apply就是在源码中用了self.apply_async(func, args, kwds).get(),本质上也是用的apply_async方法

你可能感兴趣的:(python,多进程通信,进程池)