Python多进程编程multiprocessing

因为GIL(全局解释器锁)的限制(GIL是用来保证在任意时刻只能有一个控制线程在执行),所以python中的多线程并非真正的多线程。只有python程序是I/O密集型应用时,多线程才会对运行效率有显著提高(因在等待I/O的时,会释放GIL允许其他线程继续执行),而在计算密集型应用中,多线程并没有什么用处。考虑到要充分利用多核CPU的资源,允许python可以并行处理一些任务,这里就用到了python多进程编程了。multiprocessing是python中的多进程模块,使用这个模块可以方便地进行多进程应用程序开发。multiprocessing模块中提供了:Process、Pool、Queue、Manager等组件。

1. Process类

1.1 构造方法

def __init__(self, group=None, target=None, name=None, args=(), kwargs={})
group:进程所属组,基本不用 
target:进程调用对象(可以是一个函数名,也可以是一个可调用的对象(实现了__call__方法的类)) 
args:调用对象的位置参数元组 
name:别名 
kwargs:调用对象的关键字参数字典

1.2 实例方法

is_alive():返回进程是否在运行 
start():启动进程,等待CPU调度 
join([timeout]):阻塞当前上下文环境,直到调用此方法的进程终止或者到达指定timeout 
terminate():不管任务是否完成,立即停止该进程 
run():start()调用该方法,当实例进程没有传入target参数,stat()将执行默认的run()方法

1.3 属性

authkey: 
daemon:守护进程标识,在start()调用之前可以对其进行修改 
exitcode:进程的退出状态码 
name:进程名 
pid:进程id

1.4 实例

# 实例一:传入的target为一个函数
def foo(i):
    print time.ctime(), 'process the %d begin ......' % i
    time.sleep(random.uniform(1, 3))
    print time.ctime(), 'process the %d end !!!' % i

if __name__ == '__main__':
    print time.ctime, 'process begin...'

    p_lst = list()
    for i in xrange(4):
        p_lst.append(Process(target=foo, args=(i,)))

    # 启动子进程
    for p in p_lst:
        p.start()
    # 等待子进程全部结束
    for p in p_lst:
        p.join()

    print time.ctime, 'process end!!!'

运行结果:

 process begin...
Thu Apr 19 10:33:52 2018 process the 0 begin ......
Thu Apr 19 10:33:52 2018 process the 1 begin ......
Thu Apr 19 10:33:52 2018 process the 2 begin ......
Thu Apr 19 10:33:52 2018 process the 3 begin ......
Thu Apr 19 10:33:53 2018 process the 0 end !!!
Thu Apr 19 10:33:53 2018 process the 3 end !!!
Thu Apr 19 10:33:53 2018 process the 2 end !!!
Thu Apr 19 10:33:53 2018 process the 1 end !!!
 process end!!!
# 实例二:传入的target为一个可调用对象
class Foo(object):
    # --- docstring for Foo ---
    def __init__(self, arg):
        super(Foo, self).__init__()
        self.arg = arg
    def __call__(self):
        print time.ctime(), 'process the %d begin ......' % self.arg
        time.sleep(random.uniform(1, 3))
        print time.ctime(), 'process the %d end !!!' % self.arg

if __name__ == '__main__':
    print time.ctime, 'process begin...'

    p_lst = list()
    for i in xrange(4):
        p_lst.append(Process(target=Foo(i)))

    # 启动子进程
    for p in p_lst:
        p.start()
    # 等待子进程全部结束
    for p in p_lst:
        p.join()

    print time.ctime, 'process end!!!'

运行结果:

 process begin...
Thu Apr 19 10:47:05 2018 process the 0 begin ......
Thu Apr 19 10:47:05 2018 process the 1 begin ......
Thu Apr 19 10:47:05 2018 process the 3 begin ......
Thu Apr 19 10:47:05 2018 process the 2 begin ......
Thu Apr 19 10:47:06 2018 process the 2 end !!!
Thu Apr 19 10:47:06 2018 process the 0 end !!!
Thu Apr 19 10:47:07 2018 process the 1 end !!!
Thu Apr 19 10:47:07 2018 process the 3 end !!!
 process end!!!
# 实例三:派生Process子类,并创建子类的实例(继承Process重写run方法)
class Myprocess(Process):
    def __init__(self, arg):
        super(Myprocess, self).__init__()
        self.arg = arg
    # 重写run方法
    def run(self):
        print time.ctime(), 'process the %d begin ......' % self.arg
        time.sleep(random.uniform(1, 3))
        print time.ctime(), 'process the %d end !!!' % self.arg

if __name__ == '__main__':
    print time.ctime, 'process begin...'

    p_lst = list()
    for i in xrange(4):
        p_lst.append(Myprocess(i))

    # 启动子进程
    for p in p_lst:
        p.daemen = True  #加入daemon,设置为后台进程
        p.start()
    # 等待子进程全部结束
    for p in p_lst:
        p.join()

    print time.ctime, 'process end!!!'

运行结果:

 process begin...
Thu Apr 19 10:47:16 2018 process the 0 begin ......
Thu Apr 19 10:47:16 2018 process the 1 begin ......
Thu Apr 19 10:47:16 2018 process the 2 begin ......
Thu Apr 19 10:47:16 2018 process the 3 begin ......
Thu Apr 19 10:47:17 2018 process the 0 end !!!
Thu Apr 19 10:47:17 2018 process the 2 end !!!
Thu Apr 19 10:47:18 2018 process the 3 end !!!
Thu Apr 19 10:47:18 2018 process the 1 end !!!
 process end!!!

2. Pool类

当使用Process类管理非常多(几十上百个)的进程时,就会显得比较繁琐,这是就可以使用Pool(进程池)来对进程进行统一管理。当池中进程已满时,有新进程请求执行时,就会被阻塞,直到池中有进程执行结束,新的进程请求才会被放入池中并执行。

2.1 构造方法

def __init__(self, processes=None, initializer=None, initargs=(), maxtasksperchild=None)
processes:池中可容纳的工作进程数量,默认情况使用os.cpu_count()返回的数值,一般默认即可 

2.2 实例方法

apply(self, func, args=(), kwds={}):阻塞型进程池,会阻塞主进程,直到工作进程全部退出,一般不用这个 
apply_async(self, func, args=(), kwds={}, callback=None):非阻塞型进程池 
map(self, func, iterable, chunksize=None):与内置map行为一致,它会阻塞主进程,直到map运行结束 
map_async(self, func, iterable, chunksize=None, callback=None):非阻塞版本的map 
close():关闭进程池,不在接受新任务 
terminate():结束工作进程 
join():阻塞主进程等待子进程退出,该方法必须在close或terminate之后执行

2.3 实例

# 进程池Pool类实例
def foo(i):
    print time.ctime(), 'process the %d begin ......' % i
    time.sleep(random.uniform(1, 3))
    print time.ctime(), 'process the %d end !!!' % i

if __name__ == '__main__':
    print time.ctime, 'process begin...'

    pool = Pool(processes=2)  # 设置进程池中最大并行工作进程数为2
    for i in xrange(4):
        pool.apply_async(foo, args=(i, ))  #提交4个子进程任务,非阻塞型进程池 

    pool.close()
    pool.join()

    print time.ctime, 'process end!!!'

运行结果:

 process begin...
Thu Apr 19 10:54:47 2018 process the 0 begin ......
Thu Apr 19 10:54:47 2018 process the 1 begin ......
Thu Apr 19 10:54:48 2018 process the 0 end !!!
Thu Apr 19 10:54:48 2018 process the 2 begin ......
Thu Apr 19 10:54:49 2018 process the 1 end !!!
Thu Apr 19 10:54:49 2018 process the 3 begin ......
Thu Apr 19 10:54:50 2018 process the 2 end !!!
Thu Apr 19 10:54:51 2018 process the 3 end !!!
 process end!!!

3. Queue类

Queue主要提供进程间通信以及共享数据等功能。除Queue外还可以使用Pipes实现进程间通信(Pipes是两个进程间进行通信)

3.1 构造方法

def __init__(self, maxsize=0)
maxsize:用于设置队列最大长度,当为maxsize<=0时,队列的最大长度会被设置为一个非常大的值(我的系统中队列最大长度被设置为2147483647)

3.2 实例方法

put(self, obj, block=True, timeout=None)
1、block为True,若队列已满,并且timeout为正值,该方法会阻塞timeout指定的时间,直到队列中有出现剩余空间,如果超时,会抛出Queue.Full异常 
2、block为False,若队列已满,立即抛出Queue.Full异常
get(self, block=True, timeout=None)
block为True,若队列为空,并且timeout为正值,该方法会阻塞timeout指定的时间,直到队列中有出现新的数据,如果超时,会抛出Queue.Empty异常 
block为False,若队列为空,立即抛出Queue.Empty异常

3.3 实例

# 队列Queue类实例
def write(q):
    for val in 'abcd':
        print time.ctime(), 'put    %s to queue' % val
        q.put(val)
        time.sleep(random.random())

def read(q):
    while True:
        value = q.get()
        print time.ctime(), 'get %s from queue' % value

if __name__ == '__main__':
    #主进程创建Queue,并作为参数传递给子进程
    q = Queue()
    pw = Process(target=write, args=(q, ))
    pr = Process(target=read, args=(q, ))
    #启动子进程pw,往Queue中写入
    pw.start()
    #启动子进程pr,从Queue中读取
    pr.start()
    #等待写进程执行结束
    pw.join()
    #终止读取进程
    pr.terminate()

运行结果:

Thu Apr 19 11:14:31 2018 put    a to queue
Thu Apr 19 11:14:31 2018 get a from queue
Thu Apr 19 11:14:31 2018 put    b to queue
Thu Apr 19 11:14:31 2018 get b from queue
Thu Apr 19 11:14:31 2018 put    c to queue
Thu Apr 19 11:14:31 2018 get c from queue
Thu Apr 19 11:14:32 2018 put    d to queue
Thu Apr 19 11:14:32 2018 get d from queue

4 Manager类

Manager是进程间数据共享的高级接口。 

Manager()返回的manager对象控制了一个server进程,此进程包含的python对象可以被其他的进程通过proxies来访问。从而达到多进程间数据通信且安全。Manager支持的类型有list, dict, Namespace, Lock, RLock, Semaphore, BoundedSemaphore, Condition, Event, Queue, Value和Array。

4.1 实例

# 进程间数据共享高级接口Manager类实例
# 此例是使用Manager管理一个用于多进程共享的dict数据
def worker(d, key, val):
    print time.ctime(), "insert the k-v pair to dict begin: {%d: %d}" %(key, val)
    time.sleep(random.uniform(1, 2))
    d[key] = val  #访问共享数据
    print time.ctime(), "insert the k-v pair to dict end: {%d: %d}" %(key, val)

if __name__ == '__main__':
    print time.ctime(), "process for manager begin"
    mgr = Manager()
    d = mgr.dict()
    pool = Pool(processes=4)
    for i in range(10):
        pool.apply_async(worker,  args=(d,  i,  i*i))

    pool.close()
    pool.join()
    print'Result:  %s' % d
    print time.ctime(), "process for manager end"

运行结果:

Thu Apr 19 11:28:36 2018 process for manager begin
Thu Apr 19 11:28:36 2018 insert the k-v pair to dict begin: {0: 0}
Thu Apr 19 11:28:36 2018 insert the k-v pair to dict begin: {1: 1}
Thu Apr 19 11:28:36 2018 insert the k-v pair to dict begin: {2: 4}
Thu Apr 19 11:28:36 2018 insert the k-v pair to dict begin: {3: 9}
Thu Apr 19 11:28:37 2018 insert the k-v pair to dict end: {2: 4}
Thu Apr 19 11:28:37 2018 insert the k-v pair to dict begin: {4: 16}
Thu Apr 19 11:28:38 2018 insert the k-v pair to dict end: {3: 9}
Thu Apr 19 11:28:38 2018 insert the k-v pair to dict begin: {5: 25}
Thu Apr 19 11:28:38 2018 insert the k-v pair to dict end: {0: 0}
Thu Apr 19 11:28:38 2018 insert the k-v pair to dict begin: {6: 36}
Thu Apr 19 11:28:38 2018 insert the k-v pair to dict end: {1: 1}
Thu Apr 19 11:28:38 2018 insert the k-v pair to dict begin: {7: 49}
Thu Apr 19 11:28:39 2018 insert the k-v pair to dict end: {5: 25}
Thu Apr 19 11:28:39 2018 insert the k-v pair to dict begin: {8: 64}
Thu Apr 19 11:28:39 2018 insert the k-v pair to dict end: {4: 16}
Thu Apr 19 11:28:39 2018 insert the k-v pair to dict begin: {9: 81}
Thu Apr 19 11:28:40 2018 insert the k-v pair to dict end: {7: 49}
Thu Apr 19 11:28:40 2018 insert the k-v pair to dict end: {6: 36}
Thu Apr 19 11:28:41 2018 insert the k-v pair to dict end: {8: 64}
Thu Apr 19 11:28:41 2018 insert the k-v pair to dict end: {9: 81}
Result:  {0: 0, 1: 1, 2: 4, 3: 9, 4: 16, 5: 25, 6: 36, 7: 49, 8: 64, 9: 81}
Thu Apr 19 11:28:41 2018 process for manager end

注意:以上内容是个人使用的随手记录, 就是介绍了下简单的使用

欢迎大家来吐槽,准备好瓜子饮料矿泉水,开整!!!

---------------------------------------------------------------------------------------

搞笑一则:能动手尽量别吵吵



你可能感兴趣的:(Python使用随手记录)