python线程池ThreadPoolExecutor的用法

为了释放python GIL锁实现多个任务的并发运行(实际上并非真正的并行只是看起来并发),往往采用多线程或者线程池的方式来实现。

从Python3.2开始,concurrent.futures模块提供了线程池ThreadPoolExecutor和进程池ProcessPoolExecutor两个对象,线程池模块和进程池模块通过submit提交一个任务并返回一个future对象,它是一个未来可期的对象,通过它可以获得线程或进程的任务执行状态及执行结果。

1. ThreadPoolExecutor基本用法

import time
from concurrent.futures import ThreadPoolExecutor


def io_bound_work(page):
    time.sleep(page)
    print(f"work {page} finished.")
    return page


if __name__ == '__main__':
    executor = ThreadPoolExecutor(max_workers=32)  # ThreadPoolExecutor构造示例,max_workers参数表示最多工作的线程数
    future_1 = executor.submit(io_bound_work, 1)  # 提交线程需要执行的任务到线程池中,并返回该任务的句柄,注意submit()不是阻塞的,而是立即返回的
    future_2 = executor.submit(io_bound_work, 2)
    future_3 = executor.submit(io_bound_work, 3)

    # 通过done()方法来判断线程是否完成, True表示已完成、False表示未完成
    print(f"future_1 {future_1.done()}")
    print(f"future_2: {future_2.done()}")
    print(f"future_3: {future_3.done()}")

    # 通过result()来获取线程返回的结果, timeout为获取结果的最长等待时间, 若为None则一直等待直到线程结束
    print(f"main 1 result: {future_1.result(timeout=None)}")
    print(f"main 2 result: {future_2.result(timeout=None)}")
    print(f"main 3 result: {future_3.result(timeout=None)}")

运行结果:

future_1 False
future_2: False
future_3: False
work 1 finished.
main 1 result: 1
work 2 finished.
main 2 result: 2
work 3 finished.
main 3 result: 3

2. wait方法

import time
from concurrent.futures import ThreadPoolExecutor, wait, FIRST_COMPLETED, ALL_COMPLETED


def io_bound_work(page):
    time.sleep(page)
    print(f"work {page} finished.")
    return page


if __name__ == '__main__':
    executor = ThreadPoolExecutor(max_workers=32)
    work_list = [executor.submit(io_bound_work, i) for i in range(1, 4)]

    # fs:要执行的序列, timeout:等待的最大时间, return_when:wait返回结果的条件;
    # ALL_COMPLETED等待全部线程执行结束后返回, FIRST_COMPLETED等待第一个线程结束时返回, FIRST_EXCEPTION线程一旦产生异常事件就结束
    ret = wait(fs=work_list, timeout=None, return_when=FIRST_COMPLETED) # 返回的条件是当完成第一个任务时返回
    print(f"work_list FIRST_COMPLETED, ret:{ret}")  # wait的返回值包含了已完成done和未完成not_done的句柄

    ret = wait(fs=work_list, timeout=None, return_when=ALL_COMPLETED)  # 返回的条件是当完成所有任务时返回
    print(f"work_list ALL_COMPLETED, ret:{ret}")

运行结果:

work 1 finished.
work_list FIRST_COMPLETED, ret:DoneAndNotDoneFutures(done={}, not_done={, })
work 2 finished.
work 3 finished.
work_list ALL_COMPLETED, ret:DoneAndNotDoneFutures(done={, , }, not_done=set())

3. as_completed方法

import time
from concurrent.futures import ThreadPoolExecutor, as_completed


def io_bound_work(page):
    time.sleep(page)
    print(f"work {page} finished.")
    return page


if __name__ == '__main__':
    executor = ThreadPoolExecutor(max_workers=32)
    work_list = [executor.submit(io_bound_work, i) for i in range(1, 4)]

    for future in as_completed(work_list):  # 每完成一个线程响应一个结果,直到work_list中线程全部结束
        data = future.result()
        print(f"main: {data}")
    print('finished')

运行结果:

work 1 finished.
main: 1
work 2 finished.
main: 2
work 3 finished.
main: 3
finished

4. map方法

import time
from concurrent.futures import ThreadPoolExecutor


def io_bound_work(page):
    time.sleep(page)
    print(f"work {page} finished.")
    return page


if __name__ == '__main__':
    executor = ThreadPoolExecutor(max_workers=32)
    # fn: 线程要执行的函数, iterables:可迭代对象, timeout:等待线程执行的结果
    for result in executor.map(io_bound_work, [3, 2, 1], timeout=None):
        print(f"work result: {result}")

使用map方法,无需提前使用submit方法,map方法与python高阶函数map的含义相同,都是将序列中的每个元素都执行同一个函数。上面的代码对列表中的每个元素都执行 io_bound_work 方法,并分配各个线程。

map与as_completed()方法的结果不同,map输出顺序与列表的顺序相同,会先打印前面提交的任务返回的结果。
 

你可能感兴趣的:(python,python,开发语言,性能优化)