Python3.7学习笔记29-并发编程Futures

Python3.7学习笔记29-并发编程Futures

一、并发编程的概念

如下图。我们先去掉一个误区。就是并发编程。很多人以为是同一时刻执行多个操作。其实刚好相反。特地时刻只执行一个操作。只是来回切换而已。如下图所示。线程 任务在执行并发的时候。是互相切换执行直到所有操作结束。

  • thread:进程。对于 threading,操作系统知道每个线程的所有信息,因此它会做主在适当的时候做线程切换。很显然,这样的好处是代码容易书写,因为程序员不需要做任何切换操作的处理;但是切换线程的操作,也有可能出现在一个语句执行的过程中(比如 x += 1),这样就容易出现 race condition 的情况。
  • task:协程。而对于 asyncio,主程序想要切换任务时,必须得到此任务可以被切换的通知,这样一来也就可以避免刚刚提到的 race condition 的情况。
  • 并发通常应用于 I/O 操作频繁的场景,比如你要从网站上下载多个文件,I/O 操作的时间可能会比 CPU 运行处理的时间长得多。

Python3.7学习笔记29-并发编程Futures_第1张图片

二、并行编程概念

  • 所谓的并行,指的才是同一时刻、同时发生。Python 中的 multi-processing 便是这个意思,对于 multi-processing,你可以简单地这么理解:比如你的电脑是 6 核处理器,那么在运行程序时,就可以强制 Python 开 6 个进程,同时执行,以加快运行速度,它的原理示意图如下:
  • 并行更多应用于 CPU heavy 的场景,比如 MapReduce 中的并行计算,为了加快运行速度,一般会用多台机器、多个处理器来完成。 

Python3.7学习笔记29-并发编程Futures_第2张图片

 三、Futures 多线程编程

  • Python 中的 Futures 模块,位于 concurrent.futures 和 asyncio 中,它们都表示带有延迟的操作。Futures 会将处于等待状态的操作包裹起来放到队列中,这些操作的状态随时可以查询,当然,它们的结果或是异常,也能够在操作完成后被获取。
  • 通常来说,作为用户,我们不用考虑如何去创建 Futures,这些 Futures 底层都会帮我们处理好。我们要做的,实际上是去 schedule 这些 Futures 的执行。
  • 比如,Futures 中的 Executor 类,当我们执行 executor.submit(func) 时,它便会安排里面的 func() 函数执行,并返回创建好的 future 实例,以便你之后查询调用。
  • 这里再介绍一些常用的函数。Futures 中的方法 done(),表示相对应的操作是否完成——True 表示完成,False 表示没有完成。不过,要注意,done() 是 non-blocking 的,会立即返回结果。相对应的 add_done_callback(fn),则表示 Futures 完成后,相对应的参数函数 fn,会被通知并执行调用。
  • Futures 中还有一个重要的函数 result(),它表示当 future 完成后,返回其对应的结果或异常。而 as_completed(fs),则是针对给定的 future 迭代器 fs,在其完成后,返回完成后的迭代器。

我们用单线程和多线程完成同一个需求来对比。模拟下载网站的文件。

  • 单线程伪代码如下:
import requests
import time
def download_one(url):
    resp = requests.get(url)
    print('Read {} from {}'.format(len(resp.content), url))


def download_all(sites):
    for site in sites:
        download_one(site)

def main():
    sites = [
        'https://www.runoob.com/html/html-tutorial.html',
        'https://www.runoob.com/js/js-tutorial.html',
        'https://www.runoob.com/java/java-tutorial.html',
        'https://www.runoob.com/bootstrap/bootstrap-tutorial.html',
        'https://www.runoob.com/python3/python3-tutorial.html',
        'https://www.runoob.com/python/python-tutorial.html',
        'https://www.runoob.com/cprogramming/c-tutorial.html',
        'https://www.runoob.com/csharp/csharp-tutorial.html',
        'https://www.runoob.com/sql/sql-tutorial.html',
        'https://www.runoob.com/mysql/mysql-tutorial.html',
        'https://www.runoob.com/php/php-tutorial.html',
    ]
    start_time = time.perf_counter()
    download_all(sites)
    end_time = time.perf_counter()
    print('Download {} sites in {} seconds'.format(len(sites), end_time - start_time))


/usr/local/bin/python3.7 /Users/zhonglinglong/PycharmProjects/whole_world/临时文件.py
Read 71447 from https://www.runoob.com/html/html-tutorial.html
Read 60397 from https://www.runoob.com/js/js-tutorial.html
Read 64683 from https://www.runoob.com/java/java-tutorial.html
Read 55653 from https://www.runoob.com/bootstrap/bootstrap-tutorial.html
Read 62888 from https://www.runoob.com/python3/python3-tutorial.html
Read 59152 from https://www.runoob.com/python/python-tutorial.html
Read 66737 from https://www.runoob.com/cprogramming/c-tutorial.html
Read 68789 from https://www.runoob.com/csharp/csharp-tutorial.html
Read 56418 from https://www.runoob.com/sql/sql-tutorial.html
Read 56978 from https://www.runoob.com/mysql/mysql-tutorial.html
Read 61135 from https://www.runoob.com/php/php-tutorial.html
Download 11 sites in 1.524568402 seconds

Process finished with exit code 0
  • 多线程代码如下
  • 多线程和单线程主要的区别如下。实际线程数不是越多越好。因为线程多了维护它的资源也很多。要根据实际测试设置
  • with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
            # 对sites中的每一个元素,并发地调用函数download_one()
            executor.map(download_one, sites)
  • 并行代码如下。不需要设置workers。因为它自己会根据你cpu的核数来返回设置
  •  with concurrent.futures.ProcessPoolExecutor() as executor:
        #     executor.map(download_one, sites)
     
import concurrent.futures
import requests
import time


def download_one(url):
    resp = requests.get(url)
    print('Read {} from {}'.format(len(resp.content), url))


def download_all(sites):
    # 创建并发一个线程池,线程数=5
    with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
        # 对sites中的每一个元素,并发地调用函数download_one()
        executor.map(download_one, sites)

    # 创建一个并行的线程
    # with concurrent.futures.ProcessPoolExecutor() as executor:
    #     executor.map(download_one, sites)


def main():
    sites = [
        'https: // www.runoob.com / html / html - tutorial.html',
        'https://www.runoob.com/js/js-tutorial.html',
        'https://www.runoob.com/java/java-tutorial.html',
        'https://www.runoob.com/bootstrap/bootstrap-tutorial.html',
        'https://www.runoob.com/python3/python3-tutorial.html',
        'https://www.runoob.com/python/python-tutorial.html',
        'https://www.runoob.com/cprogramming/c-tutorial.html',
        'https://www.runoob.com/csharp/csharp-tutorial.html',
        'https://www.runoob.com/sql/sql-tutorial.html',
        'https://www.runoob.com/mysql/mysql-tutorial.html',
        'https://www.runoob.com/php/php-tutorial.html', ]

    start_time = time.perf_counter()
    download_all(sites)
    end_time = time.perf_counter()
    print('Download {} sites in {} seconds'.format(len(sites), end_time - start_time))

main()

/usr/local/bin/python3.7 /Users/zhonglinglong/PycharmProjects/whole_world/临时文件.py
Read 59152 from https://www.runoob.com/python/python-tutorial.html
Read 55653 from https://www.runoob.com/bootstrap/bootstrap-tutorial.html
Read 60397 from https://www.runoob.com/js/js-tutorial.html
Read 62888 from https://www.runoob.com/python3/python3-tutorial.html
Read 64683 from https://www.runoob.com/java/java-tutorial.html
Read 56418 from https://www.runoob.com/sql/sql-tutorial.html
Read 66737 from https://www.runoob.com/cprogramming/c-tutorial.html
Read 61135 from https://www.runoob.com/php/php-tutorial.html
Read 56978 from https://www.runoob.com/mysql/mysql-tutorial.html
Read 68789 from https://www.runoob.com/csharp/csharp-tutorial.html
Download 11 sites in 0.49166310300000005 seconds

Process finished with exit code 0

 综述:多线程比单线程所耗费的时间小的多的多。而且随着下载的网站越多。耗费的时间相差巨大。

上述的例子根据Futures常用方法可以重写成

import concurrent.futures
import requests
import time


def download_one(url):
    resp = requests.get(url)
    print('Read {} from {}'.format(len(resp.content), url))


def download_all(sites):
    with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
        to_do = []
        for site in sites:
            future = executor.submit(download_one, site)
            to_do.append(future)

        for future in concurrent.futures.as_completed(to_do):
            future.result()


def main():
    sites = [
        'https://www.runoob.com/js/js-tutorial.html',
        'https://www.runoob.com/java/java-tutorial.html',
        'https://www.runoob.com/bootstrap/bootstrap-tutorial.html',
        'https://www.runoob.com/python3/python3-tutorial.html',
        'https://www.runoob.com/python/python-tutorial.html',
        'https://www.runoob.com/cprogramming/c-tutorial.html',
        'https://www.runoob.com/csharp/csharp-tutorial.html',
        'https://www.runoob.com/sql/sql-tutorial.html',
        'https://www.runoob.com/mysql/mysql-tutorial.html',
        'https://www.runoob.com/php/php-tutorial.html', ]

    start_time = time.perf_counter()
    download_all(sites)
    end_time = time.perf_counter()
    print('Download {} sites in {} seconds'.format(len(sites), end_time - start_time))

main()


/usr/local/bin/python3.7 /Users/zhonglinglong/PycharmProjects/whole_world/临时文件.py
Read 60397 from https://www.runoob.com/js/js-tutorial.html
Read 62888 from https://www.runoob.com/python3/python3-tutorial.html
Read 55653 from https://www.runoob.com/bootstrap/bootstrap-tutorial.html
Read 64683 from https://www.runoob.com/java/java-tutorial.html
Read 59152 from https://www.runoob.com/python/python-tutorial.html
Read 68789 from https://www.runoob.com/csharp/csharp-tutorial.html
Read 56418 from https://www.runoob.com/sql/sql-tutorial.html
Read 66737 from https://www.runoob.com/cprogramming/c-tutorial.html
Read 61135 from https://www.runoob.com/php/php-tutorial.html
Read 56978 from https://www.runoob.com/mysql/mysql-tutorial.html
Download 10 sites in 0.2515203130000001 seconds

Process finished with exit code 0
  • 这里,我们首先调用 executor.submit(),将下载每一个网站的内容都放进 future 队列 to_do,等待执行。然后是 as_completed() 函数,在 future 完成后,便输出结果。
  • 不过,这里要注意,future 列表中每个 future 完成的顺序,和它在列表中的顺序并不一定完全一致。到底哪个先完成、哪个后完成,取决于系统的调度和每个 future 的执行时间。

你可能感兴趣的:(python学习笔记,python)