Python学习XI --- 多线程编程初步

总说

Python代码的执行是由Python虚拟机(解释器主循环)进行控制的.在主循环中同时只能有一个控制线程在执行.尽管Python解释器中可以运行多个线程,但是在任意时刻只有一个线程被解释器执行.
对Python虚拟机的访问由全局解释器锁(GIL)控制,该锁保证同时只能有一个线程运行.
在Python中主要由”_thread”模块以及”threading”模块进行线程控制.
不建议使用_thread模块,因为在主线程退出之后,所有的其他线程会在没有清理的情况下直接退出

_thread模块

如果使用_thread模块,那么参考代码如下:

import _thread
from time import sleep, ctime

loops = [4,2]
def loop(nloop, nsec, lock):
    print('Start loop', nloop, 'at:', ctime())
    sleep(nsec) # mainly functionaly
    print('loop', nloop, 'done at:', ctime())
    lock.release()

def main():
    print('Starting at:', ctime())
    locks = []
    nloops = range(len(loops))

    # Firstly, we create two threads.
    for i in nloops:
        lock = _thread.allocate_lock()  # get the lock
        lock.acquire()  # lock the lock...
        locks.append(lock) # store the lock

    # Then we just start the threads.
    for i in nloops:
        _thread.start_new_thread(loop, (i, loops[i], locks[i]))

    # When all the locks are released, we exit.
    for i in nloops:
        while locks[i].locked(): pass

    print('all DONE at:', ctime())

if __name__ == '__main__':
    main()

挺有意思的,我们这里定义了2个子线程,第一个执行4秒,第二个执行2秒.显然第二个会更早执行完.
1. allocate_lock 获得锁对象, acquire 取得锁("锁住锁"), 然后将这些锁放进列表中.
2. 用start_new_thread来执行线程.
3. .locked()可以判断锁是否被锁住.

我们看到第二个线程在40秒执行完,第一个线程则在42秒执行完.
输出:

Starting at: Mon Apr  2 19:33:38 2018
Start loop 1 at: Mon Apr  2 19:33:38 2018
Start loop 0 at: Mon Apr  2 19:33:38 2018
loop 1 done at: Mon Apr  2 19:33:40 2018
loop 0 done at: Mon Apr  2 19:33:42 2018
all DONE at: Mon Apr  2 19:33:42 2018

因为我们这里最后判断子线程是否执行完是采用while的,显然很傻.
其实到这里,你可能有几个小想法.
1. 能设计一些子线程,如果这些子线程没有执行完,主线程就不会结束吗?这样就不用while进行傻瓜式判定了.
2. 每个线程申请都要进行allocate_lock,acquire等等操作,太麻烦了,一句话,有没有简单点的,别整这些没用的.

threading模块

守护线程

我们之前提到过,_thread模块不支持守护线程.简单来说,守护线程如果为False的话,那么主线程的退出必须要等到该线程结束.换句话说,非守护线程必须执行完,主线程才能退出.
.daemon属性为false, 则该线程为非守护线程.

现在开始说明用更加高级的模块threading,如何进行编程.
主要有两种形式:
函数风格以及面向对象风格
函数风格如下:

import threading
from time import sleep, ctime

loops = [4,2]

def loop(nloop, nsec):
    print('start loop', nloop, 'at:', ctime())
    sleep(nsec)
    print('loop', nloop, 'done at', ctime())

def main():
    print('Starting at:', ctime()) 
    threads = []
    nloops = range(len(loops))

    # create threads
    for i in nloops:
        t = threading.Thread(target=loop, args=(i, loops[i]))
        print(t.daemon)  # it is false.
        threads.append(t)

    for i in nloops:  #start threads
        threads[i].start()

    for i in nloops:   #wait for all threads to finish
        threads[i].join()

    print('all DONE at:', ctime())

if __name__ == '__main__':

输出

Starting at: Mon Apr  2 19:47:14 2018
False
False
start loop 0 at: Mon Apr  2 19:47:14 2018
start loop 1 at: Mon Apr  2 19:47:14 2018
loop 1 done at Mon Apr  2 19:47:16 2018
loop 0 done at Mon Apr  2 19:47:18 2018
all DONE at: Mon Apr  2 19:47:18 2018

升级之处:
1. 没有额外的取获得锁,直接获取一组Thread对象,然后传参,返回Thread实例.
2. 每个线程通过 .start() 进行执行.
3. 以前管理锁,要进行分配获取释放,检查锁状态啊,但是现在只需要调用 join()方法即可.
  Join()方法会等待线程结束,或是给定的时间.(不需要while循环不断检查锁状态了)

几个对比:
1.如果去掉join: 
all DONE at 在两个子线程刚开始执行后,就开始调用了.因此主线程完全忽略子线程.
主线程继续执行,到最后没东西跑了,开始等待非守护线程执行结束,再退出.

start loop 0 at: Mon Apr  2 20:07:47 2018
start loop 1 at: Mon Apr  2 20:07:47 2018
all DONE at: Mon Apr  2 20:07:47 2018
loop 1 done at Mon Apr  2 20:07:49 2018
loop 0 done at Mon Apr  2 20:07:51 2018

2 . 如果deamon=True,去掉join
这个类似最原始的_thread模块,主线程根本不等子线程执行完,直接就退出了.

start loop 0 at: Mon Apr  2 20:10:27 2018
start loop 1 at: Mon Apr  2 20:10:27 2018
all DONE at: Mon Apr  2 20:10:27 2018

3. 如果deamon=True,保留join:
all DONE at在最后,说明join会使得主线程必须等待该子线程执行完,才能往下执行.让主线程"卡住"了

start loop 0 at: Mon Apr  2 20:11:51 2018
start loop 1 at: Mon Apr  2 20:11:51 2018
loop 1 done at Mon Apr  2 20:11:53 2018
loop 0 done at Mon Apr  2 20:11:55 2018
all DONE at: Mon Apr  2 20:11:55 2018

再看一个例子:

import threading
from time import sleep, ctime

loops = [4,2]

def loop(nloop, nsec):
    print('start loop', nloop, 'at:', ctime())
    sleep(nsec)
    print('loop', nloop, 'done at', ctime())

def main():
    print('Starting at:', ctime()) 
    threads = []
    nloops = range(len(loops))

    for i in nloops:
        t = threading.Thread(target=loop, args=(i, loops[i]))
        threads.append(t)

    for i in nloops:  #start threads
        threads[i].start()

    # for i in nloops:   #wait for all threads to finish
    #     threads[i].join()

    # only let the second thread 'join'
    threads[1].join()
    print('all DONE at:', ctime())

if __name__ == '__main__':
    main()

应该知道了, 我们只让长2秒的threads[1]进行join, 即主线程会等待这个子线程执行结束,才会继续执行.而长4秒的子线程threads[0]

start loop 0 at: Mon Apr  2 20:17:18 2018
start loop 1 at: Mon Apr  2 20:17:18 2018
loop 1 done at Mon Apr  2 20:17:20 2018
all DONE at: Mon Apr  2 20:17:20 2018  # 必须等待loop1结束后,才会执行主线程
loop 0 done at Mon Apr  2 20:17:22 2018

总结
1. 主线程会等待join了的子线程执行完,才会继续执行.
2. threading初始化默认是非守护线程,只有这些进程执行完毕,主线程才能退出.

面向对象的多线程写法

import threading
from time import sleep, ctime

loops = [4,2]

class MyThread(threading.Thread):
    def __init__(self, func, args, name=''):
        threading.Thread.__init__(self)  #必须先调用基类构造函数
        self.name = name
        self.func = func
        self.args = args

    def getResult(self):
        return self.res

    def run(self):  #重载run函数
        print('Starting', self.name, 'at:', ctime())
        self.res = self.func(*self.args)

def loop(nloop, nsec):
    print('start loop', nloop, 'at:', ctime())
    sleep(nsec)
    print('loop', nloop, 'done at', ctime())

def main():
    print('Starting at:', ctime()) 
    threads = []
    nloops = range(len(loops))

    for i in nloops:
        # t = threading.Thread(target=loop, args=(i, loops[i]))
        t = MyThread(loop, (i, loops[i]), loop.__name__)
        threads.append(t)

    for i in nloops:  #start threads
        threads[i].start()

    for i in nloops:   #wait for all threads to finish
        threads[i].join()

    print('all DONE at:', ctime())

if __name__ == '__main__':
    main()

你可能感兴趣的:(Python,Python学习)