通过《通过实例认识Python的GIL》 ,《再谈Python的GIL》 ,《再谈Python的GIL(续)》已经基本上认识到了 Python的线程在多核情况下的性能是比较低的,但是在单核情况就就没有这个问题,难道就没有一个好的办法让Python的线程在多核情况下像单核情况下表现卓越吗?答案是有的,那就是限制Python在指定的CPU上运行。
在Windows下,有个函数可以指定进程在指定的CPU上运行:SetProcessAffinityMask
BOOL WINAPI SetProcessAffinityMask(
_In_ HANDLE hProcess,
_In_ DWORD_PTR dwProcessAffinityMask
);
dwProcessAffinityMask用于指定运行的CPU,比如1表示在CPU 0上运行,2表示在CPU 1上运行,3表示在CPU 0 和CPU 1上运行。
接下来看下怎么实现:
utility.pyx
cdef extern from "Windows.h": ctypedef int BOOL ctypedef void * HANDLE ctypedef unsigned long DWORD_PTR int SetProcessAffinityMask(HANDLE hProcess,DWORD_PTR dwProcessAffinityMask) nogil HANDLE GetCurrentProcess() nogil def SetAffinity(int mask): with nogil: SetProcessAffinityMask(<HANDLE>GetCurrentProcess(),mask)
Setup.py
from distutils.core import setup from distutils.extension import Extension from Cython.Build import cythonize ext = Extension("utility", define_macros = [('MAJOR_VERSION', '1'), ('MINOR_VERSION', '0')], sources = ["utility.pyx", ]) setup( name = 'callback', version = '1.0', description = 'This is a callback demo package', author = '', author_email = '[email protected]', url = '', long_description = '', ext_modules=cythonize([ext,]), )
编译生成utility.pyd:
python Setup.py build_ext --inplace
再看下测试用例:
count.py
from threading import Thread from threading import Event as TEvent from multiprocessing import Process from multiprocessing import Event as PEvent import utility utility.SetAffinity(1) from timeit import Timer import sys sys.setcheckinterval(100) #(100000) def countdown(n,event): while n > 0: n -= 1 event.set() def io_op(n,event,filename): f = open(filename,'w') while not event.is_set(): f.write('hello,world') f.close() def t1(): COUNT=100000000 event = TEvent() thread1 = Thread(target=countdown,args=(COUNT,event)) thread1.start() thread1.join() def t2(): COUNT=100000000 event = TEvent() thread1 = Thread(target=countdown,args=(COUNT//2,event)) thread2 = Thread(target=countdown,args=(COUNT//2,event)) thread1.start(); thread2.start() thread1.join(); thread2.join() def t3(): COUNT=100000000 event = PEvent() p1 = Process(target=countdown,args=(COUNT//2,event)) p2 = Process(target=countdown,args=(COUNT//2,event)) p1.start(); p2.start() p1.join(); p2.join() def t4(): COUNT=100000000 event = TEvent() thread1 = Thread(target=countdown,args=(COUNT,event)) thread2 = Thread(target=io_op,args=(COUNT,event,'thread.txt')) thread1.start(); thread2.start() thread1.join(); thread2.join() def t5(): COUNT=100000000 event = PEvent() p1 = Process(target=countdown,args=(COUNT,event)) p2 = Process(target=io_op,args=(COUNT,event,'process.txt')) p1.start(); p2.start() p1.join(); p2.join() if __name__ == '__main__': t = Timer(t1) print('countdown in one thread:%f'%(t.timeit(1),)) t = Timer(t2) print('countdown use two thread:%f'%(t.timeit(1),)) t = Timer(t3) print('countdown use two Process:%f'%(t.timeit(1),)) t = Timer(t4) print('countdown in one thread with io op in another thread:%f'%(t.timeit(1),)) t = Timer(t5) print('countdown in one process with io op in another process:%f'%(t.timeit(1),))
相对于之前的测试用例,加了两行代码:
import utility utility.SetAffinity(1)
我们来看下测试用例的输出:
countdown in one thread:7.005823
countdown use two thread:4.790538
countdown use two Process:4.936478
countdown in one thread with io op in another thread:9.526901
countdown in one process with io op in another process:9.262508
再对比一下之前在单核情况下的输出:
countdown in one thread:', 5.9650638561501195
countdown use two thread:', 5.8188333656781595
countdown use two Process', 6.197559396296269
countdown in one thread with io op in another thread:', 11.369204522553051
countdown in one process with io op in another process:', 11.79234388645473
由于这次测试时开的程序比较多,输出和之前有些差别,但是基本上是一致的。之前说要避免在多核情况下使用Thread,现在看来是错的了,只要限制进程运行的CPU即可。
在linux下,也可以用taskset命令来设置进程运行的CPU,这个以后再讨论。
从这篇文章我们可以得出一个结论:Python的GIL在多核CPU环境中的影响并没有之前想像的那么坏,在这个世界上,办法永远比困难多,就看你能不能坚持。