Server、Workstations基本都是2-way x86 PC server,普遍配备多核心CPU,拥有这么多CPU Core不拿来做并行简直是浪费资源,多核并行 还是有不少应用场景的.
Python2.7标准库中提供以下两个库支持多核并行:
这两个库有很大差异,字面上就体现的很清楚了,threading实现的是多线程,而multiprocessing实现的是多进程,熟悉mpi编程的人可能会想当然的认为这两个方式都可以用来实现多核并行计算,这其实是个误会,原因就在于CPython解释器的全局解释器锁(GIL)
下文援引官方 Documents
The Python interpreter is not fully thread-safe. In order to support multi-threaded Python programs, there’s a global lock, called the global interpreter lock or GIL, that must be held by the current thread before it can safely access Python objects. Without the lock, even the simplest operations could cause problems in a multi-threaded program: for example, when two threads simultaneously increment the reference count of the same object, the reference count could end up being incremented only once instead of twice.
Therefore, the rule exists that only the thread that has acquired the GIL may operate on Python objects or call Python/C API functions. In order to emulate concurrency of execution, the interpreter regularly tries to switch threads (see sys.setcheckinterval()). The lock is also released around potentially blocking I/O operations like reading or writing a file, so that other Python threads can run in the meantime.
一句话总结:
GIL的存在限定了解释器一次最多只能执行一个Thread,其余Thread即便能分配到空闲CPU资源也要等到当前正在run的Thread执行完毕释放GIL才能开始执行。
Note:是不是很忧伤,等于说CPython解释器并不支持多线程。
#-*- coding:utf-8 -*-
'''
Created on 2017年6月20日
@author: will
'''
from threading import Thread
import multiprocessing
import time
def countdown(n):
while n > 0:
n -= 1
def mult_threads(n):
t1 = Thread(target= countdown,args=(n//2,))
t2 = Thread(target= countdown,args=(n//2,))
t1.start();t2.start()
t1.join();t2.join()
def mult_process(n):
pn1 = multiprocessing.Process(target=countdown,args=(n//2,))
pn2 = multiprocessing.Process(target=countdown,args=(n//2,))
pn1.start();pn2.start()
pn1.join();pn2.join()
if __name__ == '__main__':
num = 1000000000
print 'CPU_Cores = :' + str(multiprocessing.cpu_count())
#单核串行
t1_s_time = time.clock()
countdown(num)
print '1 thread 1 Process exec:' + str(time.clock() -t1_s_time)
#双核2Thread
t2_s_time = time.clock()
mult_threads(num)
print '2 threads exec:' + str(time.clock() -t1_s_time)
#双核2Process
t3_s_time = time.clock()
mult_process(num)
end_time = time.clock()
print '1 Threads 2 Process execTime = ' + str(end_time - t3_s_time)
CPU_Cores = :4 1 thread 1 Process exec:36.5817461267 2 threads exec:132.859294644 1 Threads 2 Process execTime = 21.1290401559
从执行时间来看多Thread效果还不如单线程,而多Process是可以摆脱GIL限制真正实现多核并行的,编程时如需要使用多核并行处理尽量考虑使用multiprocessing库。