Python 多进程VS多线程



借鉴大拿老师进的Python,今天写一点多进程多线程,一起学习,互相进步,感谢大拿老师!

# 多线程 vs 多进程

1.进程:

进程就是一个程序在一个数据集上的一次动态执行过程。进程一般由程序、数据集、进程控制块三部分组成

  •          程序运行的一个状态
  •          包含地址空间,内存,数据栈等
  •          每个进程由自己完全独立的运行环境

2.线程:

线程是操作系统能够进行运算调度的最小单位。它被包含在进程之中,是进程中的实际运作单位。一条线程指的是进程中一个单一顺序的控制流,一个进程中可以并发多个线程,每条线程并行执行不同的任务,

线程也叫轻量级进程,它是一个基本的CPU执行单元,也是程序执行过程中的最小单元,由线程ID、程序计数器、寄存器集合和堆栈共同组成。线程的引入减小了程序并发执行时的开销,提高了操作系统的并发性能。线程没有自己的系统资源。

  •           一个进程的独立运行片段,一个进程可以由多个线程
  •           轻量化的进程
  •           一个进程的多个现成间共享数据和上下文运行环境
  •           共享互斥问题

    进程和线程的关系:

    (1)一个线程只能属于一个进程,而一个进程可以有多个线程,但至少有一个线程。
    (2)资源分配给进程,同一进程的所有线程共享该进程的所有资源。
    (3)CPU分给线程,即真正在CPU上运行的是线程。


     

3.全局解释器锁(GIL):

  •           Python代码的执行是由python虚拟机进行控制
  •           在主循环中只有一个控制线程在执行
  •           在python多线程下,每个线程的执行方式:

                   获取GIL

                   执行代码直到sleep或者是python虚拟机将其挂起

                   释放GIL

4.Python包:

  •          threading: 通用的包

         顺序执行,耗时较长

import time


def loop1():
    # ctime得到当前时间
    print('start loop 1 at:', time.ctime())
    time.sleep(4)
    print('End loop 1 at:', time.ctime())


def loop2():
    # ctime得到当前时间
    print('start loop 2 at:', time.ctime())
    time.sleep(2)
    print('End loop 2 at:', time.ctime())


def main():
    print('string at:', time.ctime())
    loop1()
    loop2()
    print('all done at:', time.ctime())


if __name__ == '__main__':
    main()


         改用多线程,缩短总时间,使用_thread

import time
import _thread as thread


def loop1():
    # ctime得到当前时间
    print('start loop 1 at:', time.ctime())
    time.sleep(4)
    print('End loop 1 at:', time.ctime())


def loop2():
    # ctime得到当前时间
    print('start loop 2 at:', time.ctime())
    time.sleep(2)
    print('End loop 2 at:', time.ctime())


def main():
    print('string at:', time.ctime())
    # loop1()
    # loop2()

    thread.start_new_thread(loop1, ())
    thread.start_new_thread(loop2, ())
    print('all done at:', time.ctime())


if __name__ == '__main__':
    main()
    while True:
        time.sleep(3)

         多线程,传参数    

import time
import _thread as thread


def loop1(in1):
    # ctime得到当前时间
    print('start loop1 at:', time.ctime())
    print("参数:", in1)
    time.sleep(4)
    print('End loop1 at:', time.ctime())


def loop2(in1, in2):
    # ctime得到当前时间
    print('start loop2 at:', time.ctime())
    print("参数1:", in1, "参数2:", in2)
    time.sleep(2)
    print('End loop2 at:', time.ctime())


def main():
    print('all loop at:', time.ctime())
    # loop1(1)
    # loop2(1, 2)
    thread.start_new_thread(loop1, ('有逗号', ))
    thread.start_new_thread(loop2, ('两个', '两个参数'))
    print('end loop at:', time.ctime())


if __name__ == '__main__':
    main()
    while True:
        time.sleep(2)

5.threading的使用:

  •          直接利用threading.Thread生成Thread实例
  •         t = threading.Thread(target=xxx, args=(xxx,))
  •         t.start():启动多线程
  •         t.join(): 等待多线程执行完成在,子线程完成运行之前,这个子线程的父线程将一直被阻塞

        开启t.start()

import time
import threading


def loop1(in1):
    # ctime得到当前时间
    print('start loop1 at:', time.ctime())
    print("参数:", in1)
    time.sleep(4)
    print('End loop1 at:', time.ctime())


def loop2(in1, in2):
    # ctime得到当前时间
    print('start loop2 at:', time.ctime())
    print("参数1:", in1, "参数2:", in2)
    time.sleep(2)
    print('End loop2 at:', time.ctime())


def main():
    print('all loop at:', time.ctime())
    # loop1(1)
    # loop2(1, 2)
    t1 = threading.Thread(target=loop1, args=('有逗号', ))
    t1.start()
    t2 = threading.Thread(target=loop2, args=('两个', '两个参数'))
    t2.start()
    print('end loop at:', time.ctime())


if __name__ == '__main__':
    main()

        加入join后比较跟案例04的结果的异同,输出结果会有改变

import time
import threading


def loop1(in1):
    # ctime得到当前时间
    print('start loop1 at:', time.ctime())
    print("参数:", in1)
    time.sleep(4)
    print('End loop1 at:', time.ctime())


def loop2(in1, in2):
    # ctime得到当前时间
    print('start loop2 at:', time.ctime())
    print("参数1:", in1, "参数2:", in2)
    time.sleep(2)
    print('End loop2 at:', time.ctime())


def main():
    print('all loop at:', time.ctime())
    # loop1(1)
    # loop2(1, 2)
    t1 = threading.Thread(target=loop1, args=('有逗号', ))
    t1.start()
    t2 = threading.Thread(target=loop2, args=('两个', '两个参数'))
    t2.start()

    t1.join()
    t2.join()

    print('end loop at:', time.ctime())


if __name__ == '__main__':
    main()
    while True:
        time.sleep(2)

       守护线程-daemon

  •             如果在程序中将子线程设置成守护现成,则子线程会在主线程结束的时候自动退出
  •             一般认为,守护线程不中要或者不允许离开主线程独立运行
  •             守护线程案例能否有效果跟环境相关

            非守护线程,所有print都可以正常进行打印

import time
import threading


def fun():
    print("start fun")
    time.sleep(2)
    print("end fun")


print("main thread")

t1 = threading.Thread(target=fun, args=())
t1.start()

time.sleep(1)
print("main tread end")

             守护线程,一旦主线程结束,end fun 不会被打印

import time
import threading


def fun():
    print("start fun")
    time.sleep(2)
    print("end fun")


print("start thread")
t1 = threading.Thread(target=fun, args=())
# 守护必须设置在start之前
t1.setDaemon(True)
t1.start()
time.sleep(1)

print("end thread")

            

         线程常用属性

  •             threading.currentThread:返回当前线程变量
  •             threading.enumerate:返回一个包含正在运行的线程的list,正在运行的线程指的是线程启动后,结束前的状态
  •             threading.activeCount: 返回正在运行的线程数量,效果跟 len(threading.enumerate)相同
  •             threading.setName: 给线程设置名字
  •             threading.getName: 得到线程的名字
    import time
    import threading
    
    
    def loop1():
        print('start loop1 at:', time.ctime())
        time.sleep(4)
        print('End loop1 at:', time.ctime())
    
    
    def loop2():
        print('start loop2 at:', time.ctime())
        time.sleep(2)
        print('End loop2 at:', time.ctime())
    
    
    def loop3():
        print('start loop3 at:', time.ctime())
        time.sleep(6)
        print('End loop3 at:', time.ctime())
    
    
    def main():
        print("starting at:", time.ctime())
        t1 = threading.Thread(target=loop1, args=())
        t1.setName("tre1")
        t1.start()
    
        t2 = threading.Thread(target=loop2, args=())
        t2.setName("tre2")
        t2.start()
    
        t3 = threading.Thread(target=loop3, args=())
        t3.setName("tre3")
        t3.start()
    
        time.sleep(3)
    
        for thr in threading.enumerate():
            print("正在运行的线程名字:{0}".format(thr.getName()))
        print("正在运行的子线程的数量:{0}".format(threading.activeCount()))
    
        print("all done at: ", time.ctime())
    
    
    if __name__ == "__main__":
        main()
        while True:
            time.sleep(6)

     

          直接继承自threading.Thread

  •             直接继承Thread
  •             重写run函数
  •             类实例可以直接运行
    import time
    import threading
    
    
    # 类需要继承threading.Thread
    class MyThread(threading.Thread):
        def __init__(self, arg):
            super(MyThread, self).__init__()
            self.arg = arg
    
    # 必须重写run,run函数代表真正执行的函数的功能
        def run(self):
            time.sleep(2)
            print("The args for this is {0}".format(self.arg))
    
    
    for i in range(5):
        t = MyThread(i)
        t.start()
        t.join()
    
    print("main thread is done!!!!")

    工业风案例(自己观点,把需要传的参数进行封住,然后直接传入参数,对代码进行整理)

    import time
    import threading
    
    # loop = [4, 2]
    
    
    class ThreadFunc(object):
        def __init__(self, name):
            self.name = name
    
        def loop(self, nloop, nsec):
            '''
            :param nloop: 函数名称
            :param nsec: 系统休眠时间
            :return:
            '''
            print("Start loop", nloop, "at", time.ctime())
            time.sleep(nsec)
            print("Done loop", nloop, "at", time.ctime())
    
    
    def main():
        print("Starting at:", time.ctime())
    
        # ThreadFunc("loop").loop 跟以下两个式子相等:
        # t = ThreadFunc("loop")
        # t.loop
        # 以下t1 t2 的定义方式相同
        t = ThreadFunc("loop")
        t1 = threading.Thread(target=t.loop, args=("loop1", 4))
        # 下面这种写法西方化,工业化
        t2 = threading.Thread(target=ThreadFunc("loop").loop, args=("loop2", 2))
    
        t1.start()
        t2.start()
    
        t1.join()
        t2.join()
    
        print("all done at:", time.ctime())
    
    
    if __name__ == "__main__":
        main()
        while True:
            time.sleep(5)

     

6.共享变量:

  •             共享变量: 当多个现成同时访问一个变量的时候,会产生共享变量的问题
    import threading
    
    loopSum = 1000
    sum = 0
    
    
    def myAdd():
        global sum, loopSum
        for i in range(1, loopSum):
            sum += 1
    
    
    def myMinu():
        global sum, loopSum
        for i in range(1, loopSum):
            sum -= 1
    
    
    if __name__ == '__main__':
        print("starting ....{0}".format(sum))
        t1 = threading.Thread(target=myAdd, args=())
        t2 = threading.Thread(target=myMinu, args=())
    
        t1.start()
        t2.start()
    
        t1.join()
        t2.join()
    
        print("Done...{0}".format(sum))

     

7.锁(Lock):

  •             是一个标志,表示一个线程在占用一些资源
  •             使用方法
  •            上锁
  •             使用共享资源,放心的用
  •             取消锁,释放锁
  •             锁谁: 哪个资源需要多个线程共享,锁哪个
  •             理解锁:锁其实不是锁住谁,而是一个令牌
    import threading
    
    loopSum = 1000
    sum = 0
    
    
    lock = threading.Lock()
    
    
    def myAdd():
        global sum, loopSum
        for i in range(1, loopSum):
            # 上锁
            lock.acquire()
            sum += 1
            # 释放锁
            lock.release()
    
    
    def myMinu():
        global sum, loopSum
        for i in range(1, loopSum):
            lock.acquire()
            sum -= 1
            lock.release()
    
    
    if __name__ == '__main__':
        print("starting ....{0}".format(sum))
        t1 = threading.Thread(target=myAdd, args=())
        t2 = threading.Thread(target=myMinu, args=())
    
        t1.start()
        t2.start()
    
        t1.join()
        t2.join()
    
        print("Done...{0}".format(sum))

     

8.线程安全问题:

  •             如果一个资源/变量,他对于多线程来讲,不用加锁也不会引起任何问题,则称为线程安全
  •             线程不安全变量类型: list, set, dict
  •             线程安全变量类型: queue

9.生产者消费者问题:

  •             一个模型,可以用来搭建消息队列
  •             queue是一个用来存放变量的数据结构,特点是先进先出,内部元素排队,可以理解成一个特殊的list
    import threading
    import time
    import queue
    
    
    class Produce(threading.Thread):
        def run(self):
            global queue
            count = 0
            while True:
                if queue.qsize() < 1000:
                    for i in range(100):
                        count += 1
                        msg = '生成产品' + str(count)
                        queue.put(msg)
                        print(msg)
                time.sleep(0.5)
    
    
    class Consumer(threading.Thread):
        def run(self):
            global queue
            count = 0
            while True:
                if queue.qsize() > 100:
                    for i in range(3):
                        count += 1
                        msg = self.name + '消费了' + queue.get()
                        print(msg)
                time.sleep(1)
    
    
    if __name__ == '__main__':
        queue = queue.Queue()
    
        for i in range(500):
            queue.put('初始生产' + str(i))
        for i in range(2):
            p = Produce()
            p.start()
        for i in range(5):
            c = Consumer()
            c.start()

     

  •             死锁问题
    import threading
    import time
    
    lock_1 = threading.Lock()
    lock_2 = threading.Lock()
    
    
    def func_1():
        print("func_1 starting.........")
        lock_1.acquire()
        print("func_1 申请了 lock_1....")
        time.sleep(2)
        print("func_1 等待 lock_2.......")
        lock_2.acquire()
        print("func_1 申请了 lock_2.......")
    
        lock_2.release()
        print("func_1 释放了 lock_2")
    
        lock_1.release()
        print("func_1 释放了 lock_1")
    
        print("func_1 done..........")
    
    
    def func_2():
        print("func_2 starting.........")
        lock_2.acquire()
        print("func_2 申请了 lock_2....")
        time.sleep(4)
        print("func_2 等待 lock_1.......")
        lock_1.acquire()
        print("func_2 申请了 lock_1.......")
    
        lock_1.release()
        print("func_2 释放了 lock_1")
    
        lock_2.release()
        print("func_2 释放了 lock_2")
    
        print("func_2 done..........")
    
    
    if __name__ == "__main__":
    
        print("主程序启动..............")
        t1 = threading.Thread(target=func_1, args=())
        t2 = threading.Thread(target=func_2, args=())
    
        t1.start()
        t2.start()
    
        t1.join()
        t2.join()
    
        print("主程序启动..............")

     

  •             锁的等待时间问题
    import time
    import threading
    
    
    lock_1 = threading.Lock()
    lock_2 = threading.Lock()
    
    
    def func_1():
        print("func_1 starting...")
        lock_1.acquire(timeout=4)
        print("func_1 申请 lock_1....")
        time.sleep(2)
        print("func_1 等待 lock_2")
    
        rst = lock_2.acquire(timeout=2)
        if rst:
            print("func_1 已经得到了 lock_2")
            lock_2.release()
            print("func_1 释放了锁 lock_2")
        else:
            print("func_1 没有申请到lock_2")
    
        lock_1.release()
        print("func_1 释放了 lock_1")
    
        print("func_1 done....")
    
    
    def func_2():
        print("func_2 starting.........")
        lock_2.acquire()
        print("func_2 申请了 lock_2....")
        time.sleep(4)
        print("func_2 等待 lock_1.......")
        lock_1.acquire()
        print("func_2 申请了 lock_1.......")
    
        lock_1.release()
        print("func_2 释放了 lock_1")
    
        lock_2.release()
        print("func_2 释放了 lock_2")
    
        print("func_2 done..........")
    
    
    if __name__ == '__main__':
        print('主线程启动')
        t1 = threading.Thread(target=func_1, args=())
        t2 = threading.Thread(target=func_2, args=())
    
        t1.start()
        t2.start()
    
        t1.join()
        t2.join()
    
        print("all thread end")

     

  •             semphore
  •             允许一个资源最多由几个多线程同时使用
    import threading
    import time
    
    
    # 参数定义最多几个同时使用资源
    semaphore = threading.Semaphore(3)
    
    
    def func():
        if semaphore.acquire(5):
            print(threading.currentThread().getName() + 'get semaphore' + time.ctime())
        time.sleep(15)
        semaphore.release()
        print(threading.currentThread().getName() + "release semaphore" + time.ctime())
    
    
    for i in range(6):
        t1 = threading.Thread(target=func)
        t1.start()

     

10.threading.Timer:

  •            Timer是利用多线程,在指定时间后启动一个功能 
    import threading
    import time
    
    
    def func():
        print("I am running...")
        time.sleep(4)
        print("I am done...")
    
    
    if __name__ == '__main__':
        t = threading.Timer(2, func)
        t.start()
    
        # i = 0
        # while True:
        for i in range(5):
            print("{0}****".format(i))
            time.sleep(3)
            i += 1

     

11.可重入锁:

  •         一个锁,可以被一个线程多次申请
  •         主要解决递归调用的时候,需要申请锁的情况
    import threading
    import time
    
    
    class MyThread(threading.Thread):
        def run(self):
            global num
            time.sleep(1)
    
            if mutex.acquire(3):
                num = num + 1
                msg = self.name + 'set num to' + str(num)
                print(msg)
                mutex.acquire()
                mutex.release()
                mutex.release()
    
    
    num = 0
    mutex = threading.RLock()
    
    
    def func_1():
        for i in range(5):
            t = MyThread()
            t.start()
    
    
    if __name__ == '__main__':
        func_1()

     

# 线程替代方案

 

1. subprocess

    完全跳过线程,使用进程

    是派生进程的主要替代方案

2. multiprocessiong

    使用threadiing借口派生,使用子进程

    允许为多核或者多cpu派生进程,接口跟threading非常相似 

3. concurrent.futures

    新的异步执行模块

    任务级别的操作

# 多进程

进程间通讯(InterprocessCommunication, IPC )

进程之间无任何共享状态

进程的创建

直接生成Process实例对象

import time
import multiprocessing


def clock(interval):
    while True:
        print("The time is %s" % time.ctime())
        time.sleep(interval)


if __name__ == '__main__':
    p = multiprocessing.Process(target=clock, args=(5, ))
    print("sleep...")
    p.start()

    派生子类

import multiprocessing
from time import sleep, ctime


class ClockProcess(multiprocessing.Process):
    '''
    两个函数比较
    init
    run
    '''

    def __init__(self, interval):
        super().__init__()
        self.interval = interval

    def run(self):
        while True:
            print("The time is %s" % ctime())
            sleep(self.interval)


if __name__ == '__main__':
    p = ClockProcess(3)
    p.start()
    while True:
        print('sleep...')
        sleep(1)

在os中查看pid,ppid以及他们的关系              

    

import multiprocessing
import os


def info(title):
    print(title)
    print("module name:", __name__)
    print("parent process:", os.getppid())
    print("process id:", os.getpid())


def f(name):
    info('function f')
    print('hello', name)


if __name__ == '__main__':
    info('main line')
    p = multiprocessing.Process(target=f, args=('bob', ))
    p.start()
    p.join()

生产者消费者模型

    JoinableQueue

    

import multiprocessing
from time import ctime


def consumer(input_q):
    print("Into consumer:", ctime())
    while True:
        item = input_q.get()
        print("pull", item, "out of q")
        input_q.task_done()
    print("Out of consumer:", ctime)


def producer(sequence, output_q):
    print("Into producer:", ctime())
    for item in sequence:
        output_q.put(item)
        print("put", item, "into q")
    print("Out of producer:", ctime())


if __name__ == '__main__':
    q = multiprocessing.JoinableQueue()
    cons_p = multiprocessing.Process(target=consumer, args=(q, ))
    cons_p.daemon = True
    cons_p.start()

    sequence = [1, 2, 3, 4]
    producer(sequence, q)
    q.join()
    cons_p.join()

    队列中哨兵的使用

import multiprocessing
from time import ctime


def consumer(input_q):
    print("Into consumer:", ctime())
    while True:
        item = input_q.get()
        if item is None:
            break
        print("pull", item, "out of q")
        input_q.task_done()
    print("Out of consumer:", ctime)


def producer(sequence, output_q):
    print("Into producer:", ctime())
    for item in sequence:
        output_q.put(item)
        print("put", item, "into q")
    print("Out of producer:", ctime())


if __name__ == '__main__':
    q = multiprocessing.JoinableQueue()
    cons_p = multiprocessing.Process(target=consumer, args=(q, ))
    # cons_p.daemon = True
    cons_p.start()

    sequence = [1, 2, 3, 4]
    producer(sequence, q)
    q.put(None)
    q.join()
    cons_p.join()

    哨兵的改进

import multiprocessing
from time import ctime


def consumer(input_q):
    print("Into consumer:", ctime())
    while True:
        item = input_q.get()
        if item is None:
            break
        print("pull", item, "out of q")
        input_q.task_done()
    print("Out of consumer:", ctime)


def producer(sequence, output_q):
    print("Into producer:", ctime())
    for item in sequence:
        output_q.put(item)
        print("put", item, "into q")
    print("Out of producer:", ctime())


if __name__ == '__main__':
    q = multiprocessing.JoinableQueue()
    cons_p = multiprocessing.Process(target=consumer, args=(q, ))
    cons_p.daemon = True
    cons_p.start()

    sequence = [1, 2, 3, 4]
    producer(sequence, q)
    q.put(None)
    q.put(None)
    q.join()
    cons_p.join()

 

你可能感兴趣的:(Python 多进程VS多线程)