问题排查

在从kafka获取数据消费过程中，使用ThreadPoolExecutor（线程池），在数据量大的情况下，导致内存泄露，机器卡死挂掉；
伪代码为：

def deal_func(msg):
    # 处理相关逻辑
    pass

pool = ThreadPoolExecutor(10)
# 链接kafka
while True：
    msg = client.poll(0.1)
    # 调用
    pool.submit(deal_func, msg)

在调用该程序之后，内存直线上升；在查询文章后，发现原因，在循环调用线程池时，进程会不断的往线程池中扔任务，而不会判断，等待线程池中是否存在空闲进程；
验证程序：

import time
from concurrent.futures import ThreadPoolExecutor
def func(num):
    print(f'the {num} run')
    time.sleep(2)
    return num * num

def main():
    pool = ThreadPoolExecutor(2)
    result = []
    for i in range(100):
        pool.submit(func, i)
        print(pool._work_queue.qsize())
    pool.shutdown()

if __name__ == '__main__':
    main()

执行结果如下：

1
...
95
96
97
98
the 2 runthe 3 run
the 4 run
the 5 run
the 6 run
the 7 run

可以看出当前进程并未判断线程池是否存在空闲进程；
进程向线程池下发任务是哪个位置进行下发呢？
查看源码，可以发现：

class ThreadPoolExecutor(_base.Executor):

        #...省略代码
        self._max_workers = max_workers
        self._work_queue = queue.SimpleQueue()
        self._threads = set()
        self._broken = False
        self._shutdown = False

代码中新建了SimpleQueue为线程池进行通信的队列对象，但是没有指定队列大小，在不断向线程池加入消息过程中，线程池消费速度跟不上生产速度，队列中就会不断积压消息，导致进程内存不断增加；

解决方式

既然线程池使用的为无界队列，那么就可以将类重写，并使用有界队列，如：

import queue
from concurrent.futures import ThreadPoolExecutor

class BoundThreadPoolExecutor(ThreadPoolExecutor):

    def __init__(self, *args, **kwargs):
        super(BoundThreadPoolExecutor, self).__init__(*args, **kwargs)
        self._work_queue = queue.Queue(2)

在初始化函数__init__中重写线程池的队列对象，赋值为Queue(2)，这样就可以限制队列的大小，并且在执行过程中，队列如果满了，那么程序就会等待线程池，直到线程池存在空闲线程；

到此为止，解决了进程为什么会内存溢出，但是ThreadPoolExcutor是如何实现的呢，可以进一步分析源码；见下一章节；

ThreadPoolExecutor使用后内存溢出（一）

问题排查

解决方式

你可能感兴趣的:(python,thread)