多线程栈OOM-jstack

failed: BoundedExecutor is in a failed state

 public void execute(Runnable task)
    {
        checkState(!failed.get(), "BoundedExecutor is in a failed state");

        queue.add(task);

        int size = queueSize.incrementAndGet();
        if (size <= maxThreads) {
            // If able to grab a permit (aka size <= maxThreads), then we are short exactly one draining thread
            try {
                coreExecutor.execute(this::drainQueue);
            }
            catch (Throwable e) {
                failed.set(true);
                log.error("BoundedExecutor state corrupted due to underlying executor failure");
                throw e;
            }
        }
    }

出现了线程OOM问题。上面一旦failed跑出来true就再也不行了。
2017-01-10T12:00:19.869+0800 ERROR Query-20170110_040019_00058_5rdfv-5997 io.airlift.concurrent.BoundedExecutor BoundedExecutor state corrupted due to underlying executor failure
2017-01-10T12:00:19.870+0800 ERROR hive-hive-2874 io.airlift.concurrent.BoundedExecutor BoundedExecutor state corrupted due to underlying executor failure
2017-01-10T12:00:19.870+0800 ERROR Query-20170110_040019_00057_5rdfv-5996 io.airlift.concurrent.BoundedExecutor BoundedExecutor state corrupted due to underlying executor failure
2017-01-10T12:00:19.872+0800 ERROR hive-hive-2856 io.airlift.concurrent.BoundedExecutor BoundedExecutor state corrupted due to underlying executor failure
2017-01-10T12:00:19.872+0800 WARN hive-hive-2874 com.facebook.presto.hive.util.ResumableTasks ResumableTask completed exceptionally
java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:714)
at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:950)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1368)
at io.airlift.concurrent.BoundedExecutor.execute(BoundedExecutor.java:62)
at com.facebook.presto.hive.HiveSplitManager$ErrorCodedExecutor.execute(HiveSplitManager.java:317)
at com.facebook.presto.hive.util.AsyncQueue.completeAsync(AsyncQueue.java:140)
at com.facebook.presto.hive.util.AsyncQueue.offer(AsyncQueue.java:94)
at com.facebook.presto.hive.HiveSplitSource.addToQueue(HiveSplitSource.java:67)
at com.facebook.presto.hive.HiveSplitSource.addToQueue(HiveSplitSource.java:59)
at com.facebook.presto.hive.BackgroundHiveSplitLoader.loadSplits(BackgroundHiveSplitLoader.java:247)
at com.facebook.presto.hive.BackgroundHiveSplitLoader.access$300(BackgroundHiveSplitLoader.java:78)
at com.facebook.presto.hive.BackgroundHiveSplitLoader$HiveSplitLoaderTask.process(BackgroundHiveSplitLoader.java:179)
at com.facebook.presto.hive.util.ResumableTasks.safeProcessTask(ResumableTasks.java:45)
at com.facebook.presto.hive.util.ResumableTasks.lambda$submit$1(ResumableTasks.java:33)
at io.airlift.concurrent.BoundedExecutor.drainQueue(BoundedExecutor.java:77)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

cat /proc/15676/status #查看进程状态
Threads: 325

java -XX:+PrintFlagsFinal -version | grep ThreadStackSize
intx CompilerThreadStackSize = 0
intx ThreadStackSize = 1024
intx VMThreadStackSize = 1024

cat /proc/sys/kernel/threads-max #允许最大的线程数
1030511

cat /proc/sys/kernel/pid_max #允许最大创建进程数
32768

pstree -p | wc -l #当前线程数
3283

pstree -p 15676 | wc -l #pid线程数
326

cat /proc/sys/vm/max_map_count #每一个进程最大VMA数量
65530

文件包含限制一个进程可以拥有的VMA(虚拟内存区域)的数量。虚拟内存区域是一个连续的虚拟地址空间区域。在进程的生命周期中,每当程序尝试在内存中映射文件,链接到共享内存段,或者分配堆空间的时候,这些区域将被创建。调优这个值将限制进程可拥有VMA的数量。限制一个进程拥有VMA的总数可能导致应用程序出错,因为当进程达到了VMA上线但又只能释放少量的内存给其他的内核进程使用时,操作系统会抛出内存不足的错误。如果你的操作系统在NORMAL区域仅占用少量的内存,那么调低这个值可以帮助释放内存给内核用。

pmap -x 15676 #查看单进程map的虚拟内存情况

sar -q #查看线程数(plist-sz)

http://blog.csdn.net/odailidong/article/details/50561257

vi /etc/security/limits.d/90-nproc.conf

内容如下:
* soft nproc 1024
root soft nproc unlimited

[hdfs@xxxx]$ ps h -Led -o user | sort | uniq -c | sort -n
806 hdfs
2786 root

ps -o nlwp,pid,lwp,args -u hdfs | sort -n
这里根据用户hdfs来查看第一列就是我们的线程数

你可能感兴趣的:(多线程栈OOM-jstack)