问题场景:在开发服上新增了一个http接口 ,测试组人员在测试这个接口一定次数后稳定会出现http服务进程没有挂掉,但是任何http接口都不再响应。
HttpServer 代码如下:
public class HttpServer extends Server {
public HttpServer(int port) {
super(port);
}
protected com.sun.net.httpserver.HttpServer server;
//固定长度线程池
protected ExecutorService executor = Executors.newFixedThreadPool(2*Runtime.getRuntime().availableProcessors()+1);
public void bootstrap() {
try {
server = com.sun.net.httpserver.HttpServer.create(new InetSocketAddress(port), 0);
server.setExecutor(executor);
server.createContext("/", new HttpHandler());
server.start();
System.out.println("HttpServer bind: " + port);
} catch (Throwable e) {
e.printStackTrace();
}
}
public void shutdown() {
executor.shutdown();
server.stop(0);
System.out.println("HttpServer unbind: " + port);
}
}
注: Executors.newFixedThreadPool(2*Runtime.getRuntime().availableProcessors()+1); 这里设置的线程大小 是参考文章 http://ifeve.com/how-to-calculate-threadpool-size/ 得来,这里的线程池长度为2x4+1
在分析上述场景问题的原因之前 先来了解下 固定线程池 newFixedThreadPool 和 缓存线程池 newCachedThreadPool
现在就来看看上述问题产生的原因:
问题场景中提到了 测试人员测试新接口一定次数后会出现任何http 请求都无响应。
得知这个问题后我的第一反应是不是进程挂了,赶紧登陆云服务器查看 结果http服务进程正常运行,并没有异常结束。然后进入阿里云(服务器选择的是阿里云服务器)后台监控观察系统和进程的 cpu,内存 ,网络 都显示正常。然后分析http服务程序的日志文件发现只有请求但是没有响应,而且是任何http请求都这样。每个http请求都会从线程池拿线程处理,这时怀疑是线程池出问题,马上用jstack 查看http服务进程里的线程信息,发现有9个同一个接口的阻塞线程 而且都是那个新增的接口处理线程。然后通过线程信息找到代码中指定位置找到问题:该接口存在超时操作且没有做超时结束处理,会长时间阻塞。
每次访问这个异常接口都会阻塞一个从线程池拿到的线程一直不释放,而 jstack 打印的线程信息里面有9个处理该错误问题的阻塞线程,再次访问错误接口再去查看线程信息 还是9个阻塞,这样看来 http服务程序肯定用的是固定长度的线程池 。
查看HttpServer 源码(上面已经贴出来了) ,发现果然 用的是固定长度的线程池:
//固定长度线程池
protected ExecutorService executor = Executors.newFixedThreadPool(2*Runtime.getRuntime().availableProcessors()+1);
@Override
public void execute() { // 接口 /wait/test 处理逻辑
try {
while(true){
Thread.sleep(2000);//这行代码在 HttpCmdWaitTest 文件的第14行,下面线程信息排错,可以定位到这里
}
} catch (InterruptedException e) {
e.printStackTrace();
}
//
}
请求这个接口超过9次后 ,jstack pid打印线程信息(java程序进程号pid 的查找,可以用jps -m -l 或 ps -aux | grep java 找到的java程序的进程id ,如果没有安装jdk开发调试工具,无法使用jps 、jstack的话可以看下我的另一篇博文介绍怎么安装 https://blog.csdn.net/a704397849/article/details/88541377 )
Full thread dump OpenJDK 64-Bit Server VM (25.201-b09 mixed mode):
"Attach Listener" #23 daemon prio=9 os_prio=0 tid=0x00007fd1ec003800 nid=0x517f waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"pool-2-thread-9" #22 prio=5 os_prio=0 tid=0x00007fd1e80c5800 nid=0x4fa7 waiting on condition [0x00007fd1f10a0000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at service.test.HttpCmdWaitTest.execute(HttpCmdWaitTest.java:14)
at frame.Command.start(Command.java:29)
at frame.http.HttpCmd.start(HttpCmd.java:26)
at frame.http.HttpHandler.handle(HttpHandler.java:34)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
at sun.net.httpserver.AuthFilter.doFilter(AuthFilter.java:83)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:82)
at sun.net.httpserver.ServerImpl$Exchange$LinkHandler.handle(ServerImpl.java:675)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
at sun.net.httpserver.ServerImpl$Exchange.run(ServerImpl.java:647)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"pool-2-thread-8" #21 prio=5 os_prio=0 tid=0x00007fd1e80c4000 nid=0x4f97 waiting on condition [0x00007fd1f10e1000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at service.test.HttpCmdWaitTest.execute(HttpCmdWaitTest.java:14)
at frame.Command.start(Command.java:29)
at frame.http.HttpCmd.start(HttpCmd.java:26)
at frame.http.HttpHandler.handle(HttpHandler.java:34)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
at sun.net.httpserver.AuthFilter.doFilter(AuthFilter.java:83)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:82)
at sun.net.httpserver.ServerImpl$Exchange$LinkHandler.handle(ServerImpl.java:675)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
at sun.net.httpserver.ServerImpl$Exchange.run(ServerImpl.java:647)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"pool-2-thread-7" #20 prio=5 os_prio=0 tid=0x00007fd1e8080800 nid=0x4f94 waiting on condition [0x00007fd1f1122000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at service.test.HttpCmdWaitTest.execute(HttpCmdWaitTest.java:14)
at frame.Command.start(Command.java:29)
at frame.http.HttpCmd.start(HttpCmd.java:26)
at frame.http.HttpHandler.handle(HttpHandler.java:34)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
at sun.net.httpserver.AuthFilter.doFilter(AuthFilter.java:83)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:82)
at sun.net.httpserver.ServerImpl$Exchange$LinkHandler.handle(ServerImpl.java:675)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
at sun.net.httpserver.ServerImpl$Exchange.run(ServerImpl.java:647)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"pool-2-thread-6" #19 prio=5 os_prio=0 tid=0x00007fd1e80c2000 nid=0x4f91 waiting on condition [0x00007fd1f1163000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at service.test.HttpCmdWaitTest.execute(HttpCmdWaitTest.java:14)
at frame.Command.start(Command.java:29)
at frame.http.HttpCmd.start(HttpCmd.java:26)
at frame.http.HttpHandler.handle(HttpHandler.java:34)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
at sun.net.httpserver.AuthFilter.doFilter(AuthFilter.java:83)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:82)
at sun.net.httpserver.ServerImpl$Exchange$LinkHandler.handle(ServerImpl.java:675)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
at sun.net.httpserver.ServerImpl$Exchange.run(ServerImpl.java:647)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"pool-2-thread-5" #18 prio=5 os_prio=0 tid=0x00007fd1e80c0000 nid=0x4f8e waiting on condition [0x00007fd1f11a4000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at service.test.HttpCmdWaitTest.execute(HttpCmdWaitTest.java:14)
at frame.Command.start(Command.java:29)
at frame.http.HttpCmd.start(HttpCmd.java:26)
at frame.http.HttpHandler.handle(HttpHandler.java:34)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
at sun.net.httpserver.AuthFilter.doFilter(AuthFilter.java:83)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:82)
at sun.net.httpserver.ServerImpl$Exchange$LinkHandler.handle(ServerImpl.java:675)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
at sun.net.httpserver.ServerImpl$Exchange.run(ServerImpl.java:647)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"pool-2-thread-4" #17 prio=5 os_prio=0 tid=0x00007fd1e80be000 nid=0x4f8d waiting on condition [0x00007fd1f11e5000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at service.test.HttpCmdWaitTest.execute(HttpCmdWaitTest.java:14)
at frame.Command.start(Command.java:29)
at frame.http.HttpCmd.start(HttpCmd.java:26)
at frame.http.HttpHandler.handle(HttpHandler.java:34)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
at sun.net.httpserver.AuthFilter.doFilter(AuthFilter.java:83)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:82)
at sun.net.httpserver.ServerImpl$Exchange$LinkHandler.handle(ServerImpl.java:675)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
at sun.net.httpserver.ServerImpl$Exchange.run(ServerImpl.java:647)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"pool-2-thread-3" #16 prio=5 os_prio=0 tid=0x00007fd1e80bc000 nid=0x4f8c waiting on condition [0x00007fd1f1226000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at service.test.HttpCmdWaitTest.execute(HttpCmdWaitTest.java:14)
at frame.Command.start(Command.java:29)
at frame.http.HttpCmd.start(HttpCmd.java:26)
at frame.http.HttpHandler.handle(HttpHandler.java:34)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
at sun.net.httpserver.AuthFilter.doFilter(AuthFilter.java:83)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:82)
at sun.net.httpserver.ServerImpl$Exchange$LinkHandler.handle(ServerImpl.java:675)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
at sun.net.httpserver.ServerImpl$Exchange.run(ServerImpl.java:647)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"pool-2-thread-2" #15 prio=5 os_prio=0 tid=0x00007fd1e80f9800 nid=0x4f86 waiting on condition [0x00007fd1f1267000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at service.test.HttpCmdWaitTest.execute(HttpCmdWaitTest.java:14)
at frame.Command.start(Command.java:29)
at frame.http.HttpCmd.start(HttpCmd.java:26)
at frame.http.HttpHandler.handle(HttpHandler.java:34)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
at sun.net.httpserver.AuthFilter.doFilter(AuthFilter.java:83)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:82)
at sun.net.httpserver.ServerImpl$Exchange$LinkHandler.handle(ServerImpl.java:675)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
at sun.net.httpserver.ServerImpl$Exchange.run(ServerImpl.java:647)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"pool-2-thread-1" #14 prio=5 os_prio=0 tid=0x00007fd1e802e800 nid=0x4f85 waiting on condition [0x00007fd1f12a8000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at service.test.HttpCmdWaitTest.execute(HttpCmdWaitTest.java:14)
at frame.Command.start(Command.java:29)
at frame.http.HttpCmd.start(HttpCmd.java:26)
at frame.http.HttpHandler.handle(HttpHandler.java:34)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
at sun.net.httpserver.AuthFilter.doFilter(AuthFilter.java:83)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:82)
at sun.net.httpserver.ServerImpl$Exchange$LinkHandler.handle(ServerImpl.java:675)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
at sun.net.httpserver.ServerImpl$Exchange.run(ServerImpl.java:647)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"DestroyJavaVM" #13 prio=5 os_prio=0 tid=0x00007fd21404b800 nid=0x4f5b waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Thread-3" #11 prio=5 os_prio=0 tid=0x00007fd2141d2000 nid=0x4f65 runnable [0x00007fd204078000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
- locked <0x00000000f8191ae0> (a sun.nio.ch.Util$3)
- locked <0x00000000f8191a58> (a java.util.Collections$UnmodifiableSet)
- locked <0x00000000f8191380> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
at sun.net.httpserver.ServerImpl$Dispatcher.run(ServerImpl.java:352)
at java.lang.Thread.run(Thread.java:748)
"server-timer" #10 daemon prio=5 os_prio=0 tid=0x00007fd2141d0000 nid=0x4f64 in Object.wait() [0x00007fd2040b9000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000000f81a2ff8> (a java.util.TaskQueue)
at java.util.TimerThread.mainLoop(Timer.java:552)
- locked <0x00000000f81a2ff8> (a java.util.TaskQueue)
at java.util.TimerThread.run(Timer.java:505)
"Service Thread" #7 daemon prio=9 os_prio=0 tid=0x00007fd21413e800 nid=0x4f62 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C1 CompilerThread1" #6 daemon prio=9 os_prio=0 tid=0x00007fd21413b800 nid=0x4f61 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread0" #5 daemon prio=9 os_prio=0 tid=0x00007fd21412d000 nid=0x4f60 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Signal Dispatcher" #4 daemon prio=9 os_prio=0 tid=0x00007fd21412a800 nid=0x4f5f runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00007fd214101000 nid=0x4f5e in Object.wait() [0x00007fd21806d000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000000f8008ed0> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144)
- locked <0x00000000f8008ed0> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:165)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:216)
"Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00007fd2140fc800 nid=0x4f5d in Object.wait() [0x00007fd2180ae000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000000f8006bf8> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:502)
at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
- locked <0x00000000f8006bf8> (a java.lang.ref.Reference$Lock)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)
"VM Thread" os_prio=0 tid=0x00007fd2140f2800 nid=0x4f5c runnable
"VM Periodic Task Thread" os_prio=0 tid=0x00007fd214141000 nid=0x4f63 waiting on condition
JNI global references: 11
分析上述进程信息可以看出 at service.test.HttpCmdWaitTest.execute(HttpCmdWaitTest.java:14) 出现了9次 , 确实是这个异常接口阻塞了9次。
那么剩下的就是分析这个接口是什么问题了,当然我这里的错误原因就是: 这里故意写的死循环。
注: 这里只是简单介绍了用jstack 来打印线程信息 , jstack命令还有很多功能用法这里就暂不做介绍了,关于jvm调优,排错,以及方便的图表查看、监控jvm中进程信息的工具如 :jvisualvm 、jConsole 等的使用 后面有时间也会专门写博介绍。
上面介绍了固定长度线程池newFixedThreadPool无法响应请求
那么 可能就会有人提出 既然固定长度线程池中的线程数量有限,实际项目中可能 存在有些接口处理起来确实会耗时,容易出现由于多个处理比较慢的接口请求导致固定线程池线程都被占用,从而出现后续接口排队等待甚至超时无响应的问题。
能不能将固定线程池改为缓存线程池,这样就没有这个长度限制了。
不能简单的这么改变,如果将固定长度线程池改为缓存线程池又会跳到另一个坑里面,缓存线程池会不停的增长直至整个系统分配的空间不足以创建新的线程了就会导致内存溢出。
最好的解决办法就是优化耗时比较久的接口将时间缩短。如果实在不行,还可以用消息队列,将耗时操作放入队列慢慢依次取出执行的方法,具体的操作还要看需求怎么定义。
我个人认为(包括我所看到的不少项目中一些提供服务请求的采用的都是固定长度的线程池)还是用固定线程池在这里会好一点。当然不是说的绝对,还是要看实际项目的需要。只有你真正熟练掌握固定长度线程池和缓存线程池的功能特点,以及对自己的项目需求明确,才会知道该如何选择。