应用的运行环境
操作系统:AIX5.3
JRE:J2RE 5.0 IBM J9 2.3 AIX ppc-32 build j9vmap3223-20070426
应用服务器:WebLogic9.2
JRE参数设置:-Xms 1500m -Xmx 2000m
JRE垃圾回收模式:optthruput
IBM的JRE不用设置PermSize参数。
WebLogic启动一段时间后,从垃圾回收的日志来看,经过垃圾回收,内存的占用一般会稳定在200m左右。但是WebLogic一般会在30天左右出现OutOfMemory。错误信息如下:
<Nov 1, 2010 10:22:00 AM GMT+08:00> <Error> <Server> <AppSev> <AppServer> <DynamicListenThread[Default]> <<WLS Kernel>> <> <> <1288578120680> <BEA-002608> <The ListenThread because of an error: java.lang.OutOfMemoryError: ZIP004:OutOfMemoryError, MEM_ERROR in inflateInit2
java.lang.OutOfMemoryError: ZIP004:OutOfMemoryError, MEM_ERROR in inflateInit2
at java.util.zip.Inflater.init(Native Method)
at java.util.zip.Inflater.<init>(Inflater.java:105)
at java.util.zip.ZipFile.getInflater(ZipFile.java:416)
at java.util.zip.ZipFile.getInputStream(ZipFile.java:359)
at java.util.zip.ZipFile.getInputStream(ZipFile.java:324)
at java.util.jar.JarFile.getInputStream(JarFile.java:417)
at sun.net.www.protocol.jar.JarURLConnection.getInputStream(JarURLConnection.java:165)
at java.net.URL.openStream(URL.java:1041)
at java.lang.ClassLoader.getResourceAsStream(ClassLoader.java:460)
at java.util.ResourceBundle$1.run(ResourceBundle.java:1101)
at java.security.AccessController.doPrivileged(AccessController.java:193)
at java.util.ResourceBundle.loadBundle(ResourceBundle.java:1097)
at java.util.ResourceBundle.findBundle(ResourceBundle.java:942)
at java.util.ResourceBundle.getBundleImpl(ResourceBundle.java:760)
at java.util.ResourceBundle.getBundle(ResourceBundle.java:716)
at weblogic.i18ntools.L10nLookup.getLocalizerBundle(L10nLookup.java:349)
at weblogic.i18ntools.L10nLookup.getLocalizer(L10nLookup.java:328)
at weblogic.logging.MessageLogger.log(MessageLogger.java:77)
at weblogic.server.ServerLogger.logChannelFailed(ServerLogger.java:388)
at weblogic.server.channels.DynamicListenThread$SocketAccepter.onAcceptException(DynamicListenThread.java:567)
at weblogic.server.channels.DynamicListenThread$SocketAccepter.accept(DynamicListenThread.java:523)
at weblogic.server.channels.DynamicListenThread$SocketAccepter.access$200(DynamicListenThread.java:418)
at weblogic.server.channels.DynamicListenThread.run(DynamicListenThread.java:164)
at java.lang.Thread.run(Thread.java:801)
>
上述错误是压缩数据时的异常,与之类似的解压缩数据时的错误如下:
<Nov 1, 2010 10:22:07 AM GMT+08:00> <Error> <HTTP> <AppSev> <AppServer> <[ACTIVE] ExecuteThread: '335' for queue: 'weblogic.kernel.Default (self-tuning)'> <<WLS Kernel>> <> <> <1288578127963> <BEA-101017> <[weblogic.servlet.internal.WebAppServletContext@4eac4eac - appName: '_appsdir_AppService_war', name: 'AppService.war', context-path: '/AppService'] Root cause of ServletException.
java.lang.OutOfMemoryError: ZIP002:OutOfMemoryError, MEM_ERROR in deflate_init2
at java.util.zip.Deflater.init(Native Method)
at java.util.zip.Deflater.<init>(Deflater.java:148)
at java.util.zip.Deflater.<init>(Deflater.java:165)
at java.util.zip.DeflaterOutputStream.<init>(DeflaterOutputStream.java:102)
at com.app.util.ZipUtil.zip(ZipUtil.java:34)
at com.app.service.chgpwd.action.ChgPWAction.chgPW(ChgPWAction.java:103)
at sun.reflect.GeneratedMethodAccessor1739.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:615)
at com.app.base.BaseAction.dispatchMethod(BaseAction.java:227)
at com.app.base.BaseAction.execute(BaseAction.java:156)
at org.apache.struts.action.RequestProcessor.processActionPerform(RequestProcessor.java:421)
at org.apache.struts.action.RequestProcessor.process(RequestProcessor.java:226)
at org.apache.struts.action.ActionServlet.process(ActionServlet.java:1164)
at org.apache.struts.action.ActionServlet.doPost(ActionServlet.java:415)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:763)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:856)
at weblogic.servlet.internal.StubSecurityHelper$ServletServiceAction.run(StubSecurityHelper.java:225)
at weblogic.servlet.internal.StubSecurityHelper.invokeServlet(StubSecurityHelper.java:127)
at weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:283)
at weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:175)
at weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.run(WebAppServletContext.java:3214)
at weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:321)
at weblogic.security.service.SecurityManager.runAs(SecurityManager.java:121)
at weblogic.servlet.internal.WebAppServletContext.securedExecute(WebAppServletContext.java:1983)
at weblogic.servlet.internal.WebAppServletContext.execute(WebAppServletContext.java:1890)
at weblogic.servlet.internal.ServletRequestImpl.run(ServletRequestImpl.java:1344)
at weblogic.work.ExecuteThread.execute(ExecuteThread.java:209)
at weblogic.work.ExecuteThread.run(ExecuteThread.java:181)
>
上述错误中“com.app”包下的类为应用内的类库,应用中确实用到了压缩和解压缩。从错误上来看确实是压缩或解压缩中出现的OOM,不过从WebLogic的日志中来看内存并未达到设置的最大值,日志片段如下:
<Nov 1, 2010 10:14:34 AM GMT+08:00> <Info> <Health> <AppSev> <AppServer> <weblogic.GCMonitor> <<anonymous>> <> <> <1288577674879> <BEA-310002> <55% of the total memory in the server is free>
<Nov 1, 2010 10:15:34 AM GMT+08:00> <Info> <Health> <AppSev> <AppServer> <weblogic.GCMonitor> <<anonymous>> <> <> <1288577734889> <BEA-310002> <23% of the total memory in the server is free>
<Nov 1, 2010 10:16:34 AM GMT+08:00> <Info> <Health> <AppSev> <AppServer> <weblogic.GCMonitor> <<anonymous>> <> <> <1288577794895> <BEA-310002> <67% of the total memory in the server is free>
<Nov 1, 2010 10:17:34 AM GMT+08:00> <Info> <Health> <AppSev> <AppServer> <weblogic.GCMonitor> <<anonymous>> <> <> <1288577854898> <BEA-310002> <34% of the total memory in the server is free>
<Nov 1, 2010 10:18:34 AM GMT+08:00> <Info> <Health> <AppSev> <AppServer> <weblogic.GCMonitor> <<anonymous>> <> <> <1288577914899> <BEA-310002> <3% of the total memory in the server is free>
<Nov 1, 2010 10:20:45 AM GMT+08:00> <Info> <Health> <AppSev> <AppServer> <weblogic.GCMonitor> <<anonymous>> <> <> <1288578045695> <BEA-310002> <53% of the total memory in the server is free>
郁郁啊。何故呢?
IBM的支持上有个貌似的解答,http://www-01.ibm.com/support/docview.wss?uid=swg21227106
内容如下:
Improper closing of deflater object results in native memory exhaustion
Cause |
Not explicitly closing and ending a deflater object (such as, DeflaterOutputStream, GZIPOutputStream or ZipOutputStream) causes a native memory leak. This native leak can result in OutOfMemoryError's in the WebSphere Application Server logs as well as JVM hangs and crashes. The following error in the WebSphere Application Server SystemOut.log or SystemErr.log files is a potential indicator of a native issue caused by this type of deflater leak: |
Resolving the problem |
To avoid the leak, the application programmer should insure that all deflater objects are explicitly closed and ended. The following is a code for properly ending a deflater:
|
应用中的相关流都进行了close。在分析日志时发现WebLogic本身也存在压缩和解压缩类库的调用,而且上述第一个错误就是WebLogic自身的错误。还会是哪里的问题呢?探索ing。
经过对自身应用程序的多次优化,以及查询若干WebLogic宕机的资料后,我怀疑是WebLogic的bug,最后把bug定位8173442上,能搜索到一片博文不错,链接:http://www.hashei.me/tag/cr370915。这个bug在WebLogic8至10的产品中都有存在,不过bug代码略有不同。
2010年的最后一天,在客户允许维护的情况下,对WebLogic进行的补丁升级,升级完成后,服务器除硬件或软件维护升级需要正常关闭WebLogic外,WebLogic未出现过宕机的情况,其中最长一次的正常运行时间为3个月。暂时认为此问题解决。