Plug memory leaks in enterprise Java applications

The Java automatic garbage collection process typically operates as a low-priority thread that constantly searches memory for unreachable objects, objects not referenced by any other object reachable by a live thread. Different JVMs use different algorithms to determine how to collect garbage most efficiently.

In the JVM, memory is allocated in two regions:

  • Stack: Where local variables (declared in methods and constructors) are allocated. Local variables are allocated when a method is invoked and de-allocated when the method is exited.
  • Heap: Where all objects created with the new keyword are allocated. Since local variables are few in number, only primitive types and references, usually the stack will not overflow, except in cases of unusually deep or infinite recursion. The JVM throws a Java out-of-memory error if it is not able to get more memory in the heap to allocate more Java objects. The JVM cannot allocate more objects if the heap is full of live objects and unable to expand further.

Causes of memory leaks in Java
The four typical causes of memory leaks in a Java program are:

  1. Unknown or unwanted object references: These objects are no longer needed, but the garbage collector can not reclaim the memory because another object still refers to it. Long-living (static) objects: These objects stay in the memory for the application's full lifetime. Objects tagged to the session may also have the same lifetime as the session, which is created per user and remains until the user logs out of the application. Failure to clean up or free native system resources: Native system resources are resources allocated by a function external to Java, typically native code written in C or C++. Java Native Interface (JNI) APIs are used to embed native libraries/code into Java code.
  2. Bugs in the JDK or third-party libraries: Bugs in various versions of the JDK or in the Abstract Window Toolkit and Swing packages can cause memory leaks.

Detection of memory leaks
Some approaches to detecting memory leaks follow in the list below:

  1. Use the operating system process monitor, which tells how much memory a process is using. On Microsoft Windows, the task manager can show memory usage when a Java program is running. This mechanism gives a coarse-grained idea of a process's total memory utilization. Use the totalMemory() and freeMemory() methods in the Java runtime class, which shows how much total heap memory is being controlled by the JVM, along with how much is not in use at a particular time. This mechanism can provide the heap's memory utilization. However, details of the heap utilization indicating the objects in the heap are not shown.
  2. Use memory-profiling tools, e.g., JProbe Memory Debugger. These tools can provide a runtime picture of the heap and its utilization. However, they can be used only during development, not deployment, as they slow application performance considerably.

Causes of memory leaks in enterprise Java applications
In the subsequent sections, I analyze some causes of memory leaks in enterprise Java applications using a sample application and a memory profiling tool. I also suggest strategies for detecting and plugging such leaks in your own projects. ResultSet and Statement Objects
The Statement and ResultSet interfaces are used with Java Database Connectivity (JDBC) APIs. Statement/PreparedStatment objects are used for executing a SQL Statement; ResultSet objects are used for storing SQL queries' results. A Java Enterprise Edition (Java EE) application usually connects to the database by either making a direct connection to the database using JDBC thin drivers provided by the database vendor or creating a pool of database connections within the Java EE container using the JDBC drivers. If the application directly connects to the database, then on calling the close() method on the connection object, the database connection closes and the associated Statement and ResultSet objects close and are garbage collected. If a connection pool is used, a request to the database is made using one of the existing connections in the pool. In this case, on calling close() on the connection object, the database connection returns to the pool. So merely closing the connection does not automatically close the ResultSet and Statement objects. As a result, ResultSet and Statement will not become eligible for garbage collection, as they continue to remain tagged with the database connection in the connection pool. To investigate the memory leak caused by not closing Statement and ResultSet objects while using a connection pool, I used a sample Java EE application that queries a database table and displays the results in a JSP (JavaServer Pages) page. It also allows you to save records to the database table. The application is deployed in iPlanet App Server 6.0. I used JProbe to analyze the memory utilization by the application. The sample application uses a database table with the following structure:

ID   NUMBER
NAME   VARCHAR2(300)
STREET   VARCHAR(500)
CITY   VARCHAR(500)
STATE   VARCHAR(200)
CREATEDON   DATE
VERSIONNO   NUMBER
DELETESTATUS   NUMBER
UPDATEDBY   VARCHAR(20)
UPDATEDON   DATE

First, I executed the application with the Statement and ResultSet objects closed. Subsequently, I executed the application by not closing the Statement and ResultSet objects. I did a query operation 50 times and observed the memory usage pattern. Scenario 1
The database table contains 100 rows and 10 columns. ResultSet and Statement objects are closed. The database connection is made using a connection pool. The memory usage results of this scenario are shown in Figure 1.


Figure 1. When queried once, the heap memory usage increases by 166.308 KB. Click on thumbnail to view full-sized image.

Figure 1 is a heap usage chart provided by JProbe. It gives a runtime summary of the heap memory in use over time as the Java EE application runs. The green area indicates the heap usage. The vertical line indicates a heap usage checkpoint has been set at that time. After setting the checkpoint, the query occurs and the heap memory usage shoots up as objects are created. Once the operation completes, the objects no longer referenced will be garbage collected by the JVM, so the memory usage decreases. Ideally at this time, all new objects should be released and garbage collected, and the heap usage should return to the value before the checkpoint was set. In this case, some new objects continue to occupy memory space, reflecting an increase in heap usage by 166.308 KB. When queried 10 times, the heap memory usage increases by 175.512 KB, as illustrated in Figure 2.


Figure 2. Ten queries. Click on thumbnail to view full-sized image.

When queried 50 times, the heap memory usage increases by 194.128 KB, as shown in Figure 3.


Figure 3. Fifty queries. Click on thumbnail to view full-sized image.

The observed increase in memory was traced to the connection objects stored in the pool for subsequent reuse. Scenario 2
The database table contains 100 rows and 10 columns. ResultSet and Statement objects are not closed. The database connection is made using a connection pool. When queried once, the heap memory usage increases by 187.356 KB, as shown in Figure 4.


Figure 4. Results from one query. Click on thumbnail to view full-sized image.

When queried 10 times, the heap memory usage increases by 217.016 KB.


Figure 5. Ten queries. Click on thumbnail to view full-sized image.

When queried 50 times, the heap memory usage increases by 425.404 KB


Figure 6. Fifty queries. Click on thumbnail to view full-sized image.

The difference in memory usage after 50 queries with open ResultSet and Statement objects is 231.276 KB. These results show that over time, these objects will cause a huge memory leak, thereby generating an OutOfMemoryError. In addition to the heap usage chart, JProbe also provides a runtime view of class instances in the heap. From the class instance summary, we can identify the objects present in the heap at any point in time. Figure 7 shows a part of the class instance view of Scenario 2.


Figure 7. Class instance summary. Click on thumbnail to view full-sized image.

Figure 7 clearly shows that 50 objects of OracleStatement, 500 objects of DBColumn, etc., exist in the heap and are not garbage collected. JProbe provides a reference/referrer tree for each class instance in the table, shown in Figure 8. From this tree we can identify how each class instance was created.


Figure 8. Referrer tree for the DBColumn object

From the referrer tree of DBColumn, we can see that it is created by the OracleStatement object. The class oracle.jdbc.driver.OracleStatement is the implementation for the Statement interface. So by closing the Statement object, all associated DBColumn objects will be garbage collected. Recommendation
When using connection pools, and when calling close() on the connection object, the connection returns to the pool for reuse. It doesn't actually close the connection. Thus, the associated Statement and ResultSet objects remain in the memory. Hence, JDBC Statement and ResultSet objects must be explicitly closed in a finally block. Collection objects
A collection is an object that organizes references to other objects. The collection itself has a long lifetime, but the elements in the collection do not. Hence, a memory leak will result if items are not removed from the collection when they are no longer needed. Java provides the Collection interface and implementation classes of this interface such as ArrayList and Vector. Using the same Java EE application tested in the previous scenario, I added the database query results to an ArrayList. When 35,000 rows were present in the database table, the application server threw a java.lang.OutOfMemoryError, with a default JVM heap size of 64 MB.


Figure 9. Heap summary when JVM throws java.lang.OutOfMemoryError. Click on thumbnail to view full-sized image.

A collection with no policy for removing data causes a memory leak, known as the Leak Collection anti-pattern (read J2EE Design Patterns, for more information on anti-patterns). Recommendation
When collections are used, the object references stored in the collections should be programmatically cleaned to ensure that the collection size does not grow indefinitely. If the collection is being used to store a large table's query results, data access should be completed in batches. Static variables and classes
In Java, usually a class member (variable or method) is accessed in conjunction with an object of its class. In the case of static variables and methods, it is possible to use a class member without creating an instance of its class. A class with static members is known as a static class. In such cases, before a class instance is created, an object of its class will also be created by the JVM. The class object is allocated to the heap itself. The primordial class loader will load the class object. In the case of static classes, all the static members will also be instantiated along with the class object. Once the variable is initialized with data (typically an object), the variable remains in memory as long as the class that defines it stays in memory. If the primordial class loader loads class instances, they will stay in memory for the duration of the program and are not eligible for garbage collection. So static classes and associated static variables will never be garbage collected. Thus, using too many static variables leads to memory leaks. Recommendation
Usage of static classes should be minimized as they stay in memory for the lifetime of the application. The Singleton pattern
The Singleton pattern is an object-oriented design pattern used to ensure that a class has only one instance and provide a global point of access to that instance. The Singleton pattern can be implemented by doing the following:

  • Implementing a static method that returns an instance of the class Making the constructor private so a class instance can be created only through the static method
  • Using a static variable to store the class instance

Example code for the Singleton pattern follows:

public class Singleton {
   private static Singleton singleton=null;
   private singleton () {
   }
   public static Singleton getInstace() {
       if (singleton != null)
       singleton=new Singleton();
       return singleton;
   }
}

The Singleton class is typically used as a factory to create objects. I cached these objects into an ArrayList to enable their speedy retrieval. When a new object must be created, it will be retrieved from the cache if it is present there, otherwise, a new object will be created.


Figure 10. Singleton class diagram. Click on thumbnail to view full-sized image.

Once the Singleton class is instantiated, it remains in memory for the application's lifetime. The other objects will also have a live reference to it and, as a result, will never be garbage collected. Recommendation
Avoid referencing objects from long-lasting objects. If such usage cannot be avoided, use a weak reference, a type of object reference that does not prevent the object from being garbage collected. HttpSession vs. HttpRequest
HTTP is a request-response-based stateless protocol. If a client wants to send information to the server, it can be stored in an HttpRequest object. But that HttpRequest object will be available only for a single transaction. The HTTP server has no way to determine whether a series of requests came from the same client or from different clients. The HttpSession object is generally used to store data required from the time a user logs into the system until he logs out. It brings statefulness into a transaction. The session can be used for storing information such as a user's security permissions. But often, programmers mistakenly store complex long-living data, such as a shopping cart, into the session, instead of using the business tier. I experimented with the sample application to find the difference in memory usage between the HttpSession and HttpRequest objects since data stored in HttpSession will stay in memory until the user logs out of the application. I added the database table's query results to an ArrayList, which I then placed into both the HttpSession and HttpRequest objects. Memory usage was observed for 50 query-and-save operations. Scenario 1
The database table contains 100 rows. The output ArrayList is stored in the HttpRequest object to be passed back to the JSP page. After performing one query-and-save operation, the increase in memory usage is 166.308 KB.


Figure 11. Results for one query-and-save operation. Click on thumbnail to view full-sized image.

After completing 10 query-and-save operations, the increase in memory usage is 175.512 KB.


Figure 12. Ten operations. Click on thumbnail to view full-sized image.

After performing 50 query-and-save operations, the increase in memory usage is 194.128 KB.


Figure 13. Fifty query-and-save operations. Click on thumbnail to view full-sized image.

Scenario 2
The database table contains 100 rows. The output ArrayList is stored in the HttpSession object to be passed back to the JSP page. After one query-and-save operation, the increase in memory usage is 176.708 KB.


Figure 14. One query-and-save operation. Click on thumbnail to view full-sized image.

After 10 query-and-save operations, the increase in memory usage is 178.46 KB.


Figure 15. Ten operations. Click on thumbnail to view full-sized image.

After 50 query-and-save operations, the increase in memory usage is 216.552 KB.


Figure 16. Fifty operations. Click on thumbnail to view full-sized image.

When the data is stored in HttpSession, instead of HttpRequest, 50 save-and-query operations increase memory usage by 22.424 KB. This happens on a per client basis. Hence for multiple clients, the multiplicative factor comes in as well. Over a period of time, this will definitely lead to a significant memory leak in the application. The data stored in HttpSession stays in memory as long as the user is logged in. Putting too much data into HttpSession leads to the Overstuffed Session anti-pattern. Since HttpSession is implemented as a collection, this overstuffed session can be considered a variant of the Leak Collection anti-pattern. Recommendation

  1. Use of HttpSessions should be minimized and used only for state that cannot realistically be kept on the request object Remove objects from HttpSession if they are no longer used
  2. Long-living data should be migrated to the business tier

Conclusion
I have highlighted some of the important programming scenarios where the JVM's garbage collection mechanism becomes ineffective. These situations necessitate appropriate precautions during design of the application itself. While closing ResultSets and Statements can be done after application development with comparatively low costs, other aspects that I have explained get deeply embedded in the design and could prove costly to correct. The garbage collector is a low-priority thread. Hence in a heavily loaded Java EE application, garbage collection itself may happen infrequently. Even those objects that could have been potentially garbage collected may actually stay in memory for a long time. So explicitly cleaning the heap may be a mandatory programming requirement in some applications; doing so must be considered on a case-by-case basis.


<!---->
<!----> <!---->
Join the discussion about this article Click Here To Add Your Comment
How to migrate? Anonymous   03/13/06 06:44 AM
Plug memory leaks in enterprise Java applications JavaWorld   03/12/06 07:46 PM

Printer-friendly version | Mail this to a friendAbout the author
Ambily Pankajakshan works as a scientist in the Centre for Artificial Intelligence and Robotics. She has more than five years of experience in the design and development of multitiered Java EE applications. Her areas of interests are performance-tuning Java EE applications and application servers. She holds a B.Tech Degree in computer science and engineering from M.G. University, Kerala India. Currently, she lives in Bangalore with her husband Nishore and son Ananthu.

你可能感兴趣的:(java,jvm,sql,jsp,jdbc,server)