Memory Leakage in Internet Explorer - revisited(zz)

Memory Leakage in Internet Explorer - revisited
By volkan.ozcelik.

In this article, we will review JavaScript memory leakage patterns from a slightly different perspective and support it with diagrams and memory utilization graphs.

 

Introduction

If you are developing client-side re-usable scripting objects, sooner or later you will find yourself spotting out memory leaks. Chances are that your browser will suck memory like a sponge and you will hardly be able to find a reason why your lovely DHTML navigation's responsiveness decreases severely after visiting a couple of pages within your site.

A Microsoft developer Justing Rogers has described IE leak patterns in his excellent article.

In this article, we will review those patterns from a slightly different perspective and support it with diagrams and memory utilization graphs. We will also introduce several subtler leak scenarios. Before we begin, I strongly recommend you to read that article if you have not already read.

Why does the memory leak?

The problem of memory leakage is not just limited to Internet Explorer. Almost any browser (including but not limited to Mozilla, Netscape and Opera) will leak memory if you provide adequate conditions (and it is not that hard to do so, as we will see shortly). But (in my humble opinion, ymmv etc.) Internet Explorer is the king of leakers.

Don't get me wrong. I do not belong to the crowd yelling "Hey IE has memory leaks, checkout this new tool [link-to-tool] and see for yourself". Let us discuss how crappy Internet Explorer is and cover up all the flaws in other browsers".

Each browser has its own strengths and weaknesses. For instance, Mozilla consumes too much of memory at initial boot, it is not good in string and array operations; Opera may crash if you write a ridiculously complex DHTML script which confuses its rendering engine.

If you have read my previous article, then you already know that I am a usability, accessibility and standards crack. I love Opera. But that does not mean I am pursuing a holy war against IE. What I want here is to follow up an analytical path and examine various leaking patterns that may not be quite obvious at first glance.

Although we will be focusing on the memory leaking situations in Internet Explorer, this discussion is equally applicable to other browsers.

A simple beginning

Let us begin with a simple example:

[Exhibit 1 - Memory leaking insert due to inline script]

<html>
<head>
<script type="text/javascript">
    function LeakMemory(){
        var parentDiv = 
             document.createElement("<div onclick='foo()'>");

        parentDiv.bigString = new Array(1000).join(
                              new Array(2000).join("XXXXX"));
    }
</script>
</head>
<body>
<input type="button" 
       value="Memory Leaking Insert" onclick="LeakMemory()" />
</body>
</html>

The first assignment parentDiv=document.createElement(...); will create a div element and create a temporary scope for it where the scripting object resides. The second assignment parentDiv.bigString=... attaches a large object to parentDiv. When LeakMemory() method is called, a DOM element will be created within the scope of this function, a very large object will be attached to it as a member property and the DOM element will be de-allocated and removed from memory as soon as the function exits, since it is an object created within the local scope of the function.

When you run the example and click the button a few times, your memory graph will probably look like this:

Increasing the frequency

No visible leak huh? What if we do this a few hundred times instead of twenty, or a few thousand times? Will it be the same? The following code calls the assignment over and over again to accomplish this goal:

[Exhibit 2 - Memory leaking insert (frequency increased) ]

<html>
<head>
<script type="text/javascript">
    function LeakMemory(){
        for(i = 0; i < 5000; i++){
            var parentDiv = 
               document.createElement("<div onClick='foo()'>");
        }
    }
</script>
</head>
<body>
<input type="button" 
       value="Memory Leaking Insert" onclick="LeakMemory()" />
</body>
</html>

And here follows the corresponding graph:

The ramp in the memory usage indicates leak in memory. The horizontal line (the last 20 seconds) at the end of the ramp is the memory after refreshing the page and loading another (about:blank) page. This shows that the leak is an actual leak and not a pseudo leak. The memory will not be reclaimed unless the browser window and other dependant windows if any are closed.

Assume you have a dozen pages that have similar leakage graph. After a few hours, you may want to restart your browser (or even your PC) because it just stops responding. The naughty browser is eating up all your resources. However, this is an extreme case because Windows will increase the virtual memory size as soon as your memory consumption reaches a certain level.

This is not a pretty scenario. Your client/boss will not be very happy, if they discover such a situation in the middle of a product showcase/training/demo.


A careful eye may have caught that there is no bigString in the second example. This means that the leak is merely because of the internal scripting object (i.e. the anonymous script onclick='foo()'). This script was not deallocated properly. This caused memory leak at each iteration. To prove our thesis let us run a slightly different test case:

[Exhibit 3 - Leak test without inline script attached]

<html>
<head>
<script type="text/javascript">
    function LeakMemory(){
        for(i = 0; i < 50000; i++){
            var parentDiv = 
            document.createElement("div");
        }
    }
</script>
</head>
<body>
<input type="button" 
       value="Memory Leaking Insert" onclick="LeakMemory()" />
</body>
</html>

And here follows the corresponding memory graph:

As you can see, we have done fifty thousand iterations instead of 5000, and still the memory usage is flat (i.e. no leaks). The slight ramp is due to some other process in my PC.

Let us change our code in a more standard and somewhat unobtrusive manner (not the correct term here, but can't find a better one) without embedded inline scripts and re-test it.

Introducing the closure

Here is another piece of code. Instead of appending the script inline, we attach it externally:

[Exhibit 4 - Leak test with a closure]

<html>
<head>
<script type="text/javascript">
    function LeakMemory(){
        var parentDiv = document.createElement("div");
                          parentDiv.onclick=function(){
            foo();
        };

        parentDiv.bigString = 
          new Array(1000).join(new Array(2000).join("XXXXX"));
    }
</script>
</head>
<body>
<input type="button" 
       value="Memory Leaking Insert" onclick="LeakMemory()" />
</body>
</html>

If you don't know what a closure is, there are very good references on the web where you may find it. Closures are very useful patterns; you should learn them and keep them in your knowledge base.

And here is the graph that shows the memory leak. This is somewhat different from the former examples. The anonymous function assigned to parentDiv.onclick is a closure that closes over parentDiv, which creates a circular reference between the JS world and DOM and creates a well-known memory leakage issue:

To generate leak in the above scenario, we should click the button, refresh the page, click the button again, refresh the page and so on.

Clicking the button without a subsequent refresh will generate the leak only once. Because, at each click, the onclick event of parentDiv is reassigned and the circular reference over the former closure is broken. Hence at each page load there is only one closure that cannot be garbage collected due to circular reference. The rest is successfully cleaned up.

More leakage patterns

All of the patterns shown below are described in detail in Justing's article. I'm going through them just for the sake of completeness:

[Exhibit 5 - Circular reference because of expando property]

<html>
<head>
<script type="text/javascript">
    var myGlobalObject;

    function SetupLeak(){
        //Here a reference created from the JS World 
        //to the DOM world.
        myGlobalObject=document.getElementById("LeakedDiv");

        //Here DOM refers back to JS World; 
        //hence a circular reference.
        //The memory will leak if not handled properly.
        document.getElementById("LeakedDiv").expandoProperty=
                                               myGlobalObject;
    }
</script>
</head>
<body onload="SetupLeak()">
<div id="LeakedDiv"></div>
</body>
</html>

Here the global variable myGlobalObject refers to the DOM element LeakDiv; at the same time LeakDiv refers to the global object through its expandoProperty. The situation looks like this:

The above pattern will leak due to the circular reference created between a DOM node and a JS element.

Since the JScript garbage collector is a mark and sweep GC, you may think that it would handle circular references. And in fact it does. However this circular reference is between the DOM and JS worlds. DOM and JS have separate garbage collectors. Therefore they cannot clean up memory in situations like the above.

Another way to create a circular reference is to encapsulate the DOM element as a property of a global object:

[Exhibit 6 - Circular reference using an Encapsulator pattern]

<html>
<head>
<script type="text/javascript">
function Encapsulator(element){
    //Assign our memeber
    this.elementReference = element;

    // Makea circular reference
    element.expandoProperty = this;
}

function SetupLeak() {
    //This leaks
    new Encapsulator(document.getElementById("LeakedDiv"));
}
</script>
</head>
<body onload="SetupLeak()">
<div id="LeakedDiv"></div>
</body>
</html>

Here is how it looks like:

However, the most common usage of closures over DOM nodes is event attachment. The following code will leak:

[Exhibit 7 - Adding an event listener as a closure function]

<html>
<head>
<script type="text/javascript">
window.onload=function(){
    // obj will be gc'ed as soon as 
    // it goes out of scope therefore no leak.
    var obj = document.getElementById("element");
    
    // this creates a closure over "element"
    // and will leak if not handled properly.
    obj.onclick=function(evt){
        ... logic ...
    };
};
</script>
</head>
<body>
<div id="element"></div>
</body>
</html>

Here is a diagram describing the closure which creates a circular reference between the DOM world and the JS world.

The above pattern will leak due to closure. Here the closure's global variable obj is referring to the DOM element. In the mean time, the DOM element holds a reference to the entire closure. This generates a circular reference between the DOM and the JS worlds. That is the cause of leakage.

When we remove closure we see that the leak has gone:

[Exhibit 8- Leak free event registration - No closures were harmed]

<html>
<head>
<script type="text/javascript">
window.onload=function(){
    // obj will be gc'ed as soon as 
    // it goes out of scope therefore no leak.
    var obj = document.getElementById("element");
    obj.onclick=element_click;
};

//HTML DOM object "element" refers to this function
//externally
function element_click(evt){
    ... logic ...
}
</script>
</head>
<body>
<div id="element"></div>
</body>
</html>

Here is the diagram for the above code piece:

This pattern will not leak because as soon as the function window.onload finishes execution, the JS object obj will be marked for garbage collection. So there won't be any reference to the DOM node on the JS side.

And the last but not the least leak pattern is the "cross-page leak":

Collapse
[Exhibit 10 - Cross Page Leak]

<html>
<head>
<script type="text/javascript">
function LeakMemory(){
    var hostElement = document.getElementById("hostElement");
    // Do it a lot, look at Task Manager for memory response
    for(i = 0; i < 5000; i++){
        var parentDiv =
        document.createElement("<div onClick='foo()'>");

        var childDiv =
        document.createElement("<div onClick='foo()'>");

        // This will leak a temporary object
        parentDiv.appendChild(childDiv);
        hostElement.appendChild(parentDiv);
        hostElement.removeChild(parentDiv);
        parentDiv.removeChild(childDiv);
        parentDiv = null;
        childDiv = null;
    }
    hostElement = null;
}
</script>
</head>
<body>
<input type="button" 
       value="Memory Leaking Insert" onclick="LeakMemory()" />
<div id="hostElement"></div>
</body>
</html>

Since we observe memory leakage even in Exhibit 1, it is not surprising that this pattern leaks. Here is what happens: When we append childDiv to parentDiv, a temporary scope from childDiv to parentDiv is created which will leak a temporary script object. Note that document.createElement("<div onClick='foo()'>"); is a non-standard method of event attachment.

Simply using the "best practices" is not enough (as Justing has mentioned in his article as well). One should also adhere to standards as much as possible. If not, he may not have a single clue about what went wrong with the code that was working perfectly a few hours ago (which had just crashed unexpectedly).

Anyway, let us re-order our insertion. The code below will not leak:

Collapse
[Exhibit 11 - DOM insertion re-ordered - no leaks]

<html>
<head>
<script type="text/javascript">
function LeakMemory(){
    var hostElement = document.getElementById("hostElement");
    // Do it a lot, look at Task Manager for memory response
    for(i = 0; i < 5000; i++){
        var parentDiv =
          document.createElement("<div onClick='foo()'>");

        var childDiv =
          document.createElement("<div onClick='foo()'>");

        hostElement.appendChild(parentDiv);
        parentDiv.appendChild(childDiv);
        parentDiv.removeChild(childDiv);
        hostElement.removeChild(parentDiv);

        parentDiv = null;
        childDiv = null;
    }
    hostElement = null;
}
</script>
</head>
<body>
<input type="button" 
       value="Memory Leaking Insert" onclick="LeakMemory()" />
<div id="hostElement"></div>
</body>
</html>

We should keep in mind that, although it is the market leader, IE is not the only browser in the world. And writing IE-specific non-standard code is a bad practice of coding. The counter-argument is true as well. I mean, saying "Mozilla is the best browser so I write Mozila-specific code; I don't care what the heck happens to the rest" is an equally bad attitude. You should enlarge your spectrum as much as possible. As a corollary, you should write standards-compatible code to the highest extent, whenever possible.

Writing, "backwards compatible" code is "out" nowadays. The "in" is writing "forward compatible" (also known as standards compatible) code which will run now and in the future, in current and in future browsers, here and on the moon.

Conclusion

The purpose of this article was to show that not all leakage patterns are easy to find. You may hardly notice some of them, may be due to the small bookkeeping objects that become obvious only after several thousands of iterations. Knowing the internals of a process is an undeniable key to success. Be aware of the leak patterns. Instead of debugging your application in a brute-force manner, look at your code fragments and check whether there is any piece that matches a leakage pattern in your arsenal.

Writing defensive code and taking care of all possible leakage issues is not an over-optimization. To take it simpler, leak-proofness is not a feature the developer may choose to implement or not depending on his mood. It is a requirement for creating a stable, consistent and forward-compatible code. Every web developer should know about it. No excuses, sorry. There are several solutions proposed for leakage issues between a JS closure and a DOM object. Here are a few of them:

However, you should be aware of the fact that there are always unique cases where you may need to craft a solution for yourself. That's it for now. Although the article is not intended to be the "best practices in avoiding memory leakage", I hope it has pointed out some of the interesting issues.

Happy coding!

History

  • 2005-11-12: Article created.
<script src="/script/togglePre.js" type="text/javascript"></script>

About volkan.ozcelik


Volkan is a java enterprise architect who left his full-time senior developer position to venture his ideas and dreams. He codes C# as a hobby, trying to combine the .Net concept with his Java and J2EE know-how. He also works as a freelance web application developer/designer.

Volkan is especially interested in database oriented content management systems, web design and development, web standards, usability and accessibility.

He was born on May '79. He has graduated from one of the most reputable universities of his country (i.e. Bogazici University) in 2003 as a Communication Engineer. He also has earned his Master of Business Administration degree from a second university in 2006.

Click here to view volkan.ozcelik's online profile.


Other popular JavaScript articles:

你可能感兴趣的:(JavaScript,html,IE,XP,Opera)