a SoftReference based HashMap

Implementing a SoftReference based HashMap

Java is slow. Java is a memory hog.

But, if I have to write a network application, I would not hesitate for a second to use Java. Why? Because Java shines in threaded, network-based applications and because over the years, I have recognised the weaknesses of Java and have learnt (the hard way) what I should do, or not do, to avoid running into too serious problems. Recognising the problems takes a while to learn, which is why most Java jobs require you to have at least 2 years of Java programming experience.

One of the most common places of memory wastage is in hash maps, so SUN have provided a WeakHashMap to minimize memory usage in caches implemented using maps. A WeakHashMap stores the keys using WeakReference objects, which means in layman's terms that as soon as the key is not referenced from somewhere else in your program, the entry may be removed and is available for garbage collection. (Have a look at the JavaDocs for the java.util.WeakHashMap and java.lang.ref.WeakReference for more information) Say you write a middle tier that provides an OO view of your database via an application server. What you really want is for the values to be automatically released, rather than the keys. If you put all the objects into a normal HashMap, you can easily run out of memory when you access many different objects from different clients. But if you store the objects in a WeakHashMap they are cleared as soon as your clients is not referencing them anymore. What you really want, however, is to only have the objects released when the VM is running low on memory, since that is when you get problems.

Enter SoftReferences. As far as I understand, a SoftReference will only be garbage collected when the VM is running low on memory and the object it is pointing to is not accessed from a normal (hard) reference. This is probably a better option to use for the HashMap values than a WeakReference, but the default JDK collections don't include a GC-friendly HashMap based on values and neither does it provide a SoftReference based HashMap.

Before I show you a SoftHashMap implementation, based on ideas by Sydney (what's up doc?) Redelinghuys, I need to explain some ideas which will make our SoftHashMap more optimal.

   1. Each time we change the map (put, remove, clear) or ask for its size, we first want to go through the map and throw away entries for which the values have been garbage collected. It is quite easy to find out which soft references have been cleared. You can give the SoftReference a ReferenceQueue to which it is added when it is garbage collected.
   2. We don't want our hash map to be bullied by the garbage collector, so we provide the option of the map itself keeping a hard link to the last couple of objects (typically 100).
   3. The SoftHashMap will use a variant of the Decorator pattern to add this functionality to an internally kept java.util.HashMap. I'm busy working on a Design Patterns course based on the GOF book, let me know if you want further information.

Without further ado, here comes the SoftHashMap:

//: SoftHashMap.java
import java.util.*;
import java.lang.ref.*;

public class SoftHashMap extends AbstractMap {
  /** The internal HashMap that will hold the SoftReference. */
  private final Map hash = new HashMap();
  /** The number of "hard" references to hold internally. */
  private final int HARD_SIZE;
  /** The FIFO list of hard references, order of last access. */
  private final LinkedList hardCache = new LinkedList();
  /** Reference queue for cleared SoftReference objects. */
  private final ReferenceQueue queue = new ReferenceQueue();

  public SoftHashMap() { this(100); }
  public SoftHashMap(int hardSize) { HARD_SIZE = hardSize; }

  public Object get(Object key) {
    Object result = null;
    // We get the SoftReference represented by that key
    SoftReference soft_ref = (SoftReference)hash.get(key);
    if (soft_ref != null) {
      // From the SoftReference we get the value, which can be
      // null if it was not in the map, or it was removed in
      // the processQueue() method defined below
      result = soft_ref.get();
      if (result == null) {
        // If the value has been garbage collected, remove the
        // entry from the HashMap.
        hash.remove(key);
      } else {
        // We now add this object to the beginning of the hard
        // reference queue.  One reference can occur more than
        // once, because lookups of the FIFO queue are slow, so
        // we don't want to search through it each time to remove
        // duplicates.
        hardCache.addFirst(result);
        if (hardCache.size() > HARD_SIZE) {
          // Remove the last entry if list longer than HARD_SIZE
          hardCache.removeLast();
        }
      }
    }
    return result;
  }

  /** We define our own subclass of SoftReference which contains
   not only the value but also the key to make it easier to find
   the entry in the HashMap after it's been garbage collected. */
  private static class SoftValue extends SoftReference {
    private final Object key; // always make data member final
    /** Did you know that an outer class can access private data
     members and methods of an inner class?  I didn't know that!
     I thought it was only the inner class who could access the
     outer class's private information.  An outer class can also
     access private members of an inner class inside its inner
     class. */
    private SoftValue(Object k, Object key, ReferenceQueue q) {
      super(k, q);
      this.key = key;
    }
  }

  /** Here we go through the ReferenceQueue and remove garbage
   collected SoftValue objects from the HashMap by looking them
   up using the SoftValue.key data member. */
  private void processQueue() {
    SoftValue sv;
    while ((sv = (SoftValue)queue.poll()) != null) {
      hash.remove(sv.key); // we can access private data!
    }
  }
  /** Here we put the key, value pair into the HashMap using
   a SoftValue object. */
  public Object put(Object key, Object value) {
    processQueue(); // throw out garbage collected values first
    return hash.put(key, new SoftValue(value, key, queue));
  }
  public Object remove(Object key) {
    processQueue(); // throw out garbage collected values first
    return hash.remove(key);
  }
  public void clear() {
    hardCache.clear();
    processQueue(); // throw out garbage collected values
    hash.clear();
  }
  public int size() {
    processQueue(); // throw out garbage collected values first
    return hash.size();
  }
  public Set entrySet() {
    // no, no, you may NOT do that!!! GRRR
    throw new UnsupportedOperationException();
  }
}

And here comes some test code that demonstrates to a certain degree that I'm not talking complete bullsh*t. Soft and weak references are quite difficult to experiment with as there is a lot of freedom left to the writers of the JVM of how they must implement them. I wish the implementation would hold back longer before removing these references, that the JVM would wait until it is really running low on memory before clearing them, but unfortunately I am not the one who wrote the JVM. I have tried to question the authors of the java.lang.ref package to find out what the strategy is for references in future versions, but have not had any response yet.

//: SoftHashMapTest.java
import java.lang.ref.*;
import java.util.*;

public class SoftHashMapTest {
  private static void print(Map map) {
    System.out.println("One=" + map.get("One"));
    System.out.println("Two=" + map.get("Two"));
    System.out.println("Three=" + map.get("Three"));
    System.out.println("Four=" + map.get("Four"));
    System.out.println("Five=" + map.get("Five"));
  }
  private static void testMap(Map map) {
    System.out.println("Testing " + map.getClass());
    map.put("One", new Integer(1));
    map.put("Two", new Integer(2));
    map.put("Three", new Integer(3));
    map.put("Four", new Integer(4));
    map.put("Five", new Integer(5));
    print(map);
    byte[] block = new byte[10*1024*1024]; // 10 MB
    print(map);
  }
  public static void main(String[] args) {
    testMap(new HashMap());
    testMap(new SoftHashMap(2));
  }
}

Until next week, and please remember to forward this newsletter and send me your comments.

Heinz

你可能感兴趣的:(java,jvm,OO,Access,Go)