最近深刻纠结于复杂对象的clone, 实在苦恼于写多个对象的clone函数,于是收集网上的deep clone方法,真是适合我这种懒人~~
The java.lang.Object
root superclass defines a clone()
method that will, assuming the subclass implements thejava.lang.Cloneable
interface, return a copy of the object. While Java classes are free to override this method to do more complex kinds of cloning, the default behavior ofclone()
is to return a shallow copy of the object. This means that the values of all of the origical object’s fields are copied to the fields of the new object.
A property of shallow copies is that fields that refer to other objects will point tothe same objects in both the original and the clone. For fields that contain primitive or immutable values (int
,String
, float
, etc…), there is little chance of this causing problems. For mutable objects, however, cloning can lead to unexpected results. Figure 1 shows an example.
import java.util.Vector; public class Example1 { public static void main(String[] args) { // Make a Vector Vector original = new Vector(); // Make a StringBuffer and add it to the Vector StringBuffer text = new StringBuffer("The quick brown fox"); original.addElement(text); // Clone the vector and print out the contents Vector clone = (Vector) original.clone(); System.out.println("A. After cloning"); printVectorContents(original, "original"); printVectorContents(clone, "clone"); System.out.println( "--------------------------------------------------------"); System.out.println(); // Add another object (an Integer) to the clone and // print out the contents clone.addElement(new Integer(5)); System.out.println("B. After adding an Integer to the clone"); printVectorContents(original, "original"); printVectorContents(clone, "clone"); System.out.println( "--------------------------------------------------------"); System.out.println(); // Change the StringBuffer contents text.append(" jumps over the lazy dog."); System.out.println("C. After modifying one of original's elements"); printVectorContents(original, "original"); printVectorContents(clone, "clone"); System.out.println( "--------------------------------------------------------"); System.out.println(); } public static void printVectorContents(Vector v, String name) { System.out.println(" Contents of \"" + name + "\":"); // For each element in the vector, print out the index, the // class of the element, and the element itself for (int i = 0; i < v.size(); i++) { Object element = v.elementAt(i); System.out.println(" " + i + " (" + element.getClass().getName() + "): " + element); } System.out.println(); } }
Vector
contents after cloning
Vector
and add a
StringBuffer
to it. Note that
StringBuffer
(unlike, for example,
String
is mutable — it’s contents can be changed after creation. Figure 2 shows the output of the example in Figure 1.
> java Example1 A. After cloning Contents of "original": 0 (java.lang.StringBuffer): The quick brown fox Contents of "clone": 0 (java.lang.StringBuffer): The quick brown fox -------------------------------------------------------- B. After adding an Integer to the clone Contents of "original": 0 (java.lang.StringBuffer): The quick brown fox Contents of "clone": 0 (java.lang.StringBuffer): The quick brown fox 1 (java.lang.Integer): 5 -------------------------------------------------------- C. After modifying one of original's elements Contents of "original": 0 (java.lang.StringBuffer): The quick brown fox jumps over the lazy dog. Contents of "clone": 0 (java.lang.StringBuffer): The quick brown fox jumps over the lazy dog. 1 (java.lang.Integer): 5
In the first block of output (”A”), we see that the clone operation was successful: The original vector and the clone have the same size (1), content types, and values. The second block of output (”B”) shows that the original vector and its clone are distinct objects. If we add another element to the clone, it only appears in the clone, and not in the original. The third block of output (”C”) is, however, a little trickier. Modifying theStringBuffer
that was added to the original vector has changed the value of the first element ofboth the original vector and its clone. The explanation for this lies in the fact thatclone
made a shallow copy of the vector, so both vectors now point to the exact sameStringBuffer
instance.
This is, of course, sometimes exactly the behavior that you need. In other cases, however, it can lead to frustrating and inexplicable errors, as the state of an object seems to change “behind your back”.
The solution to this problem is to make a deep copy of the object. A deep copy makes a distinct copy of each of the object’s fields, recursing through the entire graph of other objects referenced by the object being copied. The Java API provides no deep-copy equivalent to Object.clone()
. One solution is to simply implement your own custom method (e.g.,deepCopy()
) that returns a deep copy of an instance of one of your classes. This may be the best solution if you need a complex mixture of deep and shallow copies for different fields, but has a few significant drawbacks:
final
, you are out of luck.private
fields of a superclass, you will not be able to access them.A common solution to the deep copy problem is to use Java Object Serialization (JOS). The idea is simple: Write the object to an array using JOS’sObjectOutputStream
and then use ObjectInputStream
to reconsistute a copy of the object. The result will be a completely distinct object, with completely distinct referenced objects. JOS takes care of all of the details: superclass fields, following object graphs, and handling repeated references to the same object within the graph. Figure 3 shows a first draft of a utility class that uses JOS for making deep copies.
import java.io.IOException; import java.io.ByteArrayInputStream; import java.io.ByteArrayOutputStream; import java.io.ObjectOutputStream; import java.io.ObjectInputStream; /** * Utility for making deep copies (vs. clone()'s shallow copies) of * objects. Objects are first serialized and then deserialized. Error * checking is fairly minimal in this implementation. If an object is * encountered that cannot be serialized (or that references an object * that cannot be serialized) an error is printed to System.err and * null is returned. Depending on your specific application, it might * make more sense to have copy(...) re-throw the exception. * * A later version of this class includes some minor optimizations. */ public class UnoptimizedDeepCopy { /** * Returns a copy of the object, or null if the object cannot * be serialized. */ public static Object copy(Object orig) { Object obj = null; try { // Write the object out to a byte array ByteArrayOutputStream bos = new ByteArrayOutputStream(); ObjectOutputStream out = new ObjectOutputStream(bos); out.writeObject(orig); out.flush(); out.close(); // Make an input stream from the byte array and read // a copy of the object back in. ObjectInputStream in = new ObjectInputStream( new ByteArrayInputStream(bos.toByteArray())); obj = in.readObject(); } catch(IOException e) { e.printStackTrace(); } catch(ClassNotFoundException cnfe) { cnfe.printStackTrace(); } return obj; } }
Unfortunately, this approach has some problems, too:
java.io.Serializable
.) Fortunately it is often sufficient to simply declare that a given classimplements java.io.Serializable
and let Java’s default serialization mechanisms do their thing.readObject()
and writeObject()
methods), but this will usually be the primary bottleneck.java.io
package are designed to be general enough to perform reasonable well for data of different sizes and to be safe to use in a multi-threaded environment. These characteristics, however, slow down ByteArrayOutputStream
and (to a lesser extent) ByteArrayInputStream
.The first two of these problems cannot be addressed in a general way. We can, however, use alternative implementations ofByteArrayOutputStream
and ByteArrayInputStream
that makes three simple optimizations:
ByteArrayOutputStream
, by default, begins with a 32 byte array for the output. As content is written to the stream, the required size of the content is computed and (if necessary), the array is expanded to the greater of the required size or twice the current size. JOS produces output that is somewhat bloated (for example, fully qualifies path names are included in uncompressed string form), so the 32 byte default starting size means that lots of small arrays are created, copied into, and thrown away as data is written. This has an easy fix: construct the array with a larger inital size.ByteArrayOutputStream
that modify the contents of the byte array aresynchronized
. In general this is a good idea, but in this case we can be certain that only a single thread will ever be accessing the stream. Removing the synchronization will speed things up a little.ByteArrayInputStream
’s methods are also synchronized.toByteArray()
method creates and returns a copy of the stream’s byte array. Again, this is usually a good idea: If you retrieve the byte array and then continue writing to the stream, the retrieved byte array should not change. For this case, however, creating another byte array and copying into it merely wastes cycles and makes extra work for the garbage collector.ByteArrayOutputStream
is shown in Figure 4.
import java.io.OutputStream; import java.io.IOException; import java.io.InputStream; import java.io.ByteArrayInputStream; /** * ByteArrayOutputStream implementation that doesn't synchronize methods * and doesn't copy the data on toByteArray(). */ public class FastByteArrayOutputStream extends OutputStream { /** * Buffer and size */ protected byte[] buf = null; protected int size = 0; /** * Constructs a stream with buffer capacity size 5K */ public FastByteArrayOutputStream() { this(5 * 1024); } /** * Constructs a stream with the given initial size */ public FastByteArrayOutputStream(int initSize) { this.size = 0; this.buf = new byte[initSize]; } /** * Ensures that we have a large enough buffer for the given size. */ private void verifyBufferSize(int sz) { if (sz > buf.length) { byte[] old = buf; buf = new byte[Math.max(sz, 2 * buf.length )]; System.arraycopy(old, 0, buf, 0, old.length); old = null; } } public int getSize() { return size; } /** * Returns the byte array containing the written data. Note that this * array will almost always be larger than the amount of data actually * written. */ public byte[] getByteArray() { return buf; } public final void write(byte b[]) { verifyBufferSize(size + b.length); System.arraycopy(b, 0, buf, size, b.length); size += b.length; } public final void write(byte b[], int off, int len) { verifyBufferSize(size + len); System.arraycopy(b, off, buf, size, len); size += len; } public final void write(int b) { verifyBufferSize(size + 1); buf[size++] = (byte) b; } public void reset() { size = 0; } /** * Returns a ByteArrayInputStream for reading back the written data */ public InputStream getInputStream() { return new FastByteArrayInputStream(buf, size); } }
ByteArrayOutputStream
The
getInputStream()
method returns an instance of an optimized version of
ByteArrayInputStream
that has unsychronized methods. The implementation of
FastByteArrayInputStream
is shown in Figure 5.
import java.io.InputStream; import java.io.IOException; /** * ByteArrayInputStream implementation that does not synchronize methods. */ public class FastByteArrayInputStream extends InputStream { /** * Our byte buffer */ protected byte[] buf = null; /** * Number of bytes that we can read from the buffer */ protected int count = 0; /** * Number of bytes that have been read from the buffer */ protected int pos = 0; public FastByteArrayInputStream(byte[] buf, int count) { this.buf = buf; this.count = count; } public final int available() { return count - pos; } public final int read() { return (pos < count) ? (buf[pos++] & 0xff) : -1; } public final int read(byte[] b, int off, int len) { if (pos >= count) return -1; if ((pos + len) > count) len = (count - pos); System.arraycopy(buf, pos, b, off, len); pos += len; return len; } public final long skip(long n) { if ((pos + n) > count) n = count - pos; if (n < 0) return 0; pos += n; return n; } }
ByteArrayInputStream
.
import java.io.IOException; import java.io.ByteArrayInputStream; import java.io.ByteArrayOutputStream; import java.io.ObjectOutputStream; import java.io.ObjectInputStream; /** * Utility for making deep copies (vs. clone()'s shallow copies) of * objects. Objects are first serialized and then deserialized. Error * checking is fairly minimal in this implementation. If an object is * encountered that cannot be serialized (or that references an object * that cannot be serialized) an error is printed to System.err and * null is returned. Depending on your specific application, it might * make more sense to have copy(...) re-throw the exception. */ public class DeepCopy { /** * Returns a copy of the object, or null if the object cannot * be serialized. */ public static Object copy(Object orig) { Object obj = null; try { // Write the object out to a byte array FastByteArrayOutputStream fbos = new FastByteArrayOutputStream(); ObjectOutputStream out = new ObjectOutputStream(fbos); out.writeObject(orig); out.flush(); out.close(); // Retrieve an input stream from the byte array and read // a copy of the object back in. ObjectInputStream in = new ObjectInputStream(fbos.getInputStream()); obj = in.readObject(); } catch(IOException e) { e.printStackTrace(); } catch(ClassNotFoundException cnfe) { cnfe.printStackTrace(); } return obj; } }
The extent of the speed boost will depend on a number of factors in your specific application (more on this later), but the simple class shown in Figure 7 tests the optimized and unoptimized versions of the deep copy utility by repeatedly copying a large object.
import java.util.Hashtable; import java.util.Vector; import java.util.Date; public class SpeedTest { public static void main(String[] args) { // Make a reasonable large test object. Note that this doesn't // do anything useful -- it is simply intended to be large, have // several levels of references, and be somewhat random. We start // with a hashtable and add vectors to it, where each element in // the vector is a Date object (initialized to the current time), // a semi-random string, and a (circular) reference back to the // object itself. In this case the resulting object produces // a serialized representation that is approximate 700K. Hashtable obj = new Hashtable(); for (int i = 0; i < 100; i++) { Vector v = new Vector(); for (int j = 0; j < 100; j++) { v.addElement(new Object[] { new Date(), "A random number: " + Math.random(), obj }); } obj.put(new Integer(i), v); } int iterations = 10; // Make copies of the object using the unoptimized version // of the deep copy utility. long unoptimizedTime = 0L; for (int i = 0; i < iterations; i++) { long start = System.currentTimeMillis(); Object copy = UnoptimizedDeepCopy.copy(obj); unoptimizedTime += (System.currentTimeMillis() - start); // Avoid having GC run while we are timing... copy = null; System.gc(); } // Repeat with the optimized version long optimizedTime = 0L; for (int i = 0; i < iterations; i++) { long start = System.currentTimeMillis(); Object copy = DeepCopy.copy(obj); optimizedTime += (System.currentTimeMillis() - start); // Avoid having GC run while we are timing... copy = null; System.gc(); } System.out.println("Unoptimized time: " + unoptimizedTime); System.out.println(" Optimized time: " + optimizedTime); } }
A few notes about this test:
FastByteArrayOutputStream
. This has several implications:FastByteArrayInputStream
speeds things up a little, but the standard java.io.ByteArrayInputStream
is nearly as fast.FastByteArrayOutputStream
, but is much more sensitive to the rate at which the buffer grows. If the objects you are copying tend to be of similar size, copying will be much faster if you initialize the buffer size and tweak the rate of growth.System.currentTimeMillis()
is problematic, but for single-threaded applications and testing relatively slow operations it is sufficient. A number of commercial tools (such as JProfiler) will give more accurate per-method timing data.These caveats aside, the performance difference is sigificant. For example, the code as shown in Figure 7 (on a 500Mhz G3 Macintosh iBook running OSX 10.3 and Java 1.4.1) reveals that the unoptimized version requires about 1.8 seconds per copy, while the optimized version only requires about 1.3 seconds. Whether or not this difference is signficant will, of course, depend on the frequency with which your application does deep copies and the size of the objects being copied.
For very large objects, an extension to this approach can reduce the peak memory footprint by serializing and deserializing in parallel threads. See “Low-Memory Deep Copy Technique for Java Objects” for more information.
摘自http://javatechniques.com/blog/faster-deep-copies-of-java-objects/