2019独角兽企业重金招聘Python工程师标准>>>
System#arraycopy 是直接对内存中的数据块进行复制的,是一整块一起复制的,它是采用本地编码实现的。
而采用下标一个一个地进行赋值时,时间主要浪费在了寻址和赋值上。
因此,建议复制数组或者动态增加数组长度时,采用 System#arraycopy 方法。
像我们常用的 ArrayList 内部就是采用数组存储的,在满掉的时候,就采用 System#arraycopy 来动态
增加其内部存储容量的。
当我还年幼的时候,我很任性,复制数组也是,写一个for循环,来回倒腾,后来长大了,就发现了System.arraycopy的好处。
为了测试俩者的区别我写了一个简单赋值int[100000]的程序来对比,并且中间使用了nanoTime来计算时间差:
程序如下:
int[] a = new int[100000]; for(int i=0;iis_oop(), "JVM_ArrayCopy: src not an oop"); assert(d->is_oop(), "JVM_ArrayCopy: dst not an oop"); // Do copy Klass::cast(s->klass())->copy_array(s, src_pos, d, dst_pos, length, thread); JVM_END
前面的语句都是判断,知道最后的copy_array(s, src_pos, d, dst_pos, length, thread)是真正的copy,进一步看这里,在openjdk6-src/hotspot/src/share/vm/oops/typeArrayKlass.cpp中:
void typeArrayKlass::copy_array(arrayOop s, int src_pos, arrayOop d, int dst_pos, int length, TRAPS) { assert(s->is_typeArray(), "must be type array"); // Check destination if (!d->is_typeArray() || element_type() != typeArrayKlass::cast(d->klass())->element_type()) { THROW(vmSymbols::java_lang_ArrayStoreException()); } // Check is all offsets and lengths are non negative if (src_pos < 0 || dst_pos < 0 || length < 0) { THROW(vmSymbols::java_lang_ArrayIndexOutOfBoundsException()); } // Check if the ranges are valid if ( (((unsigned int) length + (unsigned int) src_pos) > (unsigned int) s->length()) || (((unsigned int) length + (unsigned int) dst_pos) > (unsigned int) d->length()) ) { THROW(vmSymbols::java_lang_ArrayIndexOutOfBoundsException()); } // Check zero copy if (length == 0) return; // This is an attempt to make the copy_array fast. int l2es = log2_element_size(); int ihs = array_header_in_bytes() / WordSize; char* src = (char*) ((oop*)s + ihs) + ((size_t)src_pos << l2es); char* dst = (char*) ((oop*)d + ihs) + ((size_t)dst_pos << l2es); Copy::conjoint_memory_atomic(src, dst, (size_t)length << l2es);//还是在这里处理copy }
这个函数之前的仍然是一堆判断,直到最后一句才是真实的拷贝语句。
在openjdk6-src/hotspot/src/share/vm/utilities/copy.cpp中找到对应的函数:
// Copy bytes; larger units are filled atomically if everything is aligned. void Copy::conjoint_memory_atomic(void* from, void* to, size_t size) { address src = (address) from; address dst = (address) to; uintptr_t bits = (uintptr_t) src | (uintptr_t) dst | (uintptr_t) size; // (Note: We could improve performance by ignoring the low bits of size, // and putting a short cleanup loop after each bulk copy loop. // There are plenty of other ways to make this faster also, // and it's a slippery slope. For now, let's keep this code simple // since the simplicity helps clarify the atomicity semantics of // this Operation. There are also CPU-specific assembly versions // which may or may not want to include such optimizations.) if (bits % sizeof(jlong) == 0) { Copy::conjoint_jlongs_atomic((jlong*) src, (jlong*) dst, size / sizeof(jlong)); } else if (bits % sizeof(jint) == 0) { Copy::conjoint_jints_atomic((jint*) src, (jint*) dst, size / sizeof(jint)); } else if (bits % sizeof(jshort) == 0) { Copy::conjoint_jshorts_atomic((jshort*) src, (jshort*) dst, size / sizeof(jshort)); } else { // Not aligned, so no need to be atomic. Copy::conjoint_jbytes((void*) src, (void*) dst, size); } }
上面的代码展示了选择哪个copy函数,我们选择conjoint_jints_atomic,在openjdk6-src/hotspot/src/share/vm/utilities/copy.hpp进一步查看:
// jints, conjoint, atomic on each jint static void conjoint_jints_atomic(jint* from, jint* to, size_t count) { assert_params_ok(from, to, LogBytesPerInt); pd_conjoint_jints_atomic(from, to, count); }
继续向下查看,在openjdk6-src/hotspot/src/cpu/zero/vm/copy_zero.hpp中:
static void pd_conjoint_jints_atomic(jint* from, jint* to, size_t count) { _Copy_conjoint_jints_atomic(from, to, count); }
继续向下查看,在openjdk6-src/hotspot/src/os_cpu/linux_zero/vm/os_linux_zero.cpp中:
void _Copy_conjoint_jints_atomic(jint* from, jint* to, size_t count) { if (from > to) { jint *end = from + count; while (from < end) *(to++) = *(from++); } else if (from < to) { jint *end = from; from += count - 1; to += count - 1; while (from >= end) *(to--) = *(from--); } }
可以看到,直接就是内存块赋值的逻辑了,这样避免很多引用来回倒腾的时间,必然就变快了。