之前在分析ArrayList和Vector源码的时候,发现Sun JDK版本中的ArrayList和Vector大量使用了System.arraycopy来操作数据,特别是同一数组内元素的移动及不同数组之间元素的复制。
在网上查到一些关于Java优化的资料里也推荐使用System.arraycopy来批量处理数组,其本质就是让处理器利用一条指令处理一个数组中的多条记录,有点像汇编语言里面的串操作指令(LODSB,LODSW,LODSB,STOSB,STOSW,STOSB),只需指定头指针然后就开始循环即可,执行一次指令,指针就后移一个位置。要操作多少个数据就循环多少次即可。
从java.lang.System类的源码可见:
1 |
public static native void arraycopy(Object src, int srcPos, |
2 |
Object dest, int destPos, |
3 |
int length); |
arraycopy方法是一个本地方法。
在OpenJDK源码包中可以找到“openjdk6-src\hotspot\src\share\vm\prims\jvm.cpp”文件,其中的“JVM_ArrayCopy”函数入口是:
01 |
JVM_ENTRY( void , JVM_ArrayCopy(JNIEnv *env, jclass ignored, jobject src, jint src_pos, |
02 |
jobject dst, jint dst_pos, jint length)) |
03 |
JVMWrapper( "JVM_ArrayCopy" ); |
04 |
// Check if we have null pointers |
05 |
if (src == NULL || dst == NULL) { |
06 |
THROW(vmSymbols::java_lang_NullPointerException()); |
07 |
} |
08 |
arrayOop s = arrayOop(JNIHandles::resolve_non_null(src)); |
09 |
arrayOop d = arrayOop(JNIHandles::resolve_non_null(dst)); |
10 |
assert (s->is_oop(), "JVM_ArrayCopy: src not an oop" ); |
11 |
assert (d->is_oop(), "JVM_ArrayCopy: dst not an oop" ); |
12 |
// Do copy |
13 |
Klass::cast(s->klass())->copy_array(s, src_pos, d, dst_pos, length, thread ); |
14 |
JVM_END |
前面的一大段代码都是是用于验证参数的。只有最后一句调用copy_array函数才是真正处理数组复制的操作。而copy_array有两个版本,一个是针对类型数组的,一个是针对对象数组的。这里还是不是很理解类型数组和对象数组的区别,不过从两个版本的copy_array函数的具体代码看,类型数组应该是指Java的基本类型数组,对象数组就应该是除了基本类型之外的对象组成的数组。
在“openjdk6-src\hotspot\src\share\vm\oops\typeArrayKlass.cpp”文件中找到的copy_array函数:
01 |
void typeArrayKlass::copy_array(arrayOop s, int src_pos, arrayOop d, int dst_pos, int length, TRAPS) { |
02 |
assert (s->is_typeArray(), "must be type array" ); |
03 |
04 |
// Check destination |
05 |
if (!d->is_typeArray() || element_type() != typeArrayKlass::cast(d->klass())->element_type()) { |
06 |
THROW(vmSymbols::java_lang_ArrayStoreException()); |
07 |
} |
08 |
09 |
// Check is all offsets and lengths are non negative |
10 |
if (src_pos < 0 || dst_pos < 0 || length < 0) { |
11 |
THROW(vmSymbols::java_lang_ArrayIndexOutOfBoundsException()); |
12 |
} |
13 |
// Check if the ranges are valid |
14 |
if ( (((unsigned int ) length + (unsigned int ) src_pos) > (unsigned int ) s->length()) |
15 |
|| (((unsigned int ) length + (unsigned int ) dst_pos) > (unsigned int ) d->length()) ) { |
16 |
THROW(vmSymbols::java_lang_ArrayIndexOutOfBoundsException()); |
17 |
} |
18 |
// Check zero copy |
19 |
if (length == 0) |
20 |
return ; |
21 |
22 |
// This is an attempt to make the copy_array fast. |
23 |
int l2es = log2_element_size(); |
24 |
int ihs = array_header_in_bytes() / wordSize; |
25 |
char * src = ( char *) ((oop*)s + ihs) + (( size_t )src_pos << l2es); |
26 |
char * dst = ( char *) ((oop*)d + ihs) + (( size_t )dst_pos << l2es); |
27 |
Copy::conjoint_memory_atomic(src, dst, ( size_t )length << l2es); |
28 |
} |
前面的一大段还是在验证参数的正确性,不正确就抛出相应的异常。当最后5行代码便是先对数组进行转型,然后调用conjoint_memory_atomic函数,这才真正开始数组元素的操作。
conjoint_memory_atomic函数在“openjdk6-src\hotspot\src\share\vm\utilities\copy.cpp”文件中:
01 |
// Copy bytes; larger units are filled atomically if everything is aligned. |
02 |
void Copy::conjoint_memory_atomic( void * from, void * to, size_t size) { |
03 |
address src = (address) from; |
04 |
address dst = (address) to; |
05 |
uintptr_t bits = ( uintptr_t ) src | ( uintptr_t ) dst | ( uintptr_t ) size; |
06 |
07 |
// (Note: We could improve performance by ignoring the low bits of size, |
08 |
// and putting a short cleanup loop after each bulk copy loop. |
09 |
// There are plenty of other ways to make this faster also, |
10 |
// and it's a slippery slope. For now, let's keep this code simple |
11 |
// since the simplicity helps clarify the atomicity semantics of |
12 |
// this operation. There are also CPU-specific assembly versions |
13 |
// which may or may not want to include such optimizations.) |
14 |
15 |
if (bits % sizeof (jlong) == 0) { |
16 |
Copy::conjoint_jlongs_atomic((jlong*) src, (jlong*) dst, size / sizeof (jlong)); |
17 |
} else if (bits % sizeof (jint) == 0) { |
18 |
Copy::conjoint_jints_atomic((jint*) src, (jint*) dst, size / sizeof (jint)); |
19 |
} else if (bits % sizeof (jshort) == 0) { |
20 |
Copy::conjoint_jshorts_atomic((jshort*) src, (jshort*) dst, size / sizeof (jshort)); |
21 |
} else { |
22 |
// Not aligned, so no need to be atomic. |
23 |
Copy::conjoint_jbytes(( void *) src, ( void *) dst, size); |
24 |
} |
25 |
} |
conjoint_memory_atomic函数会根据所操作的数据所属的类型选择合适的操作方法,各个操作方法都很相似,这里就看看conjoint_jints_atomic函数的实现。首先在“openjdk6-src\hotspot\src\share\vm\utilities\copy.hpp”文件中可以找到:
1 |
// jints, conjoint, atomic on each jint |
2 |
static void conjoint_jints_atomic(jint* from, jint* to, size_t count) { |
3 |
assert_params_ok(from, to, LogBytesPerInt); |
4 |
pd_conjoint_jints_atomic(from, to, count); |
5 |
} |
继续查找pd_conjoint_jints_atomic函数,在“openjdk6-src\hotspot\src\cpu\zero\vm\copy_zero.hpp”中:
1 |
static void pd_conjoint_jints_atomic(jint* from, jint* to, size_t count) { |
2 |
_Copy_conjoint_jints_atomic(from, to, count); |
3 |
} |
再找到“openjdk6-src\hotspot\src\os_cpu\linux_zero\vm\os_linux_zero.cpp”文件:
01 |
void _Copy_conjoint_jints_atomic(jint* from, jint* to, size_t count) { |
02 |
if (from > to) { |
03 |
jint *end = from + count; |
04 |
while (from < end) |
05 |
*(to++) = *(from++); |
06 |
} |
07 |
else if (from < to) { |
08 |
jint *end = from; |
09 |
from += count - 1; |
10 |
to += count - 1; |
11 |
while (from >= end) |
12 |
*(to--) = *(from--); |
13 |
} |
14 |
} |
找到这里,_Copy_conjoint_jints_atomic函数就是一个很经典的内存块处理代码了。
而在同一个文件中可以找到_Copy_conjoint_jlongs_atomic函数:
01 |
void _Copy_conjoint_jlongs_atomic(jlong* from, jlong* to, size_t count) { |
02 |
if (from > to) { |
03 |
jlong *end = from + count; |
04 |
while (from < end) |
05 |
os::atomic_copy64(from++, to++); |
06 |
} |
07 |
else if (from < to) { |
08 |
jlong *end = from; |
09 |
from += count - 1; |
10 |
to += count - 1; |