最近在项目开发时遇到一个问题,就是写好的代码时不时的报出ArrayIndexOutOfBoundsException的异常,这让我很困扰。下面是那段代码的简化版,只是为了说明这个问题。
1、 代码及报错信息
代码如下:
import java.util.ArrayList;
import java.util.List;
/**
* 测试StringBuilder和StringBuffer在多线程情形下的不同表现
* 说明: StringBuilder是线程不安全的,但是效率较高
* StringBuffer是线程安全的,但是效率较低
* @author Herry
*/
public class StringContactTest {
public static void main(String[] args) {
// 测试StringBuilder拼接方式
for(int i=0; i < 20; i++) {
stringContactWithBuilder();
}
}
// 通过StringBuilder来进行字符串拼接
public static void stringContactWithBuilder(){
// 待拼接数据
List dataList = new ArrayList<>();
// 模拟赋值
for (int i = 0; i < 20; i++) {
dataList.add("data" + i);
}
StringBuilder stringBuilder = new StringBuilder();
dataList.parallelStream().forEach(data -> {
stringBuilder.append(data);
StringBuilder stringBuilder2 = new StringBuilder();
dataList.parallelStream().forEach(data2 -> {
stringBuilder2.append(data2);
});
System.out.println(stringBuilder2.toString());
});
System.out.println(stringBuilder.toString());
}
}
报错信息如下:
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
at java.lang.reflect.Constructor.newInstance(Unknown Source)
at java.util.concurrent.ForkJoinTask.getThrowableException(Unknown Source)
at java.util.concurrent.ForkJoinTask.reportException(Unknown Source)
at java.util.concurrent.ForkJoinTask.invoke(Unknown Source)
at java.util.stream.ForEachOps$ForEachOp.evaluateParallel(Unknown Source)
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(Unknown Source)
at java.util.stream.AbstractPipeline.evaluate(Unknown Source)
at java.util.stream.ReferencePipeline.forEach(Unknown Source)
at java.util.stream.ReferencePipeline$Head.forEach(Unknown Source)
at com.liu.date20170625.StringContactTest.stringContactWithBuilder(StringContactTest.java:32)
at com.liu.date20170625.StringContactTest.main(StringContactTest.java:17)
Caused by: java.lang.ArrayIndexOutOfBoundsException
at java.lang.System.arraycopy(Native Method)
at java.lang.String.getChars(Unknown Source)
at java.lang.AbstractStringBuilder.append(Unknown Source)
at java.lang.StringBuilder.append(Unknown Source)
at com.liu.date20170625.StringContactTest.lambda$2(StringContactTest.java:36)
at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(Unknown Source)
at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(Unknown Source)
at java.util.stream.AbstractPipeline.copyInto(Unknown Source)
at java.util.stream.ForEachOps$ForEachTask.compute(Unknown Source)
at java.util.concurrent.CountedCompleter.exec(Unknown Source)
at java.util.concurrent.ForkJoinTask.doExec(Unknown Source)
at java.util.concurrent.ForkJoinPool.helpComplete(Unknown Source)
at java.util.concurrent.ForkJoinPool.awaitJoin(Unknown Source)
at java.util.concurrent.ForkJoinTask.doInvoke(Unknown Source)
at java.util.concurrent.ForkJoinTask.invoke(Unknown Source)
at java.util.stream.ForEachOps$ForEachOp.evaluateParallel(Unknown Source)
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(Unknown Source)
at java.util.stream.AbstractPipeline.evaluate(Unknown Source)
at java.util.stream.ReferencePipeline.forEach(Unknown Source)
at java.util.stream.ReferencePipeline$Head.forEach(Unknown Source)
at com.liu.date20170625.StringContactTest.lambda$0(StringContactTest.java:35)
at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(Unknown Source)
at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(Unknown Source)
at java.util.stream.AbstractPipeline.copyInto(Unknown Source)
at java.util.stream.ForEachOps$ForEachTask.compute(Unknown Source)
at java.util.concurrent.CountedCompleter.exec(Unknown Source)
at java.util.concurrent.ForkJoinTask.doExec(Unknown Source)
at java.util.concurrent.ForkJoinPool$WorkQueue.execLocalTasks(Unknown Source)
at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(Unknown Source)
at java.util.concurrent.ForkJoinPool.runWorker(Unknown Source)
at java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source)
2、 分析
一开始不知道这个异常是从哪里抛出来的,后来查看源码后发现是从append()方法中抛出的,下面是分析过程。
根据报错信息的提示,异常信息出现在36行,下面进行源码跟踪。进入StringBuilder的append()方法,具体代码如下:
public StringBuilder append(String str) {
super.append(str);
return this;
}
StringBuilder的append()方法是调用父类AbstractStringBuilder的append()方法实现的。下面看看AbstractStringBuilder的append()方法的具体实现:
public AbstractStringBuilder append(String str) {
if (str == null)
return appendNull();
int len = str.length();
ensureCapacityInternal(count + len);
str.getChars(0, len, value, count);
count += len;
return this;
}
可以看到,在AbstractStringBuilder的append()方法中,在进行字符串拼接之前会先使用ensureCapacityInternal()检查空间是否足够,如果不够将进行扩容后再进行字符串的拼接。但是从ensureCapacityInternal()方法的注释中(注释具体如下)可以看到明确的说明,该方法是不同步的,也就是在多线程情况下是不安全的。
/**
* For positive values of {@code minimumCapacity}, this method
* behaves like {@code ensureCapacity}, however it is never
* synchronized.
* If {@code minimumCapacity} is non positive due to numeric
* overflow, this method throws {@code OutOfMemoryError}.
*/
在空间检查通过后,调用getChars()方法进行字符串拼接。getChar()具体实现如下:
public void getChars(int srcBegin, int srcEnd, char dst[], int dstBegin) {
if (srcBegin < 0) {
throw new StringIndexOutOfBoundsException(srcBegin);
}
if (srcEnd > value.length) {
throw new StringIndexOutOfBoundsException(srcEnd);
}
if (srcBegin > srcEnd) {
throw new StringIndexOutOfBoundsException(srcEnd - srcBegin);
}
System.arraycopy(value, srcBegin, dst, dstBegin, srcEnd - srcBegin);
}
继续跟踪下去,查看arraycopy()的源码,具体如下:
public static native void arraycopy(Object src, int srcPos,
Object dest, int destPos,
int length);
这是一个本地(native)方法。直到这里,我们还是没有找到抛出ArrayIndexOutOfBoundsException异常的地方,但是唯一可能抛出这个异常的也只有arraycopy()了。由于这是本地方法,无法直接在jdk里找到源码,于是在网上找了这个方法的源码,具体如下:
/*
java.lang.System中的arraycopy方法
*/
JVM_ENTRY(void, JVM_ArrayCopy(JNIEnv *env, jclass ignored, jobject src, jint src_pos,
jobject dst, jint dst_pos, jint length))
JVMWrapper("JVM_ArrayCopy");
// Check if we have null pointers
//检查源数组和目的数组不为空
if (src == NULL || dst == NULL) {
THROW(vmSymbols::java_lang_NullPointerException());
}
arrayOop s = arrayOop(JNIHandles::resolve_non_null(src));
arrayOop d = arrayOop(JNIHandles::resolve_non_null(dst));
assert(s->is_oop(), "JVM_ArrayCopy: src not an oop");
assert(d->is_oop(), "JVM_ArrayCopy: dst not an oop");
// Do copy
//真正调用复制的方法
s->klass()->copy_array(s, src_pos, d, dst_pos, length, thread);
JVM_END
以上方法没有真正实现复制,而只是简单的检测源数组和目的数组不为空,排除一些异常情况。下面是具体实现的方法:
/*
java.lang.System中的arraycopy方法具体实现
*/
void ObjArrayKlass::copy_array(arrayOop s, int src_pos, arrayOop d,
int dst_pos, int length, TRAPS) {
//检测s是数组
assert(s->is_objArray(), "must be obj array");
//目的数组不是数组对象的话,则抛出ArrayStoreException异常
if (!d->is_objArray()) {
THROW(vmSymbols::java_lang_ArrayStoreException());
}
// Check is all offsets and lengths are non negative
//检测下标参数非负
if (src_pos < 0 || dst_pos < 0 || length < 0) {
THROW(vmSymbols::java_lang_ArrayIndexOutOfBoundsException());
}
// Check if the ranges are valid
//检测下标参数是否越界
if ( (((unsigned int) length + (unsigned int) src_pos) > (unsigned int) s->length())
|| (((unsigned int) length + (unsigned int) dst_pos) > (unsigned int) d->length()) ) {
THROW(vmSymbols::java_lang_ArrayIndexOutOfBoundsException());
}
// Special case. Boundary cases must be checked first
// This allows the following call: copy_array(s, s.length(), d.length(), 0).
// This is correct, since the position is supposed to be an 'in between point', i.e., s.length(),
// points to the right of the last element.
//length==0则不需要复制
if (length==0) {
return;
}
//UseCompressedOops只是用来区分narrowOop和oop,具体2者有啥区别需要再研究
//调用do_copy函数来复制
if (UseCompressedOops) {
narrowOop* const src = objArrayOop(s)->obj_at_addr(src_pos);
narrowOop* const dst = objArrayOop(d)->obj_at_addr(dst_pos);
do_copy(s, src, d, dst, length, CHECK);
} else {
oop* const src = objArrayOop(s)->obj_at_addr(src_pos);
oop* const dst = objArrayOop(d)->obj_at_addr(dst_pos);
do_copy (s, src, d, dst, length, CHECK);
}
}
这里只是为了找到异常抛出的地方和原因,所以不再继续往下分析,比如do_copy()方法的具体实现等没有深究。综合前面的ensureCapacityInternal()方法的非同步,到这里我们可以发现异常抛出的原因了:因为append()方法是在多线程(parallelStream,并行流)中调用的,所以可能有两个或者多个线程通过了ensureCapacityInternal()方法的空间校验,而实际空间不足而导致了数组下标越界。下面举个简单的例子进行理解。
假如有A、B两个线程,都需要拼接一个长度为40的字符串,而当前剩余空间为50。当A通过ensureCapacityInternal()检验且为执行getChars()方法时被挂起,这时B线程通过ensureCapacityInternal()对空间进行校验是可以通过的,因为40<50。接下来当A、B线程进行数组复制时,后复制的那个线程将出现数组下标越界异常,因为第一个线程复制完成后,剩下空间只有10。10<40而导致空间不足,下标越界。
3、 结语
所以针对开头的那个示例代码,有两种改法,一种是将并行流(parallelStream)改成串行流(stream),二是将非线程安全的StingBuilder更换成线程安全的StringBuffer。
以前对于多线程程序写得比较少,遇到这个问题时,有些无从下手,所以决心深究一下其原因。虽然平时也知道StringBuilder是线程不安全、StringBuffer是线程安全的,但是用起来有时候确实不太注意。因此将问题及解决方法记录下来,希望以后可以引以为戒,也希望对你有些许帮助,那就足矣。如有哪里写得不对,还望大神不吝赐教,不胜感激。
【参考博客】 http://blog.csdn.net/u011642663/article/details/49512643