对StringBuilder抛出ArrayIndexOutOfBoundsException的探究

最近在项目开发时遇到一个问题,就是写好的代码时不时的报出ArrayIndexOutOfBoundsException的异常,这让我很困扰。下面是那段代码的简化版,只是为了说明这个问题。

1、 代码及报错信息
代码如下:


import java.util.ArrayList;
import java.util.List;

/**
 * 测试StringBuilder和StringBuffer在多线程情形下的不同表现
 * 说明: StringBuilder是线程不安全的,但是效率较高
 *      StringBuffer是线程安全的,但是效率较低
 * @author Herry
 */
public class StringContactTest {

    public static void main(String[] args) {        
        // 测试StringBuilder拼接方式
        for(int i=0; i < 20; i++) {         
            stringContactWithBuilder();         
        }   
    }

    // 通过StringBuilder来进行字符串拼接
    public static void stringContactWithBuilder(){  
        // 待拼接数据
        List dataList = new ArrayList<>();
        // 模拟赋值
        for (int i = 0; i < 20; i++) {
            dataList.add("data" + i);
        }

        StringBuilder stringBuilder = new StringBuilder();
        dataList.parallelStream().forEach(data -> {     
            stringBuilder.append(data);     
            StringBuilder stringBuilder2 = new StringBuilder();
            dataList.parallelStream().forEach(data2 -> {                
                stringBuilder2.append(data2);               
            });
            System.out.println(stringBuilder2.toString());          
        });     
        System.out.println(stringBuilder.toString());       
    }

}

报错信息如下:

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
    at java.lang.reflect.Constructor.newInstance(Unknown Source)
    at java.util.concurrent.ForkJoinTask.getThrowableException(Unknown Source)
    at java.util.concurrent.ForkJoinTask.reportException(Unknown Source)
    at java.util.concurrent.ForkJoinTask.invoke(Unknown Source)
    at java.util.stream.ForEachOps$ForEachOp.evaluateParallel(Unknown Source)
    at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(Unknown Source)
    at java.util.stream.AbstractPipeline.evaluate(Unknown Source)
    at java.util.stream.ReferencePipeline.forEach(Unknown Source)
    at java.util.stream.ReferencePipeline$Head.forEach(Unknown Source)
    at com.liu.date20170625.StringContactTest.stringContactWithBuilder(StringContactTest.java:32)
    at com.liu.date20170625.StringContactTest.main(StringContactTest.java:17)
Caused by: java.lang.ArrayIndexOutOfBoundsException
    at java.lang.System.arraycopy(Native Method)
    at java.lang.String.getChars(Unknown Source)
    at java.lang.AbstractStringBuilder.append(Unknown Source)
    at java.lang.StringBuilder.append(Unknown Source)
    at com.liu.date20170625.StringContactTest.lambda$2(StringContactTest.java:36)
    at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(Unknown Source)
    at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(Unknown Source)
    at java.util.stream.AbstractPipeline.copyInto(Unknown Source)
    at java.util.stream.ForEachOps$ForEachTask.compute(Unknown Source)
    at java.util.concurrent.CountedCompleter.exec(Unknown Source)
    at java.util.concurrent.ForkJoinTask.doExec(Unknown Source)
    at java.util.concurrent.ForkJoinPool.helpComplete(Unknown Source)
    at java.util.concurrent.ForkJoinPool.awaitJoin(Unknown Source)
    at java.util.concurrent.ForkJoinTask.doInvoke(Unknown Source)
    at java.util.concurrent.ForkJoinTask.invoke(Unknown Source)
    at java.util.stream.ForEachOps$ForEachOp.evaluateParallel(Unknown Source)
    at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(Unknown Source)
    at java.util.stream.AbstractPipeline.evaluate(Unknown Source)
    at java.util.stream.ReferencePipeline.forEach(Unknown Source)
    at java.util.stream.ReferencePipeline$Head.forEach(Unknown Source)
    at com.liu.date20170625.StringContactTest.lambda$0(StringContactTest.java:35)
    at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(Unknown Source)
    at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(Unknown Source)
    at java.util.stream.AbstractPipeline.copyInto(Unknown Source)
    at java.util.stream.ForEachOps$ForEachTask.compute(Unknown Source)
    at java.util.concurrent.CountedCompleter.exec(Unknown Source)
    at java.util.concurrent.ForkJoinTask.doExec(Unknown Source)
    at java.util.concurrent.ForkJoinPool$WorkQueue.execLocalTasks(Unknown Source)
    at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(Unknown Source)
    at java.util.concurrent.ForkJoinPool.runWorker(Unknown Source)
    at java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source)

2、 分析
一开始不知道这个异常是从哪里抛出来的,后来查看源码后发现是从append()方法中抛出的,下面是分析过程。
根据报错信息的提示,异常信息出现在36行,下面进行源码跟踪。进入StringBuilder的append()方法,具体代码如下:

public StringBuilder append(String str) {
        super.append(str);
        return this;
    }

StringBuilder的append()方法是调用父类AbstractStringBuilder的append()方法实现的。下面看看AbstractStringBuilder的append()方法的具体实现:

public AbstractStringBuilder append(String str) {
        if (str == null)
            return appendNull();
        int len = str.length();
        ensureCapacityInternal(count + len);
        str.getChars(0, len, value, count);
        count += len;
        return this;
    }

可以看到,在AbstractStringBuilder的append()方法中,在进行字符串拼接之前会先使用ensureCapacityInternal()检查空间是否足够,如果不够将进行扩容后再进行字符串的拼接。但是从ensureCapacityInternal()方法的注释中(注释具体如下)可以看到明确的说明,该方法是不同步的,也就是在多线程情况下是不安全的。

/**
  * For positive values of {@code minimumCapacity}, this method
  * behaves like {@code ensureCapacity}, however it is never
  * synchronized.
  * If {@code minimumCapacity} is non positive due to numeric
  * overflow, this method throws {@code OutOfMemoryError}.
  */

在空间检查通过后,调用getChars()方法进行字符串拼接。getChar()具体实现如下:

public void getChars(int srcBegin, int srcEnd, char dst[], int dstBegin) {
        if (srcBegin < 0) {
            throw new StringIndexOutOfBoundsException(srcBegin);
        }
        if (srcEnd > value.length) {
            throw new StringIndexOutOfBoundsException(srcEnd);
        }
        if (srcBegin > srcEnd) {
            throw new StringIndexOutOfBoundsException(srcEnd - srcBegin);
        }
        System.arraycopy(value, srcBegin, dst, dstBegin, srcEnd - srcBegin);
    }

继续跟踪下去,查看arraycopy()的源码,具体如下:

public static native void arraycopy(Object src,  int  srcPos,
                                        Object dest, int destPos,
                                        int length);

这是一个本地(native)方法。直到这里,我们还是没有找到抛出ArrayIndexOutOfBoundsException异常的地方,但是唯一可能抛出这个异常的也只有arraycopy()了。由于这是本地方法,无法直接在jdk里找到源码,于是在网上找了这个方法的源码,具体如下:

/* 
java.lang.System中的arraycopy方法 
*/  
JVM_ENTRY(void, JVM_ArrayCopy(JNIEnv *env, jclass ignored, jobject src, jint src_pos,  
                               jobject dst, jint dst_pos, jint length))  
  JVMWrapper("JVM_ArrayCopy");  
  // Check if we have null pointers  
  //检查源数组和目的数组不为空  
  if (src == NULL || dst == NULL) {  
    THROW(vmSymbols::java_lang_NullPointerException());  
  }  

  arrayOop s = arrayOop(JNIHandles::resolve_non_null(src));  
  arrayOop d = arrayOop(JNIHandles::resolve_non_null(dst));  
  assert(s->is_oop(), "JVM_ArrayCopy: src not an oop");  
  assert(d->is_oop(), "JVM_ArrayCopy: dst not an oop");  
  // Do copy  
  //真正调用复制的方法  
  s->klass()->copy_array(s, src_pos, d, dst_pos, length, thread);  
JVM_END  

以上方法没有真正实现复制,而只是简单的检测源数组和目的数组不为空,排除一些异常情况。下面是具体实现的方法:

/* 
java.lang.System中的arraycopy方法具体实现 
*/  
void ObjArrayKlass::copy_array(arrayOop s, int src_pos, arrayOop d,  
                               int dst_pos, int length, TRAPS) {  
  //检测s是数组  
  assert(s->is_objArray(), "must be obj array");  

  //目的数组不是数组对象的话,则抛出ArrayStoreException异常  
  if (!d->is_objArray()) {  
    THROW(vmSymbols::java_lang_ArrayStoreException());  
  }  

  // Check is all offsets and lengths are non negative  
  //检测下标参数非负  
  if (src_pos < 0 || dst_pos < 0 || length < 0) {  
    THROW(vmSymbols::java_lang_ArrayIndexOutOfBoundsException());  
  }  
  // Check if the ranges are valid  
  //检测下标参数是否越界  
  if  ( (((unsigned int) length + (unsigned int) src_pos) > (unsigned int) s->length())  
     || (((unsigned int) length + (unsigned int) dst_pos) > (unsigned int) d->length()) ) {  
    THROW(vmSymbols::java_lang_ArrayIndexOutOfBoundsException());  
  }  

  // Special case. Boundary cases must be checked first  
  // This allows the following call: copy_array(s, s.length(), d.length(), 0).  
  // This is correct, since the position is supposed to be an 'in between point', i.e., s.length(),  
  // points to the right of the last element.  
  //length==0则不需要复制  
  if (length==0) {  
    return;  
  }  
  //UseCompressedOops只是用来区分narrowOop和oop,具体2者有啥区别需要再研究  
  //调用do_copy函数来复制  
  if (UseCompressedOops) {  
    narrowOop* const src = objArrayOop(s)->obj_at_addr(src_pos);  
    narrowOop* const dst = objArrayOop(d)->obj_at_addr(dst_pos);  
    do_copy(s, src, d, dst, length, CHECK);  
  } else {  
    oop* const src = objArrayOop(s)->obj_at_addr(src_pos);  
    oop* const dst = objArrayOop(d)->obj_at_addr(dst_pos);  
    do_copy (s, src, d, dst, length, CHECK);  
  }  
}  

这里只是为了找到异常抛出的地方和原因,所以不再继续往下分析,比如do_copy()方法的具体实现等没有深究。综合前面的ensureCapacityInternal()方法的非同步,到这里我们可以发现异常抛出的原因了:因为append()方法是在多线程(parallelStream,并行流)中调用的,所以可能有两个或者多个线程通过了ensureCapacityInternal()方法的空间校验,而实际空间不足而导致了数组下标越界。下面举个简单的例子进行理解。
假如有A、B两个线程,都需要拼接一个长度为40的字符串,而当前剩余空间为50。当A通过ensureCapacityInternal()检验且为执行getChars()方法时被挂起,这时B线程通过ensureCapacityInternal()对空间进行校验是可以通过的,因为40<50。接下来当A、B线程进行数组复制时,后复制的那个线程将出现数组下标越界异常,因为第一个线程复制完成后,剩下空间只有10。10<40而导致空间不足,下标越界。

3、 结语
所以针对开头的那个示例代码,有两种改法,一种是将并行流(parallelStream)改成串行流(stream),二是将非线程安全的StingBuilder更换成线程安全的StringBuffer。

以前对于多线程程序写得比较少,遇到这个问题时,有些无从下手,所以决心深究一下其原因。虽然平时也知道StringBuilder是线程不安全、StringBuffer是线程安全的,但是用起来有时候确实不太注意。因此将问题及解决方法记录下来,希望以后可以引以为戒,也希望对你有些许帮助,那就足矣。如有哪里写得不对,还望大神不吝赐教,不胜感激。

【参考博客】 http://blog.csdn.net/u011642663/article/details/49512643

你可能感兴趣的:(Java)