String、StringBuffer和StringBuilder

1.三者对比

public final class String
    implements java.io.Serializable, Comparable, CharSequence {

String是Immutable类,不可变类。因此对于修改的动作,都会产生新的String对象。

 public final class StringBuffer
    extends AbstractStringBuilder
    implements java.io.Serializable, CharSequence
{

    @Override
    public synchronized int length() {
        return count;
    }

    @Override
    public synchronized int capacity() {
        return value.length;
    }

    @Override
    public synchronized StringBuffer append(String str) {
        toStringCache = null;
        super.append(str);
        return this;
    }

StringBuffer是可以修改的,通过添加synchronized,保证了线程安全。

public final class StringBuilder
    extends AbstractStringBuilder
    implements java.io.Serializable, CharSequence
{

    @Override
    public StringBuilder append(String str) {
        super.append(str);
        return this;
    }

StringBuilder在功能上与StringBuffer没有本质区别,但是其去掉了synchronized,在无并发修改的情况下是拼接的首选。

2.拼接

        String str = "aa" + "bb" + "cc";
        String str1 = str + "dd";

对应字节码:

         0: ldc           #2                  // String aabbcc
         2: astore_1
         3: new           #3                  // class java/lang/StringBuilder
         6: dup
         7: invokespecial #4                  // Method java/lang/StringBuilder."":()V
        10: aload_1
        11: invokevirtual #5                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
        14: ldc           #6                  // String dd
        16: invokevirtual #5                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
        19: invokevirtual #7                  // Method java/lang/StringBuilder.toString:()Ljava/lang/String;

可见:

  • 1)"aa" + "bb" + "cc"直接优化成了"aabbcc"
  • 2)普通的+拼接优化成了通过StringBuilder进行拼接

+和StringBuilder直接拼接两种方式性能对比:

        String test1 = "0";
        long current = System.currentTimeMillis();
        for (int i = 1; i <= 200_000; i++) {
            if (i % 10_000 == 0) {
                long temp = System.currentTimeMillis();
                System.out.println(temp - current);
                current = temp;
            }
            test1 += i;
        }
        System.out.println();

        StringBuilder test2 = new StringBuilder();
        current = System.currentTimeMillis();
        for (int i = 1; i <= 200_000; i++) {
            if (i % 10_000 == 0) {
                long temp = System.currentTimeMillis();
                System.out.println(temp - current);
                current = temp;
            }
            test2.append(i);
        }
    }

测试结果如下:

452
1245
1555
2048
2899
3381
4524
4756
5379
5962
6720
3664
3721
4067
4431
4933
5349
5922
6192
6505

1
0
1
0
1
0
0
1
0
0
0
1
0
0
0
1
0
0
0
0

-XX:+PrintGCDetails开启GC日志,可以看到大量新建的临时对象触发频繁GC回收:

[GC (Allocation Failure) [PSYoungGen: 33280K->760K(38400K)] 33280K->768K(125952K), 0.0010430 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
[GC (Allocation Failure) [PSYoungGen: 34029K->776K(38400K)] 34037K->784K(125952K), 0.0032711 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
[GC (Allocation Failure) [PSYoungGen: 34056K->715K(38400K)] 34064K->723K(125952K), 0.0028864 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
[GC (Allocation Failure) [PSYoungGen: 33995K->728K(71680K)] 34003K->736K(159232K), 0.0006853 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
[GC (Allocation Failure) [PSYoungGen: 67288K->814K(71680K)] 67296K->822K(159232K), 0.0009245 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
[GC (Allocation Failure) [PSYoungGen: 67374K->775K(134144K)] 67382K->783K(221696K), 0.0015878 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 

反编译字节码如下,可以看出通过+在循环内会重复新建StringBuilder对象,这样会快速消耗内存又频繁触发GC。

         0: ldc           #2                  // String 0
         2: astore_1
         3: invokestatic  #3                  // Method java/lang/System.currentTimeMillis:()J
         6: lstore_2
         7: iconst_1
         8: istore        4
        10: iload         4
        12: ldc           #4                  // int 200000
        14: if_icmpgt     70
        17: iload         4
        19: sipush        10000
        22: irem
        23: ifne          44
        26: invokestatic  #3                  // Method java/lang/System.currentTimeMillis:()J
        29: lstore        5
        31: getstatic     #5                  // Field java/lang/System.out:Ljava/io/PrintStream;
        34: lload         5
        36: lload_2
        37: lsub
        38: invokevirtual #6                  // Method java/io/PrintStream.println:(J)V
        41: lload         5
        43: lstore_2
        44: new           #7                  // class java/lang/StringBuilder
        47: dup
        48: invokespecial #8                  // Method java/lang/StringBuilder."":()V
        51: aload_1
        52: invokevirtual #9                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
        55: iload         4
        57: invokevirtual #10                 // Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
        60: invokevirtual #11                 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
        63: astore_1
        64: iinc          4, 1
        67: goto          10
        70: getstatic     #5                  // Field java/lang/System.out:Ljava/io/PrintStream;
        73: invokevirtual #12                 // Method java/io/PrintStream.println:()V
        76: new           #7                  // class java/lang/StringBuilder
        79: dup
        80: invokespecial #8                  // Method java/lang/StringBuilder."":()V
        83: astore        4
        85: invokestatic  #3                  // Method java/lang/System.currentTimeMillis:()J
        88: lstore_2
        89: iconst_1
        90: istore        5
        92: iload         5
        94: ldc           #4                  // int 200000
        96: if_icmpgt     140
        99: iload         5
       101: sipush        10000
       104: irem
       105: ifne          126
       108: invokestatic  #3                  // Method java/lang/System.currentTimeMillis:()J
       111: lstore        6
       113: getstatic     #5                  // Field java/lang/System.out:Ljava/io/PrintStream;
       116: lload         6
       118: lload_2
       119: lsub
       120: invokevirtual #6                  // Method java/io/PrintStream.println:(J)V
       123: lload         6
       125: lstore_2
       126: aload         4
       128: iload         5
       130: invokevirtual #10                 // Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
       133: pop
       134: iinc          5, 1
       137: goto          92
       140: return

3.拼接字节码在不同JDK版本的演进

        String str = "aa" + "bb" + "cc";
        String str1 = str + "dd";

如下为JDK10:

         0: ldc           #2                  // String aabbcc
         2: astore_1
         3: aload_1
         4: invokedynamic #3,  0              // InvokeDynamic #0:makeConcatWithConstants:(Ljava/lang/String;)Ljava/lang/String;
         9: astore_2
        10: return

BootstrapMethods:
  0: #25 REF_invokeStatic java/lang/invoke/StringConcatFactory.makeConcatWithConstants:(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;Ljava/lang/String;[Ljava/lang/Object;)Ljava/lang/invoke/CallSite;
    Method arguments:
      #26 \u0001dd

JDK8中拼接+会被javac转换成StringBuilder操作,在JDK10中利用了InvokeDynamic,将字符串拼接的优化与java生成的字节码解耦,假设未来JVM增强相关运行时实现,并不需要依赖javac的任何修改。

拼接在JDK10下面的测试,也发现GC回收很频繁,则表明JDK10未根本改变拼接实现方式。

88
136
266
358
581
424
499
567
653
711
827
953
1038
1141
1247
1318
1412
1529
1601
1771

2
1
0
0
0
1
0
0
1
0
0
1
0
0
0
1
0
0
0
1

4.String自身的演化

JDK8使用char数组来存数据,拉丁语系语言的字符,根本不需要太宽的char。

public final class String
    implements java.io.Serializable, Comparable, CharSequence {
    /** The value is used for character storage. */
    private final char value[];

JDK9引入了Compact Strings设计,如下为JDK10的源码:

public final class String
    implements java.io.Serializable, Comparable, CharSequence {

    /**
     * The value is used for character storage.
     *
     * @implNote This field is trusted by the VM, and is a subject to
     * constant folding if String instance is constant. Overwriting this
     * field after construction will cause problems.
     *
     * Additionally, it is marked with {@link Stable} to trust the contents
     * of the array. No other facility in JDK provides this functionality (yet).
     * {@link Stable} is safe here, because value is never null.
     */
    @Stable
    private final byte[] value;

    /**
     * The identifier of the encoding used to encode the bytes in
     * {@code value}. The supported values in this implementation are
     *
     * LATIN1
     * UTF16
     *
     * @implNote This field is trusted by the VM, and is a subject to
     * constant folding if String instance is constant. Overwriting this
     * field after construction will cause problems.
     */
    private final byte coder;

在JDK9之后的String类中,维护了属性coder,它是一个编码格式的标识,使用LATIN1还是UTF-16,这个是在String生成的时候自动的,如果字符串中都是能用LATIN1就能表示的就是0,否则就是UTF-16。另外,所有相关的Intrinsic都进行了重写。

你可能感兴趣的:(String、StringBuffer和StringBuilder)