1.三者对比
public final class String
implements java.io.Serializable, Comparable, CharSequence {
String是Immutable类,不可变类。因此对于修改的动作,都会产生新的String对象。
public final class StringBuffer
extends AbstractStringBuilder
implements java.io.Serializable, CharSequence
{
@Override
public synchronized int length() {
return count;
}
@Override
public synchronized int capacity() {
return value.length;
}
@Override
public synchronized StringBuffer append(String str) {
toStringCache = null;
super.append(str);
return this;
}
StringBuffer是可以修改的,通过添加synchronized,保证了线程安全。
public final class StringBuilder
extends AbstractStringBuilder
implements java.io.Serializable, CharSequence
{
@Override
public StringBuilder append(String str) {
super.append(str);
return this;
}
StringBuilder在功能上与StringBuffer没有本质区别,但是其去掉了synchronized,在无并发修改的情况下是拼接的首选。
2.拼接
String str = "aa" + "bb" + "cc";
String str1 = str + "dd";
对应字节码:
0: ldc #2 // String aabbcc
2: astore_1
3: new #3 // class java/lang/StringBuilder
6: dup
7: invokespecial #4 // Method java/lang/StringBuilder."":()V
10: aload_1
11: invokevirtual #5 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
14: ldc #6 // String dd
16: invokevirtual #5 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
19: invokevirtual #7 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
可见:
- 1)"aa" + "bb" + "cc"直接优化成了"aabbcc"
- 2)普通的+拼接优化成了通过StringBuilder进行拼接
+和StringBuilder直接拼接两种方式性能对比:
String test1 = "0";
long current = System.currentTimeMillis();
for (int i = 1; i <= 200_000; i++) {
if (i % 10_000 == 0) {
long temp = System.currentTimeMillis();
System.out.println(temp - current);
current = temp;
}
test1 += i;
}
System.out.println();
StringBuilder test2 = new StringBuilder();
current = System.currentTimeMillis();
for (int i = 1; i <= 200_000; i++) {
if (i % 10_000 == 0) {
long temp = System.currentTimeMillis();
System.out.println(temp - current);
current = temp;
}
test2.append(i);
}
}
测试结果如下:
452
1245
1555
2048
2899
3381
4524
4756
5379
5962
6720
3664
3721
4067
4431
4933
5349
5922
6192
6505
1
0
1
0
1
0
0
1
0
0
0
1
0
0
0
1
0
0
0
0
-XX:+PrintGCDetails开启GC日志,可以看到大量新建的临时对象触发频繁GC回收:
[GC (Allocation Failure) [PSYoungGen: 33280K->760K(38400K)] 33280K->768K(125952K), 0.0010430 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
[GC (Allocation Failure) [PSYoungGen: 34029K->776K(38400K)] 34037K->784K(125952K), 0.0032711 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
[GC (Allocation Failure) [PSYoungGen: 34056K->715K(38400K)] 34064K->723K(125952K), 0.0028864 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
[GC (Allocation Failure) [PSYoungGen: 33995K->728K(71680K)] 34003K->736K(159232K), 0.0006853 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
[GC (Allocation Failure) [PSYoungGen: 67288K->814K(71680K)] 67296K->822K(159232K), 0.0009245 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
[GC (Allocation Failure) [PSYoungGen: 67374K->775K(134144K)] 67382K->783K(221696K), 0.0015878 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
反编译字节码如下,可以看出通过+在循环内会重复新建StringBuilder对象,这样会快速消耗内存又频繁触发GC。
0: ldc #2 // String 0
2: astore_1
3: invokestatic #3 // Method java/lang/System.currentTimeMillis:()J
6: lstore_2
7: iconst_1
8: istore 4
10: iload 4
12: ldc #4 // int 200000
14: if_icmpgt 70
17: iload 4
19: sipush 10000
22: irem
23: ifne 44
26: invokestatic #3 // Method java/lang/System.currentTimeMillis:()J
29: lstore 5
31: getstatic #5 // Field java/lang/System.out:Ljava/io/PrintStream;
34: lload 5
36: lload_2
37: lsub
38: invokevirtual #6 // Method java/io/PrintStream.println:(J)V
41: lload 5
43: lstore_2
44: new #7 // class java/lang/StringBuilder
47: dup
48: invokespecial #8 // Method java/lang/StringBuilder."":()V
51: aload_1
52: invokevirtual #9 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
55: iload 4
57: invokevirtual #10 // Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
60: invokevirtual #11 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
63: astore_1
64: iinc 4, 1
67: goto 10
70: getstatic #5 // Field java/lang/System.out:Ljava/io/PrintStream;
73: invokevirtual #12 // Method java/io/PrintStream.println:()V
76: new #7 // class java/lang/StringBuilder
79: dup
80: invokespecial #8 // Method java/lang/StringBuilder."":()V
83: astore 4
85: invokestatic #3 // Method java/lang/System.currentTimeMillis:()J
88: lstore_2
89: iconst_1
90: istore 5
92: iload 5
94: ldc #4 // int 200000
96: if_icmpgt 140
99: iload 5
101: sipush 10000
104: irem
105: ifne 126
108: invokestatic #3 // Method java/lang/System.currentTimeMillis:()J
111: lstore 6
113: getstatic #5 // Field java/lang/System.out:Ljava/io/PrintStream;
116: lload 6
118: lload_2
119: lsub
120: invokevirtual #6 // Method java/io/PrintStream.println:(J)V
123: lload 6
125: lstore_2
126: aload 4
128: iload 5
130: invokevirtual #10 // Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
133: pop
134: iinc 5, 1
137: goto 92
140: return
3.拼接字节码在不同JDK版本的演进
String str = "aa" + "bb" + "cc";
String str1 = str + "dd";
如下为JDK10:
0: ldc #2 // String aabbcc
2: astore_1
3: aload_1
4: invokedynamic #3, 0 // InvokeDynamic #0:makeConcatWithConstants:(Ljava/lang/String;)Ljava/lang/String;
9: astore_2
10: return
BootstrapMethods:
0: #25 REF_invokeStatic java/lang/invoke/StringConcatFactory.makeConcatWithConstants:(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;Ljava/lang/String;[Ljava/lang/Object;)Ljava/lang/invoke/CallSite;
Method arguments:
#26 \u0001dd
JDK8中拼接+会被javac转换成StringBuilder操作,在JDK10中利用了InvokeDynamic,将字符串拼接的优化与java生成的字节码解耦,假设未来JVM增强相关运行时实现,并不需要依赖javac的任何修改。
拼接在JDK10下面的测试,也发现GC回收很频繁,则表明JDK10未根本改变拼接实现方式。
88
136
266
358
581
424
499
567
653
711
827
953
1038
1141
1247
1318
1412
1529
1601
1771
2
1
0
0
0
1
0
0
1
0
0
1
0
0
0
1
0
0
0
1
4.String自身的演化
JDK8使用char数组来存数据,拉丁语系语言的字符,根本不需要太宽的char。
public final class String
implements java.io.Serializable, Comparable, CharSequence {
/** The value is used for character storage. */
private final char value[];
JDK9引入了Compact Strings设计,如下为JDK10的源码:
public final class String
implements java.io.Serializable, Comparable, CharSequence {
/**
* The value is used for character storage.
*
* @implNote This field is trusted by the VM, and is a subject to
* constant folding if String instance is constant. Overwriting this
* field after construction will cause problems.
*
* Additionally, it is marked with {@link Stable} to trust the contents
* of the array. No other facility in JDK provides this functionality (yet).
* {@link Stable} is safe here, because value is never null.
*/
@Stable
private final byte[] value;
/**
* The identifier of the encoding used to encode the bytes in
* {@code value}. The supported values in this implementation are
*
* LATIN1
* UTF16
*
* @implNote This field is trusted by the VM, and is a subject to
* constant folding if String instance is constant. Overwriting this
* field after construction will cause problems.
*/
private final byte coder;
在JDK9之后的String类中,维护了属性coder,它是一个编码格式的标识,使用LATIN1还是UTF-16,这个是在String生成的时候自动的,如果字符串中都是能用LATIN1就能表示的就是0,否则就是UTF-16。另外,所有相关的Intrinsic都进行了重写。