聊聊Java 9的Compact Strings

本文主要研究一下Java 9的Compact Strings

Compressed Strings( Java 6 )

Java 6引入了Compressed Strings,对于one byte per character使用byte[],对于two bytes per character继续使用char[];之前可以使用-XX:+UseCompressedStrings来开启,不过在java7被废弃了,然后在java8被移除

Compact Strings( Java 9 )

Java 9引入了Compact Strings来取代Java 6的Compressed Strings,它的实现更过彻底,完全使用byte[]来替代char[],同时新引入了一个字段coder来标识是LATIN1还是UTF16

进群:697699179可以获取Java各类入门学习资料!

这是我的微信公众号【编程study】各位大佬有空可以关注下,每天更新Java学习方法,感谢!

学习中遇到问题有不明白的地方,推荐加小编Java学习群:697699179内有视频教程 ,直播课程 ,等学习资料,期待你的加入

String

java.base/java/lang/String.java

publicfinalclassStringimplementsjava.io.Serializable,Comparable,CharSequence,Constable,ConstantDesc{/**    * The value is used for character storage.    *    *@implNoteThis field is trusted by the VM, and is a subject to    * constant folding if String instance is constant. Overwriting this    * field after construction will cause problems.    *    * Additionally, it is marked with {@linkStable} to trust the contents    * of the array. No other facility in JDK provides this functionality (yet).    * {@linkStable} is safe here, because value is never null.    */@Stableprivatefinalbyte[] value;/**    * The identifier of the encoding used to encode the bytes in    * {@codevalue}. The supported values in this implementation are    *    * LATIN1    * UTF16    *    *@implNoteThis field is trusted by the VM, and is a subject to    * constant folding if String instance is constant. Overwriting this    * field after construction will cause problems.    */privatefinalbytecoder;/** Cache the hash code for the string */privateinthash;// Default to 0/** use serialVersionUID from JDK 1.0.2 for interoperability */privatestaticfinallongserialVersionUID = -6849794470754667710L;/**    * If String compaction is disabled, the bytes in {@codevalue} are    * always encoded in UTF16.    *    * For methods with several possible implementation paths, when String    * compaction is disabled, only one code path is taken.    *    * The instance field value is generally opaque to optimizing JIT    * compilers. Therefore, in performance-sensitive place, an explicit    * check of the static boolean {@codeCOMPACT_STRINGS} is done first    * before checking the {@codecoder} field since the static boolean    * {@codeCOMPACT_STRINGS} would be constant folded away by an    * optimizing JIT compiler. The idioms for these cases are as follows.    *    * For code such as:    *    *    if (coder == LATIN1) { ... }    *    * can be written more optimally as    *    *    if (coder() == LATIN1) { ... }    *    * or:    *    *    if (COMPACT_STRINGS && coder == LATIN1) { ... }    *    * An optimizing JIT compiler can fold the above conditional as:    *    *    COMPACT_STRINGS == true  => if (coder == LATIN1) { ... }    *    COMPACT_STRINGS == false => if (false)          { ... }    *    *@implNote* The actual value for this field is injected by JVM. The static    * initialization block is used to set the value here to communicate    * that this static final field is not statically foldable, and to    * avoid any possible circular dependency during vm initialization.    */staticfinalbooleanCOMPACT_STRINGS;static{        COMPACT_STRINGS =true;    }/**    * Class String is special cased within the Serialization Stream Protocol.    *    * A String instance is written into an ObjectOutputStream according to    *     * Object Serialization Specification, Section 6.2, "Stream Elements"    */privatestaticfinalObjectStreamField[] serialPersistentFields =newObjectStreamField[0];/**    * Initializes a newly created {@codeString} object so that it represents    * an empty character sequence.  Note that use of this constructor is    * unnecessary since Strings are immutable.    */publicString(){this.value ="".value;this.coder ="".coder;    }//......publiccharcharAt(intindex){if(isLatin1()) {returnStringLatin1.charAt(value, index);        }else{returnStringUTF16.charAt(value, index);        }    }publicbooleanequals(Object anObject){if(this== anObject) {returntrue;        }if(anObjectinstanceofString) {            String aString = (String)anObject;if(coder() == aString.coder()) {returnisLatin1() ? StringLatin1.equals(value, aString.value)                                  : StringUTF16.equals(value, aString.value);            }        }returnfalse;    }publicintcompareTo(String anotherString){bytev1[] = value;bytev2[] = anotherString.value;if(coder() == anotherString.coder()) {returnisLatin1() ? StringLatin1.compareTo(v1, v2)                              : StringUTF16.compareTo(v1, v2);        }returnisLatin1() ? StringLatin1.compareToUTF16(v1, v2)                          : StringUTF16.compareToLatin1(v1, v2);    }publicinthashCode(){inth = hash;if(h ==0&& value.length >0) {            hash = h = isLatin1() ? StringLatin1.hashCode(value)                                  : StringUTF16.hashCode(value);        }returnh;    }publicintindexOf(intch,intfromIndex){returnisLatin1() ? StringLatin1.indexOf(value, ch, fromIndex)                          : StringUTF16.indexOf(value, ch, fromIndex);    }publicStringsubstring(intbeginIndex){if(beginIndex <0) {thrownewStringIndexOutOfBoundsException(beginIndex);        }intsubLen = length() - beginIndex;if(subLen <0) {thrownewStringIndexOutOfBoundsException(subLen);        }if(beginIndex ==0) {returnthis;        }returnisLatin1() ? StringLatin1.newString(value, beginIndex, subLen)                          : StringUTF16.newString(value, beginIndex, subLen);    }//......bytecoder(){returnCOMPACT_STRINGS ? coder : UTF16;    }byte[] value() {returnvalue;    }privatebooleanisLatin1(){returnCOMPACT_STRINGS && coder == LATIN1;    }@NativestaticfinalbyteLATIN1 =0;@NativestaticfinalbyteUTF16  =1;//......}

COMPACT_STRINGS默认为true,即该特性默认是开启的

coder方法判断COMPACT_STRINGS为true的话,则返回coder值,否则返回UTF16;isLatin1方法判断COMPACT_STRINGS为true且coder为LATIN1则返回true

诸如charAt、equals、hashCode、indexOf、substring等等一系列方法都依赖isLatin1方法来区分对待是StringLatin1还是StringUTF16

StringConcatFactory

实例

publicclassJava9StringDemo {publicstaticvoidmain(String[] args){StringstringLiteral ="tom";StringstringObject = stringLiteral +"cat";    }}

这段代码stringObject由变量stringLiteral及cat拼接而来

javap

javac src/main/java/com/example/javac/Java9StringDemo.javajavap -v src/main/java/com/example/javac/Java9StringDemo.class  Last modified2019年4月7日; size770bytes  MD5 checksum fecfca9c829402c358c4d5cb948004ff  Compiled from"Java9StringDemo.java"public class com.example.javac.Java9StringDemo  minor version:0major version:56  flags:(0x0021) ACC_PUBLIC, ACC_SUPER  this_class:#4                          // com/example/javac/Java9StringDemo  super_class:#5                        // java/lang/Object  interfaces:0, fields:0, methods:2, attributes:3Constant pool:#1 = Methodref          #5.#14        // java/lang/Object."":()V#2 = String            #15            // tom#3 = InvokeDynamic      #0:#19        // #0:makeConcatWithConstants:(Ljava/lang/String;)Ljava/lang/String;#4 = Class              #20            // com/example/javac/Java9StringDemo#5 = Class              #21            // java/lang/Object#6 = Utf8              #7 = Utf8              ()V#8 = Utf8              Code#9 = Utf8              LineNumberTable#10 = Utf8              main#11 = Utf8              ([Ljava/lang/String;)V#12 = Utf8              SourceFile#13 = Utf8              Java9StringDemo.java#14 = NameAndType        #6:#7          // "":()V#15 = Utf8              tom#16 = Utf8              BootstrapMethods#17 = MethodHandle      6:#22          // REF_invokeStatic java/lang/invoke/StringConcatFactory.makeConcatWithConstants:(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;Ljava/lang/String;[Ljava/lang/Object;)Ljava/lang/invoke/CallSite;#18 = String            #23            // \u0001cat#19 = NameAndType        #24:#25        // makeConcatWithConstants:(Ljava/lang/String;)Ljava/lang/String;#20 = Utf8              com/example/javac/Java9StringDemo#21 = Utf8              java/lang/Object#22 = Methodref          #26.#27        // java/lang/invoke/StringConcatFactory.makeConcatWithConstants:(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;Ljava/lang/String;[Ljava/lang/Object;)Ljava/lang/invoke/CallSite;#23 = Utf8              \u0001cat#24 = Utf8              makeConcatWithConstants#25 = Utf8              (Ljava/lang/String;)Ljava/lang/String;#26 = Class              #28            // java/lang/invoke/StringConcatFactory#27 = NameAndType        #24:#32        // makeConcatWithConstants:(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;Ljava/lang/String;[Ljava/lang/Object;)Ljava/lang/invoke/CallSite;#28 = Utf8              java/lang/invoke/StringConcatFactory#29 = Class              #34            // java/lang/invoke/MethodHandles$Lookup#30 = Utf8              Lookup#31 = Utf8              InnerClasses#32 = Utf8              (Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;Ljava/lang/String;[Ljava/lang/Object;)Ljava/lang/invoke/CallSite;#33 = Class              #35            // java/lang/invoke/MethodHandles#34 = Utf8              java/lang/invoke/MethodHandles$Lookup#35 = Utf8              java/lang/invoke/MethodHandles{  public com.example.javac.Java9StringDemo();    descriptor:()V    flags:(0x0001) ACC_PUBLIC    Code:stack=1, locals=1, args_size=10: aload_01: invokespecial#1                  // Method java/lang/Object."":()V4: return      LineNumberTable:line8:0public static void main(java.lang.String[]);    descriptor:([Ljava/lang/String;)V    flags:(0x0009) ACC_PUBLIC, ACC_STATIC    Code:stack=1, locals=3, args_size=10: ldc#2                  // String tom2: astore_13: aload_14: invokedynamic#3,  0              // InvokeDynamic #0:makeConcatWithConstants:(Ljava/lang/String;)Ljava/lang/String;9: astore_210: return      LineNumberTable:line11:0line12:3line13:10}SourceFile:"Java9StringDemo.java"InnerClasses:public static final#30= #29 of #33;    // Lookup=class java/lang/invoke/MethodHandles$Lookup of class java/lang/invoke/MethodHandlesBootstrapMethods:0:#17 REF_invokeStatic java/lang/invoke/StringConcatFactory.makeConcatWithConstants:(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;Ljava/lang/String;[Ljava/lang/Object;)Ljava/lang/invoke/CallSite;Method arguments:#18 \u0001cat

javap之后可以看到通过Java 9利用InvokeDynamic调用了StringConcatFactory.makeConcatWithConstants方法进行字符串拼接优化;而Java 8则是通过转换为StringBuilder来进行优化

StringConcatFactory.makeConcatWithConstants

java.base/java/lang/invoke/StringConcatFactory.java

publicfinalclassStringConcatFactory{//....../**    * Concatenation strategy to use. See {@linkStrategy} for possible options.    * This option is controllable with -Djava.lang.invoke.stringConcat JDK option.    */privatestaticStrategy STRATEGY;/**

    * Default strategy to use for concatenation.

    */privatestaticfinalStrategy DEFAULT_STRATEGY = Strategy.MH_INLINE_SIZED_EXACT;privateenumStrategy{/**        * Bytecode generator, calling into {@linkjava.lang.StringBuilder}.        */BC_SB,/**        * Bytecode generator, calling into {@linkjava.lang.StringBuilder};        * but trying to estimate the required storage.        */BC_SB_SIZED,/**        * Bytecode generator, calling into {@linkjava.lang.StringBuilder};        * but computing the required storage exactly.        */BC_SB_SIZED_EXACT,/**        * MethodHandle-based generator, that in the end calls into {@linkjava.lang.StringBuilder}.        * This strategy also tries to estimate the required storage.        */MH_SB_SIZED,/**        * MethodHandle-based generator, that in the end calls into {@linkjava.lang.StringBuilder}.        * This strategy also estimate the required storage exactly.        */MH_SB_SIZED_EXACT,/**

        * MethodHandle-based generator, that constructs its own byte[] array from

        * the arguments. It computes the required storage exactly.

        */MH_INLINE_SIZED_EXACT    }static{// In case we need to double-back onto the StringConcatFactory during this// static initialization, make sure we have the reasonable defaults to complete// the static initialization properly. After that, actual users would use// the proper values we have read from the properties.STRATEGY = DEFAULT_STRATEGY;// CACHE_ENABLE = false; // implied// CACHE = null;        // implied// DEBUG = false;        // implied// DUMPER = null;        // impliedProperties props = GetPropertyAction.privilegedGetProperties();finalString strategy =                props.getProperty("java.lang.invoke.stringConcat");        CACHE_ENABLE = Boolean.parseBoolean(                props.getProperty("java.lang.invoke.stringConcat.cache"));        DEBUG = Boolean.parseBoolean(                props.getProperty("java.lang.invoke.stringConcat.debug"));finalString dumpPath =                props.getProperty("java.lang.invoke.stringConcat.dumpClasses");        STRATEGY = (strategy ==null) ? DEFAULT_STRATEGY : Strategy.valueOf(strategy);        CACHE = CACHE_ENABLE ? new ConcurrentHashMap<>() :null;        DUMPER = (dumpPath ==null) ? null : ProxyClassesDumper.getInstance(dumpPath);    }publicstaticCallSite makeConcatWithConstants(MethodHandles.Lookup lookup,                                                  String name,                                                  MethodType concatType,                                                  String recipe,                                                  Object... constants)throwsStringConcatException {if(DEBUG) {            System.out.println("StringConcatFactory "+ STRATEGY +" is here for "+ concatType +", {"+ recipe +"}, "+ Arrays.toString(constants));        }returndoStringConcat(lookup, name, concatType,false, recipe, constants);    }privatestaticCallSite doStringConcat(MethodHandles.Lookup lookup,                                          String name,                                          MethodType concatType,booleangenerateRecipe,                                          String recipe,                                          Object... constants)throwsStringConcatException {        Objects.requireNonNull(lookup,"Lookup is null");        Objects.requireNonNull(name,"Name is null");        Objects.requireNonNull(concatType,"Concat type is null");        Objects.requireNonNull(constants,"Constants are null");for(Objecto :constants) {            Objects.requireNonNull(o,"Cannot accept null constants");        }if((lookup.lookupModes() & MethodHandles.Lookup.PRIVATE) ==0) {thrownewStringConcatException("Invalid caller: "+                    lookup.lookupClass().getName());        }intcCount =0;intoCount =0;if(generateRecipe) {// Mock the recipe to reuse the concat generator codechar[] value =newchar[concatType.parameterCount()];            Arrays.fill(value, TAG_ARG);            recipe =newString(value);            oCount = concatType.parameterCount();        }else{            Objects.requireNonNull(recipe,"Recipe is null");for(inti =0; i < recipe.length(); i++) {charc = recipe.charAt(i);if(c == TAG_CONST) cCount++;if(c == TAG_ARG)  oCount++;            }        }if(oCount != concatType.parameterCount()) {thrownewStringConcatException("Mismatched number of concat arguments: recipe wants "+                            oCount +" arguments, but signature provides "+                            concatType.parameterCount());        }if(cCount != constants.length) {thrownewStringConcatException("Mismatched number of concat constants: recipe wants "+                            cCount +" constants, but only "+                            constants.length +" are passed");        }if(!concatType.returnType().isAssignableFrom(String.class)) {thrownewStringConcatException("The return type should be compatible with String, but it is "+                            concatType.returnType());        }if(concatType.parameterSlotCount() > MAX_INDY_CONCAT_ARG_SLOTS) {thrownewStringConcatException("Too many concat argument slots: "+                    concatType.parameterSlotCount() +", can only accept "+                    MAX_INDY_CONCAT_ARG_SLOTS);        }        String className = getClassName(lookup.lookupClass());        MethodType mt = adaptType(concatType);        Recipe rec =newRecipe(recipe, constants);        MethodHandle mh;if(CACHE_ENABLE) {            Key key =newKey(className, mt, rec);            mh = CACHE.get(key);if(mh ==null) {                mh = generate(lookup, className, mt, rec);                CACHE.put(key, mh);            }        }else{            mh = generate(lookup, className, mt, rec);        }returnnewConstantCallSite(mh.asType(concatType));    }privatestaticMethodHandle generate(Lookup lookup, String className, MethodType mt, Recipe recipe)throwsStringConcatException {try{switch(STRATEGY) {caseBC_SB:returnBytecodeStringBuilderStrategy.generate(lookup, className, mt, recipe, Mode.DEFAULT);caseBC_SB_SIZED:returnBytecodeStringBuilderStrategy.generate(lookup, className, mt, recipe, Mode.SIZED);caseBC_SB_SIZED_EXACT:returnBytecodeStringBuilderStrategy.generate(lookup, className, mt, recipe, Mode.SIZED_EXACT);caseMH_SB_SIZED:returnMethodHandleStringBuilderStrategy.generate(mt, recipe, Mode.SIZED);caseMH_SB_SIZED_EXACT:returnMethodHandleStringBuilderStrategy.generate(mt, recipe, Mode.SIZED_EXACT);caseMH_INLINE_SIZED_EXACT:returnMethodHandleInlineCopyStrategy.generate(mt, recipe);                default:thrownewStringConcatException("Concatenation strategy "+ STRATEGY +" is not implemented");            }        }catch(Error | StringConcatException e) {// Pass through any error or existing StringConcatExceptionthrowe;        }catch(Throwable t) {thrownewStringConcatException("Generator failed", t);        }    }//......}

makeConcatWithConstants方法内部调用了doStringConcat,而doStringConcat方法则调用了generate方法来生成MethodHandle;generate根据不同的STRATEGY来生成MethodHandle,这些STRATEGY有BC_SB、BC_SB_SIZED、BC_SB_SIZED_EXACT、MH_SB_SIZED、MH_SB_SIZED_EXACT、MH_INLINE_SIZED_EXACT,默认是MH_INLINE_SIZED_EXACT( 可以通过-Djava.lang.invoke.stringConcat来改变默认的策略 )

小结

Java 9引入了Compact Strings来取代Java 6的Compressed Strings,它的实现更过彻底,完全使用byte[]来替代char[],同时新引入了一个字段coder来标识是LATIN1还是UTF16

isLatin1方法判断COMPACT_STRINGS为true且coder为LATIN1则返回true;诸如charAt、equals、hashCode、indexOf、substring等等一系列方法都依赖isLatin1方法来区分对待是StringLatin1还是StringUTF16

Java 9利用InvokeDynamic调用了StringConcatFactory.makeConcatWithConstants方法进行字符串拼接优化,相比于Java 8通过转换为StringBuilder来进行优化,Java 9提供了多种STRATEGY可供选择,这些STRATEGY有BC_SB( 等价于Java 8的优化方式 )、BC_SB_SIZED、BC_SB_SIZED_EXACT、MH_SB_SIZED、MH_SB_SIZED_EXACT、MH_INLINE_SIZED_EXACT,默认是MH_INLINE_SIZED_EXACT( 可以通过-Djava.lang.invoke.stringConcat来改变默认的策略 )

你可能感兴趣的:(聊聊Java 9的Compact Strings)