序
本文主要研究一下Java 9的Compact Strings
Compressed Strings( Java 6 )
Java 6引入了Compressed Strings,对于one byte per character使用byte[],对于two bytes per character继续使用char[];之前可以使用-XX:+UseCompressedStrings来开启,不过在java7被废弃了,然后在java8被移除
Compact Strings( Java 9 )
Java 9引入了Compact Strings来取代Java 6的Compressed Strings,它的实现更过彻底,完全使用byte[]来替代char[],同时新引入了一个字段coder来标识是LATIN1还是UTF16
进群:697699179可以获取Java各类入门学习资料!
这是我的微信公众号【编程study】各位大佬有空可以关注下,每天更新Java学习方法,感谢!
学习中遇到问题有不明白的地方,推荐加小编Java学习群:697699179内有视频教程 ,直播课程 ,等学习资料,期待你的加入
String
java.base/java/lang/String.java
publicfinalclassStringimplementsjava.io.Serializable,Comparable,CharSequence,Constable,ConstantDesc{/** * The value is used for character storage. * *@implNoteThis field is trusted by the VM, and is a subject to * constant folding if String instance is constant. Overwriting this * field after construction will cause problems. * * Additionally, it is marked with {@linkStable} to trust the contents * of the array. No other facility in JDK provides this functionality (yet). * {@linkStable} is safe here, because value is never null. */@Stableprivatefinalbyte[] value;/** * The identifier of the encoding used to encode the bytes in * {@codevalue}. The supported values in this implementation are * * LATIN1 * UTF16 * *@implNoteThis field is trusted by the VM, and is a subject to * constant folding if String instance is constant. Overwriting this * field after construction will cause problems. */privatefinalbytecoder;/** Cache the hash code for the string */privateinthash;// Default to 0/** use serialVersionUID from JDK 1.0.2 for interoperability */privatestaticfinallongserialVersionUID = -6849794470754667710L;/** * If String compaction is disabled, the bytes in {@codevalue} are * always encoded in UTF16. * * For methods with several possible implementation paths, when String * compaction is disabled, only one code path is taken. * * The instance field value is generally opaque to optimizing JIT * compilers. Therefore, in performance-sensitive place, an explicit * check of the static boolean {@codeCOMPACT_STRINGS} is done first * before checking the {@codecoder} field since the static boolean * {@codeCOMPACT_STRINGS} would be constant folded away by an * optimizing JIT compiler. The idioms for these cases are as follows. * * For code such as: * * if (coder == LATIN1) { ... } * * can be written more optimally as * * if (coder() == LATIN1) { ... } * * or: * * if (COMPACT_STRINGS && coder == LATIN1) { ... } * * An optimizing JIT compiler can fold the above conditional as: * * COMPACT_STRINGS == true => if (coder == LATIN1) { ... } * COMPACT_STRINGS == false => if (false) { ... } * *@implNote* The actual value for this field is injected by JVM. The static * initialization block is used to set the value here to communicate * that this static final field is not statically foldable, and to * avoid any possible circular dependency during vm initialization. */staticfinalbooleanCOMPACT_STRINGS;static{ COMPACT_STRINGS =true; }/** * Class String is special cased within the Serialization Stream Protocol. * * A String instance is written into an ObjectOutputStream according to * * Object Serialization Specification, Section 6.2, "Stream Elements" */privatestaticfinalObjectStreamField[] serialPersistentFields =newObjectStreamField[0];/** * Initializes a newly created {@codeString} object so that it represents * an empty character sequence. Note that use of this constructor is * unnecessary since Strings are immutable. */publicString(){this.value ="".value;this.coder ="".coder; }//......publiccharcharAt(intindex){if(isLatin1()) {returnStringLatin1.charAt(value, index); }else{returnStringUTF16.charAt(value, index); } }publicbooleanequals(Object anObject){if(this== anObject) {returntrue; }if(anObjectinstanceofString) { String aString = (String)anObject;if(coder() == aString.coder()) {returnisLatin1() ? StringLatin1.equals(value, aString.value) : StringUTF16.equals(value, aString.value); } }returnfalse; }publicintcompareTo(String anotherString){bytev1[] = value;bytev2[] = anotherString.value;if(coder() == anotherString.coder()) {returnisLatin1() ? StringLatin1.compareTo(v1, v2) : StringUTF16.compareTo(v1, v2); }returnisLatin1() ? StringLatin1.compareToUTF16(v1, v2) : StringUTF16.compareToLatin1(v1, v2); }publicinthashCode(){inth = hash;if(h ==0&& value.length >0) { hash = h = isLatin1() ? StringLatin1.hashCode(value) : StringUTF16.hashCode(value); }returnh; }publicintindexOf(intch,intfromIndex){returnisLatin1() ? StringLatin1.indexOf(value, ch, fromIndex) : StringUTF16.indexOf(value, ch, fromIndex); }publicStringsubstring(intbeginIndex){if(beginIndex <0) {thrownewStringIndexOutOfBoundsException(beginIndex); }intsubLen = length() - beginIndex;if(subLen <0) {thrownewStringIndexOutOfBoundsException(subLen); }if(beginIndex ==0) {returnthis; }returnisLatin1() ? StringLatin1.newString(value, beginIndex, subLen) : StringUTF16.newString(value, beginIndex, subLen); }//......bytecoder(){returnCOMPACT_STRINGS ? coder : UTF16; }byte[] value() {returnvalue; }privatebooleanisLatin1(){returnCOMPACT_STRINGS && coder == LATIN1; }@NativestaticfinalbyteLATIN1 =0;@NativestaticfinalbyteUTF16 =1;//......}
COMPACT_STRINGS默认为true,即该特性默认是开启的
coder方法判断COMPACT_STRINGS为true的话,则返回coder值,否则返回UTF16;isLatin1方法判断COMPACT_STRINGS为true且coder为LATIN1则返回true
诸如charAt、equals、hashCode、indexOf、substring等等一系列方法都依赖isLatin1方法来区分对待是StringLatin1还是StringUTF16
StringConcatFactory
实例
publicclassJava9StringDemo {publicstaticvoidmain(String[] args){StringstringLiteral ="tom";StringstringObject = stringLiteral +"cat"; }}
这段代码stringObject由变量stringLiteral及cat拼接而来
javap
javac src/main/java/com/example/javac/Java9StringDemo.javajavap -v src/main/java/com/example/javac/Java9StringDemo.class Last modified2019年4月7日; size770bytes MD5 checksum fecfca9c829402c358c4d5cb948004ff Compiled from"Java9StringDemo.java"public class com.example.javac.Java9StringDemo minor version:0major version:56 flags:(0x0021) ACC_PUBLIC, ACC_SUPER this_class:#4 // com/example/javac/Java9StringDemo super_class:#5 // java/lang/Object interfaces:0, fields:0, methods:2, attributes:3Constant pool:#1 = Methodref #5.#14 // java/lang/Object."
javap之后可以看到通过Java 9利用InvokeDynamic调用了StringConcatFactory.makeConcatWithConstants方法进行字符串拼接优化;而Java 8则是通过转换为StringBuilder来进行优化
StringConcatFactory.makeConcatWithConstants
java.base/java/lang/invoke/StringConcatFactory.java
publicfinalclassStringConcatFactory{//....../** * Concatenation strategy to use. See {@linkStrategy} for possible options. * This option is controllable with -Djava.lang.invoke.stringConcat JDK option. */privatestaticStrategy STRATEGY;/**
* Default strategy to use for concatenation.
*/privatestaticfinalStrategy DEFAULT_STRATEGY = Strategy.MH_INLINE_SIZED_EXACT;privateenumStrategy{/** * Bytecode generator, calling into {@linkjava.lang.StringBuilder}. */BC_SB,/** * Bytecode generator, calling into {@linkjava.lang.StringBuilder}; * but trying to estimate the required storage. */BC_SB_SIZED,/** * Bytecode generator, calling into {@linkjava.lang.StringBuilder}; * but computing the required storage exactly. */BC_SB_SIZED_EXACT,/** * MethodHandle-based generator, that in the end calls into {@linkjava.lang.StringBuilder}. * This strategy also tries to estimate the required storage. */MH_SB_SIZED,/** * MethodHandle-based generator, that in the end calls into {@linkjava.lang.StringBuilder}. * This strategy also estimate the required storage exactly. */MH_SB_SIZED_EXACT,/**
* MethodHandle-based generator, that constructs its own byte[] array from
* the arguments. It computes the required storage exactly.
*/MH_INLINE_SIZED_EXACT }static{// In case we need to double-back onto the StringConcatFactory during this// static initialization, make sure we have the reasonable defaults to complete// the static initialization properly. After that, actual users would use// the proper values we have read from the properties.STRATEGY = DEFAULT_STRATEGY;// CACHE_ENABLE = false; // implied// CACHE = null; // implied// DEBUG = false; // implied// DUMPER = null; // impliedProperties props = GetPropertyAction.privilegedGetProperties();finalString strategy = props.getProperty("java.lang.invoke.stringConcat"); CACHE_ENABLE = Boolean.parseBoolean( props.getProperty("java.lang.invoke.stringConcat.cache")); DEBUG = Boolean.parseBoolean( props.getProperty("java.lang.invoke.stringConcat.debug"));finalString dumpPath = props.getProperty("java.lang.invoke.stringConcat.dumpClasses"); STRATEGY = (strategy ==null) ? DEFAULT_STRATEGY : Strategy.valueOf(strategy); CACHE = CACHE_ENABLE ? new ConcurrentHashMap<>() :null; DUMPER = (dumpPath ==null) ? null : ProxyClassesDumper.getInstance(dumpPath); }publicstaticCallSite makeConcatWithConstants(MethodHandles.Lookup lookup, String name, MethodType concatType, String recipe, Object... constants)throwsStringConcatException {if(DEBUG) { System.out.println("StringConcatFactory "+ STRATEGY +" is here for "+ concatType +", {"+ recipe +"}, "+ Arrays.toString(constants)); }returndoStringConcat(lookup, name, concatType,false, recipe, constants); }privatestaticCallSite doStringConcat(MethodHandles.Lookup lookup, String name, MethodType concatType,booleangenerateRecipe, String recipe, Object... constants)throwsStringConcatException { Objects.requireNonNull(lookup,"Lookup is null"); Objects.requireNonNull(name,"Name is null"); Objects.requireNonNull(concatType,"Concat type is null"); Objects.requireNonNull(constants,"Constants are null");for(Objecto :constants) { Objects.requireNonNull(o,"Cannot accept null constants"); }if((lookup.lookupModes() & MethodHandles.Lookup.PRIVATE) ==0) {thrownewStringConcatException("Invalid caller: "+ lookup.lookupClass().getName()); }intcCount =0;intoCount =0;if(generateRecipe) {// Mock the recipe to reuse the concat generator codechar[] value =newchar[concatType.parameterCount()]; Arrays.fill(value, TAG_ARG); recipe =newString(value); oCount = concatType.parameterCount(); }else{ Objects.requireNonNull(recipe,"Recipe is null");for(inti =0; i < recipe.length(); i++) {charc = recipe.charAt(i);if(c == TAG_CONST) cCount++;if(c == TAG_ARG) oCount++; } }if(oCount != concatType.parameterCount()) {thrownewStringConcatException("Mismatched number of concat arguments: recipe wants "+ oCount +" arguments, but signature provides "+ concatType.parameterCount()); }if(cCount != constants.length) {thrownewStringConcatException("Mismatched number of concat constants: recipe wants "+ cCount +" constants, but only "+ constants.length +" are passed"); }if(!concatType.returnType().isAssignableFrom(String.class)) {thrownewStringConcatException("The return type should be compatible with String, but it is "+ concatType.returnType()); }if(concatType.parameterSlotCount() > MAX_INDY_CONCAT_ARG_SLOTS) {thrownewStringConcatException("Too many concat argument slots: "+ concatType.parameterSlotCount() +", can only accept "+ MAX_INDY_CONCAT_ARG_SLOTS); } String className = getClassName(lookup.lookupClass()); MethodType mt = adaptType(concatType); Recipe rec =newRecipe(recipe, constants); MethodHandle mh;if(CACHE_ENABLE) { Key key =newKey(className, mt, rec); mh = CACHE.get(key);if(mh ==null) { mh = generate(lookup, className, mt, rec); CACHE.put(key, mh); } }else{ mh = generate(lookup, className, mt, rec); }returnnewConstantCallSite(mh.asType(concatType)); }privatestaticMethodHandle generate(Lookup lookup, String className, MethodType mt, Recipe recipe)throwsStringConcatException {try{switch(STRATEGY) {caseBC_SB:returnBytecodeStringBuilderStrategy.generate(lookup, className, mt, recipe, Mode.DEFAULT);caseBC_SB_SIZED:returnBytecodeStringBuilderStrategy.generate(lookup, className, mt, recipe, Mode.SIZED);caseBC_SB_SIZED_EXACT:returnBytecodeStringBuilderStrategy.generate(lookup, className, mt, recipe, Mode.SIZED_EXACT);caseMH_SB_SIZED:returnMethodHandleStringBuilderStrategy.generate(mt, recipe, Mode.SIZED);caseMH_SB_SIZED_EXACT:returnMethodHandleStringBuilderStrategy.generate(mt, recipe, Mode.SIZED_EXACT);caseMH_INLINE_SIZED_EXACT:returnMethodHandleInlineCopyStrategy.generate(mt, recipe); default:thrownewStringConcatException("Concatenation strategy "+ STRATEGY +" is not implemented"); } }catch(Error | StringConcatException e) {// Pass through any error or existing StringConcatExceptionthrowe; }catch(Throwable t) {thrownewStringConcatException("Generator failed", t); } }//......}
makeConcatWithConstants方法内部调用了doStringConcat,而doStringConcat方法则调用了generate方法来生成MethodHandle;generate根据不同的STRATEGY来生成MethodHandle,这些STRATEGY有BC_SB、BC_SB_SIZED、BC_SB_SIZED_EXACT、MH_SB_SIZED、MH_SB_SIZED_EXACT、MH_INLINE_SIZED_EXACT,默认是MH_INLINE_SIZED_EXACT( 可以通过-Djava.lang.invoke.stringConcat来改变默认的策略 )
小结
Java 9引入了Compact Strings来取代Java 6的Compressed Strings,它的实现更过彻底,完全使用byte[]来替代char[],同时新引入了一个字段coder来标识是LATIN1还是UTF16
isLatin1方法判断COMPACT_STRINGS为true且coder为LATIN1则返回true;诸如charAt、equals、hashCode、indexOf、substring等等一系列方法都依赖isLatin1方法来区分对待是StringLatin1还是StringUTF16
Java 9利用InvokeDynamic调用了StringConcatFactory.makeConcatWithConstants方法进行字符串拼接优化,相比于Java 8通过转换为StringBuilder来进行优化,Java 9提供了多种STRATEGY可供选择,这些STRATEGY有BC_SB( 等价于Java 8的优化方式 )、BC_SB_SIZED、BC_SB_SIZED_EXACT、MH_SB_SIZED、MH_SB_SIZED_EXACT、MH_INLINE_SIZED_EXACT,默认是MH_INLINE_SIZED_EXACT( 可以通过-Djava.lang.invoke.stringConcat来改变默认的策略 )