FST序列化/反序列化
基本上所有以Byte形式存储的序列化对象都是类似的存储结构,不管class文件、so文件、dex文件都是类似,可以看看这个,这方面没有什么创新的格式,最多是在字段内容上做了一些压缩优化,包括我们最常使用的utf-8编码都是这个做法。
FST的序列化存储和一般的字节格式化存储方案也没有标新立异的地方,比如下面这个FTS的序列化字节文件
00000001: 0001 0f63 6f6d 2e66 7374 2e46 5354 4265
00000010: 616e f701 fc05 7630 7374 7200
格式:Header|类名长度|类名String|字段1类型(1Byte) | [长度] | 内容|字段2类型(1Byte) | [长度] | 内容|…
0000:字节数组类型:00标识OBJECT
0001:类名编码,00标识UTF编码,01表示ASCII编码
0002:Length of class name (1Byte) = 15
0003~0011:Class name string (15Byte)
0012:Integer类型标识 0xf7
0013:Integer的值=1
0014:String类型标识 0xfc
0015:String的长度=5
0016~001a:String的值"v0str"
001b~001c:END
从上面可以看到Integer类型序列化后只占用了一个字节(值等于1),并不像在内存中占用4Byte,所以可以看出是根据一定规则做了压缩,具体代码看FSTObjectInput#instantiateSpecialTag中对不同类型的读取,FSTObjectInput也定义不同类型对应的枚举值:
public class FSTObjectOutput implements ObjectOutput {
private static final FSTLogger LOGGER = FSTLogger.getLogger(FSTObjectOutput.class);
public static Object NULL_PLACEHOLDER = new Object() { public String toString() { return "NULL_PLACEHOLDER"; }};
public static final byte SPECIAL_COMPATIBILITY_OBJECT_TAG = -19; // see issue 52
public static final byte ONE_OF = -18;
public static final byte BIG_BOOLEAN_FALSE = -17;
public static final byte BIG_BOOLEAN_TRUE = -16;
public static final byte BIG_LONG = -10;
public static final byte BIG_INT = -9;
public static final byte DIRECT_ARRAY_OBJECT = -8;
public static final byte HANDLE = -7;
public static final byte ENUM = -6;
public static final byte ARRAY = -5;
public static final byte STRING = -4;
public static final byte TYPED = -3; // var class == object written class
public static final byte DIRECT_OBJECT = -2;
public static final byte NULL = -1;
public static final byte OBJECT = 0;
protected FSTEncoder codec;
...
}
对Object进行Byte序列化,相当于做了持久化的存储,在反序列的时候,如果Bean的定义发生了改变,那么反序列化器就要做兼容的解决方案,我们知道对于JDK的序列化和反序列,serialVersionUID对版本控制起了很重要的作用。FTS对这个问题的解决方案是通过@Version注解进行排序。
在进行反序列操作的时候,FST会先反射或者对象Class的所有成员,并对这些成员进行了排序
,请注意这里被标红的排序
,因为这个排序对兼容起了关键作用,也就是@Version的原理。在FSTClazzInfo中定义了一个defFieldComparator比较器,用于对Bean的所有Field进行排序:
public final class FSTClazzInfo {
public static final Comparator<FSTFieldInfo> defFieldComparator = new Comparator<FSTFieldInfo>() {
@Override
public int compare(FSTFieldInfo o1, FSTFieldInfo o2) {
int res = 0;
if ( o1.getVersion() != o2.getVersion() ) {
return o1.getVersion() < o2.getVersion() ? -1 : 1;
}
// order: version, boolean, primitives, conditionals, object references
if (o1.getType() == boolean.class && o2.getType() != boolean.class) {
return -1;
}
if (o1.getType() != boolean.class && o2.getType() == boolean.class) {
return 1;
}
if (o1.isConditional() && !o2.isConditional()) {
res = 1;
} else if (!o1.isConditional() && o2.isConditional()) {
res = -1;
} else if (o1.isPrimitive() && !o2.isPrimitive()) {
res = -1;
} else if (!o1.isPrimitive() && o2.isPrimitive())
res = 1;
// if (res == 0) // 64 bit / 32 bit issues
// res = (int) (o1.getMemOffset() - o2.getMemOffset());
if (res == 0)
res = o1.getType().getSimpleName().compareTo(o2.getType().getSimpleName());
if (res == 0)
res = o1.getName().compareTo(o2.getName());
if (res == 0) {
return o1.getField().getDeclaringClass().getName().compareTo(o2.getField().getDeclaringClass().getName());
}
return res;
}
};
...
}
从代码实现上可以看到,比较的优先级是Field的Version大小,然后是Field类型,所以总的来说Version越大排序越靠后,至于为什么要排序,看下FSTObjectInput#instantiateAndReadNoSer方法
public class FSTObjectInput implements ObjectInput {
protected Object instantiateAndReadNoSer(Class c, FSTClazzInfo clzSerInfo, FSTClazzInfo.FSTFieldInfo referencee, int readPos) throws Exception {
Object newObj;
newObj = clzSerInfo.newInstance(getCodec().isMapBased());
...
} else {
FSTClazzInfo.FSTFieldInfo[] fieldInfo = clzSerInfo.getFieldInfo();
readObjectFields(referencee, clzSerInfo, fieldInfo, newObj,0,0);
}
return newObj;
}
protected void readObjectFields(FSTClazzInfo.FSTFieldInfo referencee, FSTClazzInfo serializationInfo, FSTClazzInfo.FSTFieldInfo[] fieldInfo, Object newObj, int startIndex, int version) throws Exception {
if ( getCodec().isMapBased() ) {
readFieldsMapBased(referencee, serializationInfo, newObj);
if ( version >= 0 && newObj instanceof Unknown == false)
getCodec().readObjectEnd();
return;
}
if ( version < 0 )
version = 0;
int booleanMask = 0;
int boolcount = 8;
final int length = fieldInfo.length;
int conditional = 0;
for (int i = startIndex; i < length; i++) { // 注意这里的循环
try {
FSTClazzInfo.FSTFieldInfo subInfo = fieldInfo[i];
if (subInfo.getVersion() > version ) { // 需要进入下一个版本的迭代
int nextVersion = getCodec().readVersionTag(); // 对象流的下一个版本
if ( nextVersion == 0 ) // old object read
{
oldVersionRead(newObj);
return;
}
if ( nextVersion != subInfo.getVersion() ) { // 同一个Field的版本不允许变,并且版本变更和流的版本保持同步
throw new RuntimeException("read version tag "+nextVersion+" fieldInfo has "+subInfo.getVersion());
}
readObjectFields(referencee,serializationInfo,fieldInfo,newObj,i,nextVersion); // 开始下一个Version的递归
return;
}
if (subInfo.isPrimitive()) {
...
} else {
if ( subInfo.isConditional() ) {
...
}
// object 把读出来的值保存到FSTFieldInfo中
Object subObject = readObjectWithHeader(subInfo);
subInfo.setObjectValue(newObj, subObject);
}
...
从这段代码的逻辑基本就可以知道FST的序列化和反序列化兼容的原理了,注意里面的循环,正是按照排序后的Filed进行循环,而每个FSTFieldInfo都记录自己在对象流中的位置、类型等详细信息:
序列化:
反序列化:
01:18:56.442 [main] WARN Serializer-Tools - Serializer-Tools-Deserializer:use FST
unable to decode:com.fst.FSTBean÷üv0strüv1str
01:18:56.455 [main] ERROR Serializer-Tools - error in fstSerializer
java.io.IOException: java.lang.NullPointerException
at org.nustaq.serialization.FSTObjectInput.readObject(FSTObjectInput.java:247)
at org.nustaq.serialization.FSTConfiguration.asObject(FSTConfiguration.java:1150)
at com.taobao.wireless.ripple.common.serialize.FstSerializer.deserialize(FstSerializer.java:27)
at com.fst.FSTSerial.deserilize(FSTSerial.java:44)
at com.fst.FSTSerial.main(FSTSerial.java:56)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
Caused by: java.lang.NullPointerException: null
at org.nustaq.serialization.FSTObjectInput.readObjectWithHeader(FSTObjectInput.java:357)
at org.nustaq.serialization.FSTObjectInput.readObjectFields(FSTObjectInput.java:712)
at org.nustaq.serialization.FSTObjectInput.instantiateAndReadNoSer(FSTObjectInput.java:566)
at org.nustaq.serialization.FSTObjectInput.readObjectWithHeader(FSTObjectInput.java:374)
at org.nustaq.serialization.FSTObjectInput.readObjectInternal(FSTObjectInput.java:331)
at org.nustaq.serialization.FSTObjectInput.readObject(FSTObjectInput.java:311)
at org.nustaq.serialization.FSTObjectInput.readObject(FSTObjectInput.java:245)
... 9 common frames omitted
所以从上面的代码逻辑就可以分析出这个使用规则:@Version的使用原则就是,每新增一个Field,就对应的加上@Version注解,并且把version的值设置为当前版本的最大值加一,不允许删除Field
另外再看一下@Version注解的注释:明确说明了用于后向兼容
package org.nustaq.serialization.annotations;
import java.lang.annotation.ElementType;
import java.lang.annotation.Retention;
import java.lang.annotation.RetentionPolicy;
import java.lang.annotation.Target;
@Retention(RetentionPolicy.RUNTIME)
@Target({ElementType.FIELD})
/**
* support for adding fields without breaking compatibility to old streams.
* For each release of your app increment the version value. No Version annotation means version=0.
* Note that each added field needs to be annotated.
*
* e.g.
*
* class MyClass implements Serializable {
*
* // fields on initial release 1.0
* int x;
* String y;
*
* // fields added with release 1.5
* @Version(1) String added;
* @Version(1) String alsoAdded;
*
* // fields added with release 2.0
* @Version(2) String addedv2;
* @Version(2) String alsoAddedv2;
*
* }
*
* If an old class is read, new fields will be set to default values. You can register a VersionConflictListener
* at FSTObjectInput in order to fill in defaults for new fields.
*
* Notes/Limits:
* - Removing fields will break backward compatibility. You can only Add new fields.
* - Can slow down serialization over time (if many versions)
* - does not work for Externalizable or Classes which make use of JDK-special features such as readObject/writeObject
* (AKA does not work if fst has to fall back to 'compatible mode' for an object).
* - in case you use custom serializers, your custom serializer has to handle versioning
*
*/
public @interface Version {
byte value();
}
下面根据上面分析的结果,一个个验证场景,首先定义一个简单的测试Bean
public class FSTBean implements Serializable {
/** serialVersionUID */
private static final long serialVersionUID = -2708653783151699375L;
private Integer v0int
private String v0str;
}
准备序列化和反序列化方法
public class FSTSerial {
private static void serialize(FstSerializer fst, String fileName) {
try {
FSTBean fstBean = new FSTBean();
fstBean.setV0int(1);
fstBean.setV0str("v0str");
byte[] v1 = fst.serialize(fstBean);
FileOutputStream fos = new FileOutputStream(new File("byte.bin"));
fos.write(v1, 0, v1.length);
fos.close();
} catch (Exception e) {
e.printStackTrace();
}
}
private static void deserilize(FstSerializer fst, String fileName) {
try {
FileInputStream fis = new FileInputStream(new File("byte.bin"));
ByteArrayOutputStream baos = new ByteArrayOutputStream();
byte[] buf = new byte[256];
int length = 0;
while ((length = fis.read(buf)) > 0) {
baos.write(buf, 0, length);
}
fis.close();
buf = baos.toByteArray();
FSTBean deserial = fst.deserialize(buf, FSTBean.class);
System.out.println(deserial);
System.out.println(deserial);
} catch (Exception e) {
e.printStackTrace();
}
}
public static void main(String[] args) {
FstSerializer fst = new FstSerializer();
serialize(fst, "byte.bin");
deserilize(fst, "byte.bin");
}
}
验证序列化和反序列话没有问题
23:42:57.705 [main] WARN Serializer-Tools - Serializer-Tools:use FST
23:42:57.715 [main] WARN Serializer-Tools - Serializer-Tools-Deserializer:use FST
FSTBean(v0int=1, v0str=v0str)
FSTBean(v0int=1, v0str=v0str)
Case1:单纯读入bin文件,并且向增加一个FSTBean中增加一个成员v1str,抛异常。因为不允许增加相同版本的Field
public class FSTBean implements Serializable {
/** serialVersionUID */
private static final long serialVersionUID = -2708653783151699375L;
private Integer v0int
private String v0str;
private String v1str;
}
unable to decode:com.fst.FSTBean֟v0str
01:27:29.878 [main] ERROR Serializer-Tools - error in fstSerializer
java.io.IOException: java.lang.ArrayIndexOutOfBoundsException: 28
at org.nustaq.serialization.FSTObjectInput.readObject(FSTObjectInput.java:247)
at org.nustaq.serialization.FSTConfiguration.asObject(FSTConfiguration.java:1150)
at com.taobao.wireless.ripple.common.serialize.FstSerializer.deserialize(FstSerializer.java:27)
at com.fst.FSTSerial.deserilize(FSTSerial.java:44)
at com.fst.FSTSerial.main(FSTSerial.java:54)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 28
at org.nustaq.serialization.coders.FSTStreamDecoder.readFByte(FSTStreamDecoder.java:296)
at org.nustaq.serialization.coders.FSTStreamDecoder.readFShort(FSTStreamDecoder.java:366)
at org.nustaq.serialization.FSTClazzNameRegistry.decodeClass(FSTClazzNameRegistry.java:163)
at org.nustaq.serialization.coders.FSTStreamDecoder.readClass(FSTStreamDecoder.java:478)
at org.nustaq.serialization.FSTObjectInput.readClass(FSTObjectInput.java:938)
at org.nustaq.serialization.FSTObjectInput.readObjectWithHeader(FSTObjectInput.java:347)
at org.nustaq.serialization.FSTObjectInput.readObjectFields(FSTObjectInput.java:712)
at org.nustaq.serialization.FSTObjectInput.instantiateAndReadNoSer(FSTObjectInput.java:566)
at org.nustaq.serialization.FSTObjectInput.readObjectWithHeader(FSTObjectInput.java:374)
at org.nustaq.serialization.FSTObjectInput.readObjectInternal(FSTObjectInput.java:331)
at org.nustaq.serialization.FSTObjectInput.readObject(FSTObjectInput.java:311)
at org.nustaq.serialization.FSTObjectInput.readObject(FSTObjectInput.java:245)
... 9 common frames omitted
Case2:单纯读入bin文件,并且向增加一个FSTBean中增加一个成员v1str,并将增加注解@Version,结果正常,和期望的一致。为了验证不加Version的时候默认version值是0,可以把1变成0再测试,一定会抛异常。
public class FSTBean implements Serializable {
/** serialVersionUID */
private static final long serialVersionUID = -2708653783151699375L;
private Integer v0int
private String v0str;
@Verison(1)
private String v1str;
}
01:29:35.678 [main] WARN Serializer-Tools - Serializer-Tools-Deserializer:use FST
FSTBean(v0int=1, v0str=v0str, v1str=null)
Process finished with exit code 0
Case3:单纯读入bin文件,并且现有的v0str增加@Version(1)注解,结果抛异常,说明Field的版本一旦添加后就不允许修改,否则有后向兼容问题。注意看这个异常,对比一下上面提到的 nextVersion != subInfo.getVersion()
校验就很清楚了。
public class FSTBean implements Serializable {
/** serialVersionUID */
private static final long serialVersionUID = -2708653783151699375L;
private Integer v0int
@Version(1)
private String v0str;
}
01:32:58.140 [main] WARN Serializer-Tools - Serializer-Tools-Deserializer:use FST
unable to decode:com.fst.FSTBean֟v0str
01:32:58.157 [main] ERROR Serializer-Tools - error in fstSerializer
java.io.IOException: java.lang.RuntimeException: read version tag -4 fieldInfo has 1
at org.nustaq.serialization.FSTObjectInput.readObject(FSTObjectInput.java:247)
at org.nustaq.serialization.FSTConfiguration.asObject(FSTConfiguration.java:1150)
at com.taobao.wireless.ripple.common.serialize.FstSerializer.deserialize(FstSerializer.java:27)
at com.fst.FSTSerial.deserilize(FSTSerial.java:44)
at com.fst.FSTSerial.main(FSTSerial.java:54)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
Caused by: java.lang.RuntimeException: read version tag -4 fieldInfo has 1
at org.nustaq.serialization.FSTObjectInput.readObjectFields(FSTObjectInput.java:674)
at org.nustaq.serialization.FSTObjectInput.instantiateAndReadNoSer(FSTObjectInput.java:566)
at org.nustaq.serialization.FSTObjectInput.readObjectWithHeader(FSTObjectInput.java:374)
at org.nustaq.serialization.FSTObjectInput.readObjectInternal(FSTObjectInput.java:331)
at org.nustaq.serialization.FSTObjectInput.readObject(FSTObjectInput.java:311)
at org.nustaq.serialization.FSTObjectInput.readObject(FSTObjectInput.java:245)
... 9 common frames omitted
null
Process finished with exit code 0
Case4:如果Version间有空隙,比如对象流中定一个了最高版本那是10,想在内存Bean中新定义一个version=5的Field,根据排序的规则,还是会抛异常,所以版本一定要一直增加,不可回退
public class FSTBean implements Serializable {
/** serialVersionUID */
private static final long serialVersionUID = -2708653783151699375L;
private Integer v0int
@Version(10)
private String v0str;
@Version(5) // 版本不可回退!!
private String v0str;
}
Case5:把已有的v0str
删除,没有问题,但是把已有的v0int
删除就有问题了,因为涉及到了排序,所以不要对删除字段抱有幻想,依靠不稳定的排序做兼容迟早要出事。
public class FSTBean implements Serializable {
/** serialVersionUID */
private static final long serialVersionUID = -2708653783151699375L;
private Integer v0int
private String v0str;
}
删除v0str正常
01:40:00.548 [main] WARN Serializer-Tools - Serializer-Tools-Deserializer:use FST
FSTBean(v0int=1)
删除v0int异常
01:38:39.479 [main] WARN Serializer-Tools - Serializer-Tools-Deserializer:use FST
java.lang.NullPointerException
at java.lang.String.length(String.java:623)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:420)
at java.lang.StringBuilder.append(StringBuilder.java:136)
at com.fst.FSTBean.toString(FSTBean.java:12)
at java.lang.String.valueOf(String.java:2994)
at java.io.PrintStream.println(PrintStream.java:821)
at com.fst.FSTSerial.deserilize(FSTSerial.java:45)
at com.fst.FSTSerial.main(FSTSerial.java:54)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
差不多这么多例子了,为啥看这个,因为多租户集群发布过程,出现了新老代码互相消费消息的情况,而消息体Bean的定义是在SDK里,版本升级后没发现,预发环境单台机器不存在兼容问题,所以看不出来,上线到集群环境才发现这个问题。