序列化是指将数据对象转换成为一种可存储或可传输的数据格式,而反序列化则是相反的操作,将序列化后的数据还原成对象。最为常见的序列化应用有Json和XML,它们都是行业公认的标准。而在 Java 里,有专门提供了 Serializable 接口用于对象的序列化和反序列化。
Serializable接口在java.io包中定义,它本身并不存任何字段和方法,只是用于标识类为可序列化。类对象在序列化后会被转换成为字节输出流OutputStream(BufferedOutputStream、ByteArrayOutputStream、DataOutputStream、FilterOutputStream等)的形式可存储到本地文件或者用于传输,反序列化时则是将字节输入流InputStream进行解析后重新创建新的类对象。
public class Student implements Serializable {
private static final long serialVersionUID = 1L;
private int id;
private String name;
private boolean verified;
private List phones;
public int getId() {
return id;
}
public void setId(int id) {
this.id = id;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public void setVerified(boolean verified) {
this.verified = verified;
}
public boolean isVerified() {
return verified;
}
public List getPhones() {
return phones;
}
public void setPhones(List phones) {
this.phones = phones;
}
static class Phone implements Serializable {
private static final long serialVersionUID = 2L;
private String number;
private String type;
public String getNumber() {
return number;
}
public void setNumber(String number) {
this.number = number;
}
public String getType() {
return type;
}
public void setType(String type) {
this.type = type;
}
}
}
Student student = new Student();
student.setId(1001);
student.setName("子云心");
student.setVerified(true);
List phoneList = new ArrayList<>();
Student.Phone phone1 = new Student.Phone();
phone1.setType("MOBILE");
phone1.setNumber("12345678910");
phoneList.add(phone1);
Student.Phone phone2 = new Student.Phone();
phone2.setType("HOME");
phone2.setNumber("0000-1234567");
phoneList.add(phone2);
student.setPhones(phoneList);
try {
byte[] byteData = serialization(student);
Student student2 = deserialization(byteData);
} catch (IOException e) {
e.printStackTrace();
} catch (ClassNotFoundException e) {
e.printStackTrace();
}
private byte[] serialization(Student student) throws IOException {
// 保存到本地文件
// ObjectOutputStream objectOutputStream = new ObjectOutputStream(new FileOutputStream("test.txt"));
// 保存到 ByteArrayOutputStream
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
ObjectOutputStream objectOutputStream = new ObjectOutputStream(byteArrayOutputStream);
objectOutputStream.writeObject(student);
objectOutputStream.close();
return byteArrayOutputStream.toByteArray();
}
private Student deserialization(byte[] byteData) throws IOException, ClassNotFoundException {
ByteArrayInputStream byteArrayInputStream = new ByteArrayInputStream(byteData);
ObjectInputStream objectInputStream = new ObjectInputStream(byteArrayInputStream);
Object object = objectInputStream.readObject();
objectInputStream.close();
return (Student) object;
}
在实现Serializable接口的类中,一般都需要声明一个名叫serialVersionUID的long类型变量,因为对于JVM来讲,要进行一个类对象的序列化和反序列化必须需要一个该类的标记。需要注意以下项:
Serializable支持自定义序列化,使用Externalizable 接口替换Serializable接口,然后实现writeExternal和readExternal方法后,可以完全自定义所有字段的序列化和反序列化。
public class Student implements Externalizable {
private static final long serialVersionUID = 1L;
private int id;
private String name;
private boolean verified;
private List phones;
……
@Override
public void writeExternal(ObjectOutput out) throws IOException {
out.writeObject(id + 90000);
out.writeObject(name + ",你好");
out.writeObject(verified);
out.writeObject(phones);
}
@Override
public void readExternal(ObjectInput in) throws ClassNotFoundException, IOException {
id = (int)in.readObject();
name = (String) in.readObject();
verified = (boolean) in.readObject();
phones= (List)in.readObject();
}
}
使用上需要注意:
Serializable支持部分字段自定义序列化。例如某些字段在序列化时需要专门处理,则可以在这些字段前加个 transient 修饰符,那么在序列化和反序列化过程中就会自动过滤这些字段,然后可以在类中添加 writeObject(ObjectOutputStream objectOutputStream) 和 readObject(ObjectInputStream objectInputStream) 方法进行这些字段的自定义序列化和反序列化操作。
public class Student implements Serializable {
private static final long serialVersionUID = 1L;
private transient int id;
private transient String name;
private boolean verified;
……
private void writeObject(ObjectOutputStream objectOutputStream) throws IOException {
objectOutputStream.defaultWriteObject();
objectOutputStream.writeObject(id + 90000);
objectOutputStream.writeObject(name + ",你好");
}
private void readObject(ObjectInputStream objectInputStream) throws IOException, ClassNotFoundException {
objectInputStream.defaultReadObject();
id = (int)objectInputStream.readObject();
name = (String) objectInputStream.readObject();
}
}
使用上需要注意:
根据上面示例得知,序列化就是先创建ObjectOutputStream对象,然后调用其writeObject方法,所以要理解序列化的原理就从这里入手。
ObjectOutputStream.java
public ObjectOutputStream(OutputStream out) throws IOException {
verifySubclass();
// 创建一个DataOutputStream对象bout,用于表示底层块数据输出流容器
bout = new BlockDataOutputStream(out);
……
// 写入文件头信息,STREAM_MAGIC是序列化协议流头标识,STREAM_VERSION是协议版本号
writeStreamHeader();
……
}
final static short STREAM_MAGIC = (short)0xaced;
final static short STREAM_VERSION = 5;
protected void writeStreamHeader() throws IOException {
bout.writeShort(STREAM_MAGIC);
bout.writeShort(STREAM_VERSION);
}
ObjectOutputStream对象的构造方法中,关键逻辑有:
ObjectOutputStream.java
public final void writeObject(Object obj) throws IOException {
……
try {
writeObject0(obj, false);
} catch (IOException ex) {
……
}
}
private void writeObject0(Object obj, boolean unshared) throws IOException {
……
depth++;
try {
……
Object orig = obj;
Class> cl = obj.getClass();
ObjectStreamClass desc;
Class repCl;
// 获取当前类的ObjectStreamClass,desc用于描述类的属性相关信息
desc = ObjectStreamClass.lookup(cl, true);
……
// 根据不同的类型进行不同的写操作
if (obj instanceof Class) {
writeClass((Class) obj, unshared);
} else if (obj instanceof ObjectStreamClass) {
writeClassDesc((ObjectStreamClass) obj, unshared);
// END Android-changed: Make Class and ObjectStreamClass replaceable.
} else if (obj instanceof String) {
writeString((String) obj, unshared);
} else if (cl.isArray()) {
writeArray(obj, desc, unshared);
} else if (obj instanceof Enum) {
writeEnum((Enum>) obj, desc, unshared);
} else if (obj instanceof Serializable) {
// 实现Serializable接口应该走到这里来
writeOrdinaryObject(obj, desc, unshared);
} else {
if (extendedDebugInfo) {
throw new NotSerializableException(
cl.getName() + "\n" + debugInfoStack.toString());
} else {
throw new NotSerializableException(cl.getName());
}
}
} finally {
depth--;
bout.setBlockDataMode(oldMode);
}
}
writeObject0方法关键逻辑有:
final static byte TC_OBJECT = (byte)0x73;
private void writeOrdinaryObject(Object obj, ObjectStreamClass desc, boolean unshared) throws IOException {
……
try {
desc.checkSerialize();
// 写入TC_OBJECT,表示这是一个新对象
bout.writeByte(TC_OBJECT);
// 写入类的元数据
writeClassDesc(desc, false);
handles.assign(unshared ? null : obj);
// 判断类是否实现Externalizable接口进行全部字段自定义序列化
if (desc.isExternalizable() && !desc.isProxy()) {
writeExternalData((Externalizable) obj);
} else {
writeSerialData(obj, desc);
}
} finally {
if (extendedDebugInfo) {
debugInfoStack.pop();
}
}
}
writeOrdinaryObject方法关键逻辑有:
先来看看writeOrdinaryObject方法关键2中writeClassDesc方法,最后会调用到writeNonProxyDesc方法去。
private void writeClassDesc(ObjectStreamClass desc, boolean unshared) throws IOException {
……
writeNonProxyDesc(desc, unshared);
}
final static byte TC_CLASSDESC = (byte)0x72;
private void writeNonProxyDesc(ObjectStreamClass desc, boolean unshared) throws IOException {
// 再写入TC_CLASSDESC,表示接下来的数据是一个新的Class描述符
bout.writeByte(TC_CLASSDESC);
handles.assign(unshared ? null : desc);
if (protocol == PROTOCOL_VERSION_1) {
// do not invoke class descriptor write hook with old protocol
desc.writeNonProxy(this);
} else {
// writeClassDescriptor方法也是desc.writeNonProxy方法
writeClassDescriptor(desc);
}
……
}
protected void writeClassDescriptor(ObjectStreamClass desc) throws IOException {
desc.writeNonProxy(this);
}
writeNonProxyDesc方法首先再写入了0x72,表示接下来的数据是一个新的Class描述符,然后再调用了ObjectStreamClass的writeNonProxy方法。
ObjectStreamClass.java
void writeNonProxy(ObjectOutputStream out) throws IOException {
// 写入类的名称
out.writeUTF(name);
// 写入类的序列号,通过getSerialVersionUID方法获取或生成serialVersionUID变量的值
out.writeLong(getSerialVersionUID());
byte flags = 0;
// 实现Externalizable接口的标识值
if (externalizable) {
flags |= ObjectStreamConstants.SC_EXTERNALIZABLE;
int protocol = out.getProtocolVersion();
if (protocol != ObjectStreamConstants.PROTOCOL_VERSION_1) {
flags |= ObjectStreamConstants.SC_BLOCK_DATA;
}
}
// 实现Serializable接口的标识值
else if (serializable) {
flags |= ObjectStreamConstants.SC_SERIALIZABLE;
}
// 实现Serializable接口且自定义了writeObject()方法进行部分字段自定义序列化
if (hasWriteObjectData) {
flags |= ObjectStreamConstants.SC_WRITE_METHOD;
}
if (isEnum) {
flags |= ObjectStreamConstants.SC_ENUM;
}
// 写入类的标识值
out.writeByte(flags);
// 写入类的字段数
out.writeShort(fields.length);
// 写入类的字段信息
for (int i = 0; i < fields.length; i++) {
ObjectStreamField f = fields[i];
out.writeByte(f.getTypeCode());
out.writeUTF(f.getName());
if (!f.isPrimitive()) {
// 如果字段非原始类型(对象或接口),则写入类型字符串
out.writeTypeString(f.getTypeString());
}
}
}
public long getSerialVersionUID() {
if (suid == null) {
suid = AccessController.doPrivileged(
new PrivilegedAction() {
public Long run() {
return computeDefaultSUID(cl);
}
}
);
}
return suid.longValue();
}
如上述注释,writeNonProxy方法写入了类的名称、serialVersionUID字段的值作为序列号、类的标识值、类的字段信息等。其中serialVersionUID字段的值如果不存在则通过类的信息进行计算出类的默认UID值作为序列号值。
回到writeOrdinaryObject方法关键3中,通过判断是否实现Externalizable接口,然后决定接下来调用的writeExternalData方法和writeSerialData方法。
ObjectOutputStream.java
private void writeExternalData(Externalizable obj) throws IOException {
……
try {
curContext = null;
if (protocol == PROTOCOL_VERSION_1) {
obj.writeExternal(this);
} else {
bout.setBlockDataMode(true);
// 调用类中的writeExternal方法进行自定义序列化
obj.writeExternal(this);
bout.setBlockDataMode(false);
// 再写入TC_ENDBLOCKDATA,表示数据读取完毕
bout.writeByte(TC_ENDBLOCKDATA);
}
} finally {
……
}
……
}
writeExternalData方法所实现的便是上面“全部字段自定义序列化”的步骤,它会调用到Externalizable接口要求实现的writeExternal方法的逻辑。
private void writeSerialData(Object obj, ObjectStreamClass desc) throws IOException {
// 返回序列化对象及其父类的的数据布局ClassDataSlot实例数组,ClassDataSlot按继承顺序排列,数组中最高的超类为首位,本类为末位。
ObjectStreamClass.ClassDataSlot[] slots = desc.getClassDataLayout();
for (int i = 0; i < slots.length; i++) {
ObjectStreamClass slotDesc = slots[i].desc;
// 判断是否包含writeObject方法
if (slotDesc.hasWriteObjectMethod()) {
// ……
try {
curContext = new SerialCallbackContext(obj, slotDesc);
bout.setBlockDataMode(true);
// 调用类中的writeObject方法实现上面“部分字段自定义序列化”的逻辑
slotDesc.invokeWriteObject(obj, this);
bout.setBlockDataMode(false);
// 再写入TC_ENDBLOCKDATA,表示数据读取完毕
bout.writeByte(TC_ENDBLOCKDATA);
} finally {
……
}
……
} else {
// 实现自动默认的序列化逻辑
defaultWriteFields(obj, slotDesc);
}
}
}
writeSerialData方法关键逻辑有:
private void defaultWriteFields(Object obj, ObjectStreamClass desc) throws IOException {
Class> cl = desc.forClass();
……
desc.checkDefaultSerialize();
int primDataSize = desc.getPrimDataSize();
if (primVals == null || primVals.length < primDataSize) {
primVals = new byte[primDataSize];
}
desc.getPrimFieldValues(obj, primVals);
// 写入基本数据类型的数据
bout.write(primVals, 0, primDataSize, false);
ObjectStreamField[] fields = desc.getFields(false);
Object[] objVals = new Object[desc.getNumObjFields()];
int numPrimFields = fields.length - objVals.length;
desc.getObjFieldValues(obj, objVals);
// 写入非基本数据类型的数据
for (int i = 0; i < objVals.length; i++) {
……
try {
// 递归调用writeObject0
writeObject0(objVals[i], fields[numPrimFields + i].isUnshared());
} finally {
if (extendedDebugInfo) {
debugInfoStack.pop();
}
}
}
}
defaultWriteFields方法关键逻辑有:
反序列化的过程几乎是跟序列化相对应,需要先创建ObjectInputStream对象,然后调用其readObject方法。
ObjectInputStream.java
public ObjectInputStream(InputStream in) throws IOException {
verifySubclass();
// 创建一个DataInputStream对象bout,用于表示底层块数据输入流容器
bin = new BlockDataInputStream(in);
……
// 读取文件头信息,判断是否STREAM_MAGIC和STREAM_VERSION
readStreamHeader();
……
}
protected void readStreamHeader() throws IOException, StreamCorruptedException {
short s0 = bin.readShort();
short s1 = bin.readShort();
if (s0 != STREAM_MAGIC || s1 != STREAM_VERSION) {
throw new StreamCorruptedException(String.format("invalid stream header: %04X%04X", s0, s1));
}
}
ObjectInputStream.java
public final Object readObject() throws IOException, ClassNotFoundException {
……
try {
Object obj = readObject0(false);
……
return obj;
} finally {
……
}
}
private Object readObject0(boolean unshared) throws IOException {
……
byte tc;
while ((tc = bin.peekByte()) == TC_RESET) {
bin.readByte();
handleReset();
}
……
try {
switch (tc) {
……
case TC_OBJECT:
return checkResolve(readOrdinaryObject(unshared));
……
}
} finally {
……
}
}
我们知道,在序列化时,判断对象实现了Serializable接口后调用了writeOrdinaryObject方法,方法内为了表示这是一个新对象写入了TC_OBJECT,所以switch中这里将会匹配到TC_OBJECT,然后调用readOrdinaryObject方法。
private Object readOrdinaryObject(boolean unshared) throws IOException {
……
// 当初通用writeClassDesc方法写入类的元数据,现在是读取类的元数据
// 里面还调用了ObjectStreamClass的initNonProxy方法,校验SerialVersionUID是否相等便是在这里
ObjectStreamClass desc = readClassDesc(false);
desc.checkDeserialize();
Class> cl = desc.forClass();
……
// 创建要返回的对象
Object obj;
try {
obj = desc.isInstantiable() ? desc.newInstance() : null;
} catch (Exception ex) {
……
}
……
// 判断类是否实现Externalizable接口进行全部字段自定义反序列化
if (desc.isExternalizable()) {
readExternalData((Externalizable) obj, desc);
} else {
readSerialData(obj, desc);
}
……
return obj;
}
readOrdinaryObject方法对应ObjectOutputStream#writeOrdinaryObject方法,该方法内关键逻辑有:
private void readExternalData(Externalizable obj, ObjectStreamClass desc) throws IOException {
……
try {
……
if (obj != null) {
try {
obj.readExternal(this);
} catch (ClassNotFoundException ex) {
……
}
}
……
} finally {
……
}
}
readExternalData方法所实现的便是上面“全部字段自定义序列化”的步骤,它会调用到Externalizable接口要求实现的readExternal方法的逻辑。
private void readSerialData(Object obj, ObjectStreamClass desc) throws IOException {
// 返回序列化对象及其父类的的数据布局ClassDataSlot实例数组,ClassDataSlot按继承顺序排列,数组中最高的超类为首位,本类为末位。
ObjectStreamClass.ClassDataSlot[] slots = desc.getClassDataLayout();
for (int i = 0; i < slots.length; i++) {
ObjectStreamClass slotDesc = slots[i].desc;
if (slots[i].hasData) {
if (obj == null || handles.lookupException(passHandle) != null) {
defaultReadFields(null, slotDesc);
}
// 判断是否包含readObject方法
else if (slotDesc.hasReadObjectMethod()) {
……
try {
curContext = new SerialCallbackContext(obj, slotDesc);
bin.setBlockDataMode(true);
// 调用类中的readObject方法实现上面“部分字段自定义序列化”的逻辑
slotDesc.invokeReadObject(obj, this);
} catch (ClassNotFoundException ex) {
handles.markException(passHandle, ex);
} finally {
……
}
defaultDataEnd = false;
} else {
// 实现自动默认的反序列化逻辑
defaultReadFields(obj, slotDesc);
}
……
}
……
}
}
readSerialData方法关键逻辑有:
private void defaultReadFields(Object obj, ObjectStreamClass desc) throws IOException {
Class> cl = desc.forClass();
……
int primDataSize = desc.getPrimDataSize();
if (primVals == null || primVals.length < primDataSize) {
primVals = new byte[primDataSize];
}
// 读取基本数据类型的数据
bin.readFully(primVals, 0, primDataSize, false);
if (obj != null) {
desc.setPrimFieldValues(obj, primVals);
}
int objHandle = passHandle;
ObjectStreamField[] fields = desc.getFields(false);
Object[] objVals = new Object[desc.getNumObjFields()];
int numPrimFields = fields.length - objVals.length;
// 读取非基本数据类型的数据
for (int i = 0; i < objVals.length; i++) {
ObjectStreamField f = fields[numPrimFields + i];
// 递归调用readObject0
objVals[i] = readObject0(f.isUnshared());
if (f.getField() != null) {
handles.markDependency(objHandle, passHandle);
}
}
if (obj != null) {
desc.setObjFieldValues(obj, objVals);
}
passHandle = objHandle;
}
defaultReadFields方法关键逻辑有: