序列化

序列化

学习资料

  • Java对象的序列化与反序列化-HollisChuang’s Blog
  • 《成神之路-基础篇》Java基础知识——序列化(已完结)-HollisChuang’s Blog
  • Java基础学习总结——Java对象的序列化和反序列化 - 孤傲苍狼 - 博客园
  • 深度解析JAVA序列化 -
  • Java对象序列化底层原理源码解析 -
  • 深入学习Java序列化 - 天凉好个秋

学习目标

  • 序列化的基本知识
    • 什么时序列化和反序列化
    • 序列化怎么使用
    • 序列化设计那些类;
    • 序列化的实现原理;
  • 序列化的优点和缺点
    • 序列化的用处;
    • 序列化和单例的关系
  • 学习步骤
    • [x] 序列化的使用
    • [x] 序列化的原理
    • [x] 序列化后文件是什么
    • [x] transient关键字是怎么起作用的
    • [x] userId的作用
    • [ ] 使用序列化需要注意的地方

序列化基础知识

  • 序列化和反序列化的定义
    序列化 (Serialization)是将对象的状态信息转换为可以存储或传输的形式的过程。一般将一个对象存储至一个储存媒介,例如档案或是记亿体缓冲等。在网络传输过程中,可以是字节或是XML等格式。而字节的或XML编码格式可以还原完全相等的对象。这个相反的过程又称为反序列化。
  • 序列化涉及的API

java.io.Serializable
java.io.Externalizable
ObjectOutput
ObjectInput
ObjectOutputStream
ObjectInputStream

序列化的使用

父类Person

import java.io.Serializable;

public class Person implements Serializable {
    private int age;
    private String name;

    public void setAge(int age){
        this.age = age;
    }

    public void setName(String name){
        this.name = name;
    }

    public int getAge(){
        return this.age;
    }

    public String getName(){
        return this.name;
    }
}

子类User

import java.io.Serializable;

public class User extends Person  {
    private int level;
    private String company;
    private transient int weight;

    public void setLevel(int level){
        this.level = level;
    }

    public int getLevel()
    {
        return this.level;
    }

    public void setCompany(String company){
        this.company = company;
    }

    public String getCompany()
    {
        return this.company;
    }

    public void setWeight(int weight){
        this.weight = weight;
    }

    public int getWeight(){
        return this.weight;
    }

    public String toString()
    {
        return "User is " + super.getAge() + " old and name is " + super.getName() +
                ", level is " + level + " in " + company + " and the weight is " + this.weight;
    }
}

序列化和反序列

import org.apache.commons.io.IOUtils;

import java.io.ObjectOutputStream;

import java.io.*;

public class SerializationTest {
    public static void main(String[] args)
    {
        User ft = new User();
        ft.setAge(25);
        ft.setName("uranus");
        ft.setLevel(3);
        ft.setCompany("Huawei");
        System.out.println(ft.toString());

        try(ObjectOutputStream oos = new ObjectOutputStream(new FileOutputStream("testSerialization"))){
            oos.writeObject(ft);
        }
        catch(IOException o)
        {
            o.printStackTrace();
        }

        // Read Object from file
        File f = new File("testSerialization");
        try(ObjectInputStream ois = new ObjectInputStream(new FileInputStream(f)))
        {
            User ft1 = (User) ois.readObject();
            System.out.println(ft1.toString());
        }
        catch(IOException o)
        {
            o.printStackTrace();
        }
        catch(ClassNotFoundException cfe)
        {
            cfe.printStackTrace();
        }
    }
}
  1. 如果一个java类实现了Serializable接口,则这个java类的对象可以序列化
  2. 通过ObjectOutputStream和ObjectInputStream对对象进行序列化和反序列化;
  3. 父类和子类在序列化和反序列化的关系
    1. 如果父类实现了接口Serializable,则子类不需要实现接口Serializable,子类对象也可以序列化;
    2. 如果父类没有实现接口Serializable,则子类需要自己实现接口Serializable,而且父类中的属性无法序列化。
  4. 序列化不保存静态变量;
  5. 使用关键字transient可以使变量不被序列化;
  6. 如果类A的变量是类B的引用,并且类B没有实现Serializable,则类A序列化时会报错。

序列化后的文件

aced -- STREAM_MAGIC. 声明使用了序列化协议.
0005 -- STREAM_VERSION. 序列化协议版本.
73 -- TC_OBJECT. 声明这是一个新的对象. 
72 -- 0x72: TC_CLASSDESC. 声明这里开始一个新Class。
0004 -- Class名字的长度
5573 6572 --  User, 类的名字(55-U, 73-s, 65-e, 72-r)
8697 a06f b50f eca3 -- SerialVersionUID, 序列化ID,如果没有指定,则会由算法随机生成一个8byte的ID.
02 -- 标记号. 该值声明该对象支持序列化
00 03 -- 该类所包含的域个数
49 -- 域类型. 49 代表”I”,也就是int 
0005 -- 域名字的长度;
6c65 7665 6c -- level, 字段的名字 
49 -- 字段类型. 49 代表”I”,也就是int 
0006 
7765 6967 6874 -- weight
4c -- L,表示Class或者Interface
00 07 -- 字段名长度
63 6f6d 7061 6e79 -- company
74 -- 标志位:TC_STRING,表示后面的数据是个字符串
00 12 
4c 6a61 7661 2f6c 616e 672f 5374 7269 6e67 3b -- Ljava/lang/String;
78 -- 标志位:TC_ENDBLOCKDATA,对象的数据块描述的结束
72 -- 0x72: TC_CLASSDESC. 声明这里开始一个新Class。
00 06
50 6572 736f 6e -- Person,父类的名字
b7 6bf8 886c 83fe fc -- SerialVersionUID, 序列化ID,如果没有指定,则会由算法随机生成一个8byte的ID.
02 -- 标记号. 该值声明该对象支持序列化
0002 -- 字段个数 
49 -- 字段类型. 49 代表”I”,也就是int 
00 03
61 6765 
4c -- L,表示Class或者Interface
00 04
6e 616d 65 -- name
71 ??
007e 0001 ??
78 -- 标志位:TC_ENDBLOCKDATA,对象的数据块描述的结束
70 -- 标志位:TC_NULL,Null object reference.
0000 0019 -- 父类字段Person.age = 25
74 ---- 标志位:TC_STRING,表示后面的数据是个字符串
00 06
75 7261 6e75 73 -- uranus
00 0000 03 -- User.level = 3
00 0000 64 -- User.Weight = 100
74 -- 标志位:TC_STRING,表示后面的数据是个字符串
0006 
4875 6177 6569 -- Huawei

序列化和反序列化的实现原理

从上面的代码可以看出,序列化和反序列化是使用ObjectOutputStream.writeObjectObjectInputStream.readObject完成的,所以需要更加这两个方法来探究序列化的实现原理。

序列化的原理

  • 首先查看ObjectOutputStream.writeObject方法
*/***
* * Write the specified object to the ObjectOutputStream.  The class of the*
* * object, the signature of the class, and the values of the non-transient*
* * and non-static fields of the class and all of its supertypes are*
* * written.  Default serialization for a class can be overridden using the*
* * writeObject and the readObject methods.  Objects referenced by this*
* * object are written transitively so that a complete equivalent graph of*
* * objects can be reconstructed by an ObjectInputStream.*
* **
* * 

Exceptions are thrown for problems with the OutputStream and for* * * classes that should not be serialized. All exceptions are fatal to the* * * OutputStream, which is left in an indeterminate state, and it is up to* * * the caller to ignore or recover the stream state.* * ** * ****@throws***InvalidClassException Something is wrong with a class used by* * * serialization.* * ****@throws***NotSerializableException Some object to be serialized does not* * * implement the java.io.Serializable interface.* * ****@throws***IOException Any exception thrown by the underlying* * * OutputStream.* * */* public final void writeObject(Object obj) throws IOException { if (enableOverride) { writeObjectOverride(obj); return; } try { writeObject0(obj, false); } catch (IOException ex) { if (depth == 0) { writeFatalException(ex); } throw ex; } }

- `writeObjectOverride`:在OjectOutputStream中是一个空方法,在自定义序列化类的时候,需要继承ObjectOutputStream,并且调用ObjectOutputStream的无参构造函数,重写此方法,从而自定义序列化类可以实现自己的序列化逻辑。

重点是方法writeObject0

private void writeObject0(Object obj, boolean unshared)
    throws IOException
{
   ...

        // remaining cases
        if (obj instanceof String) {
            writeString((String) obj, unshared);
        } else if (cl.isArray()) {
            writeArray(obj, desc, unshared);
        } else if (obj instanceof Enum) {
            writeEnum((Enum) obj, desc, unshared);
        } else if (obj instanceof Serializable) {
            writeOrdinaryObject(obj, desc, unshared);
        } else {
            if (*extendedDebugInfo*) {
                throw new NotSerializableException(
                    cl.getName() + "\n" + debugInfoStack.toString());
            } else {
                throw new NotSerializableException(cl.getName());
            }
        }
   ...
}
- 从方法`writeObject0`中可以看出,`String, Array, Enmu`类型和实现了`Serializable`接口的类才可以序列化。

查看writeOrdinaryObject方法

*/***
* * Writes representation of a "ordinary" (i.e., not a String, Class,*
* * ObjectStreamClass, array, or enum constant) serializable object to the*
* * stream.*
* */*
private void writeOrdinaryObject(Object obj,
                                 ObjectStreamClass desc,
                                 boolean unshared)
    throws IOException
{
    if (*extendedDebugInfo*) {
        debugInfoStack.push(
            (depth == 1 ? “root “ : “”) + “object (class \”” +
            obj.getClass().getName() + “\”, “ + obj.toString() + “)”);
    }
    try {
        desc.checkSerialize();

        bout.writeByte(*TC_OBJECT*);
        writeClassDesc(desc, false);
        handles.assign(unshared ? null : obj);
        if (desc.isExternalizable() && !desc.isProxy()) {
            writeExternalData((Externalizable) obj);
        } else {
            writeSerialData(obj, desc);
        }
    } finally {
        if (*extendedDebugInfo*) {
            debugInfoStack.pop();
        }
    }
}

写数据的方法是writeSerialData,作用是从可序列化的父类到子类写实例数据。
​ - 取出此类的继承层次,放在数组slots中,父类放在子类的前边;
​ - 遍历数组slots,判断每个类是否有实现writeObject;
​ - 如果实现了writeObject,通过slotDesc.invokeWriteObject(obj, this);调用writeObject进行序列化;
​ - 如果没有实现writeObject,则调用defaultWriteFields(obj, slotDesc)进行序列化。

*/***
* * Writes instance data for each serializable class of given object, from*
* * superclass to subclass.*
* */*
private void writeSerialData(Object obj, ObjectStreamClass desc)
    throws IOException
{
     // 取出此类的继承层次,父类放在子类的前边
   ObjectStreamClass.ClassDataSlot[] slots = desc.getClassDataLayout();
    for (int i = 0; i < slots.length; i++) {
        ObjectStreamClass slotDesc = slots[I].desc;
            // 判断类是否实现了自己的writeObject方法
        if (slotDesc.hasWriteObjectMethod()) {
            PutFieldImpl oldPut = curPut;
            curPut = null;
            SerialCallbackContext oldContext = curContext;

            if (*extendedDebugInfo*) {
                debugInfoStack.push(
                    “custom writeObject data (class \”” +
                    slotDesc.getName() + “\”)”);
            }
            try {
                curContext = new SerialCallbackContext(obj, slotDesc);
                bout.setBlockDataMode(true);
                slotDesc.invokeWriteObject(obj, this);
                bout.setBlockDataMode(false);
                bout.writeByte(*TC_ENDBLOCKDATA*);
            } finally {
                curContext.setUsed();
                curContext = oldContext;
                if (*extendedDebugInfo*) {
                    debugInfoStack.pop();
                }
            }

            curPut = oldPut;
        } else {
            defaultWriteFields(obj, slotDesc);
        }
    }
}

查看defaultWriteFields方法

*/***
* * Fetches and writes values of serializable fields of given object to*
* * stream.  The given class descriptor specifies which field values to*
* * write, and in which order they should be written.*
* */*
private void defaultWriteFields(Object obj, ObjectStreamClass desc)
    throws IOException
{
    Class cl = desc.forClass();
    if (cl != null && obj != null && !cl.isInstance(obj)) {
        throw new ClassCastException();
    }

    desc.checkDefaultSerialize();

    int primDataSize = desc.getPrimDataSize();
    if (primVals == null || primVals.length < primDataSize) {
        primVals = new byte[primDataSize];
    }
    desc.getPrimFieldValues(obj, primVals);
    bout.write(primVals, 0, primDataSize, false);

    ObjectStreamField[] fields = desc.getFields(false);
    Object[] objVals = new Object[desc.getNumObjFields()];
    int numPrimFields = fields.length - objVals.length;
    desc.getObjFieldValues(obj, objVals);
    for (int I = 0; I < objVals.length; I++) {
        if (*extendedDebugInfo*) {
            debugInfoStack.push(
                “field (class \”” + desc.getName() + “\”, name: \”” +
                fields[numPrimFields + I].getName() + “\”, type: \”” +
                fields[numPrimFields + I].getType() + “\”)”);
        }
        try {
            writeObject0(objVals[I],
                         fields[numPrimFields + I].isUnshared());
        } finally {
            if (*extendedDebugInfo*) {
                debugInfoStack.pop();
            }
        }
    }
}

类中的变量分为基本数据类型和引用类型
​ - 基本数据类型:bout.write(primVals, 0, primDataSize, false);写入基本数据类型。
​ - 引用类型:重复调用writeObject0,不断序列化引用的对象,因此如果引用的对象不是String, Array, Enmu类型和实现了Serializable接口的类,运行时会报错。
从上可以看出,对于每一个对象,先将对象中的基本数据类型序列化,然后循环的序列化所有的引用变量。

反序列化的原理

反序列化的过程和序列化类似,调用链如下

  • ObjectInputStream.readObject—> ObjectInputStream.readObject0—> ObjectInputStream.readOrdinaryObject—>ObjectInputStream.readSerialDate—>ObjectInputStream.defaultReadFields.

    • 在方法readSerialDate中从最上层的父类开始循环反序列化每个类
      • 首先通过slotDesc.hasReadObjectMethod()判断此类是否实现了readObject方法,如果实现了。。。。
      • 如果没有实现,调用defaultReadFields方法。

查看defaultReadFields方法,和序列化时相似,分别获取基本数据类型和引用类型

  • 基本数据类型:通过bin.readFully(primVals, 0, primDataSize, false);desc.setPrimFieldValues(obj, primVals);获取;

    • 引用类型:重复调用readObject0方法,objVals[I] = readObject0(f.isUnshared());
*/***
* * Reads in values of serializable fields declared by given class*
* * descriptor.  If obj is non-null, sets field values in obj.  Expects that*
* * passHandle is set to obj’s handle before this method is called.*
* */*
private void defaultReadFields(Object obj, ObjectStreamClass desc)
    throws IOException
{
    Class cl = desc.forClass();
    if (cl != null && obj != null && !cl.isInstance(obj)) {
        throw new ClassCastException();
    }

    int primDataSize = desc.getPrimDataSize();
    if (primVals == null || primVals.length < primDataSize) {
        primVals = new byte[primDataSize];
    }
        bin.readFully(primVals, 0, primDataSize, false);
    if (obj != null) {
        desc.setPrimFieldValues(obj, primVals);
    }

    int objHandle = passHandle;
    ObjectStreamField[] fields = desc.getFields(false);
    Object[] objVals = new Object[desc.getNumObjFields()];
    int numPrimFields = fields.length - objVals.length;
    for (int I = 0; I < objVals.length; I++) {
        ObjectStreamField f = fields[numPrimFields + I];
        objVals[I] = readObject0(f.isUnshared());
        if (f.getField() != null) {
            handles.markDependency(objHandle, passHandle);
        }
    }
    if (obj != null) {
        desc.setObjFieldValues(obj, objVals);
    }
    passHandle = objHandle;
}

static和transient字段不能被序列化的原因

ObjectOutputStream.writeObject0—>ObjectStreamClass.lookup—>ObjectStreamClass(final Class cl)—> ObjectStreamClass.getSerialFields—> ObjectStreamClass.getDefaultSerialField
在方法getDefaultSerialField中,将static和transient的字段去除。

*/***
* * Returns array of ObjectStreamFields corresponding to all non-static*
* * non-transient fields declared by given class.  Each ObjectStreamField*
* * contains a Field object for the field it represents.  If no default*
* * serializable fields exist, NO_FIELDS is returned.*
* */*
private static ObjectStreamField[] getDefaultSerialFields(Class cl) {
    Field[] clFields = cl.getDeclaredFields();
    ArrayList list = new ArrayList<>();
    int mask = Modifier.*STATIC*| Modifier.*TRANSIENT*;

    for (int I = 0; I < clFields.length; I++) {
        if ((clFields[I].getModifiers() & mask) == 0) {
            list.add(new ObjectStreamField(clFields[I], false, true));
        }
    }
    int size = list.size();
    return (size == 0) ? *NO_FIELDS*:
        list.toArray(new ObjectStreamField[size]);
}

自定义writeObject和readObject

在类中可以自定义writeObject和readObject方法,然后序列化和反序列化时会调用自定义的方法。

  • 序列化实现自定义writeObject

    • 在方法writeSerialData()中,如果判断类实现了自己的writeObject,会调用slotDesc.invokeWriteObject(obj, this);, 进入invokeWriteObject方法,有writeObjectMethod.invoke(obj, new Object[]{ out });,通过反射的方法获取自定义的writeObject。

    • writeObjectMethod是怎么获得的?
      在方法writeObject0中获取desc时,需要调用ObjectStreamClass的构造方法,在构造方法中有一段代码。
      调用getPrivateMethod获取类中的writeObject和readObject方法。

      cons = *getSerializableConstructor*(cl);
      writeObjectMethod = *getPrivateMethod*(cl, “writeObject”,
          new Class[] { ObjectOutputStream.class },
          Void.*TYPE*);
      readObjectMethod = *getPrivateMethod*(cl, “readObject”,
          new Class[] { ObjectInputStream.class },
          Void.*TYPE*);
      readObjectNoDataMethod = *getPrivateMethod*(
          cl, “readObjectNoData”, null, Void.*TYPE*);
      hasWriteObjectData = (writeObjectMethod != null);
      
  • 反序列化实现自定义readObject

    反序列化调用自定义readObject与序列化比较类似;

    • readSerialData中调用slotDesc.invokeReadObject(obj, this);,同样有readObjectMethod.invoke(obj, new Object[]{ in });通过反射获取类中自定义的readObject方法在invokeReadObeject中;

    • readObjectMethod的获取

      ObjectInputStrem.readOrdinaryObject-->ObjectInputStrem.readClassDesc-->ObjectInputStrem.readNonProxyDesc-->ObjectStreamClass.initNonProxy-->ObjectStreamClass.lookup

      lookup中调用了ObjectStreamClass的构造方法,和序列化部分类似,最终得到了readObjectMethod

SerialVersionId

虚拟机是否允许反序列化,不仅取决于类路径和功能代码是否一致,一个非常重要的一点是两个类的序列化 ID 是否一致(就是 private static final long serialVersionUID

序列化与单例模式

  • 单例与序列化的那些事儿

ObjectInputStream.readOrdinaryObject中,需要获取对象实例,对应代码为

Object obj;
try {
    obj = desc.isInstantiable() ? desc.newInstance() : null;
} catch (Exception ex) {
    throw (IOException) new InvalidClassException(
        desc.forClass().getName(),
        "unable to create instance").initCause(ex);
}

从代码中可以看出,desc.newInstance()通过反射调用无参构造函数创建了一个新的对象。

解决方法,在类中加入readResolve方法。

import java.io.Serializable;

public class Singleton implements Serializable {
    private volatile static Singleton singleton;
    private Singleton() {};


    public static Singleton getSingleton()
    {
        if (singleton == null)
        {
            synchronized(Singleton.class)
            {
                singleton = new Singleton();
            }
        }
        return singleton;
    }

    private Object readResolve()
    {
        return singleton;
    }
}

同样在方法readOrdinaryObject中,会判断类是否实现了readResolve方法,如果实现了,则会执行此方法,因此在此方法中指定返回对象的生成策略。

if (obj != null &&
    handles.lookupException(passHandle) == null &&
    desc.hasReadResolveMethod())
{
    Object rep = desc.invokeReadResolve(obj);
    if (unshared && rep.getClass().isArray()) {
        rep = cloneArray(rep);
    }
    if (rep != obj) {
        // Filter the replacement object
        if (rep != null) {
            if (rep.getClass().isArray()) {
                filterCheck(rep.getClass(), Array.getLength(rep));
            } else {
                filterCheck(rep.getClass(), -1);
            }
        }
        handles.setObject(passHandle, obj = rep);
    }
}

你可能感兴趣的:(序列化)