FlatBuffers反序列化过程

FlatBuffers简介
FlatBuffers Schema解析
FlatBuffers序列化过程
FlatBuffers反序列化过程

在上一篇详细讲解了FlatBuffers的序列化过程,现在来讲解其逆过程:反序列化。
使用的代码仍然是SampleBinary.java,简单修改了下最后的输出。

class SampleBinary {
  // Example how to use FlatBuffers to create and read binary buffers.
  public static void main(String[] args) {
    FlatBufferBuilder builder = new FlatBufferBuilder(0);

    // Create some weapons for our Monster ('Sword' and 'Axe').
    int weaponOneName = builder.createString("Sword");
    short weaponOneDamage = 3;
    int weaponTwoName = builder.createString("Axe");
    short weaponTwoDamage = 5;

    // Use the `createWeapon()` helper function to create the weapons, since we set every field.
    int[] weaps = new int[2];
    weaps[0] = Weapon.createWeapon(builder, weaponOneName, weaponOneDamage);
    weaps[1] = Weapon.createWeapon(builder, weaponTwoName, weaponTwoDamage);

    // Serialize the FlatBuffer data.
    int name = builder.createString("Orc");
    byte[] treasure = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
    int inv = Monster.createInventoryVector(builder, treasure);
    int weapons = Monster.createWeaponsVector(builder, weaps);
    int pos = Vec3.createVec3(builder, 1.0f, 2.0f, 3.0f);

    Monster.startMonster(builder);
    Monster.addPos(builder, pos);
    Monster.addName(builder, name);
    Monster.addColor(builder, Color.Red);
    Monster.addHp(builder, (short)300);
    Monster.addInventory(builder, inv);
    Monster.addWeapons(builder, weapons);
    Monster.addEquippedType(builder, Equipment.Weapon);
    Monster.addEquipped(builder, weaps[1]);
    int orc = Monster.endMonster(builder);

    builder.finish(orc); // You could also call `Monster.finishMonsterBuffer(builder, orc);`.

    // We now have a FlatBuffer that can be stored on disk or sent over a network.

    // ...Code to store to disk or send over a network goes here...

    // Instead, we are going to access it right away, as if we just received it.

    ByteBuffer buf = builder.dataBuffer();

    // Get access to the root:
    Monster monster = Monster.getRootAsMonster(buf);
    System.out.println(monster.hp());
    System.out.println(monster.mana());
    System.out.println(monster.name());
    System.out.println(monster.pos());
    System.out.println(monster.inventory(1));
    System.out.println(monster.weapons(0).name());
    System.out.println(monster.weapons(0).damage());
  }
}

内存结构:

8            24                                                           40                                                                                                     84                                                                       112                       124                              140                  148                          160              168                      180                  188                            199                            
|            |                                                            |                                                                                                       |                                                                        |                         |                                |                    |                            |                |                        |                    |                              |                                         
192         186                                                          160                                                                                                     116                                                                      88                        76                                60                   52                           40               32                       20                   12                             1                                    
|          | |                                                          | |                                                                                                    |  |                                                                     |  |                       | |                              | |                 |  |                          | |              | |                      | |                  | |                              |                                        
 root_table                      Monster vtable                                                                  Monster                                                                                               path                                        weapons                        inv                        name                    axe                 Weapon vtable           sword                weaponTwoName             weaponOneName
|          | |                                                          | |                                                                                                    |  |                                                                     |  |                       | |                              | |                 |  |                          | |              | |                      | |                  | |                              | 
32 0 0 0 0 0 26 0 44 0 32 0 0 0 24 0 28 0 0 0 20 0 27 0 16 0 15 0 8 0 4 0 26 0 0 0 40 0 0 0 100 0 0 0 0 0 0 1 56 0 0 0 64 0 0 0 -12 1 0 0 72 0 0 0 0 0 -128 63 0 0 0 64 0 0 64 64 2 0 0 0 0 0 -128 64 0 0 -96 64 0 0 -64 64 0 0 -128 63 0 0 0 64 0 0 64 64 2 0 0 0 52 0 0 0 28 0 0 0 10 0 0 0 0 1 2 3 4 5 6 7 8 9 0 0 3 0 0 0 79 114 99 0 -12 -1 -1 -1 0 0 5 0 24 0 0 0 8 0 12 0 8 0 6 0 8 0 0 0 0 0 3 0 12 0 0 0 3 0 0 0 65 120 101 0 5 0 0 0 83 119 111 114 100 0 0 0 

代码很简单,首先获取一个monster实例,然后就可以调用相应的方法获取值。获取值的逻辑也很简单:首先获取根据字段的vtable_offset从vtable中获取到offset,然后到对应的对象内存中读取对应的值。如果是非引用类型则直接获取值;如果是引用类型(string/vector/table)则从获取到的值引用的位置,需要再进行转化。

1. getRootAsMonster获取root type

Monster monster = Monster.getRootAsMonster(buffer);
  |
  public static Monster getRootAsMonster(ByteBuffer _bb, Monster obj) { _bb.order(ByteOrder.LITTLE_ENDIAN); return (obj.__assign(_bb.getInt(_bb.position()) + _bb.position(), _bb)); }
    |
    public Monster __assign(int _i, ByteBuffer _bb) { __init(_i, _bb); return this; }
      |
      public void __init(int _i, ByteBuffer _bb) { bb_pos = _i; bb = _bb; }

getRootAsMonster这个方法只是简单的进行标记工作:1、标记root table的起始位置(_bb.position()保存的是到root table开始位置的offset);2、标记使用的ByteBuffer。
序列化过程数据是从ByteBuffer的高位往低位写,反序列化的时候刚好相反,从低位往高位读。

2. 从vtable获取offset,这里使用的是Table类的__offset方法

  /**
   * Look up a field in the vtable.
   *
   * @param vtable_offset An `int` offset to the vtable in the Table's ByteBuffer.
   * @return Returns an offset into the object, or `0` if the field is not present.
   */
  protected int __offset(int vtable_offset) {
    // 获取vtable的开始位置
    int vtable = bb_pos - bb.getInt(bb_pos);
    // bb.getShort(vtable) 获取vtable的大小,bb.getShort(vtable + vtable_offset)获取vtable_offset的字段对应于的值,如果返回0则使用默认值。
    return vtable_offset < bb.getShort(vtable) ? bb.getShort(vtable + vtable_offset) : 0;
  }

3. 获取Primitive类型

分两种情况:

  • 获取非默认值
monster.hp()
  |
  public short hp() { int o = __offset(8); return o != 0 ? bb.getShort(o + bb_pos) : 100; }

这里返回的__offset(8)为24,o + bb_pos为64,因此是获取offset为164的值,调用bb.getShort(o + bb_pos)得到-12 1,即500。

  • 获取默认值
monster.mana()
  |
  public short mana() { int o = __offset(6); return o != 0 ? bb.getShort(o + bb_pos) : 150; }

__offset(6)返回值为0,因此使用默认值150。

4. 获取string

monster.name()
  |
  public String name() { int o = __offset(10); return o != 0 ? __string(o + bb_pos) : null; }
    |
  /**
   * Create a Java `String` from UTF-8 data stored inside the FlatBuffer.
   *
   * This allocates a new string and converts to wide chars upon each access,
   * which is not very efficient. Instead, each FlatBuffer string also comes with an
   * accessor based on __vector_as_bytebuffer below, which is much more efficient,
   * assuming your Java program can handle UTF-8 data directly.
   *
   * @param offset An `int` index into the Table's ByteBuffer.
   * @return Returns a `String` from the data stored inside the FlatBuffer at `offset`.
   */
  protected String __string(int offset) {
    CharsetDecoder decoder = UTF8_DECODER.get();
    decoder.reset();
    
    // bb.getInt(offset)获取相对当前位置的偏移值,加上当前位置才是真正的偏移值。
    offset += bb.getInt(offset);
    ByteBuffer src = bb.duplicate().order(ByteOrder.LITTLE_ENDIAN);
    // string的第一个字段存储的是string的长度
    int length = src.getInt(offset);
    src.position(offset + SIZEOF_INT);
    src.limit(offset + SIZEOF_INT + length);

    int required = (int)((float)length * decoder.maxCharsPerByte());
    CharBuffer dst = CHAR_BUFFER.get();
    if (dst == null || dst.capacity() < required) {
      dst = CharBuffer.allocate(required);
      CHAR_BUFFER.set(dst);
    }

    dst.clear();

    try {
      CoderResult cr = decoder.decode(src, dst, true);
      if (!cr.isUnderflow()) {
        cr.throwException();
      }
    } catch (CharacterCodingException x) {
      throw new Error(x);
    }

    return dst.flip().toString();
  }

string在table中存储的是引用值,即string开始的位置与存储引用值位置的距离(注意,这里存储的不是引用对象真正的offset,而是相对这个存储位置的offset,两者相加才是真正的偏移值。这是由addOffset方法决定的。)。
__offset返回28,__string输入值为68,bb.getInt(68);得到72,offset += bb.getInt(offset);得到140,即为string的offset。

5. 获取struct

monster.pos().x()
  |
  public Vec3 pos() { return pos(new Vec3()); }
    |
    public Vec3 pos(Vec3 obj) { int o = __offset(4); return o != 0 ? obj.__assign(o + bb_pos, bb) : null; }
      |
      public Vec3 __assign(int _i, ByteBuffer _bb) { __init(_i, _bb); return this; }
        |
        public void __init(int _i, ByteBuffer _bb) { bb_pos = _i; bb = _bb; }
          |
          public float x() { return bb.getFloat(bb_pos + 0); }

这里类似getRootAsMonster,最后只是标记下Vec3对象使用的ByteBuffer和开始的位置,最后直接从开始位置按字节读取即可。
__offset(4)返回32,o + bb_pos返回72,刚好是pos在内存中的位置。

6. 获取vector

monster.inventory(1)
  |
  public int inventory(int j) { int o = __offset(14); return o != 0 ? bb.get(__vector(o) + j * 1) & 0xFF : 0; }
    |
     /**
      * Get the start data of a vector.
      *
      * @param offset An `int` index into the Table's ByteBuffer.
      * @return Returns the start of the vector data whose offset is stored at `offset`.
      */
      protected int __vector(int offset) {
        offset += bb_pos;
        return offset + bb.getInt(offset) + SIZEOF_INT;  // data starts after the length
      }

获取vector类似string,主要逻辑在__vector(int offset)中,返回真正的偏移量,然后就可以按字节读取。

7. 获取table

获取table的逻辑也类似getRootAsMonster,转化offset后使用assign标记table的中的ByteBuffer和bb_pos,然后就可以使用上面的方法获取值。

monster.weapons(0)
  |
  public Weapon weapons(int j) { return weapons(new Weapon(), j); }
    |
    public Weapon weapons(Weapon obj, int j) { int o = __offset(18); return o != 0 ? obj.__assign(__indirect(__vector(o) + j * 4), bb) : null; }

总结

到这里FlatBuffers的内容基本结束。FlatBuffers给出了一种序列化和反序列的新的视角,在保持内存和速度的高效性的同时原理也很简单。虽然没有像Protocol Buffer那么有名,但是也有不少项目在使用。刚刚成为Apache顶级项目的Arrow就是使用FlatBuffers作为schema序列化存储格式的。但是FlatBuffers也有一个比较大的缺点,其生成的代码风格不太符合正常的调用习惯,这点从上面的代码中也可以看出来:构造的时候需要先构造string/vector/table,然后才开始构造root type,不能嵌套。当然这种风格的争议更多的使用习惯的问题,并不会影响到功能,因此对于大部分用户来说是可以忽略的。

你可能感兴趣的:(FlatBuffers反序列化过程)