对象的内存布局&hotspot对象模型

对象组成

对象在内存中布局可以分为三个区域：

对象头

运行时数据-通过Mark Word实现

包括hashcode、GC分代年龄、锁状态标识、线程持有的锁、偏向锁ID和偏向时间戳
官方称为Mark Word，在32位虚拟机中长度为32bit
在64位虚拟机中长度为64bit
非固定的数据结构，以实现在有限空间内保存尽可能多的数据
32位的Mark Word，在对象未被锁定状态下，其结构如下

WechatIMG281.png

25bit hashcode，4bit分代年龄，2bit锁标志位，1bit固定为0

千万记住，Mark Word中的数据结构是一直在变化的，根据对象状态的不同，其记录的内容不同，则结构也不同，下面是其他状态下，Mark Word中存储的内容：（标志位两bit始终存在）

状态	存储内容	标志位
轻量级锁定	指向锁记录的指针	00
重量级锁定	指向重量级锁的指针	10
GC标识	空	11
可偏向	偏向线程ID、时间戳、分代年龄	01
未锁定	对象Hashcode、分代年龄，如上图实例	01

Mark Word的实现

HotSpot通过markOop类实现Mark Word，markOop.hpp文件中有对应的代码：（均基于openJDK9，其它版本代码实现可能不同）

// The markOop describes the header of an object.
//
// Note that the mark is not a real oop but just a word.
// It is placed in the oop hierarchy for historical reasons.
//
// Bit-format of an object header (most significant first, big endian layout below):
//
//  32 bits:
//  --------
//             hash:25 ------------>| age:4    biased_lock:1 lock:2 (normal object)
//             JavaThread*:23 epoch:2 age:4    biased_lock:1 lock:2 (biased object)
//             size:32 ------------------------------------------>| (CMS free block)
//             PromotedObject*:29 ---------->| promo_bits:3 ----->| (CMS promoted object)
//
//  64 bits:
//  --------
//  unused:25 hash:31 -->| unused:1   age:4    biased_lock:1 lock:2 (normal object)
//  JavaThread*:54 epoch:2 unused:1   age:4    biased_lock:1 lock:2 (biased object)
//  PromotedObject*:61 --------------------->| promo_bits:3 ----->| (CMS promoted object)
//  size:64 ----------------------------------------------------->| (CMS free block)
//
//  unused:25 hash:31 -->| cms_free:1 age:4    biased_lock:1 lock:2 (COOPs && normal object)
//  JavaThread*:54 epoch:2 cms_free:1 age:4    biased_lock:1 lock:2 (COOPs && biased object)
//  narrowOop:32 unused:24 cms_free:1 unused:4 promo_bits:3 ----->| (COOPs && CMS promoted object)
//  unused:21 size:35 -->| cms_free:1 unused:7 ------------------>| (COOPs && CMS free block)
//
//  - hash contains the identity hash value: largest value is
//    31 bits, see os::random().  Also, 64-bit vm's require
//    a hash value no bigger than 32 bits because they will not
//    properly generate a mask larger than that: see library_call.cpp
//    and c1_CodePatterns_sparc.cpp.
//
//  - the biased lock pattern is used to bias a lock toward a given
//    thread. When this pattern is set in the low three bits, the lock
//    is either biased toward a given thread or "anonymously" biased,
//    indicating that it is possible for it to be biased. When the
//    lock is biased toward a given thread, locking and unlocking can
//    be performed by that thread without using atomic operations.
//    When a lock's bias is revoked, it reverts back to the normal
//    locking scheme described below.
//
//    Note that we are overloading the meaning of the "unlocked" state
//    of the header. Because we steal a bit from the age we can
//    guarantee that the bias pattern will never be seen for a truly
//    unlocked object.
//
//    Note also that the biased state contains the age bits normally
//    contained in the object header. Large increases in scavenge
//    times were seen when these bits were absent and an arbitrary age
//    assigned to all biased objects, because they tended to consume a
//    significant fraction of the eden semispaces and were not
//    promoted promptly, causing an increase in the amount of copying
//    performed. The runtime system aligns all JavaThread* pointers to
//    a very large value (currently 128 bytes (32bVM) or 256 bytes (64bVM))
//    to make room for the age bits & the epoch bits (used in support of
//    biased locking), and for the CMS "freeness" bit in the 64bVM (+COOPs).
//
//    [JavaThread* | epoch | age | 1 | 01]       lock is biased toward given thread
//    [0           | epoch | age | 1 | 01]       lock is anonymously biased
//
//  - the two lock bits are used to describe three states: locked/unlocked and monitor.
//
//    [ptr             | 00]  locked             ptr points to real header on stack
//    [header      | 0 | 01]  unlocked           regular object header
//    [ptr             | 10]  monitor            inflated lock (header is wapped out)
//    [ptr             | 11]  marked             used by markSweep to mark an object
//                                               not valid at any other time
//
//    We assume that stack/thread pointers have the lowest two bits cleared.

public:
  // Constants
  enum { age_bits                 = 4,
         lock_bits                = 2,
         biased_lock_bits         = 1,
         max_hash_bits            = BitsPerWord - age_bits - lock_bits - biased_lock_bits,
         hash_bits                = max_hash_bits > 31 ? 31 : max_hash_bits,
         cms_bits                 = LP64_ONLY(1) NOT_LP64(0),
         epoch_bits               = 2
  };

  // The biased locking code currently requires that the age bits be
  // contiguous to the lock bits.
  enum { lock_shift               = 0,
         biased_lock_shift        = lock_bits,
         age_shift                = lock_bits + biased_lock_bits,
         cms_shift                = age_shift + age_bits,
         hash_shift               = cms_shift + cms_bits,
         epoch_shift              = hash_shift
  };

Mark Word的结构是非固定的，根据不同的状态有对应的不同实现。

类型指针

对象头的类型指针，指向该对象的类数据，jvm可以根据这个指针确定该对象是哪个类的实例

如果对象是一个数组，还需要一块用于记录数据长度的区域

实例数据

在程序代码中，所定义的各种类型的字段，包括从父类继承的。

这部分的存储顺序会受到JVM分配策略，以及字段在源码中定义顺序的影响

对齐填充

要求对象的起始地址必须是8字节的整数倍，即对象的大小必须是8字节的整数倍。

由于对象头的大小刚好是8bit的整数倍（32bit或者64bit），所以如果实例数据+对象头，不够8字节的整数倍时，需要通过对齐填充进行补全。

1 byte = 8 bit

1B = 1 byte

1b = 1 bit

HotSpot对象模型

OOP/Klass

HotSpot JVM并没有根据Java对象直接通过虚拟机映射到新建的C++对象，而是设计了一个oop/klass Model。

OOP：Ordinary Object Pointer，用来表示对象的实例信息

Klass：用来保存描述元数据

Klass.hpp中对klass的描述：

// A Klass provides:
//  1: language level class object (method dictionary etc.)
//  2: provide vm dispatch behavior for the object

// Both functions are combined into one C++ class.

// One reason for the oop/klass dichotomy in the implementation is
// that we don't want a C++ vtbl pointer in every object.  Thus,
// normal oops don't have any virtual functions.  Instead, they
// forward all "virtual" functions to their klass, which does have
// a vtbl and does the C++ dispatch depending on the object's
// actual type.  (See oop.inline.hpp for some of the forwarding code.)
// ALL FUNCTIONS IMPLEMENTING THIS DISPATCH ARE PREFIXED WITH "oop_"!

设计OOP/Klass这种模型，原因是不希望每个对象（Object）中都包含一个vtbl（虚方法表），其中oop中不含有任何虚方法，虚方法保存在klass中。

OOP

oop基于oopDesc实现，参见oop.hpp，部分代码如下

class oopDesc {
  friend class VMStructs;
  friend class JVMCIVMStructs;
 private:
  volatile markOop _mark;
  union _metadata {
    Klass*      _klass;
    narrowKlass _compressed_klass;
  } _metadata;

  // Fast access to barrier set. Must be initialized.
  static BarrierSet* _bs;

 public:
  markOop  mark()      const { return _mark; }
  markOop* mark_addr() const { return (markOop*) &_mark; }

  void set_mark(volatile markOop m) { _mark = m; }

主要看private域中包含的内容，这才是oopDesc本身包含的数据。

可以看到，一个oopDesc由两部分组成，分别是_mark 和 _metadata

_mark

_mark是markOop类型，也就是Mark Word的实现，是对象头运行时数据实现。

详见上文中对象头—运行时数据。

其占用内存大小与JVM位长保持一致。

_metadata

是一个结构体(联合体)，Klass 和 narrowKlass都指向instanceKlass对象，其中narrowKlass指向的是经过压缩的对象。

_klass字段建立了oop对象与klass对象之间的关联关系。

在联合体中，各个成员共享一段内存空间，一个联合变量的长度等于各成员中最长的长度。

instanceKlass.hpp中对于InstanceKlass的描述如下

// An InstanceKlass is the VM level representation of a Java class.
// It contains all information needed for at class at execution runtime.

//  InstanceKlass embedded field layout (after declared fields):
//    [EMBEDDED Java vtable             ] size in words = vtable_len
//    [EMBEDDED nonstatic oop-map blocks] size in words = nonstatic_oop_map_size
//      The embedded nonstatic oop-map blocks are short pairs (offset, length)
//      indicating where oops are located in instances of this klass.
//    [EMBEDDED implementor of the interface] only exist for interface
//    [EMBEDDED host klass        ] only exist for an anonymous class (JSR 292 enabled)
//    [EMBEDDED fingerprint       ] only if should_store_fingerprint()==true