解释
在计算机世界里,所有的数据(data)都是在二进制形式存储的,这些二进制数可以表示我们人类社会中的各种信息,比如文本,十进制数据,浮点数,等等。
那这里边有一个问题,计算机自己是如何识别这些信息的呢?
简单回答是计算机本身不能识别,这需要程序设计人员的解释
。
如上图所示,当我们写入一个二进制时,我们问计算机这里的二进制数据是什么?它是无法回答的,如何想得到正确的答案,还需要另一部分数据,就是元数据(数据的描述信息),一般我们的计算机操作提令里会隐含着包含这些信息,比如iadd(int 类型相加)。
类似的元数据在计算机底层有很多。举个例子:
一个二进制位只有两个模式,这两个模式只能是0或是1,而这两个模式都有可能是数据本身,比如一个文件只有一位,文件内容是0或是1都是正常的,那么这个位有没有被占用,用当前位就无法表示了,需要另外一部分数据来描述这个位,这也可以理解为元数据。
这也是磁盘区块管理及内存页框的基本原理。
当然磁盘和内存不会一次只操作一位,磁盘的基本存储单位是扇区。
所有盘面上的同一磁道构成一个圆柱,通常称做柱面(Cylinder),每个圆柱上的磁头由上而下从“0”开始编号。数据的读/写按柱面进行,即磁 头读/写数据时首先在同一柱面内从“0”磁头开始进行操作,依次向下在同一柱面的不同盘面即磁头上进行操作,只在同一柱面所有的磁头全部读/写完毕后磁头才转移到下一柱面,因为选取磁头只需通过电子切换即可,而选取柱面则必须通过机械切换。
关于磁盘和文件系统管理详见
https://www.jianshu.com/p/5f77b221165e
推荐冬瓜哥《大话存储》
Java Object
JAVA 对象一般都是复杂对象,那么和上述基本原理一致,数据本身会存在一个地方(堆),那么对于复杂对象的访问,要知道对象的某个field是在什么位置,具体数据是什么?这些元数据也存在一个地方(堆 方法区),对对象的引用也应该保存起来。因为JVM GC的时侯需要遍历所有引用以便回收,这个对象用OOPMAP
(详见周志明《深入理解JAVA虚拟机第二版》根枚举遍历)结构存储。
GC Root OopMap 分析
mark word
JAVA对象除以上内容之外还有一个非常重要的部分就是mark word
The object header consists of a mark word and a klass pointer.
对象头包括markwork
和klass pointer
The mark word has word size (4 byte on 32 bit architectures, 8 byte on 64 bit architectures) and
mark work
与机器的字长一致,32位是4个字节,64位是8个字节
the klass pointer has word size on 32 bit architectures. On 64 bit architectures the klass pointer either has word size, but can also have 4 byte if the heap addresses can be encoded in these 4 bytes.
64位下klass也可以通过压缩指针的方式变成4个字节,JVM参数为UseCompressedOops
This optimization is called "compressed oops" and you can also control it with the option UseCompressedOops.
原文参考
https://stackoverflow.com/questions/26357186/what-is-in-java-object-header
以下截至open jdk markOop.hpp
/ Bit-format of an object header (most significant first, big endian layout below):
//
// 32 bits:
// --------
// hash:25 ------------>| age:4 biased_lock:1 lock:2 (normal object)
// JavaThread*:23 epoch:2 age:4 biased_lock:1 lock:2 (biased object)
// size:32 ------------------------------------------>| (CMS free block)
// PromotedObject*:29 ---------->| promo_bits:3 ----->| (CMS promoted object)
//
// 64 bits:
// --------
// unused:25 hash:31 -->| unused:1 age:4 biased_lock:1 lock:2 (normal object)
// JavaThread*:54 epoch:2 unused:1 age:4 biased_lock:1 lock:2 (biased object)
// PromotedObject*:61 --------------------->| promo_bits:3 ----->| (CMS promoted object)
// size:64 ----------------------------------------------------->| (CMS free block)
//
// unused:25 hash:31 -->| cms_free:1 age:4 biased_lock:1 lock:2 (COOPs && normal object)
// JavaThread*:54 epoch:2 cms_free:1 age:4 biased_lock:1 lock:2 (COOPs && biased object)
// narrowOop:32 unused:24 cms_free:1 unused:4 promo_bits:3 ----->| (COOPs && CMS promoted object)
// unused:21 size:35 -->| cms_free:1 unused:7 ------------------>| (COOPs && CMS free block)
实际占位情况
通过JOL 可以查看对象的实际占用情况
源码如下
public static class A {
private boolean a;
}
# Running 64-bit HotSpot VM.
# Using compressed oop with 3-bit shift.
# Using compressed klass with 3-bit shift.
# WARNING | Compressed references base/shifts are guessed by the experiment!
# WARNING | Therefore, computed addresses are just guesses, and ARE NOT RELIABLE.
# WARNING | Make sure to attach Serviceability Agent to get the reliable addresses.
# Objects are 8 bytes aligned.
# Field sizes by type: 4, 1, 1, 2, 2, 4, 4, 8, 8 [bytes]
# Array element sizes: 4, 1, 1, 2, 2, 4, 4, 8, 8 [bytes]
**** Fresh object
org.openjdk.jol.samples.JOLSample_14_FatLocking$A object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 4 (object header) 01 00 00 00 (00000001 00000000 00000000 00000000) (1)
4 4 (object header) 00 00 00 00 (00000000 00000000 00000000 00000000) (0)
8 4 (object header) 16 f0 00 f8 (00010110 11110000 00000000 11111000) (-134156266)
12 1 boolean A.a false
13 3 (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 3 bytes external = 3 bytes total
Using compressed oop with 3-bit shift?
这句话是什么意思?
为什么压缩指针要移3位 先补补脑
https://wiki.openjdk.java.net/display/HotSpot/CompressedOops
https://stackoverflow.com/questions/25120546/trick-behind-jvms-compressed-oops
- on modern computer architectures, memory addresses are byte addresses,
在现在计算机的体系结构中,内存地址一般指向字节的,意思就是说一个字节一个地址。 - Java object references are addresses that point to the start of a word。
那么在jvm中,对象的引用指向的是字的起始位置。(32位4个字节,64位8个字节,这也是为什么java对象要8个字节对齐
下文要引用) - on a 64-bit machine, word alignment means that that the bottom 3 bits of an object reference / address are zero
在64位机器上,字对齐意识站后3位变成0 (2的三次幂=8) - so, by shifting an address 3 bits to the right, we can "compress" up to a 35 bits of a 64 bit address into a 32-bit word,
所以右移三位,我们可以将64位的35位压缩到32位的字中(相对于64位是半字(half-word))。 - and, decompression can be done by shifting 3 bits to the left, which puts those 3 zero bits back,
那么解压缩的时侯再左移三位,也就是在后三位添0. - 35 bits of addressing allows us to represent object pointers for up to 32 GB of heap memory using compressed oops that fit in 32-bit (half-)words on a 64-bit machine.
35位(实际是32位)的地址可以引用32GB的堆内存空间。
Using compressed oop with 3-bit shift
就是上文提到的8个字节对齐
Field sizes by type: 4, 1, 1, 2, 2, 4, 4, 8, 8 [bytes]
out.printf("# %-19s: %d, %d, %d, %d, %d, %d, %d, %d, %d [bytes]%n",
"Field sizes by type",
oopSize,
sizes.booleanSize,
sizes.byteSize,
sizes.charSize,
sizes.shortSize,
sizes.intSize,
sizes.floatSize,
sizes.longSize,
sizes.doubleSize
);
Array element sizes: 4, 1, 1, 2, 2, 4, 4, 8, 8 [bytes]
out.printf("# %-19s: %d, %d, %d, %d, %d, %d, %d, %d, %d [bytes]%n",
"Array element sizes",
U.arrayIndexScale(Object[].class),
U.arrayIndexScale(boolean[].class),
U.arrayIndexScale(byte[].class),
U.arrayIndexScale(char[].class),
U.arrayIndexScale(short[].class),
U.arrayIndexScale(int[].class),
U.arrayIndexScale(float[].class),
U.arrayIndexScale(long[].class),
U.arrayIndexScale(double[].class)
);
对象头分析
OFFSET SIZE TYPE DESCRIPTION VALUE
0 4 (object header) 01 00 00 00 (00000001 00000000 00000000 00000000) (1)
4 4 (object header) 00 00 00 00 (00000000 00000000 00000000 00000000) (0)
8 4 (object header) 16 f0 00 f8 (00010110 11110000 00000000 11111000) (-134156266)
当前环境为64位并开启压缩指针markwork占8byte即
OFFSET SIZE TYPE DESCRIPTION VALUE
0 4 (object header) 01 00 00 00 (00000001 00000000 00000000 00000000) (1)
4 4 (object header) 00 00 00 00 (00000000 00000000 00000000 00000000) (0)
第三行为压缩指针(Class pointer)4byte
8 4 (object header) 16 f0 00 f8 (00010110 11110000 00000000 11111000) (-134156266)
JDK synchronize
JDK 1.6之后对synchronize进行过优化,会涉及锁升级,理论上通过jol和open-jdk markOop 对markwork的定义,是可以查询锁的变化的,但由于当前笔者能力有限,参考JAVA并发编译艺术 方腾飞 的这部分内容与markOop 结构定义不完全一致。
实际锁的变化如下
joi 源码
https://github.com/sparrowzoo/jol
测试源码
JOLSample_14_FatLocking
**** Fresh object
org.openjdk.jol.samples.JOLSample_14_FatLocking$A object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 4 (object header) 01 00 00 00 (00000001 00000000 00000000 00000000) (1)
4 4 (object header) 00 00 00 00 (00000000 00000000 00000000 00000000) (0)
8 4 (object header) 16 f0 00 f8 (00010110 11110000 00000000 11111000) (-134156266)
12 1 boolean A.a false
13 3 (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 3 bytes external = 3 bytes total
**** Before the lock
org.openjdk.jol.samples.JOLSample_14_FatLocking$A object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 4 (object header) 98 68 0b 11 (10011000 01101000 00001011 00010001) (285960344)
4 4 (object header) 00 70 00 00 (00000000 01110000 00000000 00000000) (28672)
8 4 (object header) 16 f0 00 f8 (00010110 11110000 00000000 11111000) (-134156266)
12 1 boolean A.a false
13 3 (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 3 bytes external = 3 bytes total
**** With the lock
org.openjdk.jol.samples.JOLSample_14_FatLocking$A object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 4 (object header) 9a 76 01 9d (10011010 01110110 00000001 10011101) (-1660848486)
4 4 (object header) b5 7f 00 00 (10110101 01111111 00000000 00000000) (32693)
8 4 (object header) 16 f0 00 f8 (00010110 11110000 00000000 11111000) (-134156266)
12 1 boolean A.a false
13 3 (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 3 bytes external = 3 bytes total
**** After the lock
org.openjdk.jol.samples.JOLSample_14_FatLocking$A object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 4 (object header) 9a 76 01 9d (10011010 01110110 00000001 10011101) (-1660848486)
4 4 (object header) b5 7f 00 00 (10110101 01111111 00000000 00000000) (32693)
8 4 (object header) 16 f0 00 f8 (00010110 11110000 00000000 11111000) (-134156266)
12 1 boolean A.a false
13 3 (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 3 bytes external = 3 bytes total
**** After System.gc()
org.openjdk.jol.samples.JOLSample_14_FatLocking$A object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 4 (object header) 09 00 00 00 (00001001 00000000 00000000 00000000) (9)
4 4 (object header) 00 00 00 00 (00000000 00000000 00000000 00000000) (0)
8 4 (object header) 16 f0 00 f8 (00010110 11110000 00000000 11111000) (-134156266)
12 1 boolean A.a false
13 3 (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 3 bytes external = 3 bytes total
关于锁升级的理论参考
http://ifeve.com/java-synchronized/
https://blog.csdn.net/qq838642798/article/details/64439761
总结
综上所述
- synchronize 并不慢
- 基础数据类型更节省空间,应尽量使用基础数据类型,但有一种情况,基础数据类型容易产品空指针,而且很难定位排查。
比如int value=2*Config.getValue("key") 或查询数据库等场景。null会直接报空指针异常。