volatile本意是“易变的,可变的”,它的作用是来保证 线程的可见性,和防止指令重排
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-SOBgRKtz-1603325999246)(/Users/luca/MarkText-img-Support/2020-08-08-09-39-08-image.png)]
我们都知道每一个线程都有自己的线程私有区,还有与多个线程共享的线程共有区也就是堆(heap),如果线程要使用到堆中的某一个对象,则这个对象会被复制到线程私有区,对这个对象的任何改变首先是更新自己私有区的,之后在立刻写回到共享内存中(Heap 堆)。
这就产生了线程可见性的问题,假设一种情况:我需要线程一修改该对象之后,在用线程二处理这个对象,但是这个对象在线程一还未将处理后的对象更新到共享内存中。线程二就已经到共享内存中取得了这个对象。这就会出现问题。
所以我们就使用volatile关键字来保证这个对象的线程可见性。当一个线程在对主内存的某一份数据进行更改时,改完之后会立刻刷新到主内存。并且会强制让缓存了该变量的线程中的数据清空,它们必须从主内存重新读取最新数据。这样一来就保证了可见性
它的本质是使用的CPU的缓存一致性协议来保证线程可见性。 CPU缓存一致性协议是多核的计算机来保证各个CPU缓存一致的。
注意⚠️:volatile没有把握就不要用,volatile修饰的值越简单越好,尽量不要用它来修饰引用,例如:
volatile ArrayList<Object> mylist = new ArrayList();
这样volatile就是修饰的一个引用,volatile保证这个引用的可见性,也就是说,当mylist指向别的对象时其他的线程才看的见,当这个列表的内容发生改变时,另外的线程时看不见的。
package com.mashibing.testvolatile;
public class T01_ThreadVisibility {
private static volatile boolean flag = true;
public static void main(String[] args) throws InterruptedException {
new Thread(()-> {
while (flag) {
//do sth
}
System.out.println("end");
}, "server").start();
Thread.sleep(1000);
flag = false;
}
}
我们知道,CPU为了执行的效率,都是采用了流水线模式来处理指令,如果想充分的利用这一点,就要求我们的编译器会对编译完的代码进行重新排序。
但是有一些指令我们需要让他按顺序执行,不能让它重排序,所以我们要使用volatile来修饰。
我们不能禁止CPU的指令重排序,因为这是CPU提高效率的策略,是CPU级别的,我们禁止不了的,但是我们可以在虚拟机级别来禁止指令重排序。
如果还要深究,其实防止指令重排是使用读屏障(LoadFence),写屏障(StroeFence). 这个是CPU的原语,CPU是直接支持的。LoadFance规定 必须执行完屏障前的读操作才能执行屏障后的操作,写也是一样
public static class MyObject{
private static MyObject INSTANCE ;
//有线程安全的单例模式
public static /*volatile*/MyObject getInstance(){
if(INSTANCE == null){
INSTANCE = new MyObject();
}
return INSTANCE;
}
//使用synchronized解决线程安全
public static synchronized MyObject getInstance(){
if(INSTANCE == null){
INSTANCE = new MyObject();
}
return INSTANCE;
}
//锁粒度细化
public static MyObject getInstance(){
//其他不用上锁的代码
if(INSTANCE == null){
synchronized(MyObject.class){
if(INSTANCE == null) {
INSTANCE = new MyObject();
}
}
}
return INSTANCE;
}
}
这样的单例模式肯定是有问题的,因为他是线程不安全的,我们可以在getInstance方法上加一个Synchronized,这样肯定是解决了这个问题。
但是现在又有一个问题,我们直接粗暴将getInstance() 方法变成了同步方法,我们希望将这个锁的粒度细化,最终就是我们的DCL(Double Check Lock), 这个看起来是十全十美了,在工程上也十分难出错,但是这个还是可能会出错。
这个错就在INSTANCE = new MyObject(); 的指令重排上。在JVM中new一个对象分成三步
给对象申请内存
初始化对象的成员变量
引用指向这块内存
当这三个步骤发生了指令重排序的话,比如顺序是132。当我们一个线程开始new这个对象的时候,执行的是132,当这个线程执行到3的时候,也就是说引用已经指向了这一块内存,但这一块内存还是赋值的默认值,还没有进行初始化,这时第二个线程进来,判断发现,这个引用已经指向一个内存了(也就是不等于null了),这时第二个线程就直接拿起还没有初始化的对象就走了。
虽然这个情况在高并发的环境中也可能不会出现,但是在超高超高的并发环境下就可能会出现这种的情况
这时我们就要对这个对象加上volatile,防止这个对象进行指令重排序
volatile不可能替代synchronized,volatile只保证线程的可见性,但不保证原子性,比如一个递增语句:count++,它最少分为三步执行,在这三步中难免会被其他的线程插一脚进来访问,所以volatile并不能保证多个线程访问共享数据带来的不一致问题
缓存行对齐
缓存行64个字节是CPU同步的基本单位,缓存行隔离会比伪共享效率要高
Disruptor
需要注意,JDK8引入了@sun.misc.Contended注解,来保证缓存行隔离效果
要使用此注解,必须去掉限制参数:-XX:-RestrictContended
另外,java编译器或者JIT编译器有可能会去除没用的字段,所以填充字段必须加上volatile
package com.mashibing.juc.c_028_FalseSharing;
public class T02_CacheLinePadding {
private static class Padding {
public volatile long p1, p2, p3, p4, p5, p6, p7; //
}
private static class T extends Padding {
public volatile long x = 0L;
}
public static T[] arr = new T[2];
static {
arr[0] = new T();
arr[1] = new T();
}
public static void main(String[] args) throws Exception {
Thread t1 = new Thread(()->{
for (long i = 0; i < 1000_0000L; i++) {
arr[0].x = i;
}
});
Thread t2 = new Thread(()->{
for (long i = 0; i < 1000_0000L; i++) {
arr[1].x = i;
}
});
final long start = System.nanoTime();
t1.start();
t2.start();
t1.join();
t2.join();
System.out.println((System.nanoTime() - start)/100_0000);
}
}
MESI
伪共享
合并写
CPU内部的4个字节的Buffer
package com.mashibing.juc.c_029_WriteCombining;
public final class WriteCombining {
private static final int ITERATIONS = Integer.MAX_VALUE;
private static final int ITEMS = 1 << 24;
private static final int MASK = ITEMS - 1;
private static final byte[] arrayA = new byte[ITEMS];
private static final byte[] arrayB = new byte[ITEMS];
private static final byte[] arrayC = new byte[ITEMS];
private static final byte[] arrayD = new byte[ITEMS];
private static final byte[] arrayE = new byte[ITEMS];
private static final byte[] arrayF = new byte[ITEMS];
public static void main(final String[] args) {
for (int i = 1; i <= 3; i++) {
System.out.println(i + " SingleLoop duration (ns) = " + runCaseOne());
System.out.println(i + " SplitLoop duration (ns) = " + runCaseTwo());
}
}
public static long runCaseOne() {
long start = System.nanoTime();
int i = ITERATIONS;
while (--i != 0) {
int slot = i & MASK;
byte b = (byte) i;
arrayA[slot] = b;
arrayB[slot] = b;
arrayC[slot] = b;
arrayD[slot] = b;
arrayE[slot] = b;
arrayF[slot] = b;
}
return System.nanoTime() - start;
}
public static long runCaseTwo() {
long start = System.nanoTime();
int i = ITERATIONS;
while (--i != 0) {
int slot = i & MASK;
byte b = (byte) i;
arrayA[slot] = b;
arrayB[slot] = b;
arrayC[slot] = b;
}
i = ITERATIONS;
while (--i != 0) {
int slot = i & MASK;
byte b = (byte) i;
arrayD[slot] = b;
arrayE[slot] = b;
arrayF[slot] = b;
}
return System.nanoTime() - start;
}
}
指令重排序
package com.mashibing.jvm.c3_jmm;
public class T04_Disorder {
private static int x = 0, y = 0;
private static int a = 0, b =0;
public static void main(String[] args) throws InterruptedException {
int i = 0;
for(;;) {
i++;
x = 0; y = 0;
a = 0; b = 0;
Thread one = new Thread(new Runnable() {
public void run() {
//由于线程one先启动,下面这句话让它等一等线程two. 读着可根据自己电脑的实际性能适当调整等待时间.
//shortWait(100000);
a = 1;
x = b;
}
});
Thread other = new Thread(new Runnable() {
public void run() {
b = 1;
y = a;
}
});
one.start();other.start();
one.join();other.join();
String result = "第" + i + "次 (" + x + "," + y + ")";
if(x == 0 && y == 0) {
System.err.println(result);
break;
} else {
//System.out.println(result);
}
}
}
public static void shortWait(long interval){
long start = System.nanoTime();
long end;
do{
end = System.nanoTime();
}while(start + interval >= end);
}
}
### 系统底层如何实现数据一致性
1. MESI如果能解决,就使用MESI
2. 如果不能,就锁总线
### 系统底层如何保证有序性
1. 内存屏障sfence mfence lfence等系统原语
2. 锁总线
### volatile如何解决指令重排序
1: volatile i
2: ACC_VOLATILE
3: JVM的内存屏障
屏障两边的指令不可以重排!保障有序!
happends-before
as - if - serial
4:hotspot实现
bytecodeinterpreter.cpp
```c++
int field_offset = cache->f2_as_index();
if (cache->is_volatile()) {
if (support_IRIW_for_not_multiple_copy_atomic_cpu) {
OrderAccess::fence();
}
orderaccess_linux_x86.inline.hpp
inline void OrderAccess::fence() {
if (os::is_MP()) {
// always use locked addl since mfence is sometimes expensive
#ifdef AMD64
__asm__ volatile ("lock; addl $0,0(%%rsp)" : : : "cc", "memory");
#else
__asm__ volatile ("lock; addl $0,0(%%esp)" : : : "cc", "memory");
#endif
}
}
LOCK 用于在多处理器中执行指令时对共享内存的独占使用。
它的作用是能够将当前处理器对应缓存的内容刷新到内存,并使其他处理器对应的缓存失效。
另外还提供了有序的指令无法越过这个内存屏障的作用。
安装hsdis
代码
public class T {
public static volatile int i = 0;
public static void main(String[] args) {
for(int i=0; i<1000000; i++) {
m();
n();
}
}
public static synchronized void m() {
}
public static void n() {
i = 1;
}
}
java -XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly T > 1.txt
由于JIT会为所有代码生成汇编,请搜索T::m T::n,来找到m() 和 n()方法的汇编码
============================= C1-compiled nmethod ==============================
----------------------------------- Assembly -----------------------------------
Compiled method (c1) 67 1 3 java.lang.Object:: (1 bytes)
total in heap [0x00007f81d4d33010,0x00007f81d4d33360] = 848
relocation [0x00007f81d4d33170,0x00007f81d4d33198] = 40
main code [0x00007f81d4d331a0,0x00007f81d4d33260] = 192
stub code [0x00007f81d4d33260,0x00007f81d4d332f0] = 144
metadata [0x00007f81d4d332f0,0x00007f81d4d33300] = 16
scopes data [0x00007f81d4d33300,0x00007f81d4d33318] = 24
scopes pcs [0x00007f81d4d33318,0x00007f81d4d33358] = 64
dependencies [0x00007f81d4d33358,0x00007f81d4d33360] = 8
--------------------------------------------------------------------------------
[Constant Pool (empty)]
--------------------------------------------------------------------------------
[Entry Point]
# {method} {0x00007f81d3cfe650} '' '()V' in 'java/lang/Object'
# [sp+0x40] (sp of caller)
0x00007f81d4d331a0: mov 0x8(%rsi),%r10d
0x00007f81d4d331a4: shl $0x3,%r10
0x00007f81d4d331a8: cmp %rax,%r10
0x00007f81d4d331ab: jne 0x00007f81d47eed00 ; {runtime_call ic_miss_stub}
0x00007f81d4d331b1: data16 data16 nopw 0x0(%rax,%rax,1)
0x00007f81d4d331bc: data16 data16 xchg %ax,%ax
[Verified Entry Point]
0x00007f81d4d331c0: mov %eax,-0x14000(%rsp)
0x00007f81d4d331c7: push %rbp
0x00007f81d4d331c8: sub $0x30,%rsp
0x00007f81d4d331cc: movabs $0x7f81d3f33388,%rdi ; {metadata(method data for {method} {0x00007f81d3cfe650} '' '()V' in 'java/lang/Object')}
0x00007f81d4d331d6: mov 0x13c(%rdi),%ebx
0x00007f81d4d331dc: add $0x8,%ebx
0x00007f81d4d331df: mov %ebx,0x13c(%rdi)
0x00007f81d4d331e5: and $0x1ff8,%ebx
0x00007f81d4d331eb: cmp $0x0,%ebx
0x00007f81d4d331ee: je 0x00007f81d4d33204 ;*return {reexecute=0 rethrow=0 return_oop=0}
; - java.lang.Object::@0 (line 50)
0x00007f81d4d331f4: add $0x30,%rsp
0x00007f81d4d331f8: pop %rbp
0x00007f81d4d331f9: mov 0x108(%r15),%r10
0x00007f81d4d33200: test %eax,(%r10) ; {poll_return}
0x00007f81d4d33203: retq
0x00007f81d4d33204: movabs $0x7f81d3cfe650,%r10 ; {metadata({method} {0x00007f81d3cfe650} '' '()V' in 'java/lang/Object')}
0x00007f81d4d3320e: mov %r10,0x8(%rsp)
0x00007f81d4d33213: movq $0xffffffffffffffff,(%rsp)
0x00007f81d4d3321b: callq 0x00007f81d489e000 ; ImmutableOopMap {rsi=Oop }
;*synchronization entry
; - java.lang.Object::@-1 (line 50)
; {runtime_call counter_overflow Runtime1 stub}
0x00007f81d4d33220: jmp 0x00007f81d4d331f4
0x00007f81d4d33222: nop
0x00007f81d4d33223: nop
0x00007f81d4d33224: mov 0x3f0(%r15),%rax
0x00007f81d4d3322b: movabs $0x0,%r10
0x00007f81d4d33235: mov %r10,0x3f0(%r15)
0x00007f81d4d3323c: movabs $0x0,%r10
0x00007f81d4d33246: mov %r10,0x3f8(%r15)
0x00007f81d4d3324d: add $0x30,%rsp
0x00007f81d4d33251: pop %rbp
0x00007f81d4d33252: jmpq 0x00007f81d480be80 ; {runtime_call unwind_exception Runtime1 stub}
0x00007f81d4d33257: hlt
0x00007f81d4d33258: hlt
0x00007f81d4d33259: hlt
0x00007f81d4d3325a: hlt
0x00007f81d4d3325b: hlt
0x00007f81d4d3325c: hlt
0x00007f81d4d3325d: hlt
0x00007f81d4d3325e: hlt
0x00007f81d4d3325f: hlt
[Exception Handler]
0x00007f81d4d33260: callq 0x00007f81d489ad00 ; {no_reloc}
0x00007f81d4d33265: mov %rsp,-0x28(%rsp)
0x00007f81d4d3326a: sub $0x80,%rsp
0x00007f81d4d33271: mov %rax,0x78(%rsp)
0x00007f81d4d33276: mov %rcx,0x70(%rsp)
0x00007f81d4d3327b: mov %rdx,0x68(%rsp)
0x00007f81d4d33280: mov %rbx,0x60(%rsp)
0x00007f81d4d33285: mov %rbp,0x50(%rsp)
0x00007f81d4d3328a: mov %rsi,0x48(%rsp)
0x00007f81d4d3328f: mov %rdi,0x40(%rsp)
0x00007f81d4d33294: mov %r8,0x38(%rsp)
0x00007f81d4d33299: mov %r9,0x30(%rsp)
0x00007f81d4d3329e: mov %r10,0x28(%rsp)
0x00007f81d4d332a3: mov %r11,0x20(%rsp)
0x00007f81d4d332a8: mov %r12,0x18(%rsp)
0x00007f81d4d332ad: mov %r13,0x10(%rsp)
0x00007f81d4d332b2: mov %r14,0x8(%rsp)
0x00007f81d4d332b7: mov %r15,(%rsp)
0x00007f81d4d332bb: movabs $0x7f81f15ff3e2,%rdi ; {external_word}
0x00007f81d4d332c5: movabs $0x7f81d4d33265,%rsi ; {internal_word}
0x00007f81d4d332cf: mov %rsp,%rdx
0x00007f81d4d332d2: and $0xfffffffffffffff0,%rsp
0x00007f81d4d332d6: callq 0x00007f81f1108240 ; {runtime_call}
0x00007f81d4d332db: hlt
[Deopt Handler Code]
0x00007f81d4d332dc: movabs $0x7f81d4d332dc,%r10 ; {section_word}
0x00007f81d4d332e6: push %r10
0x00007f81d4d332e8: jmpq 0x00007f81d47ed0a0 ; {runtime_call DeoptimizationBlob}
0x00007f81d4d332ed: hlt
0x00007f81d4d332ee: hlt
0x00007f81d4d332ef: hlt
--------------------------------------------------------------------------------
============================= C1-compiled nmethod ==============================
----------------------------------- Assembly -----------------------------------
Compiled method (c1) 74 2 3 java.lang.StringLatin1::hashCode (42 bytes)
total in heap [0x00007f81d4d33390,0x00007f81d4d338a8] = 1304
relocation [0x00007f81d4d334f0,0x00007f81d4d33528] = 56
main code [0x00007f81d4d33540,0x00007f81d4d336c0] = 384
stub code [0x00007f81d4d336c0,0x00007f81d4d33750] = 144
metadata [0x00007f81d4d33750,0x00007f81d4d33758] = 8
scopes data [0x00007f81d4d33758,0x00007f81d4d337c0] = 104
scopes pcs [0x00007f81d4d337c0,0x00007f81d4d33890] = 208
dependencies [0x00007f81d4d33890,0x00007f81d4d33898] = 8
nul chk table [0x00007f81d4d33898,0x00007f81d4d338a8] = 16
--------------------------------------------------------------------------------
[Constant Pool (empty)]
--------------------------------------------------------------------------------
[Verified Entry Point]
# {method} {0x00007f81d3e6ddd0} 'hashCode' '([B)I' in 'java/lang/StringLatin1'
# parm0: rsi:rsi = '[B'
# [sp+0x40] (sp of caller)
0x00007f81d4d33540: mov %eax,-0x14000(%rsp)
[测试结果太多,大约有14w+字,如果感兴趣,私信我获取完整的测试结果]