【6.C++基础】-锁

锁的意义

原子性+可见性
同一时间，只有一个线程执行锁中代码 + 锁内读在锁前代码执行完，写在锁释放前可见

原子

操作

本身内核的原子是通过原子指令实现的https://code.woboq.org/linux/...
原子库实现的一下方法可以带内存屏障来加强可见性。

store //原子写
load //原子读
exchange //原子交换

compare_exchange_weak //compare and set 性能更高，但是两个值一样时可能会意外返回false。a.compare_exchange_weak(&expect,val)。if a=expect，则a.store(v), else expect=a,返回false

bool compare_exchange_weak (T& expected, T val, memory_order sync = memory_order_seq_cst) volatile noexcept;
Compares the contents of the atomic object's contained value with expected:
- if true, it replaces the contained value with val (like store).
- if false, it replaces expected with the contained value .

 __asm__ __volatile__("" : : : "memory");
 inline void* Acquire_Load() const {
    void* result = rep_;
    MemoryBarrier();
    return result;
  }
  inline void Release_Store(void* v) {
    MemoryBarrier();
    rep_ = v;
  }

compare_exchange_strong

2.内存屏障

    typedef enum memory_order {
        memory_order_relaxed, // 不对执行顺序做保证
        memory_order_acquire, // A load operation with this memory order performs the acquire operation on the affected memory location: no reads or writes in the current thread can be reordered before this load. All writes in other threads that release the same atomic variable are visible in the current thread (see Release-Acquire ordering below)
        memory_order_release, // A store operation with this memory order performs the release operation: no reads or writes in the current thread can be reordered after this store. All writes in the current thread are visible in other threads that acquire the same atomic variable (see Release-Acquire ordering below) and writes that carry a dependency into the atomic variable become visible in other threads that consume the same atomic (see Release-Consume ordering below).
        memory_order_acq_rel, // 同时包含memory_order_acquire 和 memory_order_release
        memory_order_consume, // 本线程中,所有后续的有关本原子类型的操作,必须在本条原子操作完成之后执行
        memory_order_seq_cst // 全部存取都按顺序执行
    } memory_order;

无锁队列

template
struct Node { T t; shared_ptr next; };
atomic> head;
public:
   slist() =default;
   ~slist() =default;
   class reference { 
      shared_ptr p;
   public:
      reference(shared_ptr p_) : p{_p} {}
      T& operator*() { return p->t; }
      T* operator->() { return &p->t; }
   };
   auto find(T t) const {
      auto p = head.load();
      while (p && p->t != t)
         p = p->next;
      return reference{move(p)};
   void push_front(T t) {
      auto p = make_shared();
      p->t = t;
      p->next = head;
      while (head.compare_exchange_weak(p->next, p))
         {}
   }
   void pop_front() {
      auto p = head.load();
      while (p && !head.compare_exchange_weak(p, p->next))
         {}
   }
};

### mutex

std的mutex =>pthread_mutex_lock
linux的glibc的pthread包分好几种，普通的就调futex。自适应的也会先spin。
循环调用 CAS,wait在futex
cmpxchgl检查futex（也就是__lock成员）是否为0（表示锁未占用），如是，赋值1（表示锁被占用）
pthread_cond_wait:
也是先释放mutex。然后futex在cond上（lll_futex_wait (&cond->__data.__futex, futex_val, pshared);）然后再锁mutex
更多pthread的锁：https://casatwy.com/pthreadde...

应用

boost和std都有。boost的效率说是比std高一些

定义：mutex对象 boost::shared_mutex, boost::mutex
lock_guard,shard_lock,unique_lock都是模板类，用来管理mutex

boost::shared_lock中的T只能是shared_mutex类
unique_lock中的T可以为mutex类中的任意一种，如果为shared_mutex，那么boost::unique_lock类的对象构造函数构造时，会自动调用shared_mutex的shared_lock方法，析构函数里，会自动调用shared_mutex的shared_unlock方法。如果是boost:: unique_lock，则分别自动调用lock和unlock方法。

读写锁实现：
typedef boost::shared_lock readLock;
typedef boost::unique_lock writeLock;
boost::shared_mutex rwmutex;
用的时候：
readLock(rwmutex)

互斥锁：
typedef boost::unique_lock exclusiveLock;
boost::mutex m;
exclusiveLock(m)

tips

一写多读多写多读关于coredump这种线程安全都是因为地址访问，比如要读的起始被删除了，数据的reserve啊，map的树调整啊，rehash啊，直接删除之类的。而单独的++这种是不需要的。

还有是可见性和原子性。多写不加锁（没有原子性，可见性的保证）会指令乱序覆盖，比如++的次数变少，读可能会读到旧数据，可能作为if判断不会立即生效因为在寄存器和另一个cpucache中。
关于volitale 作用就是禁止编译器优化，所以取值不会走寄存器。控制不了别的，所以后面的指令还是会乱序到他前面，cpu还是有cpucache，并且cpucache的MSEI没有指令加锁也不会原子性，仍然会出现读不到的情况。用内存屏障或者老老实实用原子，用锁，减少锁冲突

内核原语（spinlocks，mutexes，memory barriers等）确保了并发访问共享数据的安全，内核原语同时阻止了不需要的优化。如果能正确的使用这些同步原语，当然同时也就没有必要使用volatile类型。
https://lwn.net/Articles/233482/

barrier();
禁止编译器指令重排。不使用寄存器的值，从内存中load
(https://zhuanlan.zhihu.com/p/...

spinlock

用户态和内核处理spin差异很大,内核能控制特定cpu,所以逻辑会复杂很多
用户态spin还会直接陷入内核阻塞,内核可不会，那就是真的死循环,必须考虑性能

自己写spinlock

pthread有spin

  while (!condition) {  
    if (count > xxx)  break;  
    count++;  
    \_\_asm\_\_ volatile （"pause");  
  }

  mutex();

内核spin

while (lock->locked);    
        lock->locked = 1;    =》不原子=》 while (test_and_set(&lock->locked));  =》while (lock->locked || test_and_set(&lock->locked));
这种写法每次唤醒lock会出现饿死情况
引入owner和排队
struct spinlock {
        unsigned short owner;
        unsigned short next;
};
void spin_lock(struct spinlock *lock)
{
        unsigned short next = xadd(&lock->next, 1);
        while (lock->owner != next);
}
void spin_unlock(struct spinlock *lock)
{
        lock->owner++;
}
在加入spinlock时，会invalid spinlock导致整个cpu cache颠簸。=》每个cpu自己的结构，用链表链接起来
https://zhuanlan.zhihu.com/p/89058726

信号量

信号量
可睡眠，可多个
原来pthread_mutex不支持进程，后来也有了，但是不是所有平台都支持。信号量是原来进程
加锁down:在自旋锁的保护下，加入等待列表，解锁，调度出去，回来后获取锁，检查是否up，up返回否则循环
解锁up:在自旋锁的保护下，去第一个等待列表，删除，设置up,回调
https://zhuanlan.zhihu.com/p/...
pfs中用来进程同步

ABA

rocksdb中无所队列ABA问题
如果位置V存储的是链表的头结点，那么发生ABA问题的链表中，原头结点是node1，线程 2 操作头结点变化了两次，很可能是先修改头结点为node2，再将node1（在C++中，也可是重新分配的节点node3，但恰好其指针等于已经释放掉的node1）插入表头成为新的头结点。

对于线程 1 ，头结点仍旧为 node1（或者说头结点的值，因为在C++中，虽然地址相同，但其内容可能变为了node3），CAS操作成功，但头结点之后的子链表的状态已不可预知。

建立一个全局数组 HP hp[N]，数组中的元素为指针，称为 Hazard pointer，数组的大小为线程的数目，即每个线程拥有一个 HP。
约定每个线程只能修改自己的 HP，而不允许修改别的线程的 HP，但可以去读别的线程的 HP 值。
当线程尝试去访问一个关键数据节点时，它得先把该节点的指针赋给自己的 HP，即告诉别人不要释放这个节点。
每个线程维护一个私有链表(free list)，当该线程准备释放一个节点时，把该节点放入自己的链表中，当链表数目达到一个设定数目 R 后，遍历该链表把能释放的节点通通释放。
当一个线程要释放某个节点时，它需要检查全局的 HP 数组，确定如果没有任何一个线程的 HP 值与当前节点的指针相同，则释放之，否则不释放，仍旧把该节点放回自己的链表中。
这个不是和文件持有时，其他不能delete是一样的。无锁链表在没有delete时候，next比较。问题是CAS直接取指针比较啊。
https://www.drdobbs.com/lock-...
这个可以解决释放，相当于维护一个释放队列，先不释放=。=
但是解决不了如果再申请还是这块内存，CAS比较里边值的问题，这个释放可以延时，但是赋值不行啊，还是要带version啊。