原子操作相关资源

https://gcc.gnu.org/wiki/Atomic

Full support for C/C++ atomic operations as specified in draft ISO C and C++ documents.

Full support for C/C++ atomic operations as specified in draft ISO C and C++ documents.

The C++11 Memory Model and GCC: Summarized information and the GCC implementation details.

For C, see "Parallel memory sequencing model proposal"

N1349

For C++, see "The C++ memory model" and the "Atomic operations library" chapter in the most recent draft:

N3290: Final draft standard as of 11-Apr-2011

A Less Formal Explanation of the Proposed C++ Concurrency Memory Model explains the C++ memory model; Threads Basics explains the basic cross-language ideas behind it.

Concurrency memory model compiler consequences: Examples of optimizations that are no longer allowed under the new memory model.

Rationale for the C++ working paper definition of "memory location": why old Alphas need to use CAS to update chars.

Research papers

  • An Optimistic Approach to Lock-Free FIFO Queues
    Edya Ladan-Mozes, Nir Shavit

  • Fast and Lock-Free Concurrent Priority Queues for Multi-Thread Systems
    Håkan Sundell, Philippas Tsigas

  • Split-Ordered Lists: Lock-Free Extensible Hash Tables
    Ori Shalev, Nir Shavit

  • LF THREADS: A Lock-Free Thread Library
    Anders Gidenstam, Marina Papatriantafilou

  • Lock-free deques and doubly linked lists
    Håkan Sundell, Philippas Tsigas

  • Lock-Free Dynamically Resizable Arrays
    Damian Dechev, Peter Pirkelbauer, Bjarne Stroustrup

  • High Performance Dynamic Lock-Free Hash Tables and List-Based Sets
    Maged M. Michael

  • Practical lock-freedom
    Keir Fraser

  • Real-Time Computing with Lock-Free Shared Objects
    James H. Anderson, Srikanth Ramamurthy, Kevin Jeffay

C lock-free libraries

  • http://code.google.com/p/nbds/: Includes a port of Cliff Click's nonblocking hashmap.

Implementation notes

Use of -fno-builtin, -march=xxx flags considered harmful.

  • -fno-builtin disables the builtin atomics, therefore unresolved function calls will occur at linktime unless an external library is being provided.

  • -march=xxx is listed primarily because it may unexpectedly disable some atomic instructions. Knowledgable use may in fact enable some instructions on compilers targetting generic architecture features, but it is up to the user to know what they are doing.

GCC Manual

  • Built-in functions for atomic memory access

Libstdc++ Manual

  • Using Atomics and Concurrency

  • Internals Thread Safety

  • Atomics

Implementing C++1x and C1x atomics

Status

As of 4.4.0 release, gcc/g++ have experimental support for atomics, based on the existing __sync builtins for atomic operations and without specific attention to ordering. To use, gcc with #include , or g++ with #include , and link with libstdc++.

Built-in atomic support by architecture.

Fully supported atomic implementations which provide 1,2,4 and 8 byte instructions for fetch operations.

  • alpha
  • x86_64
  • mips
  • rs6000
  • ia64
  • s390
  • xtensa

If the full atomic implementation is not available, but a compare and swap built-in is available, then the operation will be generated using a compare and swap loop. Basically the value is loaded, and a loop is executed which performs the required arithmetic operation on the value and attempts a compare_and_swap() until it is successful. See optabs.c::expand_sync_fetch_operation() and optabs.c::expand_compare_and_swap_loop().

  • sparcv9, sparc64
  • pentium (586, 686)

All other architectures simply generate calls to the appropriate sync function name, and will fail at link time if a library with the required routines is not provided.

The following architectures do provide functions in libgcc, but only for linux targets. There are no native atomic instructions, but the linux kernel provides support for software atomic operations. These architectures support atomic operations by utilizing this software mechanism. Unless specified otherwise, there is support for 1,2 and 4 byte values, but no 8 byte support.

  • arm (implemented via compare and swap loops, see config/arm/linux-atomic.c)
  • pa (implemented via compare and swap loops, see config/pa/linux-atomic.c)
  • sh (provides the routines coded in assembly, see config/sh/linux-atomic.asm)

The blackfin port appears to attempt to implement atomics via sync.md, but ultimately the insn table doesn't get populated with them, so calls are generated.

ARM supports 8 byte atomics with cmpxchg64 kernel helper with gcc-4.8 + Linux 3.1 (or Linux 2.6.30 w/ armv6k+)

Issues

Memory Ordering

(copied from libstdc++/40297 comment #2)

__atomic2::atomic_flag::test_and_set() issues a full barrier when m == memory_order_relaxed (too much synchronization) but is not a release operation when m == memory_order_acq_rel (not enough synchronization.) sync_lock_test_and_set is always an acquire barrier, so the full barrier is only needed when the ordering specifies it should also be a release operation i.e.

切换行号显示
   1     bool
   2     test_and_set(memory_order __m = memory_order_seq_cst) volatile
   3     {
   4       // Redundant synchronize if built-in for lock is a full barrier.
   5       if (__m >= memory_order_release)
   6         __sync_synchronize();
   7       return __sync_lock_test_and_set(&_M_i, 1);
   8     }

__atomic2::atomic_flag::clear also issues an unwanted full memory barrier when m == memory_order_relaxed, and in debug mode doesn't enforce the requirement that m != acquire and m != acq_rel.

It seems strange to me that clear() allows memory_order_consume but not acquire. I'll ask on the lib reflector if that's an oversight, assuming it is, I would write it:

切换行号显示
   1     void
   2     clear(memory_order __m = memory_order_seq_cst) volatile
   3     {
   4       __glibcxx_assert(__m != memory_order_consume);
   5       __glibcxx_assert(__m != memory_order_acquire);
   6       __glibcxx_assert(__m != memory_order_acq_rel);
   7       __sync_lock_release(&_M_i);
   8       if (__m == memory_order_seq_cst)
   9         __sync_synchronize();
  10     }

atomic_*::store() has no memory barrier when m == memory_order_release.

atomic_*::fetch() would benefit from an atomic load operation.

atomic_*::exchange() needs a release barrier when m >= memory_order_release.

Plans

- Update docs

- Detail issues with existing sync builtins

  • (absence of store and load, operations on 1, 2, 4, 8 byte size types required, )

- Have verifiable, correct optimizations.

- Move C subset into libgcc

- Use to create lock-free containers

  • Working branch : cxx-lockfree

None: Atomic (2012-07-27 06:27:22由71编辑)


你可能感兴趣的:(LINUX)