HP garbage collector portting

 原文

http://www.hpl.hp.com/personal/Hans_Boehm/gc/porting.html

Conservative GC Porting Directions

保守(*还是守恒)的 GC 移植 说明

The collector is designed to be relatively easy to port, but is not portable code per se.
收集器被设计的相对比较好移植,但不是每个代码段都可以移植(*se 应该是 section 的缩写)

The collector inherently has to perform operations, such as scanning the stack(s), that are not possible in portable C code.
收集器内在的必须执行一些活动,例如扫描栈,这是不可能在可移植的C代码中(实现)的

All of the following assumes that the collector is being ported to a byte-addressable 32- or 64-bit machine.
下面的所有(内容)假设收集器被移植到一个可单字节寻址 的32或64位的机器上

Currently all successful ports to 64-bit machines involve LP64 targets.
当前所有成功移植到64位机器上(的情况)需要LP64目标(*没看懂)

The code base includes some provisions for P64 targets (notably win64), but that has not been tested.
代码包括了一些为(移植到)P64目标的条文(特别是win64),但还未被测试

You are hereby discouraged from attempting a port to non-byte-addressable, or 8-bit, or 16-bit machines.
尝试移植到一个无字节寻址能力的或者8位、16为的机器上,这种方式将会使得你灰心丧气(*大概说是搞不成吧,没看懂)

The difficulty of porting the collector varies greatly depending on the needed functionality.
收集器的变化的移植难度取决于必须的功能性函数

In the simplest case, only some small additions are needed for the include/private/gcconfig.h file.
对最简单的情况,仅仅是 在include/private/gcconfig.h 文件中做很少的添加.

This is described in the following section.
这(方面)在下一段中有描述.

Later sections discuss some of the optional features, which typically involve more porting effort.
后面的一节讨论了一些移植工作中典型的可选的特征.

Note that the collector makes heavy use of ifdefs.
(需要)指出收集器(的实现)着重使用了 ifdefs(条件编译的方式)

Unlike some other software projects, we have concluded repeatedly that this is preferable to system dependent files, with code duplicated between the files.
不象一些其他的软件工程,我们反复断定(用ifdefs)(会比)在文件中复制代码,对系统支持文件来说更好.

However, to keep this manageable, we do strongly believe in indenting ifdefs correctly (for historical reasons usually without the leading sharp sign).
然而,为了使得(代码)可管理,我们强烈主张正确的缩排 ifdefs(这些条件编译选项)(历史原因,通常没有前导清晰标记)

(Separate source files are of course fine if they don't result in code duplication.)
如果(以上情况)不发生,在代码复制(的情况下),分割源代码文件当然是好的

Adding Platforms to gcconfig.h

在gcconfig.h中添加平台(信息)

If neither thread support, nor tracing of dynamic library data is required, these are often the only changes you will need to make.
如果即没有线程支持也没有跟踪(功能,*我觉得他指的是功能)则动态库数据是必要的,这些(应该指下面的N条)是通常要做变更的(地方).

The gcconfig.h file consists of three sections:
gcconfig.h 这个文件由三个段组成:

  1. A section that defines GC-internal macros that identify the architecture (e.g. IA64 or I386) and operating system (e.g. LINUX or MSWIN32). This is usually done by testing predefined macros. By defining our own macros instead of using the predefined ones directly, we can impose a bit more consistency, and somewhat isolate ourselves from compiler differences.
    其中一个段,定义了一些GC内部宏,这些宏用来指明体系结构(例如IA64或I386,*这里应该指CPU的体系结构)和
    操作系统 (例如 LINUX OR MSWIN32).通常这是通过测试预定宏来完成的.直接使用用户自定义的宏直接替换(系统)
    之前定义的(那个)宏,(这样)我们能加强一点一致性 ,并屏蔽编译器的差别

    It is relatively straightforward to add a new entry here. But please try to be consistent with the existing code. In particular, 64-bit variants of 32-bit architectures general are not treated as a new architecture. Instead we explicitly test for 64-bit-ness in the few places in which it matters. (The notable exception here is I386 and X86_64. This is partially historical, and partially justified by the fact that there are arguably more substantial architecture and ABI differences here than for RISC variants.)
    这里添加一个新的入口相对比较容易.但请尽量保持和现有的代码一致.特别是,32位体系结构下64位变量一般不被判断成
    一个新的体系结构.

    on GNU-based systems, cpp -dM empty_source_file.c seems to generate a set of predefined macros. On some other systems, the "verbose" compiler option may do so, or the manual page may list them.

  2. A section that defines a small number of platform-specific macros, which are then used directly by the collector. For simple ports, this is where most of the effort is required. We describe the macros below.

    This section contains a subsection for each architecture (enclosed in a suitable ifdef. Each subsection usually contains some architecture-dependent defines, followed by several sets of OS-dependent defines, again enclosed in ifdefs.

  3. A section that fills in defaults for some macros left undefined in the preceding section, and defines some other macros that rarely need adjustment for new platforms. You will typically not have to touch these. If you are porting to an OS that was previously completely unsupported, it is likely that you will need to add another clause to the definition of GET_MEM.

The following macros must be defined correctly for each architecture and operating system:
对每个体系结构和操作系统以下这些宏必须定义正确:

MACH_TYPE
Defined to a string that represents the machine architecture. Usually just the macro name used to identify the architecture, but enclosed in quotes.
OS_TYPE
Defined to a string that represents the operating system name. Usually just the macro name used to identify the operating system, but enclosed in quotes.
CPP_WORDSZ
The word size in bits as a constant suitable for preprocessor tests, i.e. without casts or sizeof expressions. Currently always defined as either 64 or 32. For platforms supporting both 32- and 64-bit ABIs, this should be conditionally defined depending on the current ABI. There is a default of 32.
ALIGNMENT
Defined to be the largest N, such that all pointer are guaranteed to be aligned on N-byte boundaries. defining it to be 1 will always work, but perform poorly. For all modern 32-bit platforms, this is 4. For all modern 64-bit platforms, this is 8. Whether or not X86 qualifies as a modern architecture here is compiler- and OS-dependent.
DATASTART
The beginning of the main data segment. The collector will trace all memory between DATASTART and DATAEND for root pointers. On some platforms,this can be defined to a constant address, though experience has shown that to be risky. Ideally the linker will define a symbol (e.g. _data whose address is the beginning of the data segment. Sometimes the value can be computed using the GC_SysVGetDataStart function. Not used if either the next macro is defined, or if dynamic loading is supported, and the dynamic loading support defines a function GC_register_main_static_data() which returns false.
SEARCH_FOR_DATA_START
If this is defined DATASTART will be defined to a dynamically computed value which is obtained by starting with the address of _end and walking backwards until non-addressable memory is found. This often works on Posix-like platforms. It makes it harder to debug client programs, since startup involves generating and catching a segmentation fault, which tends to confuse users.
DATAEND
Set to the end of the main data segment. Defaults to end, where that is declared as an array. This works in some cases, since the linker introduces a suitable symbol.
DATASTART2, DATAEND2
Some platforms have two discontiguous main data segments, e.g. for initialized and uninitialized data. If so, these two macros should be defined to the limits of the second main data segment.
STACK_GROWS_UP
Should be defined if the stack (or thread stacks) grow towards higher addresses. (This appears to be true only on PA-RISC. If your architecture has more than one stack per thread, and is not already supported, you will need to do more work. Grep for "IA64" in the source for an example.)
STACKBOTTOM
Defined to be the cool end of the stack, which is usually the highest address in the stack. It must bound the region of the stack that contains pointers into the GC heap. With thread support, this must be the cold end of the main stack, which typically cannot be found in the same way as the other thread stacks. If this is not defined and none of the following three macros is defined, client code must explicitly set GC_stackbottom to an appropriate value before calling GC_INIT() or any other GC_ routine.
LINUX_STACKBOTTOM
May be defined instead of STACKBOTTOM. If defined, then the cold end of the stack will be determined Currently we usually read it from /proc.
HEURISTIC1
May be defined instead of STACKBOTTOM. STACK_GRAN should generally also be undefined and defined. The cold end of the stack is determined by taking an address inside GC_init's frame, and rounding it up to the next multiple of STACK_GRAN. This works well if the stack base is always aligned to a large power of two. ( STACK_GRAN is predefined to 0x1000000, which is rarely optimal.)
HEURISTIC2
May be defined instead of STACKBOTTOM. The cold end of the stack is determined by taking an address inside GC_init's frame, incrementing it repeatedly in small steps (decrement if STACK_GROWS_UP), and reading the value at each location. We remember the value when the first Segmentation violation or Bus error is signalled, round that to the nearest plausible page boundary, and use that as the stack base.
DYNAMIC_LOADING
Should be defined if dyn_load.c has been updated for this platform and tracing of dynamic library roots is supported.
MPROTECT_VDB, PROC_VDB
May be defined if the corresponding "virtual dirty bit" implementation in os_dep.c is usable on this platform. This allows incremental/generational garbage collection. MPROTECT_VDB identifies modified pages by write protecting the heap and catching faults. PROC_VDB uses the /proc primitives to read dirty bits.
PREFETCH, PREFETCH_FOR_WRITE
The collector uses PREFETCH( x) to preload the cache with * x. This defaults to a no-op.
CLEAR_DOUBLE
If CLEAR_DOUBLE is defined, then CLEAR_DOUBLE(x) is used as a fast way to clear the two words at GC_malloc-aligned address x. By default, word stores of 0 are used instead.
HEAP_START
HEAP_START may be defined as the initial address hint for mmap-based allocation.
ALIGN_DOUBLE
Should be defined if the architecture requires double-word alignment of GC_malloced memory, e.g. 8-byte alignment with a 32-bit ABI. Most modern machines are likely to require this. This is no longer needed for GC7 and later.

Additional requirements for a basic port
基础移植的附加需求

In some cases, you may have to add additional platform-specific code to other files.
在某中情况下,(用户)也许必须添加附加的平台特性代码到一些文件中

A likely candidate is the implementation of GC_with_callee_saves_pushed in mach_dep.c.
mach_dep.c 中 GC_width_callesaves_pushed 给出了一个类似实现

This ensure that register contents that the collector must trace from are copied to the stack.
收集器必须跟踪那些复制到栈的确认注册目录(*没搞懂)

Typically this can be done portably, but on some platforms it may require assembly code, or just tweaking of conditional compilation tests.
通常情况下这些将很容易做到,但是在一些平台中也许会需要用到汇编代码,或者调整条件编译测试的tweaking(*没搞懂)

For GC7, if your platform supports getcontext(), then definining the macro UNIX_LIKE for your OS in gcconfig.h (if it isn't defined there already) is likely to solve the problem.
在 GC7中,如果你的平台支持 getcontext(),那么在 gcconfig.h中为你的操作系统定义UNIX_LIKE宏 (如果在其中还没被定义)将可能解决你的问题

otherwise, if you are using gcc, _builtin_unwind_init() will be used, and should work fine.
否则,如果你使用gcc,,你将会用到 _builtin_unwind_init()(这个函数),它会帮助你很好的完成工作

If that is not applicable either, the implementation will try to use setjmp().
如果没办法找到适用的,(gc)的实现将尝试使用setjmp()(这个函数).

This will work if your setjmp implementation saves all possibly pointer-valued registers into the buffer, as opposed to trying to unwind the stack at longjmp time.
如果setjmp(这个函数,*估计是让自己实现的),保存了所可能的 pointer-valued(*不知道啥意思)注册到缓存中,(gc)将开始工作,另一方面使用longjamp(这个函数)尝试释放堆栈上的内容.

The setjmp_test test tries to determine this, but often doesn't get it right.
setjmp_test(这个函数)将设法决定这些情况,但是它常常不搞清楚.(*它都搞不清楚我们咋搞清楚呢?)

In GC6.x versions of the collector, tracing of registers was more commonly handled with assembly code.
在 GC6.X各个版本的回收器中,跟踪注册通常是通过汇编代码来处理的.

In GC7, this is generally to be avoided.
在 GC7,这些通常被避免

Most commonly os_dep.c will not require attention, but see below.
通常情况下os_dep.c将不要求注意,但要参考下面(的内容)

Thread support
对 线程 的支持(应该是指collector支持多线程这个功能)

Supporting threads requires that the collector be able to find and suspend all threads potentially accessing the garbage-collected heap, and locate any state associated with each thread that must be traced.
支持线程需要收集器可以查找和挂起所有线程

The functionality needed for thread support is generally implemented in one or more files specific to the particular thread interface.
线程支持功能通常在一个或者多个文件中实现特定线程接口

For example, somewhat portable pthread support is implemented in pthread_support.c and pthread_stop_world.c.
例如,一些可移植的线程支持功能分别在 pthread_support.c和pthread_stop-world.c中实现.

The essential functionality consists of
基本功能表

GC_stop_world()
Stops all threads which may access the garbage collected heap, other than the caller.
停止所有的线程,让垃圾回收器可以回收堆(上的闲置内存),除了调用这个函数的执行者.
GC_start_world()
Restart other threads.
重新启动其他线程(应该是指不包括调用者)
GC_push_all_stacks()
Push the contents of all thread stacks (or at least of pointer-containing regions in the thread stacks) onto the mark stack.
将所有在标记栈中的所有线程栈的目录压(栈)(或者至少指针包含相关区域,在线程堆栈中)

These very often require that the garbage collector maintain its own data structures to track active threads.
用来跟踪活动的线程时,垃圾回收器会维护他自己的数据结构,这些(功能,指上面这几个函数)经常是必须的

In addition, LOCK and UNLOCK must be implemented in gc_locks.h
此外,在 gc_locks.h中(定义的)LOCK(锁) 和 UNLOCK (解锁)必须实现

The easiest case is probably a new pthreads platform on which threads can be stopped with signals. In this case, the changes involve:

  1. Introducing a suitable GC_X_THREADS macro, which should be automatically defined by gc_config_macros.h in the right cases. It should also result in a definition of GC_PTHREADS, as for the existing cases.
  2. For GC7+, ensuring that the atomic_ops package at least minimally supports the platform. If incremental GC is needed, or if pthread locks don't perform adequately as the allocation lock, you will probably need to ensure that a sufficient atomic_ops port exists for the platform to provided an atomic test and set operation. (Current GC7 versions require moreatomic_ops asupport than necessary. This is a bug.) For earlier versions define GC_test_and_set in gc_locks.h.
  3. Making any needed adjustments to pthread_stop_world.c and pthread_support.c. Ideally none should be needed. In fact, not all of this is as well standardized as one would like, and outright bugs requiring workarounds are common.

Non-preemptive threads packages will probably require further work. Similarly thread-local allocation and parallel marking requires further work in pthread_support.c, and may require better atomic_ops support.

Dynamic library support
对 动态库 的支持

So long as DATASTART and DATAEND are defined correctly, the collector will trace memory reachable from file scope or static variables defined as part of the main executable. This is sufficient if either the program is statically linked, or if pointers to the garbage-collected heap are never stored in non-stack variables defined in dynamic libraries.

If dynamic library data sections must also be traced, then

  • DYNAMIC_LOADING must be defined in the appropriate section of gcconfig.h.
  • An appropriate versions of the functions GC_register_dynamic_libraries() should be defined in dyn_load.c. This function should invoke GC_cond_add_roots(region_start, region_end, TRUE) on each dynamic library data section.

Implementations that scan for writable data segments are error prone, particularly in the presence of threads. They frequently result in race conditions when threads exit and stacks disappear. They may also accidentally trace large regions of graphics memory, or mapped files. On at least one occasion they have been known to try to trace device memory that could not safely be read in the manner the GC wanted to read it.

It is usually safer to walk the dynamic linker data structure, especially if the linker exports an interface to do so. But beware of poorly documented locking behavior in this case.

Incremental GC support
对 增量 GC(垃圾回收) (基于标记扫描算法的一种改进)的支持

For incremental and generational collection to work, os_dep.c must contain a suitable "virtual dirty bit" implementation, which allows the collector to track which heap pages (assumed to be a multiple of the collectors block size) have been written during a certain time interval.
为了增量回收工作,os_dep.c必须包含一个允许回收器跟踪(内存)堆的分页(假设是 回收器(使用的)块大小的倍数)在一个确定的时间间隔被写的,类似"虚拟页面重写标志位"的实现

The collector provides several implementations, which might be adapted.
回收器本身提供了一些也许可以适配(通用)的实现.

The default (DEFAULT_VDB) is a placeholder which treats all pages as having been written.
默认的(DEFAULT_VDB)是一个判断所有页是否已经被写的占位符.

This ensures correctness, but renders incremental and generational collection essentially useless.

Stack traces for debug support
对 调试时使用栈跟踪 的支持

If stack traces in objects are need for debug support, GC_dave_callers and GC_print_callers must be implemented.
如果要支持调试(环境),对象的堆栈追踪功能是必须的 ,GC_dave_callers 和 GC_print_callers
(两个,看样子是函数,具体是啥,搜索过所有地方没见到有范例)必须实现

Disclaimer
免责声明

This is an initial pass at porting guidelines. Some things have no doubt been overlooked.
这是个初步通过的移植指导. 一些事毫无疑问的(可能)被忽略掉了

你可能感兴趣的:(thread,HP,library,alignment,macros,linker)