a.out格式解析

    最近在看linux0.11代码,涉及到很多linux系统特性的东西,以下收集到a.out格式的文档。维基上的介绍也不够详细:http://en.wikipedia.org/wiki/A.out 

    来源:http://modman.unixdev.net/?sektion=5&page=a.out&manpath=SunOS-4.1.3


A.OUT(5)                                                              A.OUT(5)



NAME
       a.out - assembler and link editor output format

SYNOPSIS
       #include <&lt;a.out.h>&gt;
       #include <&lt;stab.h>&gt;
       #include <&lt;nlist.h>&gt;

AVAILABILITY
       Sun-2,  Sun-3,  and  Sun-4  systems only.  For Sun386i systems refer to
       coff(5).

DESCRIPTION
       a.out is the output format of the assembler as(1) and the  link  editor
       ld(1).  The link editor makes a.out executable files.

       A file in a.out format consists of: a header, the program text, program
       data, text and data relocation  information,  a  symbol  table,  and  a
       string table (in that order).  In the header, the sizes of each section
       are given in bytes.  The last three sections may be absent if the  pro-
       gram  was loaded with the -s option of ld or if the symbols and reloca-
       tion have been removed by strip(1).

       The machine type in the header indicates the type of hardware on  which
       the  object  code  can be executed.   Sun-2 code runs on Sun-3 systems,
       but not vice versa.  Program files predating release 3.0 are recognized
       by a machine type of `0'.  Sun-4 code may not be run on Sun-2 or Sun-3,
       nor vice versa.

   Header
       The header consists of a exec structure.  The exec  structure  has  the
       form:

              struct exec {
                        unsigned char       a_dynamic:1;/* has a __DYNAMIC */
                        unsigned char       a_toolversion:7;/* version of toolset used to create this file */
                        unsigned char       a_machtype;/* machine type */
                        unsigned short      a_magic;/* magic number */
                        unsigned long       a_text;/* size of text segment */
                        unsigned long       a_data;/* size of initialized data */
                        unsigned long       a_bss;/* size of uninitialized data */
                        unsigned long       a_syms;/* size of symbol table */
                        unsigned long       a_entry;/* entry point */
                        unsigned long       a_trsize;/* size of text relocation */
                        unsigned long       a_drsize;/* size of data relocation */
              };

       The members of the structure are:

       a_dynamic      1 if the a.out file is dynamically linked or is a shared
                      object, 0 otherwise.

       a_toolversion  The version number of the toolset (as, ld, etc.) used to
                      create the file.

       a_machtype     One of the following:

                      0           pre-3.0 executable image

                      M_68010     executable image using only MC68010 instruc-
                                  tions that can run on Sun-2  or  Sun-3  sys-
                                  tems.

                      M_68020     executable  image using MC68020 instructions
                                  that can run only on Sun-3 systems.

                      M_SPARC     executable image  using  SPARC  instructions
                                  that can run only on Sun-4 systems.

       a_magic        One of the following:

                      OMAGIC      An  text executable image which is not to be
                                  write-protected,  so  the  data  segment  is
                                  immediately  contiguous  with  the text seg-
                                  ment.

                      NMAGIC      A  write-protected  text  executable  image.
                                  The data segment begins at the first segment
                                  boundary following the text segment, and the
                                  text segment is not writable by the program.
                                  When the image is started  with  execve(2V),
                                  the  entire  text  and data segments will be
                                  read into memory.

                      ZMAGIC      A page-aligned text executable  image.   the
                                  data  segment  begins  at  the first segment
                                  boundary following the text segment, and the
                                  text segment is not writable by the program.
                                  The text and data sizes are  both  multiples
                                  of  the page size, and the pages of the file
                                  will be brought into the  running  image  as
                                  needed, and not pre-loaded as with the other
                                  formats.  This is the  default  format  pro-
                                  duced by ld(1).

                      The  macro  N_BADMAG takes an exec structure as an argu-
                      ment; it evaluates to 1 if the  a_magic  field  of  that
                      structure is invalid, and evaluates to 0 if it is valid.

       a_text         The size of the text segment, in bytes.

       a_data         The size of the initialized portion of the data segment,
                      in bytes.

       a_bss          The size of the "uninitialized" portion of the data seg-
                      ment, in bytes.  This portion is actually initialized to
                      zero.  The zeroes are not stored in the a.out file;  the
                      data  in  this portion of the data segment is zeroed out
                      when it is loaded.

       a_syms         The size of the symbol table, in bytes.

       a_entry        The virtual address of the entry point of  the  program;
                      when  the  image  is  started  with  execve,  the  first
                      instruction executed in the image is at this address.

       a_trsize       The size of the relocation information for the text seg-
                      ment.

       a_drsize       The size of the relocation information for the data seg-
                      ment.

       The  macros  N_TXTADDR,  N_DATADDR,  and  N_BSSADDR  give  the   memory
       addresses at which the text, data, and bss segments, respectively, will
       be loaded.

       In the ZMAGIC format, the size of the header is included in the size of
       the text section; in other formats, it is not.

       When  an a.out file is executed, three logical segments are set up: the
       text segment, the data segment (with uninitialized data,  which  starts
       off as all 0, following initialized data), and a stack.  For the ZMAGIC
       format, the header is loaded with the text segment; for  other  formats
       it is not.

       Program  execution  begins  at  the  address  given by the value of the
       a_entry field.

       The stack starts at the highest possible location in the memory  image,
       and  grows downwards.  The stack is automatically extended as required.
       The data segment is extended as requested by brk(2) or sbrk.

   Text and Data Segments
       The text segment begins at the start of the file for ZMAGIC format,  or
       just  after  the  header  for  the  other  formats.  The N_TXTOFF macro
       returns this absolute file position when given  an  exec  structure  as
       argument.  The data segment is contiguous with the text and immediately
       followed by the text relocation and then the data  relocation  informa-
       tion.   The  N_DATOFF  macro  returns the absolute file position of the
       beginning of the data segment when given an exec structure as argument.

   Relocation
       The relocation information appears after the text  and  data  segments.
       The  N_TRELOFF  macro returns the absolute file position of the reloca-
       tion information for the text segment, when given an exec structure  as
       argument.   The  N_DRELOFF  macro returns the absolute file position of
       the relocation information for the data segment,  when  given  an  exec
       structure   as   argument.   There  is  no  relocation  information  if
       a_trsize+a_drsize==0.

   Relocation (Sun-2 and Sun-3 Systems)
       If a byte in the text or data involves  a  reference  to  an  undefined
       external  symbol,  as indicated by the relocation information, then the
       value stored in the file is an offset from the associated external sym-
       bol.   When  the  file is processed by the link editor and the external
       symbol becomes defined, the value of the symbol is added to  the  bytes
       in the file.  If a byte involves a reference to a relative location, or
       relocatable segment, then the value stored in the  file  is  an  offset
       from the associated segment.

       If  relocation  information  is  present, it amounts to eight bytes per
       relocatable datum as in the following structure:

              struct reloc_info_68k {
                  long r_address;/* address which is relocated */
              unsigned intr_symbolnum:24,/* local symbol ordinal */
                  r_pcrel:1,/* was relocated pc relative already */
                  r_length:2,/* 0=byte, 1=word, 2=long */
                  r_extern:1,/* does not include value of sym referenced */
                  r_baserel:1,/* linkage table relative */
                  r_jmptable:1,/* pc-relative to jump table */
                  r_relative:1,/* relative relocation */
                  :1;
              };


       If r_extern is 0, then r_symbolnum is actually an n_type for the  relo-
       cation (for instance, N_TEXT meaning relative to segment text origin.)

   Relocation (Sun-4 System)
       If  a  byte  in  the  text or data involves a reference to an undefined
       external symbol, as indicated by the relocation information,  then  the
       value stored in the file is ignored. Unlike the Sun-2 and Sun-3 system,
       the offset from the associated  symbol  is  kept  with  the  relocation
       record.  When the file is processed by the link editor and the external
       symbol becomes defined, the value of the symbol is added to  this  off-
       set,  and  the  sum is inserted into the bytes in the text or data seg-
       ment.

       If relocation information is present, it amounts to  twelve  bytes  per
       relocatable datum as in the following structure:

       enum reloc_type
       {
           RELOC_8,RELOC_16,RELOC_32,/* simplest relocs */
           RELOC_DISP8,RELOC_DISP16,RELOC_DISP32,/* Disp's (pc-rel) */
           RELOC_WDISP30,RELOC_WDISP22,/* SR word disp's */
           RELOC_HI22,RELOC_22,/* SR 22-bit relocs */
           RELOC_13,RELOC_LO10,/* SR 13&&amp;10-bit relocs */
           RELOC_SFA_BASE,RELOC_SFA_OFF13,/* SR S.F.A. relocs */
           RELOC_BASE10,RELOC_BASE13,RELOC_BASE22,/* base_relative pic */
           RELOC_PC10,RELOC_PC22,/* special pc-rel pic*/
           RELOC_JMP_TBL,/* jmp_tbl_rel in pic */
           RELOC_SEGOFF16,/* ShLib offset-in-seg */
           RELOC_GLOB_DAT,RELOC_JMP_SLOT,RELOC_RELATIVE,/* rtld relocs */
       };
       struct reloc_info_sparc/* used when header.a_machtype == M_SPARC */
       {
           unsigned long intr_address;/* relocation addr (offset in segment) */
           unsigned intr_index:24;/* segment index or symbol index */
           unsigned intr_extern: 1;/* if F, r_index==SEG#; if T, SYM idx */
           int: 2;/* <&lt;unused>&gt; */
           enum reloc_typer_type: 5;/* type of relocation to perform */
           long intr_addend;/* addend for relocation value */
       };

       If  r_extern is 0, then r_index is actually a n_type for the relocation
       (for instance, N_TEXT meaning relative to segment text origin.)

   Symbol Table
       The N_SYMOFF macro returns the absolute file position of the symbol ta-
       ble  when  given an exec structure as argument.  Within this symbol ta-
       ble, distinct symbols point to disjoint areas in the string table (even
       when  two  symbols  have  the same name).  The string table immediately
       follows the symbol table; the N_STROFF macro returns the absolute  file
       position  of the string table when given an exec structure as argument.
       The first 4 bytes of the string table are not used for string  storage,
       but rather contain the size of the string table. This size includes the
       4 bytes; thus, the minimum string table size is 4.  Layout  information
       as given in the include file for the Sun system is shown below.

       The  layout  of a symbol table entry and the principal flag values that
       distinguish symbol types are given in the include file as follows:

       struct nlist {
             union {
             char*n_name;/* for use when in-memory */
             longn_strx;/* index into file string table */
             } n_un;
             unsigned charn_type;/* type flag, that is, N_TEXT etc; see below */
             charn_other;
             shortn_desc;/* see <&lt;stab.h>&gt; */
             unsignedn_value;/* value of this symbol (or adb offset) */
       };
       #definen_hashn_desc/* used internally by ld */
       /*
       * Simple values for n_type.
       */
       #defineN_UNDF0x0/* undefined */
       #defineN_ABS0x2/* absolute */
       #defineN_TEXT0x4/* text */
       #defineN_DATA0x6/* data */
       #defineN_BSS0x8/* bss */
       #defineN_COMM0x12/* common (internal to ld) */
       #defineN_FN0x1f/* file name symbol */
       #defineN_EXT01/* external bit, or'ed in */
       #defineN_TYPE0x1e/* mask for all the type bits */
       /*
       * Other permanent symbol table entries have some of the N_STAB bits set.
       * These are given in <&lt;stab.h>&gt;
       */
       #defineN_STAB0xe0/* if any of these bits set, don't discard */

       In the a.out file a symbol's n_un.n_strx field gives an index into  the
       string table.  A n_strx value of 0 indicates that no name is associated
       with a particular symbol table entry.  The  field  n_un.n_name  can  be
       used to refer to the symbol name only if the program sets this up using
       n_strx and appropriate data from the  string  table.   Because  of  the
       union  in  the  nlist  declaration, it is impossible in C to statically
       initialize such a structure.  If this  must  be  done  (as  when  using
       nlist(3V))  include  the  file  <&lt;nlist.h>&gt;, rather than <&lt;a.out.h>&gt;.  This
       contains the declaration without the union.

       If a symbol's type is undefined external, and the value field  is  non-
       zero,  the symbol is interpreted by the loader ld as the name of a com-
       mon region whose size is indicated by the value of the symbol.

SEE ALSO
       adb(1),  as(1),  cc(1V),  dbx(1),  ld(1),  nm(1),   strip(1),   brk(2),
       nlist(3V), coff(5)



                               18 February 1988                       A.OUT(5)

你可能感兴趣的:(ASM)