GCC's bacl-end & assemble emission (1)

1.        Overview

To make GCC porting to other machines (architecture) in most efficient and convient way, GCC needs machine description file (MD file) for chips. To describe a chip, series definitions called pattern are introduced. Generally, we need describe the chip from two ways.

First is the instruction set defined in form of rtl– there include what is the instructions look alike (define_insn pattern), which instructions sequence is more efficient than other equivalent ones (define_peephole and define_peephole2), how to split a complex instruction into simpler ones then one of them can be replaced into delay slot or fill pipeline (define_split and define_insn_split, for delay slot and pipeline fill in consideration refer to Tool of genattrtab ), how to split a complex instruction into simpler ones then define_insn patterns can be matched (define_expand).

The second is the architecture description; we know different chips of the same series may still have different function units, different pipeline structure. To exploit chips ability as possible, we need tell the compiler about the detail. The description is also defined in rtl language.

So these two forms of description are largely human readable, to convert the descriptions into a form that can be included into the source code of GCC, the developer design a series tools for the purpose. These tools are so important, without them backend can do nothing. They not only provide the major work of assembly generation, but also offer the basic for optimization executed at level close to machine (for release V4, a layer named SSA is introudced to enhence the ability of the compiler to do more optimization at level closer to source code. By machine description file, release V3 can provide very powerful optimization at lower level, however at this lower level form, many informations that can help to dependence analysis, such as type and alias, are stripped. It keeps compiler away from taking opportunites of optimization).

In following, we will study most of the tools for handling the machine description file to undertsand how the backend is constructed. At compiling GCC, these tools will be first compiled, and then they will be run to parse the description file of target machine to generate the source code. Following, it comes to compile GCC’s source (such that for the front-end we have seen), tegother with the code emitted here, to build the compiler.

2.        Tool of genconditions

2.1. Preparation for Code Emission

Before study genrecog tool, we need first study tool genconditions, as it will produce the file insn-conditions.c which will be used by genrecog.

 

182  int

183  main (int argc, char **argv)                                                                 in genconditions.c

184  {

185    rtx desc;

186    int pattern_lineno; /* not used */

187    int code;

188 

189    progname = "genconditions";

190 

191    if (argc <= 1)

192      fatal ("No input file name.");

193 

194    if (init_md_reader (argv[1]) != SUCCESS_EXIT_CODE)

195      return (FATAL_EXIT_CODE);

196 

197    condition_table = htab_create (1000, hash_c_test, cmp_c_test, NULL);

2.1.1. Read in definitions in rtx

Note, condition_table below and condition_table above are two different static variables, which are declared in genconditions.c and gensupport.c respectively. init_md_reader will read in rtx objects from machine description file, which is a common function invoked by other tools too.

 

935  int

936  init_md_reader (const char *filename)                                                   in gensupport.c

937  {

938    FILE *input_file;

939    int c;

940    size_t i;

941    char *lastsl;

942 

943    lastsl = strrchr (filename, '/');

944    if (lastsl != NULL)

945      base_dir = save_string (filename, lastsl - filename + 1 );

946 

947    read_rtx_filename = filename;

948    input_file = fopen (filename, "r");

949    if (input_file == 0)

950    {

951      perror (filename);

952      return FATAL_EXIT_CODE;

953    }

954 

955    /* Initialize the table of insn conditions.  */

956    condition_table = htab_create (n_insn_conditions ,

957                              hash_c_test, cmp_c_test, NULL);

958 

959    for (i = 0; i < n_insn_conditions ; i++)

960      *(htab_find_slot (condition_table , &insn_conditions [i], INSERT))

961        = (void *) &insn_conditions [i];

962    obstack_init (rtl_obstack );

963    errors = 0;

964    sequence_num = 0;

965 

966    /* Read the entire file.  */

967    while (1)

968    {

969      rtx desc;

970      int lineno;

971 

972      c = read_skip_spaces (input_file);

973      if (c == EOF)

974        break ;

975 

976      ungetc (c, input_file);

977      lineno = read_rtx_lineno ;

978      desc = read_rtx (input_file);

979      process_rtx (desc, lineno);

980    }

981    fclose (input_file);

982 

983    /* Process define_cond_exec patterns.  */

984    if (define_cond_exec_queue != NULL)

985      process_define_cond_exec ();

986 

987    return errors ? FATAL_EXIT_CODE : SUCCESS_EXIT_CODE;

988  }

 

read_skip_spaces skips white space and comment, and fetches the first valid character.

 

102  int

103  read_skip_spaces (FILE *infile)                                                                   in read-rtl.c

104  {

105    int c;

106 

107    while (1)

108    {

109      c = getc (infile);

110      switch (c)

111      {

112        case '/n':

113           read_rtx_lineno ++;

114           break ;

115 

116        case ' ': case '/t': case '/f': case '/r':

117           break ;

118 

119        case ';':

120           do

121             c = getc (infile);

122           while (c != '/n' && c != EOF);

123           read_rtx_lineno ++;

124            break ;

125 

126        case '/':

127         {

128           int prevc;

129           c = getc (infile);

130           if (c != '*')

131             fatal_expected_char (infile, '*', c);

132 

133           prevc = 0;

134           while ((c = getc (infile)) && c != EOF)

135           {

136            if (c == '/n')

137               read_rtx_lineno ++;

138             else if (prevc == '*' && c == '/')

139             break ;

140             prevc = c;

141            }

142         }

143         break ;

144 

145        default :

146            return c;

147      }

148    }

149  }

 

For md file, “;” at the beginning of the line indicates the whole line is comment. At the same time, “/*” and “*/” pair serve as comment too. Then read_rtx with the help of read_skip_spaces constructs rtx object from rtx definition in machine description file.

 

509  rtx

510  read_rtx (FILE *infile)                                                                                     in read-rtl.c

511  {

512    int i, j;

513    RTX_CODE tmp_code;

514    const char *format_ptr;

515    /* tmp_char is a buffer used for reading decimal integers

516      and names of rtx types and machine modes.

517      Therefore, 256 must be enough.  */

518    char tmp_char[256];

519    rtx return_rtx;

520    int c;

521    int tmp_int;

522    HOST_WIDE_INT tmp_wide;

523 

524    /* Obstack used for allocating RTL objects.  */

525    static struct obstack rtl_obstack;

526    static int initialized ;

527 

528    /* Linked list structure for making RTXs: */

529    struct rtx_list

530    {

531      struct rtx_list *next;

532      rtx value;        /* Value of this node.  */

533    };

534 

535    if (!initialized ) {

536      obstack_init (&rtl_obstack);

537      initialized = 1;

538    }

539 

540  again:

541    c = read_skip_spaces (infile); /* Should be open paren.  */

542    if (c != '(')

543      fatal_expected_char (infile, '(', c);

544 

545    read_name (tmp_char, infile);

546 

547    tmp_code = UNKNOWN;

548 

549    if (! strcmp (tmp_char, "define_constants"))

550    {

551       read_constants (infile, tmp_char);

552      goto again;

553    }

554    for (i = 0; i < NUM_RTX_CODE; i++)

555      if (! strcmp (tmp_char, GET_RTX_NAME (i)))

556      {

557        tmp_code = (RTX_CODE) i;     /* get value for name */

558        break ;

559      }

560 

561    if (tmp_code == UNKNOWN)

562      fatal_with_file_and_line (infile, "unknown rtx code `%s'", tmp_char);

563 

564    /* (NIL) stands for an expression that isn't there.  */

565    if (tmp_code == NIL)

566    {

567       /* Discard the closeparen.  */

568      while ((c = getc (infile)) && c != ')')

569        ;

570 

571      return 0;

572    }

573 

574    /* If we end up with an insn expression then we free this space below.  */

575    return_rtx = rtx_alloc (tmp_code);

576    format_ptr = GET_RTX_FORMAT (GET_CODE (return_rtx));

 

In machine description file, every instruction must be enclosed within parentheses pair, line 542 ensures it. Then read_name is invoked to get the name of the instruction.

 

154  static void

155  read_name (char *str, FILE *infile)                                                              in read-rtl.c

156  {

157    char *p;

158    int c;

159 

160    c = read_skip_spaces (infile);

161 

162    p = str;

163    while (1)

164    {

165      if (c == ' ' || c == '/n' || c == '/t' || c == '/f' || c == '/r')

166        break ;

167      if (c == ':' || c == ')' || c == ']' || c == '"' || c == '/'

168         || c == '(' || c == '[')

169      {

170         ungetc (c, infile);

171         break ;

172      }

173      *p++ = c;

174      c = getc (infile);

175    }

176    if (p == str)

177      fatal_with_file_and_line (infile, "missing name or number");

178    if (c == '/n')

179      read_rtx_lineno ++;

180 

181    *p = 0;

182 

183    if (md_constants )

184    {

185      /* Do constant expansion.  */

186      struct md_constant *def;

187 

188       p = str;

189      do

190      {

191         struct md_constant tmp_def;

192 

193         tmp_def.name = p;

194         def = htab_find (md_constants , &tmp_def);

195         if (def)

196           p = def->value;

197      } while (def);

198      if (p != str)

199        strcpy (str, p);

200    }

201  }

 

Above, at line 183, md_constants is an instance of hash table (htab), while at line 186 the md_constant is a simple struct having two char pointers members name and value .

In machine description file, there sometimes defines constants which will be used in the instruction definition. These constant definitions are indicated by define_constants . At line 551 above in read_rtx , read_constants is invoked to handle the definitions.

 

421  static void

422  read_constants (FILE *infile, char *tmp_char)                                              in read-rtl.c

423  {

424    int c;

425    htab_t defs;

426 

427    c = read_skip_spaces (infile);

428    if (c != '[')

429      fatal_expected_char (infile, '[', c);

430    defs = md_constants ;

431    if (! defs)

432      defs = htab_create (32, def_hash, def_name_eq_p, (htab_del) 0);

433    /* Disable constant expansion during definition processing.  */

434    md_constants = 0;

435    while ( (c = read_skip_spaces (infile)) != ']')

436    {

437      struct md_constant *def;

438      void **entry_ptr;

439 

440       if (c != '(')

441        fatal_expected_char (infile, '(', c);

442      def = xmalloc (sizeof (struct md_constant));

443      def->name = tmp_char;

444      read_name (tmp_char, infile);

445      entry_ptr = htab_find_slot (defs, def, TRUE);

446      if (! *entry_ptr)

447        def->name = xstrdup (tmp_char);

448      c = read_skip_spaces (infile);

449      ungetc (c, infile);

450      read_name (tmp_char, infile);

451      if (! *entry_ptr)

452      {

453         def->value = xstrdup (tmp_char);

454         *entry_ptr = def;

455      }

456      else

457      {

458         def = *entry_ptr;

459         if (strcmp (def->value, tmp_char))

460           fatal_with_file_and_line (infile,

461                    "redefinition of %s, was %s, now %s",

462                    def->name, def->value, tmp_char);

463      }

464      c = read_skip_spaces (infile);

465      if (c != ')')

466        fatal_expected_char (infile, ')', c);

467    }

468    md_constants = defs;

469    c = read_skip_spaces (infile);

470    if (c != ')')

471      fatal_expected_char (infile, ')', c);

472  }

 

The form of the constant definition is like: (define_constants [(name value)…(name value)]), the pair of name and value will be retrieved and saved into struct md_constant, which saved into hash table of md_constants .

Other machine description patterns include define_insn, define_attr, define_peephole, define_split, define_expand, define_insn_and_split and etc. They also appear in file rtl.def . They are special rtx object, and for i386 system, we can find following

 

192  DEF_RTL_EXPR(DEFINE_INSN, "define_insn", "sEsTV", 'x')                                   in rtl.def

 

In this definition, we know that the first parameter is the rtx code, the second parameter is the name, the third parameter is the format, in which “sEsTV” means the rtx object has at most 5 children, and the last parameter is the rtx class. Let’s see the meaning the character in format and class for rtx object has.

For format we can get:

"0" field is unused (or used in a phase-dependent manner), prints nothing

"i" an integer, prints the integer

"n" like "i", but prints entries from `note_insn_name'

"w" an integer of width HOST_BITS_PER_WIDE_INT, prints the integer

"s" a pointer to a string, prints the string

"S" like "s", but optional: the containing rtx may end before this operand

"T" like "s", but treated specially by the RTL reader; only in machine description patterns.

"e" a pointer to an rtl expression, prints the expression

"E" a pointer to a vector that points to a number of rtl expressions, prints a list of the rtl expressions

"V" like "E", but optional: the containing rtx may end before this operand

"u" a pointer to another insn, prints the uid of the insn.

"b" is a pointer to a bitmap header.

"B" is a basic block pointer.

"t" is a tree pointer.

For class we can get:

"o" an rtx code that can be used to represent an object (e.g, REG, MEM)

"<" an rtx code for a comparison (e.g, EQ, NE, LT)

"1" an rtx code for a unary arithmetic expression (e.g, NEG, NOT)

"c" an rtx code for a commutative binary operation (e.g,, PLUS, MULT)

"3" an rtx code for a non-bitfield three input operation (IF_THEN_ELSE)

"2" an rtx code for a non-commutative binary operation (e.g., MINUS, DIV)

"b" an rtx code for a bit-field operation (ZERO_EXTRACT, SIGN_EXTRACT)

"i" an rtx code for a machine insn (INSN, JUMP_INSN, CALL_INSN)

"m" an rtx code for something that matches in insns (e.g, MATCH_DUP)

"g" an rtx code for grouping insns together (e.g, GROUP_PARALLEL)

"a" an rtx code for autoincrement addressing modes (e.g. POST_DEC)

"x" everything else

 

And other related instructions’ rtl definitions are:

 

200  DEF_RTL_EXPR(DEFINE_PEEPHOLE, "define_peephole", "EsTV", 'x')

211  DEF_RTL_EXPR(DEFINE_SPLIT, "define_split", "EsES", 'x')

239  DEF_RTL_EXPR(DEFINE_INSN_AND_SPLIT, "define_insn_and_split", "sEsTsESV", 'x')

243  DEF_RTL_EXPR(DEFINE_PEEPHOLE2, "define_peephole2", "EsES", 'x')

247  DEF_RTL_EXPR(DEFINE_COMBINE, "define_combine", "Ess", 'x')

260  DEF_RTL_EXPR(DEFINE_EXPAND, "define_expand", "sEss", 'x')

276  DEF_RTL_EXPR(DEFINE_DELAY, "define_delay", "eE", 'x')

 

So at line 554 in function read_rtx above, other instructions in machinde description file are recognized by name, and the corresponding rtx codes are fetched. At line 565, NIL is used by rtl reader and printer to represent a null pointer. Those instructions are allocated as rtx objects. And read_rtx comes to handle these instructions.

Let’s take an example from i386.md.

 

467  (define_insn "cmpdi_ccno_1_rex64"                                                            in i386.md

468    [(set (reg 17)

469         (compare (match_operand:DI 0 "nonimmediate_operand" "r,?mr")

470                (match_operand:DI 1 "const0_operand" "n,n")))]

471    "TARGET_64BIT && ix86_match_ccmode (insn, CCNOmode)"

472    "@

473     test{q}/t{%0, %0|%0, %0}

474     cmp{q}/t{%1, %0|%0, %1}"

475    [(set_attr "type" "test,icmp")

476     (set_attr "length_immediate" "0,1")

477     (set_attr "mode" "DI")])

 

你可能感兴趣的:(c,gcc,File,input,optimization,Constants)