从init_loop返回后,在backend_init中,接下来调用init_reload来为重装遍做准备。
重装遍做什么用呢?
重装遍使用所分配的物理寄存器的编号,重新编号对应的伪寄存器。没有分配到物理寄存器的伪寄存器将被栈槽(stack slot)所替代。然后查找,因为值不能放在寄存器,或者使用了错误的寄存器而非法的指令。再通过将问题值临时放入寄存器而修复这些指令。同时额外的执行拷贝的指令需要被生成。
重装遍可可选地消除栈框指针,而在调用前后插入代码来保存及恢复被调用所破坏的寄存器。
为了使重装遍成为可能,init_reload现在需要收集一些信息。这些信息包括:通过寄存器可实现的间接取址的深度,通过符号可实现的间接取址的深度,及形如(栈框寄存器+通用寄存器)的取址是否支持。
431 void
432 init_reload (void) in reload1.c
433 {
434 int i;
435
436 /* Often (MEM (REG n)) is still valid even if (REG n) is put on the stack.
437 Set spill_indirect_levels to the number of levels such addressing is
438 permitted, zero if it is not permitted at all. */
439
440 rtx tem
441 = gen_rtx_MEM (Pmode,
442 gen_rtx_PLUS (Pmode,
443 gen_rtx_REG (Pmode,
444 LAST_VIRTUAL_REGISTER + 1),
445 GEN_INT (4)));
该函数首先创建一个代表形式为memory+register的rtx对象,通过它来评估使用寄存器可实现的间接取址的深度。gen_rtx_PLUS定义在genrtl.h中。它调用gen_rtx_fmt_ee来创建这个rtx对象。这个rtx对象将有2个都是rtx表达式的孩子。
上面的440行执行后,以下rtx对象被创建。
图20:使用寄存器的间接取址
init_reload (continue)
446 spill_indirect_levels = 0;
447
448 while (memory_address_p (QImode, tem))
449 {
450 spill_indirect_levels++;
451 tem = gen_rtx_MEM (Pmode, tem);
452 }
spill_indirect_levels代表了间接取址的深度。在这里,间接取址表示,例如,如果spill_indirect_levels是1,获取被寄存器n指向的内存的内容(我们记为(mem (reg n))),在寄存器n被溢出后仍然是有效的(这表示内存可以,不通过寄存器,直接访问)。如果spill_indirect_level是2,则内存引用(mem (mem (reg n))),在寄存器n被溢出后,仍然有效。
通过spill_indirect_levels,可以避免在重装遍中不必要的寄存器恢复。
在这里,对于x86机器,memory_address_p返回false,使得spill_indirect_levels为0。接下来,init_reload评估使用符合可实现的间接取址的深度。
init_reload (continue)
454 /* See if indirect addressing is valid for (MEM (SYMBOL_REF ...)). */
455
456 tem = gen_rtx_MEM (Pmode, gen_rtx_SYMBOL_REF (Pmode, "foo"));
457 indirect_symref_ok = memory_address_p (QImode, tem);
gen_rtx_SYMBOL_REF同样也是定义在genrtl.h中,为编译器构建工具自动产生。它调用gen_rtx_fmt_s00。而它的名字透露出所创建的rtx对象有3个孩子,第一个是字符串,其余则没有使用。这个rtx对象显示如下。
图21:使用符号的间接取址
同样对于x86机器,indirect_symref_ok是0。这个变量如果非0,表示了当最内层的MEM对象具有形式(MEM (SYMBOL_REF sym))(即,被符号所引用的内存的内容所指向的内存)间接取址被支持。那么形如(frame register + GPR)取址呢?
init_reload (continue)
459 /* See if reg+reg is a valid (and offsettable) address. */
460
461 for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
462 {
463 tem = gen_rtx_PLUS (Pmode,
464 gen_rtx_REG (Pmode, HARD_FRAME_POINTER_REGNUM),
465 gen_rtx_REG (Pmode, i));
466
467 /* This way, we make sure that reg+reg is an offsettable address. */
468 tem = plus_constant (tem, 4);
469
470 if (memory_address_p (QImode, tem))
471 {
472 double_reg_address_ok = 1;
473 break;
474 }
475 }
double_reg_adress_ok表示形如的(reg_frame_pointer+GPR) +constant取址是否支持。在463行创建的rtx对象显示如下,它还不是最终的形式。
图22:(frame register + GPR)形式的取址
正如注释所说,在468行,plus_constant对上面创建的rtx对象加上一个偏移。
1432 #define plus_constant(X, C) plus_constant_wide ((X), (HOST_WIDE_INT) (C)) in rtl.h
78 rtx
79 plus_constant_wide (rtx x, HOST_WIDE_INT c) in explow.c
80 {
81 RTX_CODE code;
82 rtx y;
83 enum machine_mode mode;
84 rtx tem;
85 int all_constant = 0;
86
87 if (c == 0)
88 return x;
89
90 restart:
91
92 code = GET_CODE (x);
93 mode = GET_MODE (x);
94 y = x;
95
96 switch (code)
97 {
98 case CONST_INT:
99 return GEN_INT (INTVAL (x) + c);
100
101 case CONST_DOUBLE:
102 {
…
113 }
114
115 case MEM:
…
129 break;
130
131 case CONST:
132 /* If adding to something entirely constant, set a flag
133 so that we can add a CONST around the result. */
134 x = XEXP (x, 0);
135 all_constant = 1;
136 goto restart;
137
138 case SYMBOL_REF:
139 case LABEL_REF:
140 all_constant = 1;
141 break;
142
143 case PLUS:
144 /* The interesting case is adding the integer to a sum.
145 Look for constant term in the sum and combine
146 with C. For an integer constant term, we make a combined
147 integer. For a constant term that is not an explicit integer,
148 we cannot really combine, but group them together anyway.
149
150 Restart or use a recursive call in case the remaining operand is
151 something that we handle specially, such as a SYMBOL_REF.
152
153 We may not immediately return from the recursive call here, lest
154 all_constant gets lost. */
155
156 if (GET_CODE (XEXP (x, 1)) == CONST_INT)
157 {
158 c += INTVAL (XEXP (x, 1));
159
160 if (GET_MODE (x) != VOIDmode)
161 c = trunc_int_for_mode (c, GET_MODE (x));
162
163 x = XEXP (x, 0);
164 goto restart;
165 }
166 else if (CONSTANT_P (XEXP (x, 1)))
167 {
168 x = gen_rtx_PLUS (mode, XEXP (x, 0), plus_constant (XEXP (x, 1), c));
169 c = 0;
170 }
171 else if (find_constant_term_loc (&y))
172 {
173 /* We need to be careful since X may be shared and we can't
174 modify it in place. */
175 rtx copy = copy_rtx (x);
176 rtx *const_loc = find_constant_term_loc (©);
177
178 *const_loc = plus_constant (*const_loc, c);
179 x = copy;
180 c = 0;
181 }
182 break;
183
184 default:
185 break;
186 }
187
188 if (c != 0)
189 x = gen_rtx_PLUS (mode, x, GEN_INT (c));
190
191 if (GET_CODE (x) == SYMBOL_REF || GET_CODE (x) == LABEL_REF)
192 return x;
193 else if (all_constant)
194 return gen_rtx_CONST (mode, x);
195 else
196 return x;
197 }
在166行,CONSTANT_P检查rtx对象是否代表一个常量。
293 #define CONSTANT_P(X) / in rtl.h
294 (GET_CODE (X) == LABEL_REF || GET_CODE (X) == SYMBOL_REF /
295 || GET_CODE (X) == CONST_INT || GET_CODE (X) == CONST_DOUBLE /
296 || GET_CODE (X) == CONST || GET_CODE (X) == HIGH /
297 || GET_CODE (X) == CONST_VECTOR /
298 || GET_CODE (X) == CONSTANT_P_RTX)
对于我们这里的rtx对象,在143行的case语句将被执行,然后是171行的语句,即函数find_constant_term_loc。
1818 rtx *
1819 find_constant_term_loc (rtx *p) in recog.c
1820 {
1821 rtx *tem;
1822 enum rtx_code code = GET_CODE (*p);
1823
1824 /* If *P IS such a constant term, P is its location. */
1825
1826 if (code == CONST_INT || code == SYMBOL_REF || code == LABEL_REF
1827 || code == CONST)
1828 return p;
1829
1830 /* Otherwise, if not a sum, it has no constant term. */
1831
1832 if (GET_CODE (*p) != PLUS)
1833 return 0;
1834
1835 /* If one of the summands is constant, return its location. */
1836
1837 if (XEXP (*p, 0) && CONSTANT_P (XEXP (*p, 0))
1838 && XEXP (*p, 1) && CONSTANT_P (XEXP (*p, 1)))
1839 return p;
1840
1841 /* Otherwise, check each summand for containing a constant term. */
1842
1843 if (XEXP (*p, 0) != 0)
1844 {
1845 tem = find_constant_term_loc (&XEXP (*p, 0));
1846 if (tem != 0)
1847 return tem;
1848 }
1849
1850 if (XEXP (*p, 1) != 0)
1851 {
1852 tem = find_constant_term_loc (&XEXP (*p, 1));
1853 if (tem != 0)
1854 return tem;
1855 }
1856
1857 return 0;
1858 }
注意到这个函数第一次被调用时,传入的rtx对象是在init_reload中,在463行创建的那个。对于这个对象,在find_constant_term_loc中的1843行将被执行,而其间将为它第一个孩子再调用find_constant_term_loc。其结果返回0。然后1850行开始的代码依次执行,所调用的find_constant_term_loc亦返回0。最后最开始调用的find_constant_term_loc,在plus_constant_wide的171行处返回0。
在plus_constant_wide的189行,一个新创建的rtx对象显示如下。注意到这个表达式具有前序(infix)形式。这个形式不再需要操作符的优先级。
图23:(frame register + GPR)形式的取址,最终的对象
然后在init_reload的470行,memory_address_p再一次被调用。从上面,我们知道对于x86机器,legitimate_address_p将是完成这个工作的地方。
6033 int
6034 legitimate_address_p (enum machine_mode mode, rtx addr, int strict) in i386.c
6035 {
6036 struct ix86_address parts;
6037 rtx base, index, disp;
6038 HOST_WIDE_INT scale;
6039 const char *reason = NULL;
6040 rtx reason_rtx = NULL_RTX;
6041
6042 if (TARGET_DEBUG_ADDR)
6043 {
6044 fprintf (stderr,
6045 "/n======/nGO_IF_LEGITIMATE_ADDRESS, mode = %s, strict = %d/n",
6046 GET_MODE_NAME (mode), strict);
6047 debug_rtx (addr);
6048 }
6049
6050 if (ix86_decompose_address (addr, &parts) <= 0)
6051 {
6052 reason = "decomposition failed";
6053 goto report_error;
6054 }
再一次,ix86_decompose_address将为取址表达式填充ix86_address结构。
5566 static int
5567 ix86_decompose_address (rtx addr, struct ix86_address *out) in i386.c
5568 {
5569 rtx base = NULL_RTX;
5570 rtx index = NULL_RTX;
5571 rtx disp = NULL_RTX;
5572 HOST_WIDE_INT scale = 1;
5573 rtx scale_rtx = NULL_RTX;
5574 int retval = 1;
5575 enum ix86_address_seg seg = SEG_DEFAULT;
5576
5577 if (GET_CODE (addr) == REG || GET_CODE (addr) == SUBREG)
5578 base = addr;
5579 else if (GET_CODE (addr) == PLUS)
5580 {
5581 rtx addends[4], op;
5582 int n = 0, i;
5583
5584 op = addr;
5585 do
5586 {
5587 if (n >= 4)
5588 return 0;
5589 addends[n++] = XEXP (op, 1);
5590 op = XEXP (op, 0);
5591 }
5592 while (GET_CODE (op) == PLUS);
5593 if (n >= 4)
5594 return 0;
5595 addends[n] = op;
5596
5597 for (i = n; i >= 0; --i)
5598 {
5599 op = addends[i];
5600 switch (GET_CODE (op))
5601 {
5602 case MULT:
…
5607 break;
5608
5609 case UNSPEC:
…
5616 break;
5617
5618 case REG:
5619 case SUBREG:
5620 if (!base)
5621 base = op;
5622 else if (!index)
5623 index = op;
5624 else
5625 return 0;
5626 break;
5627
5628 case CONST:
5629 case CONST_INT:
5630 case SYMBOL_REF:
5631 case LABEL_REF:
5632 if (disp)
5633 return 0;
5634 disp = op;
5635 break;
5636
5637 default:
5638 return 0;
5639 }
5640 }
5641 }
5642 else if (GET_CODE (addr) == MULT)
5643 {
5644 index = XEXP (addr, 0); /* index*scale */
5645 scale_rtx = XEXP (addr, 1);
5646 }
5647 else if (GET_CODE (addr) == ASHIFT)
5648 {
...
5661 }
5662 else
5663 disp = addr; /* displacement */
5664
5665 /* Extract the integral value of scale. */
5666 if (scale_rtx)
5667 {
5668 if (GET_CODE (scale_rtx) != CONST_INT)
5669 return 0;
5670 scale = INTVAL (scale_rtx);
5671 }
5672
5673 /* Allow arg pointer and stack pointer as index if there is not scaling. */
5674 if (base && index && scale == 1
5675 && (index == arg_pointer_rtx
5676 || index == frame_pointer_rtx
5677 || (REG_P (index) && REGNO (index) == STACK_POINTER_REGNUM)))
5678 {
5679 rtx tmp = base;
5680 base = index;
5681 index = tmp;
5682 }
5683
5684 /* Special case: %ebp cannot be encoded as a base without a displacement. */
5685 if ((base == hard_frame_pointer_rtx
5686 || base == frame_pointer_rtx
5687 || base == arg_pointer_rtx) && !disp)
5688 disp = const0_rtx;
5689
5690 /* Special case: on K6, [%esi] makes the instruction vector decoded.
5691 Avoid this by transforming to [%esi+0]. */
5692 if (ix86_tune == PROCESSOR_K6 && !optimize_size
5693 && base && !index && !disp
5694 && REG_P (base)
5695 && REGNO_REG_CLASS (REGNO (base)) == SIREG)
5696 disp = const0_rtx;
5697
5698 /* Special case: encode reg+reg instead of reg*2. */
5699 if (!base && index && scale && scale == 2)
5700 base = index, scale = 1;
5701
5702 /* Special case: scaling cannot be encoded without base or displacement. */
5703 if (!base && !disp && index && scale != 1)
5704 disp = const0_rtx;
5705
5706 out->base = base;
5707 out->index = index;
5708 out->disp = disp;
5709 out->scale = scale;
5710 out->seg = seg;
5711
5712 return retval;
5713 }
上面的5585行开始的do…while循环体处理这个前序形式的表达式。可以看到前序记符使得处理变得非常容易。base将指向rtx对象hard_frame_pointer_rtx,index将指向其他寄存器的rtx对象,而disp将指向rtx对象——常量4。
那么在接下来的legitimate_address_p部分,记住如果不是在寄存器重命名或重装遍中, strict将是0。
legitimate_address_p (continue)
6056 base = parts.base;
6057 index = parts.index;
6058 disp = parts.disp;
6059 scale = parts.scale;
6060
6061 /* Validate base register.
6062
6063 Don't allow SUBREG's here, it can lead to spill failures when the base
6064 is one word out of a two word structure, which is represented internally
6065 as a DImode int. */
6066
6067 if (base)
6068 {
6069 reason_rtx = base;
6070
6071 if (GET_CODE (base) != REG)
6072 {
6073 reason = "base is not a register";
6074 goto report_error;
6075 }
6076
6077 if (GET_MODE (base) != Pmode)
6078 {
6079 reason = "base is not in Pmode";
6080 goto report_error;
6081 }
6082
6083 if ((strict && ! REG_OK_FOR_BASE_STRICT_P (base))
6084 || (! strict && ! REG_OK_FOR_BASE_NONSTRICT_P (base)))
6085 {
6086 reason = "base is not valid";
6087 goto report_error;
6088 }
6089 }
6090
6091 /* Validate index register.
6092
6093 Don't allow SUBREG's here, it can lead to spill failures when the index
6094 is one word out of a two word structure, which is represented internally
6095 as a DImode int. */
6096
6097 if (index)
6098 {
6099 reason_rtx = index;
6100
6101 if (GET_CODE (index) != REG)
6102 {
6103 reason = "index is not a register";
6104 goto report_error;
6105 }
6106
6107 if (GET_MODE (index) != Pmode)
6108 {
6109 reason = "index is not in Pmode";
6110 goto report_error;
6111 }
6112
6113 if ((strict && ! REG_OK_FOR_INDEX_STRICT_P (index))
6114 || (! strict && ! REG_OK_FOR_INDEX_NONSTRICT_P (index)))
6115 {
6116 reason = "index is not valid";
6117 goto report_error;
6118 }
6119 }
宏REG_OK_FOR_INDEX_NONSTRICT_P检查,指定的寄存器是否能存放取址所用的索引。
1964 #define REG_OK_FOR_INDEX_NONSTRICT_P(X) / in i386.h
1965 (REGNO (X) < STACK_POINTER_REGNUM /
1966 || (REGNO (X) >= FIRST_REX_INT_REG /
1967 && REGNO (X) <= LAST_REX_INT_REG) /
1968 || REGNO (X) >= FIRST_PSEUDO_REGISTER)
看到寄存器ax,dx,cx,bx,si,di,bp,r8 ~ r15及伪寄存器都可被用于保存地址索引。
legitimate_address_p (continue)
6121 /* Validate scale factor. */
6122 if (scale != 1)
6123 {
…
6136 }
6137
6138 /* Validate displacement. */
6139 if (disp)
6140 {
6141 reason_rtx = disp;
6142
6143 if (GET_CODE (disp) == CONST
6144 && GET_CODE (XEXP (disp, 0)) == UNSPEC)
…
6166 else if (flag_pic && (SYMBOLIC_CONST (disp)
6167 #if TARGET_MACHO
6168 && !machopic_operand_p (disp)
6169 #endif
6170 ))
6171 {
…
6214 }
6215 else if (GET_CODE (disp) != LABEL_REF
6216 && GET_CODE (disp) != CONST_INT
6217 && (GET_CODE (disp) != CONST
6218 || !legitimate_constant_p (disp))
6219 && (GET_CODE (disp) != SYMBOL_REF
6220 || !legitimate_constant_p (disp)))
6221 {
6222 reason = "displacement is not constant";
6223 goto report_error;
6224 }
6225 else if (TARGET_64BIT && !x86_64_sign_extended_value (disp))
6226 {
6227 reason = "displacement is out of range";
6228 goto report_error;
6229 }
6230 }
6231
6232 /* Everything looks valid. */
6233 if (TARGET_DEBUG_ADDR)
6234 fprintf (stderr, "Success./n");
6235 return TRUE;
6236
6237 report_error:
6238 if (TARGET_DEBUG_ADDR)
6239 {
6240 fprintf (stderr, "Error: %s/n", reason);
6241 debug_rtx (reason_rtx);
6242 }
6243 return FALSE;
6244 }
对于第一个寄存器ax,legitimate_address_p返回true,同样memory_address_p亦如是。则在461行的FOR循环立即退出并执行下面的代码。
init_reload (continue)
477 /* Initialize obstack for our rtl allocation. */
478 gcc_obstack_init (&reload_obstack);
479 reload_startobj = obstack_alloc (&reload_obstack, 0);
480
481 INIT_REG_SET (&spilled_pseudos);
482 INIT_REG_SET (&pseudos_counted);
483 }
在上面的481行,spilled_pseudos记录了哪个伪寄存器需要被溢出。在482行,pseudos_counted用于在函数order_regs_for_reload 及count_pseudo间的通讯,并用于避免对同一个伪寄存器重复计数。这2个变量都具有bitmap类型。INIT_REG_SET初始化了这2个变量。