Return finish_file , we have seen that flag_unit_at_a_time is set at level above ‘-O2’. This flag also can be set by option -funit-at-a-time which described in [6] as below:
Parse the whole compilation unit before starting to produce code. This allows some extra optimizations to take place but consumes more memory (in general). There are some compatibility issues with unit-at-a-time mode: enabling unit-at-a-time mode may change the order in which functions, variables, and top-level asm statements are emitted, and will likely break code relying on some particular ordering. The majority of such top-level asm statements, though, can be replaced by section attributes. The ‘fno-toplevel-reorder’ option may be used to keep the ordering used in the input file, at the cost of some optimizations. unit-at-a-time mode removes unreferenced static variables and functions. This may result in undefined references when an asm statement refers directly to variables or functions that are otherwise unused. In that case either the variable/function shall be listed as an operand of the asm statement operand or, in the case of top-level asm statements the attribute used shall be used on the declaration. Static functions now can use non-standard passing conventions that may break asm statements calling functions directly. Again, attribute used will prevent this behavior. As a temporary workaround, ‘-fno-unit-at-a-time’ can be used, but this scheme may not be supported by future releases of GCC. |
finish_file (continue)
2850 if (flag_unit_at_a_time )
2851 {
2852 cgraph_finalize_compilation_unit ();
2853 cgraph_optimize ();
2854 }
To do optimization, we need know the relations among code, and then we can strip off the redundancy upon the relationship probed. So here, below function analyzes the translation-unit and builds a graph describing the dependence, the data-flow and the control-flow.
368 void
369 cgraph_finalize_compilation_unit (void) in cgraphunit.c
370 {
371 struct cgraph_node *node;
372
373 if (!flag_unit_at_a_time )
374 {
375 cgraph_assemble_pending_functions ();
376 return ;
377 }
378
379 cgraph_varpool_assemble_pending_decls ();
380 if (!quiet_flag )
381 fprintf (stderr , "/nAnalyzing compilation unit/n");
Remember if the declaration is visible out of the translation unit; it is recorded into the queue of cgraph_varpool_nodes_queue . At here if this queue is not empty now, the declarations held should have assemble generated; otherwise, they should be emitted here.
612 bool
613 cgraph_varpool_assemble_pending_decls (void) in cgraph.c
614 {
615 bool changed = false;
616
617 while (cgraph_varpool_nodes_queue )
618 {
619 tree decl = cgraph_varpool_nodes_queue ->decl;
620 struct cgraph_varpool_node *node = cgraph_varpool_nodes_queue ;
621
622 cgraph_varpool_nodes_queue = cgraph_varpool_nodes_queue ->next_needed;
623 if (!TREE_ASM_WRITTEN (decl))
624 {
625 assemble_variable (decl, 0, 1, 0);
626 changed = true;
627 }
628 node->next_needed = NULL;
629 }
630 return changed;
631 }
At line 623 above, in a VAR_DECL, TREE_ASM_WRITTEN if nonzero means assemble code has been written. Nothing needs be done; otherwise, assemble_variable is invoked. Note that assemble should not be emitted for automatic variable by this function.
In this invocation, the argument don’t_output_data is 0, which means actually output the initial value. Then below at line 1344, hooks prepare_assemble_variable is NULL for all front-end in this version.
1335 void
1336 assemble_variable (tree decl, int top_level ATTRIBUTE_UNUSED, in varasm.c
1337 int at_end ATTRIBUTE_UNUSED, int dont_output_data)
1338 {
1339 const char *name;
1340 unsigned int align;
1341 int reloc = 0;
1342 rtx decl_rtl;
1343
1344 if (lang_hooks .decls.prepare_assemble_variable)
1345 (*lang_hooks .decls.prepare_assemble_variable) (decl);
1346
1347 last_assemble_variable_decl = 0;
1348
1349 /* Normally no need to say anything here for external references,
1350 since assemble_external is called by the language-specific code
1351 when a declaration is first seen. */
1352
1353 if (DECL_EXTERNAL (decl))
1354 return ;
1355
1356 /* Output no assembler code for a function declaration.
1357 Only definitions of functions output anything. */
1358
1359 if (TREE_CODE (decl) == FUNCTION_DECL)
1360 return ;
1361
1362 /* Do nothing for global register variables. */
1363 if (DECL_RTL_SET_P (decl) && GET_CODE (DECL_RTL (decl)) == REG)
1364 {
1365 TREE_ASM_WRITTEN (decl) = 1;
1366 return ;
1367 }
1368
1369 /* If type was incomplete when the variable was declared,
1370 see if it is complete now. */
1371
1372 if (DECL_SIZE (decl) == 0)
1373 layout_decl (decl, 0);
1374
1375 /* Still incomplete => don't allocate it; treat the tentative defn
1376 (which is what it must have been) as an `extern' reference. */
1377
1378 if (!dont_output_data && DECL_SIZE (decl) == 0)
1379 {
1380 error ("%Jstorage size of `%D' isn't known", decl, decl);
1381 TREE_ASM_WRITTEN (decl) = 1;
1382 return ;
1383 }
1384
1385 /* The first declaration of a variable that comes through this function
1386 decides whether it is global (in C, has external linkage)
1387 or local (in C, has internal linkage). So do nothing more
1388 if this function has already run. */
1389
1390 if (TREE_ASM_WRITTEN (decl))
1391 return ;
1392
1393 /* Make sure targetm.encode_section_info is invoked before we set
1394 ASM_WRITTEN. */
1395 decl_rtl = DECL_RTL (decl);
1396
1397 TREE_ASM_WRITTEN (decl) = 1;
1398
1399 /* Do no output if -fsyntax-only. */
1400 if (flag_syntax_only )
1401 return ;
The node can has associated RTL nodes generated from its initial value must be complete. In front-end, the way to see if the tree node is completed or not, is checking its size. Because every node created by make_node always has size 0; and only in layout_decl , the size would be calculated and set.
Then at line 1395, macro DECL_RTL generates following rtx nodes for the VAR_DECL, and variable decl_rtl then refers to the rtx MEM node below (for the case, VAR_DECL must not be not declared as register).
Then in below code snippet, it begins configure the memory allocated for the VAR_DECL within MEM node.
assemble_variable (continue)
1403 app_disable ();
1404
1405 if (! dont_output_data
1406 && ! host_integerp (DECL_SIZE_UNIT (decl), 1))
1407 {
1408 error ("%Jsize of variable '%D' is too large", decl, decl);
1409 return ;
1410 }
1411
1412 name = XSTR (XEXP (decl_rtl, 0), 0);
1413 if (TREE_PUBLIC (decl) && DECL_NAME (decl))
1414 notice_global_symbol (decl);
1415
1416 /* Compute the alignment of this data. */
1417
1418 align = DECL_ALIGN (decl);
1419
1420 /* In the case for initialing an array whose length isn't specified,
1421 where we have not yet been able to do the layout,
1422 figure out the proper alignment now. */
1423 if (dont_output_data && DECL_SIZE (decl) == 0
1424 && TREE_CODE (TREE_TYPE (decl)) == ARRAY_TYPE)
1425 align = MAX (align, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (decl))));
1426
1427 /* Some object file formats have a maximum alignment which they support.
1428 I n particular, a.out format supports a maximum alignment of 4. */
1429 #ifndef MAX_OFILE_ALIGNMENT
1430 #define MAX_OFILE_ALIGNMENT BIGGEST_ALIGNMENT
1431 #endif
1432 if (align > MAX_OFILE_ALIGNMENT)
1433 {
1434 warning ("%Jalignment of '%D' is greater than maximum object "
1435 "file alignment. Using %d", decl, decl,
1436 MAX_OFILE_ALIGNMENT/BITS_PER_UNIT);
1437 align = MAX_OFILE_ALIGNMENT;
1438 }
1439
1440 /* On some machines, it is good to increase alignment sometimes. */
1441 if (! DECL_USER_ALIGN (decl))
1442 {
1443 #ifdef DATA_ALIGNMENT
1444 align = DATA_ALIGNMENT (TREE_TYPE (decl), align);
1445 #endif
1446 #ifdef CONSTANT_ALIGNMENT
1447 if (DECL_INITIAL (decl) != 0 && DECL_INITIAL (decl) != error_mark_node)
1448 align = CONSTANT_ALIGNMENT (DECL_INITIAL (decl), align);
1449 #endif
1450 }
1451
1452 /* Reset the alignment in case we have made it tighter, so we can benefit
1453 from it in get_pointer_alignment. */
1454 DECL_ALIGN (decl) = align;
1455 set_mem_align (decl_rtl, align);
1456
1457 if (TREE_PUBLIC (decl))
1458 maybe_assemble_visibility (decl);
First, as syntax of GNU assembler, if the first line of an input file is #NO_APP, or if you use the `-f' option, whitespace and comments are not removed from the input file. Within an input file, you can ask for whitespace and comment removal in specific portions by putting a line that says #APP before the text that may contain whitespace or comments, and putting a line that says #NO_APP after this text. This feature is mainly intended to support asm statements in compilers whose output is otherwise free of comments and whitespace. In C++, it is majorly used with embedded assembler, and at head of the embedded assembler block out of function scope, GCC would insert #APP automatically. So at line 1403, before emitting the assemble if #APP is in use, first closes it with #NO_APP via app_disable .
284 void
285 app_disable (void) in final.c
286 {
287 if (app_on )
288 {
289 fputs (ASM_APP_OFF, asm_out_file );
290 app_on = 0;
291 }
292 }
Global variable app_on if nonzero means #APP is in use; and ASM_APP_OFF, for most targets is defined as “#NO_APP” (and under few others, it is defined as empty string).
Next at line 1414, notice_global_symbol collects the name of the first global variable in the translation-unit, which will be used to generated internal name for nameless namespace. And the last part of above code is to determine the alignment of the memory; even for Linux/x86 target, at line 1429, MAX_OFILE_ALIGNMENT is undefined yet, and is set as BIGGEST_ALIGNMENT (defined as 32) following.
As we have seen, unless it explicitly uses attribute to declare special alignment in the declaration, DECL_USER_ALIGN of the node is always false. And in common case, the target usually defines DATA_ALIGNMENT to increase alignment of medium-size data to make it all fit in fewer cache lines. Further CONSTANT_ALIGNMENT is also defined by the target to increase alignment for string constants to be word aligned so that `strcpy' calls that copy constants can be done inline. Then the deliberately determined alignment should be set in rtx MEM node which will guide the assemble emission later.
1822 void
1823 set_mem_align (rtx mem, unsigned int align) in emit-rtl.c
1824 {
1825 MEM_ATTRS (mem) = get_mem_attrs (MEM_ALIAS_SET (mem), MEM_EXPR (mem),
1826 MEM_OFFSET (mem), MEM_SIZE (mem), align,
1827 GET_MODE (mem));
1828 }
Macro MEM_ATTRS extracts rtmem field from rtx, this field has following definition.
99 typedef struct mem_attrs GTY(()) in rtl.h
100 {
101 HOST_WIDE_INT alias; /* Memory alias set. */
102 tree expr; /* expr corresponding to MEM. */
103 rtx offset; /* Offset from start of DECL, as CONST_INT. */
104 rtx size; /* Size in bytes, as a CONST_INT. */
105 unsigned int align; /* Alignment of MEM in bits. */
106 } mem_attrs;
Above MEM_ALIAS_SET, MEM_EXPR, MEM_OFFSET, and MEM_SIZE extract content of corresponding fields of the structure (but 0 are returned if rtmem field is empty, and see in gen_rtx_MEM in make_decl_rtl , a NULL pointer is placed into rtmem field). So the net effect of set_mem_align is to update align field with the help of get_mem_attrs .
291 static mem_attrs *
292 get_mem_attrs (HOST_WIDE_INT alias, tree expr, rtx offset, rtx size, in emit-rtl.c
293 unsigned int align, enum machine_mode mode)
294 {
295 mem_attrs attrs;
296 void **slot;
297
298 /* If everything is the default, we can just return zero.
299 This must match what the corresponding MEM_* macros return when the
300 field is not present. */
301 if (alias == 0 && expr == 0 && offset == 0
302 && (size == 0
303 || (mode != BLKmode && GET_MODE_SIZE (mode) == INTVAL (size)))
304 && (STRICT_ALIGNMENT && mode != BLKmode
305 ? align == GET_MODE_ALIGNMENT (mode) : align == BITS_PER_UNIT))
306 return 0;
307
308 attrs.alias = alias;
309 attrs.expr = expr;
310 attrs.offset = offset;
311 attrs.size = size;
312 attrs.align = align;
313
314 slot = htab_find_slot (mem_attrs_htab , &attrs, INSERT);
315 if (*slot == 0)
316 {
317 *slot = ggc_alloc (sizeof (mem_attrs));
318 memcpy (*slot, &attrs, sizeof (mem_attrs));
319 }
320
321 return *slot;
322 }
All instances of mem_attrs are cached within hash table of mem_attrs_htab to accelerate the searching; and every memory-attribute-set is a signleton. See normally, every declaration of the same type would have the same memory attribute set.
Back assemble_variable , at line 1458, maybe_assemble_visibility just handles declaraction with attribute of “visibility”. The function will invoke back-end hook to output names with visibility described. We skip its handling here, [6] gives a detailed description about this attribute.