The grammar of the language can be shown as a tree (of course we need treat the recursion of parent as child node, most of which we highlight in red in the figures). The root and starting point of the tree is translation-unit. From below figures, we can see building a parser for C++ is not an easy task. What’s more, the grammar below provided by standard is left recursive in nature. To be able to build parser with parser builder, it needs be transimitted into right recursive ones to accommodate the bottom up LR parser. Even with the right recursive version, building LR(1) parser is still very difficult. So GCC adopts hand written tentative LL(n) parser – it supports retrospect. And no doubt looking through the parser is a long journey. Now let’s go!
Figure 43: syntax tree for declarations
Figure 44: syntax tree for class
Figure 45: syntax tree for statements
The major body of c_parser_file is cp_parser_translation_unit. According to [3], the abbreviate syntax tree for translation-unit is:
translation-unit
Ⅼ declaration-seq [opt] —— declaration-seq declaration
Ⅼ declaration
In this simple view, a translation-uint just contains a sequence of declaration. In C++, translation-unit usually includes a source file and several header files. From this translation-unit, the compiler will generate the object file and linker will join these object files together to form the target file.
2319 static bool
2320 cp_parser_translation_unit (cp_parser* parser) in parser.c
2321 {
2322 while (true)
2323 {
2324 cp_parser_declaration_seq_opt (parser);
2325
2326 /* If there are no tokens left then all went well. */
2327 if (cp_lexer_next_token_is (parser->lexer, CPP_EOF))
2328 break;
2329
2330 /* Otherwise, issue an error message. */
2331 cp_parser_error (parser, "expected declaration");
2332 return false;
2333 }
2334
2335 /* Consume the EOF token. */
2336 cp_parser_require (parser, CPP_EOF, "end-of-file");
2337
2338 /* Finish up. */
2339 finish_translation_unit ();
2340
2341 /* All went well. */
2342 return true;
2343 }
Note WHILE above, the successful parse jumps to line 2336 from line 2328, and by this way, it can avoid using “goto”.