Error recovering is a serious and quite diffcult topic. When encountering error, how to make parser resume the parsing upon valid tokens after the error is always not an easy task. In the front-end of C++, tentative parser offers an elegant way of recovery when we find error in the tried parsing and the roll-back makes another try possible. But when all possible are tried and none success, we have to abandon tokens until the parser can resume. In front-end for C++, there are several functions for this purpose. The first function cp_parser_skip_to_closing_parenthesis will skip tokens until find unnested closing parethesis (“)”) or unnested comma (if or_comma is true) or semicolon or the end of file.
2024 static int
2025 cp_parser_skip_to_closing_parenthesis (cp_parser *parser, in parser.c
2026 bool recovering,
2027 bool or_comma,
2028 bool consume_paren)
2029 {
2030 unsigned paren_depth = 0;
2031 unsigned brace_depth = 0;
2032
2033 if (recovering && !or_comma && cp_parser_parsing_tentatively (parser)
2034 && !cp_parser_committed_to_tentative_parse (parser))
2035 return 0;
2036
2037 while (true)
2038 {
2039 cp_token *token;
2040
2041 /* If we've run out of tokens, then there is no closing `)'. */
2042 if (cp_lexer_next_token_is (parser->lexer, CPP_EOF))
2043 return 0;
2044
2045 token = cp_lexer_peek_token (parser->lexer);
2046
2047 /* This matches the processing in skip_to_end_of_statement. */
2048 if (token->type == CPP_SEMICOLON && !brace_depth)
2049 return 0;
2050 if (token->type == CPP_OPEN_BRACE)
2051 ++brace_depth;
2052 if (token->type == CPP_CLOSE_BRACE)
2053 {
2054 if (!brace_depth--)
2055 return 0;
2056 }
2057 if (recovering && or_comma && token->type == CPP_COMMA
2058 && !brace_depth && !paren_depth)
2059 return -1;
2060
2061 if (!brace_depth)
2062 {
2063 /* If it is an `(', we have entered another level of nesting. */
2064 if (token->type == CPP_OPEN_PAREN)
2065 ++paren_depth;
2066 /* If it is a `)', then we might be done. */
2067 else if (token->type == CPP_CLOSE_PAREN && !paren_depth--)
2068 {
2069 if (consume_paren)
2070 cp_lexer_consume_token (parser->lexer);
2071 return 1;
2072 }
2073 }
2074
2075 /* Consume the token. */
2076 cp_lexer_consume_token (parser->lexer);
2077 }
2078 }
From above code, the so-called unnested, given in example, like “{ .. ( .. ( .., ); .. ), ( ..;);}”, those in bold and underline are unnested. Parameter recovering if is true, indicates we are doing error recovery. And if or_comma is false, means we ignore unnested comma, that is we skip till seeing unnested semicolon, closing parenthese, or closing brace, in which semicolon and closing brace both meaning the end of statement. This way, if before we are within tentative parse, at beginning, the front-end would push a node for saving possible deferred access control checkings in this parse into deferred access control checking stack, before this error recovery (ignoring unnested comma), the tentative parse must have been stopped (via calling cp_parser_parse_definitely, but it would only remove current context, not guarantee to clear the context list); otherwise we haven’t chance to restore the deferred access control stack. Condition “cp_parser_parsing_tentatively (parser) && cp_parser_committed_to_tentative_parse (parser)” is held for stopped tentative parse or untentative parse. If the condition at line 2033 unheld, we’d better leave the job to the front-end.
Can image more often cp_parser_skip_to_closing_parenthesis skips to closing parenthesis. To definitly skip to the end of a statement, it can use cp_parser_skip_to_end_of_statement. Here it doesn’t check if tentative parse has been stopped. It is because, at invoking this function, the parser can’t parse the statement, it should be ran unconditionally, and usually at this time, the front-end has stopped the tentavtive parse.
2084 static void
2085 cp_parser_skip_to_end_of_statement (cp_parser* parser) in parser.c
2086 {
2087 unsigned nesting_depth = 0;
2088
2089 while (true)
2090 {
2091 cp_token *token;
2092
2093 /* Peek at the next token. */
2094 token = cp_lexer_peek_token (parser->lexer);
2095 /* If we've run out of tokens, stop. */
2096 if (token->type == CPP_EOF)
2097 break;
2098 /* If the next token is a `;', we have reached the end of the
2099 statement. */
2100 if (token->type == CPP_SEMICOLON && !nesting_depth)
2101 break;
2102 /* If the next token is a non-nested `}', then we have reached
2103 the end of the current block. */
2104 if (token->type == CPP_CLOSE_BRACE)
2105 {
2106 /* If this is a non-nested `}', stop before consuming it.
2107 That way, when confronted with something like:
2108
2109 { 3 + }
2110
2111 we stop before consuming the closing `}', even though we
2112 have not yet reached a `;'. */
2113 if (nesting_depth == 0)
2114 break;
2115 /* If it is the closing `}' for a block that we have
2116 scanned, stop -- but only after consuming the token.
2117 That way given:
2118
2119 void f g () { ... }
2120 typedef int I;
2121
2122 we will stop after the body of the erroneously declared
2123 function, but before consuming the following `typedef'
2124 declaration. */
2125 if (--nesting_depth == 0)
2126 {
2127 cp_lexer_consume_token (parser->lexer);
2128 break;
2129 }
2130 }
2131 /* If it the next token is a `{', then we are entering a new
2132 block. Consume the entire block. */
2133 else if (token->type == CPP_OPEN_BRACE)
2134 ++nesting_depth;
2135 /* Consume the token. */
2136 cp_lexer_consume_token (parser->lexer);
2137 }
2138 }
Next cp_parser_skip_to_end_of_block_or_statement resembles, but it aims for block like DO..WHILE loop which should be ended by semicolon So only unnested semicolon is considered, besides unnested closing brace.
2162 static void
2163 cp_parser_skip_to_end_of_block_or_statement (cp_parser* parser) in parser.c
2164 {
2165 unsigned nesting_depth = 0;
2166
2167 while (true)
2168 {
2169 cp_token *token;
2170
2171 /* Peek at the next token. */
2172 token = cp_lexer_peek_token (parser->lexer);
2173 /* If we've run out of tokens, stop. */
2174 if (token->type == CPP_EOF)
2175 break;
2176 /* If the next token is a `;', we have reached the end of the
2177 statement. */
2178 if (token->type == CPP_SEMICOLON && !nesting_depth)
2179 {
2180 /* Consume the `;'. */
2181 cp_lexer_consume_token (parser->lexer);
2182 break;
2183 }
2184 /* Consume the token. */
2185 token = cp_lexer_consume_token (parser->lexer);
2186 /* If the next token is a non-nested `}', then we have reached
2187 the end of the current block. */
2188 if (token->type == CPP_CLOSE_BRACE
2189 && (nesting_depth == 0 || --nesting_depth == 0))
2190 break;
2191 /* If it the next token is a `{', then we are entering a new
2192 block. Consume the entire block. */
2193 if (token->type == CPP_OPEN_BRACE)
2194 ++nesting_depth;
2195 }
2196 }
Not suprising, there is a function especially for statement like FOR block which is not ended by semicolon.
2201 static void
2202 cp_parser_skip_to_closing_brace (cp_parser *parser) in parser.c
2203 {
2204 unsigned nesting_depth = 0;
2205
2206 while (true)
2207 {
2208 cp_token *token;
2209
2210 /* Peek at the next token. */
2211 token = cp_lexer_peek_token (parser->lexer);
2212 /* If we've run out of tokens, stop. */
2213 if (token->type == CPP_EOF)
2214 break;
2215 /* If the next token is a non-nested `}', then we have reached
2216 the end of the current block. */
2217 if (token->type == CPP_CLOSE_BRACE && nesting_depth-- == 0)
2218 break;
2219 /* If it the next token is a `{', then we are entering a new
2220 block. Consume the entire block. */
2221 else if (token->type == CPP_OPEN_BRACE)
2222 ++nesting_depth;
2223 /* Consume the token. */
2224 cp_lexer_consume_token (parser->lexer);
2225 }
2226 }
There are cases these methods may abandon too many tokens or even the method is inappropriate, for example when handling a template declaration, and an error occurs within the angle braces pair “< >”, it is not correct to advance to closing parenthesis, but advance to end of statement has too many tokens lost. It is preferred to discard tokens till seeing “>”. To accommodate such cases, cp_parser_skip_until_found will stop at specified token.
15109 static void
15110 cp_parser_skip_until_found (cp_parser* parser, in parser.c
15111 enum cpp_ttype type,
15112 const char* token_desc)
15113 {
15114 cp_token *token;
15115 unsigned nesting_depth = 0;
15116
15117 if (cp_parser_require (parser, type, token_desc))
15118 return;
15119
15120 /* Skip tokens until the desired token is found. */
15121 while (true)
15122 {
15123 /* Peek at the next token. */
15124 token = cp_lexer_peek_token (parser->lexer);
15125 /* If we've reached the token we want, consume it and
15126 stop. */
15127 if (token->type == type && !nesting_depth)
15128 {
15129 cp_lexer_consume_token (parser->lexer);
15130 return;
15131 }
15132 /* If we've run out of tokens, stop. */
15133 if (token->type == CPP_EOF)
15134 return;
15135 if (token->type == CPP_OPEN_BRACE
15136 || token->type == CPP_OPEN_PAREN
15137 || token->type == CPP_OPEN_SQUARE)
15138 ++nesting_depth;
15139 else if (token->type == CPP_CLOSE_BRACE
15140 || token->type == CPP_CLOSE_PAREN
15141 || token->type == CPP_CLOSE_SQUARE)
15142 {
15143 if (nesting_depth-- == 0)
15144 return;
15145 }
15146 /* Consume this token. */
15147 cp_lexer_consume_token (parser->lexer);
15148 }
15149 }
Of course, when finding tokens that may mark the end of statement, it should return to be conservative.
Tens of helpers are defined for parser, they help parser to retrieve and verify tokens. Routine cp_lexer_next_token_is can tell whether the token is of expected type.
667 static bool
668 cp_lexer_next_token_is (cp_lexer* lexer, enum cpp_ttype type) in parser.c
669 {
670 cp_token *token;
671
672 /* Peek at the next token. */
673 token = cp_lexer_peek_token (lexer);
674 /* Check to see if it has the indicated TYPE. */
675 return token->type == type;
676 }
Routine cp_parser_require, if finds the token is the expected type, consumes the token; otherwise issues error message.
15085 static cp_token *
15086 cp_parser_require (cp_parser* parser, in parser.c
15087 enum cpp_ttype type,
15088 const char* token_desc)
15089 {
15090 if (cp_lexer_next_token_is (parser->lexer, type))
15091 return cp_lexer_consume_token (parser->lexer);
15092 else
15093 {
15094 /* Output the MESSAGE -- unless we're parsing tentatively. */
15095 if (!cp_parser_simulate_error (parser))
15096 {
15097 char *message = concat ("expected ", token_desc, NULL);
15098 cp_parser_error (parser, message);
15099 free (message);
15100 }
15101 return NULL;
15102 }
15103 }
Routine cp_parser_require_keyword just like cp_parser_require, but expects the coming token is one of keywords of the language,
15157 static cp_token *
15158 cp_parser_require_keyword (cp_parser* parser, in parser.c
15159 enum rid keyword,
15160 const char* token_desc)
15161 {
15162 cp_token *token = cp_parser_require (parser, CPP_KEYWORD, token_desc);
15163
15164 if (token && token->keyword != keyword)
15165 {
15166 dyn_string_t error_msg;
15167
15168 /* Format the error message. */
15169 error_msg = dyn_string_new (0);
15170 dyn_string_append_cstr (error_msg, "expected ");
15171 dyn_string_append_cstr (error_msg, token_desc);
15172 cp_parser_error (parser, error_msg->s);
15173 dyn_string_delete (error_msg);
15174 return NULL;
15175 }
15176
15177 return token;
15178 }
Another similar function is cp_lexer_next_token_is_keyword.
688 static bool
689 cp_lexer_next_token_is_keyword (cp_lexer* lexer, enum rid keyword) in parser.c
690 {
691 cp_token *token;
692
693 /* Peek at the next token. */
694 token = cp_lexer_peek_token (lexer);
695 /* Check to see if it is the indicated keyword. */
696 return token->keyword == keyword;
697 }
Routine cp_parser_identifier then expects token of identifier instead.
2299 static tree
2300 cp_parser_identifier (cp_parser* parser) in parser.c
2301 {
2302 cp_token *token;
2303
2304 /* Look for the identifier. */
2305 token = cp_parser_require (parser, CPP_NAME, "identifier");
2306 /* Return the value. */
2307 return token ? token->value : error_mark_node;
2308 }