JavaCC简单语法

options {
  LOOKAHEAD = 1;
  CHOICE_AMBIGUITY_CHECK = 2;
  OTHER_AMBIGUITY_CHECK = 1;
  STATIC = true;
  DEBUG_PARSER = false;
  DEBUG_LOOKAHEAD = false;
  DEBUG_TOKEN_MANAGER = false;
  ERROR_REPORTING = true;
  JAVA_UNICODE_ESCAPE = false;
  UNICODE_INPUT = false;
  IGNORE_CASE = false;
  USER_TOKEN_MANAGER = false;
  USER_CHAR_STREAM = false;
  BUILD_PARSER = true;
  BUILD_TOKEN_MANAGER = true;
  SANITY_CHECK = true;
  FORCE_LA_CHECK = false;
}

PARSER_BEGIN(Simple1)

/** Simple brace matcher. */
public class Simple1 {

  /** Main entry point. */
  public static void main(String args[]) throws ParseException {
    Simple1 parser = new Simple1(System.in);
    parser.Input();
  }

}

PARSER_END(Simple1)

/** Root production. */
void Input() :
{}
{
  MatchedBraces() ("\n"|"\r")* <EOF>
}

/** Brace matching production. */
void MatchedBraces() :
{}
{
  "{" [ MatchedBraces() ] "}"
}

 

 

 

认真的看了javacc中的simpleExamples中的example1.jj以及readme文档,才刚刚开始理解Javacc中的简单语法。

Following this is a list of productions.  In this example, there are

two productions, that define the non-terminals "Input" and

"MatchedBraces" respectively.  In JavaCC grammars, non-terminals are

written and implemented (by JavaCC) as Java methods.  When the

non-terminal is used on the left-hand side of a production, it is

considered to be declared and its syntax follows the Java syntax.  On

the right-hand side its use is similar to a method call in Java.

对应simple1.jj中的代码如下

void Input() :

{}

{

  MatchedBraces() ("\n"|"\r")* <EOF>

}

/** Brace matching production. */

void MatchedBraces() :

{}

{

  "{" [ MatchedBraces() ] "}"

 

 

}

简单的翻译下:

接下来是一系列的产生式。在这个例子中,有分别定义非终结符“Input”和 MatchedBraces”两个产生式。在JavaCC语法中,非终结符的申明和实现犹如java语言中方法。当该终结符出现在产生式的左边时,则该终结符被声明,该语法和java中语法相同。如果出现在右边,类似于java中方法的调用。

 

Each production defines its left-hand side non-terminal followed by a

colon.  This is followed by a bunch of declarations and statements

within braces (in both cases in the above example, there are no

declarations and hence this appears as "{}") which are generated as

common declarations and statements into the generated method.  This is

then followed by a set of expansions also enclosed within braces.

每个产生式定义了后面紧跟着冒号的“左边”非终结符。接着是一串带有大括号的declarationsstatements(在上面的两个产生式中,都没有详细的declarations,所以在代码中仅仅有“{}”),

而在statements可以是一系列的表达式或declarationsstatements

Lexical tokens (regular expressions) in a JavaCC input grammar are

either simple strings ("{", "}", "\n", and "\r" in the above example),

or a more complex regular expression.  In our example above, there is

one such regular expression "<EOF>" which is matched by the end of

file.  All complex regular expressions are enclosed within angular

brackets.

Javacc中词法记号(正则表达式)或者是简单的字符串或者是一些复杂的正则表达式。在这个例子中,有这样一个正则表达式“<EOF>”,它表示文件的末尾。所有复杂的正则表达式以尖括号封闭。

 

The first production above says that the non-terminal "Input" expands

to the non-terminal "MethodBraces" followed by zero or more line

terminators ("\n" or "\r") and then the end of file.

该例子中的第一个表达式表明了:非终结符“Input”为在非终结符MethodBraces后跟了0或多个终结符“("\n" or "\r")”以及文件的结束。

The second production above says that the non-terminal "MatchedBraces"

expands to the token "{" followed by an optional nested expansion of

"MatchedBraces" followed by the token "}".  Square brackets [...]

in a JavaCC input file indicate that the ... is optional.

第二个产生式表明:非终结符“MatchedBraces”为“{”后跟着可选可不选的“MatchedBraces”后再跟着“}”,中括号[]的的内的内容是可选的。

[...] may also be written as (...)?.  These two forms are equivalent.

Other structures that may appear in expansions are:

 

   e1 | e2 | e3 | ... : A choice of e1, e2, e3, etc.

   ( e )+             : One or more occurrences of e

   ( e )*             : Zero or more occurrences of e

 

Note that these may be nested within each other, so we can have

something like:

 

   (( e1 | e2 )* [ e3 ] ) | e4

[…]也可以写成(…)?。这两种格式是等价的,其他一些结构:如上。

你可能感兴趣的:(正则表达式)