cmm编译器-cmm 语言语法分析-javacc实现

Cmm的文法:

//程序开始

programàstmt-sequence

stmt-sequenceà statement | statement; stmt-squence

//各种语句的文法

statementàif-stmt | while-stmt | read-stmt | write-stmt|assign-stmt| declare-stmt

//LL1

Sàstmt

Stmtàif-stmt morestmt|while-stmt morestmt |read-stmt morestmt |write-stmt morestmt |assign-stmt morestmt |declare-stmt morestmt

Morestmtà$ |stmt

T: ε

//变量数组声明语句文法

declare-stmtà(int | real |int[] |real[] ) indentifier |(int | real |int[] |real[] )  identifier indetifierMuti;

indetifierMutià,identifier| ,identifier idetifierMuti

//LL1

Declare-stmtàint more identifier more-declare| real more identifier more-declare 

more-->ε | []

More-declareà; | ,identifier more-declare

//if语句

if-stmtàif ( exp ) {stmt-sequence} ­

| if ( exp ) {stmt-sequence} elseifelse-stmt

elseifelse-stmtà else if ( exp ) {stmt-sequence} elseifelse-stmt | else  {stmt-sequence}

LL1

If-stmtàif ( exp ) { stmt } more-ifelse

more-ifelse àε | else else-stmt

else-stmtàif { stmt } |{ stmt }

//while语句

while-stmtàwhile (exp){stmt-sequence}

LL1

While-stmtàwhile(exp){stmt}

//read语句

read-stmtàread identifier

LL1

Read-stmtàread identifier ;

//write语句

write-stmtàwrite exp

LL1

Write-stmt àwrite exp ;

 

//赋值语句

assign-stmtàidentifier = exp|identifier[number]=exp

LL1

Assign-stmtàidentifier other-assign

Other-assignà=exp; | [number]=exp;

//表达式

expàsimple-exp comparison-op simple-exp | simple-exp

LL1

Expàsimple-exp more-exp

More-expàε|comparison-op simple-exp

 

//比较符

comparison-opà< | == | <>

LL1

comparison-opà< | == | <>

 

//加减操作表达式

simple-expàsimple-exp add-op term| term

LL1

Simple-expàterm more-term

More-termàε | add-op term more-term

//运算符

add-opà+|-

LL1

Add-opà+|-

//乘除操作表达式

termàterm mul-op factor | factor

LL1

Termàfactor more-factor

More-factoràε | mul-op factor more-factor

//乘除运算符

mul-opà*|/

LL1

mul-opà*|/

//因子

factorànumber|identifier|(exp)| identifier[number]

LL1

Factorànumber|identifier more-identifier|(exp)

more-identifieràε | [number]

 number → number-real | number-int

 

 

 

 

 

测试列表

注明:测试类对于正确语法的cmm源程序输出 cmm源码语法分析成功,更不会有任何的异常抛出。但是对于错误语法的cmm源程序,这个界面程序将告诉你“cmm源码语法分析失败

 

测试1

测试cmm源码:

int i; i=1;if(i==1){ read i;}else{read i;}

测试的Testeg1.java程序部分如下:

String str ="int i; " +                                   

           "i=1;if(i==1)" +                                

           "{ read i;}" +                                  

           "else" +                                         

           "{read i;}";                                    

      eg1 parser = new eg1(new StringReader(str));      

   parser.program(); 

  

 

测试结果:通过,无任何异常抛出!

测试2     

测试cmm源码:

/*测试数组和变量声明*/int i_3[],i;i=1;/**测试while*/while(i==2){i=2;}/*测试表达式*/i=2*(3+4);

测试的Testeg1.java部分代码如下

String str ="/**测试数组" +"*/"+

         "int i_3[],i;i=1;" +

         "/**测试while*/ " +

         "while(i==2){i=2;}" +

         "i=2*(3+4);"; 

     eg1 parser = new eg1(new StringReader(str));

     parser.program(); 

 

 

测试结果:通过,无任何异常抛出!

测试2     

测试cmm源码:

//测试数组的赋值以及数组参与运算

i_3[2]=0;i=i_3[1]+2;

试结果:通过,无任何异常抛出!

 

 Javacc 语法分析

遇到的问题:

问题1

第一次写的cmm语法含有左递归:而Javacc不支持左递归,于是把左递归换成右递归,比如:

stmt-sequenceà stmt-squence ; statement | statement

改成如下形式:

stmt-sequenceà statement | statement; stmt-squence 对应javacc中的写法如下:

void stmt_sequence() : {}

{

   statement()

   (

     < SEMICOLON > stmt_sequence()

)*

 

}

但是又出现了新的警告信息,如下描述:

Warning: Choice conflict in (...)* construct

原因不是很明白,但是通过交流只知道怎么解决这个问题,

Javaccoptions设置中添加如下语句:

LOOKAHEAD =2 ; 就解决了问题。

 

问题2

测试如下cmm源码时

/**测试*/int i_3[],i;i=1;/**测试while*/while(i==2){i=2;}i=2*(3+4);

出现如下错误:

Exception in thread "main" exercise.grammer.ParseException: Encountered " ";" "; "" at line 1, column 21.

Was expecting:

"," ...

根据错误提示为:期望的是逗号。问题在声明语句;查看declare_stmt 的代码

void declare_stmt():{}

{

  < INT > < IDENTIFIER >

  (

    <LBRACKET ><RBRACKET >

  )?

  (

    idetifierMutil()

  )?<SEMICOLON >

 

|   < REAL >< IDENTIFIER >

(

  <LBRACKET ><RBRACKET >

)?

(

  idetifierMutil()

)?<SEMICOLON >

 

}

void idetifierMutil():{}

{

  < COMMA > <IDENTIFIER >

  (

    <LBRACKET ><RBRACKET >

  )?idetifierMutil()

}

将其更改为如下:

void declare_stmt():{}

{

 

  < INT > < IDENTIFIER >

  (

    <LBRACKET ><RBRACKET >

  )?

  (

    idetifierMutil()

  )*<SEMICOLON >

 

|   < REAL >< IDENTIFIER >

(

  <LBRACKET ><RBRACKET >

)?

(

  idetifierMutil()

)*<SEMICOLON >

 

}

void idetifierMutil():{}

{

  < COMMA > <IDENTIFIER >

  (

    <LBRACKET ><RBRACKET >

  )?

}

于是把声明变量和声明数组的问题解决!

 

问题3

在测试数组过程中.

测试cmm代码: i_3[2]=0;i=i_3[1]+2;

抛出如下异常:

Exception in thread "main" exercise.grammer.ParseException: Encountered " "[" "[ "" at line 1, column 15.

Was expecting:

    ";" ...

i_3[1]这个位置出错了。原因在数组参与运算

Cmm文法

factorànumber|identifier|(exp)| identifier[number]

对应的javacc代码修改为如下:

void factor():{}

{

  < INT_LITERAL >

| < REAL_LITERAL>

| < IDENTIFIER >

(

  < LBRACKET > < INT_LITERAL > < RBRACKET >

)?

| < LPAREN > exp() < RPAREN >

 

}

 

重新测试,通过.

 

加上javacc源码:

 

/**
 * JavaCC file
 */
 
options {
  JDK_VERSION = "1.5";
  LOOKAHEAD = 2 ;
  FORCE_LA_CHECK =false;
}
PARSER_BEGIN(eg1)
package exercise.grammer;

public class eg1 {
 public static void main(String args[]) throws ParseException {
    eg1 parser = new eg1(System.in);
     try {
        switch (eg1.program()) {
        case 0:
          System.out.println("OK.");
          break;
        default:
          System.out.println("Goodbye.");
          break;
    }
      } catch (Exception e) {
        System.out.println("NOK.");
        System.out.println(e.getMessage());
        eg1.ReInit(System.in);
      } catch (Error e) {
        System.out.println("Oops.");
        System.out.println(e.getMessage());
       
      }
  
  }
}
PARSER_END(eg1)
SKIP :
{
 
  " "
| "\t"
| "\n"
| "\r"
| "\f"

}
/* COMMENT */
MORE :
{
  "//" : IN_SINGLE_LINE_COMMENT
|
  <"/**" ~["/"]> { input_stream.backup(1); } : IN_FORMAL_COMMENT
|
  "/*" : IN_MULTI_LINE_COMMENT
}

<IN_SINGLE_LINE_COMMENT>
SPECIAL_TOKEN :
{
  <SINGLE_LINE_COMMENT: "\n" | "\r" | "\r\n" > : DEFAULT
}

<IN_FORMAL_COMMENT>
SPECIAL_TOKEN :
{
  <FORMAL_COMMENT: "*/" > : DEFAULT
}

<IN_MULTI_LINE_COMMENT>
SPEC
  

你可能感兴趣的:(jdk,thread,F#,CMM)