自己开发计算器(0)-扩展巴科斯范式(EBNF)

基本信息 

扩展巴科斯-瑙尔范式(EBNF)是表达作为描述计算机编程语言和形式语言的正规方式的上下文无关文法的元语法符号表示法。它是基本巴科斯范式(BNF)元语法符号表示法的一种扩展。
它最初由尼古拉斯·沃斯开发,最常用的 EBNF 变体由标准,特别是 ISO-14977 所定义。
 
在这里我们介绍EBNF的一种形式,它由W3C定义。我们可以在 XML Path Language (XPath) 2.0 (Second Edition)中找到它的细节,内容如下
 
 A.1.1 Notation
The following definitions will be helpful in defining precisely this exposition.
[Definition: Each rule in the grammar defines one symbol, using the following format:
symbol ::= expression
]
 
[Definition: A terminal is a symbol or string or pattern that can appear in the right-hand side of a rule, but never appears on the left hand side in the main grammar, although it may appear on the left-hand side of a rule in the grammar for terminals.]
 
The following constructs are used to match strings of one or more characters in a terminal:
 
[a-zA-Z]
matches any Char with a value in the range(s) indicated (inclusive).
 
[abc]
matches any Char with a value among the characters enumerated.
 
[^abc]
matches any Char with a value not among the characters given.
 
"string"
matches the sequence of characters that appear inside the double quotes.
 
'string'
matches the sequence of characters that appear inside the single quotes.
[http://www.w3.org/TR/REC-example/#NT-Example]
matches any string matched by the production defined in the external specification as per the provided reference.

Patterns (including the above constructs) can be combined with grammatical operators to form more complex patterns, matching more complex sets of character strings. In the examples that follow, A and B represent (sub-)patterns.
 
(A)
A is treated as a unit and may be combined as described in this list.
 
A?
matches A or nothing; optional A.
 
A B
matches A followed by B. This operator has higher precedence than alternation; thus A B | C D is identical to (A B) | (C D).
 
A | B
matches A or B but not both.
 
A - B
matches any string that matches A but does not match B.
 
A+
matches one or more occurrences of A. Concatenation has higher precedence than alternation; thus A+ | B+ is identical to (A+) | (B+).
 
A*
matches zero or more occurrences of A. Concatenation has higher precedence than alternation; thus A* | B* is identical to (A*) | (B*) 

 

如何使用EBNF

最基本的用法,我们可以用来定义我们自己的语言。这听起来好像是一件很了不起的事情,其实也很普通。比如我们如果想作一个计算器,它可以解析输入的文本,并计算结果。例如如果我们输入

1+(2*3+4)/5

我们就可以得到结果3,当然我们希望可以输入以下的内容并得到相应的结果。

 1+sinr(2)*root(4,2)

如果我们要做这样一个解析器,就面临这一个如何描述我们所面的需求的问题。答案不言自明,我们可以用EBNF,就像下面这样。

[1] Expr::= AdditiveExpr
 
[2] AdditiveExpr::=MultiplicativeExpr ( ("+" | "-") MultiplicativeExpr )*
 
[3] MultiplicativeExpr::= UnaryExpr ( ("*" | "/" | "%" ) UnaryExpr)*
 
[4] UnaryExpr::=("-" | "+")* PrimaryExpr
 
[5] PrimaryExpr::= NumericLiteral | ParenthesizedExpr | FunctionCall
 
[6] NumericLiteral::=IntegerLiteral | DecimalLiteral | DoubleLiteral
 
[7] ParenthesizedExpr::="(" Expr? ")"
 
[8] FunctionCall::=FunctionName "(" (Expr(","Expr)*)? ")"
 
[9] IntegerLiteral ::=Digits
 
[10] DecimalLiteral ::=("." Digits) | (Digits "." [0-9]*)
 
[11] DoubleLiteral::=(("." Digits) | (Digits ("." [0-9]*)?)) [eE] [+-]? Digits
 
[12] Digits ::=[0-9]+
 
[13] FunctionName=sinr
                |sind
                |cosr
                |sind
                |tanr
                |tand
                |asinr
                |asind
                |acosr
                |acosd
                |atanr
                |atand
                |power
                |root

可以结合本文前半部分的说明来理解。 一开始可能不大习惯,但看过一两个表达式后就会发现其实想法都一样。EBNF高就高在这个地方。

最后补充一点,这并不是本人的原创,而是将XML Path Language (XPath) 2.0 (Second Edition)的内容简化后修改而成的。更多信息请参照原文。

其他关联文章请参考。

  • 自己开发计算器(1)-准备开发环境
  • 自己开发计算器(2)-全新的操作方式
  • 自己开发计算器(3)-140行代码搞定Token解析
  • 自己开发计算器(4)-完成!源代码公开!

参考资料资料

维基百科:扩展巴克斯范式

Wikipedia:Extended Backus–Naur Form

W3C:在XML规格书中定义的W3C的扩展巴克斯范式

你可能感兴趣的:(职场,w3c,休闲,扩展巴克斯范式,EBNF)