扩展巴科斯-瑙尔范式(Extended Backus–Naur Form,EBNF)是一种用于描述计算机编程语言等正式语言的与上下文无关语法的元语法(metasyntax)符号表示法。简而言之,它是一种描述语言的语言。它是基本巴科斯范式(BNF)元语法符号表示法的一种扩展。
最初由尼克劳斯·维尔特开发,最常用的EBNF变体由标准是 ISO-14977 所定义。
EBNF的基本语法形式如下,这个形式也被叫做production:
左式(LeftHandSide) = 右式(RightHandSide).
左式也被叫做 非终端符号(non-terminal symbol),而右式则描述了其的组成。
1 .使用了如下约定:
2 .用普通字符表示的EBNF操作符按照优先级(顶部为最高优先级)排序为:
*repetition-symbol(重复符)
-except-symbol(除去符)
, concatenate-symbol(连接符)
| definition-separator-symbol
= defining-symbol(定义符)
; terminator-symbol(结束符)
. terminator-symbol(结束符)
3 .以下的括号对(bracket pairs)能够改变优先级,括号对间也有优先级(顶部为最高优先级):
' first-quote-symbol first-quote-symbol ' (* 引用 *)
" second-quote-symbol second-quote-symbol " (* 引用 *)
(* start-comment-symbol end-comment-symbol *) (* 注释 *)
( start-group-symbol end-group-symbol ) (* 分组 *)
[ start-option-symbol end-option-symbol ] (* 可选 *)
{ start-repeat-symbol end-repeat-symbol } (* 重复 *)
? special-sequence-symbol special-sequence-symbol ? (* 特殊序列 *)
下例示范了怎么表达重复:
aa = "A";
bb = 3 * aa, "B";
cc = 3 * [aa], "C";
dd = {aa}, "D";
ee = aa, {aa}, "E";
ff = 3 * aa, 3 * [aa], "F";
gg = {3 * aa}, "D";
这些规则定义的终端字符串如下:
aa: A
bb: AAAB
cc: C AC AAC AAAC
dd: D AD AAD AAAD AAAAD etc.
ee: AE AAE AAAE AAAAE AAAAAE etc.
ff: AAAF AAAAF AAAAAF AAAAAAF
gg: D AAAD AAAAAAD etc.
除了标准的定义,在FREESCALE文档中还使用了以下约定:
以下提供一些示例以直观的理解EBNF。
digit excluding zero = "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" ;
digit = "0" | digit excluding zero ;
natural number = digit excluding zero, { digit } ;
integer = "0" | [ "-" ], natural number ;
digit excluding zero 可以是 1到9任意一个字符,digit则扩展为0到9任意一个字符。
natural number可以是1、2、…、10、…、12345、…,因为{}代表重复任意次,包括0次。
integer则可以是0或者可能带个负号的自然数。
这是用EBNF描述的EBNF自身语法:
Production = NonTerminal "=" Expression ".".
Expression = Term {"|" Term}.
Term = Factor {Factor}.
Factor = NonTerminal
| Terminal
| "(" Expression ")"
| "[" Expression "]"
| "{" Expression "}".
Terminal = Identifier | “"“ “" “.
NonTerminal = Identifier.
非终端符号可以是任意你喜欢的名字,而终端符号则要不然是出现在被描述的语言中的标识符,要不然就是任何被引号括起来的字符序列。
然后Factor(参数)可以是终端字符、非终端字符、三种括号中任意一种括起来的表达式。
Term(术语)由起码一个Factor组合而成……
用EBNF描述的一个只能赋值的类PASCAL编程语言:
(* a simple program syntax in EBNF − Wikipedia *)
program = 'PROGRAM', white space, identifier, white space,
'BEGIN', white space,
{ assignment, ";", white space },
'END.' ;
identifier = alphabetic character, { alphabetic character | digit } ;
number = [ "-" ], digit, { digit } ;
string = '"' , { all characters - '"' }, '"' ;
assignment = identifier , ":=" , ( number | identifier | string ) ;
alphabetic character = "A" | "B" | "C" | "D" | "E" | "F" | "G"
| "H" | "I" | "J" | "K" | "L" | "M" | "N"
| "O" | "P" | "Q" | "R" | "S" | "T" | "U"
| "V" | "W" | "X" | "Y" | "Z" ;
digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" ;
white space = ? white space characters ? ;
all characters = ? all visible characters ? ;
对应的语法正确的程序如下:
PROGRAM DEMO1
BEGIN
A:=3;
B:=45;
H:=-100023;
C:=A;
D123:=B34A;
BABOON:=GIRAFFE;
TEXT:="Hello world!";
END.
[1] Freescale semiconductor. HC(S)08/RS08 and S12(X) Build Tools Utilities Manual. 2010
[2] Extended Backus–Naur form. https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form