COMP2022 Assignment2 课业·解析

COMP2022Assignment2课业解析

题意:

考察LL(1)文法的相关知识及实现基于预测分析表方法的LL(1)语法分析器

解析:

第一题分别要求列出给定文法G的终止符、非终止符、最左推导字符串及构建其语法树;第二题用泵引理证明文法是否非正则;第三题证明给定文法不是LL(1)文法,提示:存在左递归;第四题消除左递归和回溯,构造等价的LL(1)文法;第五题构造预测分析表;第六题编程实现预测分析表表驱动的LL(1)文法分析器;第七部分实现附加功能,判断出一些类型的语法错。符号串的分析流程图:

Amaze:
[图片]
COMP2022 Assignment2 课业·解析_第1张图片
第一题的最左推导:
COMP2022 Assignment2 课业·解析_第2张图片
涉及知识点:

LL(1)文法、预测分析表、泵引理

更多可加微信讨论

微信号:wanqcwanqcwanqc0219

pdf

COMP2022: Assignment 2
Due: 23:59pm Sunday 20th October 2019 (end week 10)
1 The grammar G [10%]
Consider the following grammar G, which represents a fragment of a simple programming language:
S ! SL j "
L ! A; j E; j C;
E ! (EBE) j N j V
A ! let V =E
C ! while E do S j while E do S else S
B ! + j - j * j >
V ! x j y j z
N ! ND j N0 j D
D ! 1 j 2 j 3 j 4 j 5 j 6 j 7 j 8 j 9 (you can denote this [1-9])
In this assignment, you can ignore whitespace in the strings (i.e. spaces and newlines are not part of the
language, but should be used for the sake of readability).
i) List the variables of G
ii) List the terminals of G
iii) Give a leftmost derivation of the string let x=(y-20); while 1 do y;;
iv) Draw a parse tree for the string: let x=(y-20); while 1 do y;;
2 Prove that L(G) is not regular [5%]
Prove that the language generated by G is not a regular language, by contradicting the Pumping Lemma.
3 Prove G is not LL(1) [5%]
Prove that G is not an LL(1) grammar.
4 Find an equivalent LL(1) grammar G′ [10%]
Find an equivalent grammar G′ which is LL(1), by using the grammar transformation techniques shown
in lectures, or otherwise. Describe the process and show your working.
1
5 LL(1) parse table [15%]
Complete the LL(1) parse table for G′. Describe the process and show your working, including:

  1. FIRST sets for all the production rules of G′
  2. FOLLOW sets for variables of G′ only if they are needed
    6 Implementation [25%]
    Implement a program which parses strings using an LL(1) table driven parser, using the table you
    determined for G′ in the previous exercise. You may use Python, Java, C, C++, or Lisp. If you’d like to
    use a different language then please check with us frst.
    • Input:
    The frst command line argument is the flename of a fle containing the string of characters to test.
    • Output:
  3. Print a trace of the execution, showing the steps followed by the program as it performs the
    left-most derivation. This should look similar to parsing the string through a PDA. An example
    of this is given in the appendices.
  4. After parsing the whole input fle, print ACCEPTED or REJECTED, depending on whether or not
    the string could be derived by the grammar.
  5. If there is a symbol in the input string which is not a terminal from the grammar, the program
    should output ERROR_INVALID_SYMBOL (This could be during or before trying to parse the
    input.)
    • All whitespace in the input fle should be ignored (line breaks, spaces, etc.). The output will be easier
    to read if you remove the whitespace before starting the parse.
    • Examples of the program output syntax are provided in the appendices.
    7 Extension [30%]
    You may pick one of the following extensions to improve your parser. Any one of the following ideas
    would be sufcient (do not implement more than one). They are listed roughly in order of increasing
    difculty/effort, but are worth the same marks:
    a) Use the FIRST and FOLLOW sets to implement an error recovery feature. This should give the user
    suggestions on possible corrections which could change strings which could not be derived in G′ For
    this extension you need to:
    • Implementation: If a second command line argument error is given, then instead of rejecting
    a string that is not in the language, it should make suggestions about how to correct it. The
    user chooses one of the options, then the program should continue the parse. Some examples
    are provided in the appendices. This should be included in the code you submit to PASTA.
    • Report: Explain how your error recovery feature uses the FIRST and FOLLOW sets to work,
    and show some useful examples.
    b) Extend your program to be able to evaluate the input program, if a second command line argument
    eval is given. Some examples are provided in the appendices, and some supplementary material will
    be provided on Ed to explain the algorithm needed to build and evaluate the parse tree. This should
    be included in the code you submit to PASTA.
    • Implementation:
    2
    – Build a parse tree as it performs the leftmost derivation.
    – Evaluate that parse tree.
    • Report: Give some examples of usage
    c) Implement a second parsing algorithm. You could implement:
    • the CYK algorithm, or
    • a LR(1) parser
    • Note: a recursive descent parser would not be sufcient (too easy).
    You will need to submit your code to PASTA, as well as including some details in your report
    comparing your second parser to the frst one (what are the advantages and disadvantages of this
    parser compared to the LL(1) table driven parser.)
    d) Something else? Check with your tutor to see if they think it’s appropriate.
    8 Submission details
    Due 23:59pm Sunday 20th October 2019. The late submission policy is detailed in the administrivia lecture
    slides from week 1. Please notify us if you intend to make a late submission, or if you believe you will not
    be able to submit, to make it easier for us to support you.
    8.1 Canvas submission
    You must submit a report as a single document (.pdf or .docx) to TurnItIn via Canvas. The written parts
    of the report must be text, not images of hand-writing. Any diagrams can be images, of course. The report
    should include:
    • Task 1 – 5: Your answers, working, and explanations
    • Task 6: A description of the testing runs you used, including examples of the output. Show enough
    to convince the marker that your testing was comprehensive
    • Task 7: A description of your extension, including documentation about how to use it and some
    examples of input/output
    8.2 PASTA submission
    Exact submission details (such as requirements on the name of the program) will be provided next week.
    • Task 6: submit your source code.
    • Task 7: submit your source code, if relevant
    The visible tests for task 6 (the implementation) will be just enough to show you that you have got
    the input/output syntax correct. It will not test the correctness of your program (you are expected to test
    that yourself!)
    There will be no automatic testing of any extra functionality you added as an extension.
    8.3 Marking criteria
    The weight of marks for each question are noted next to each question.
    • Tasks 1 – 5: The marks will be roughly evenly divided between correctness, and on your working
    and explanations.
    • Task 6: The marks will be roughly evenly divided between automatic marking (for correctness) and
    hand marking (based on your testing, the quality of your explanations, code, etc.)
    • Task 7: The extension will be hand marked.
    3
    9 Appendices
    9.1 Example of string derivations
    Suppose we had a program which parsed this grammar fragment (your grammar will be different!):
    E ! (F ) j T (start variable)
    F ! +EE
    T ! 0 j 1 j 2 j 3
    Example 1: For the input:

    (+12)
    ✝ ✆
    The output might look like this (note: it doesn’t need to line up neatly):

    (+12)$ E$
    (+12)$ (F)$
    +12)$ F)$
    +12)$ +EE)$
    12)$ EE)$
    12)$ VE)$
    12)$ 1E)$
    2)$ E)$
    2)$ V)$
    2)$ 2)$
    )$ )$
    $ $
    ACCEPTED
    ✝ ✆
    Example 2: For the input:

    ( +
    1
    3 )
    ✝ ✆
    The output is the same, because we ignore all the whitespace:

    (+12)$ E$
    (+12)$ (F)$

    ACCEPTED
    ✝ ✆
    Example 3: For the input:

    (++3)
    ✝ ✆
    We get:

    (++3)$ E$
    (++3)$ (F)$
    ++3)$ F)$
    ++3)$ +EE)$
    +3)$ EE)$
    REJECTED
    ✝ ✆
    It is rejected because there would be no entry in the parse table for (E; +).
    4
    9.2 Extension option A (error recovery)
    Reminder: your code should only run the extension if a second command line argument error is given.
    Otherwise it should behave normally.
    This appendix shows a few examples of what using an error recovery feature might look like. Your
    output and features do not need to be identical to these examples, it is only a guideline to give you some
    ideas.
    Suppose we had a program which parsed this grammar fragment (your grammar will be different!):
    E ! (F ) j T (start variable)
    F ! +EE
    T ! 0 j 1 j 2 j 3
    There are three key steps to implementing error recovery. More marks will be awarded to more useful/sophisticated features.
    9.2.1 Identifying the error
    The frst step is to be able to describe the error. Examples:

    (+1+)$ E$
    (+1+)$ (F)$
    +1+)$ F)$
    +1+)$ +EE)$
    1+)$ EE)$
    1+)$ VE)$
    1+)$ 1E)$
    +)$ E)$
    Error: got +, but expected {0, 1, 2, 3, (}.
    REJECTED
    ✝ ✆

    1)$ E$
    1)$ T$
    1)$ 1$
    )$ $
    Error: got ), but expected . R E J E C T E D ✝ ✆ 9.2.2 R e c o v e r i n g w i t h u s e r i n t e r v e n t i o n T h e n e x t s t e p w o u l d b e t o g i v e t h e u s e r t h e o p t i o n t o c o r r e c t t h e e r r o r a n d a l l o w t h e d e r i v a t i o n t o c o n t i n u e . F o r e x a m p l e , y o u m i g h t h a v e a n o p t i o n t o d e l e t e t h e n e x t i n p u t s y m b o l : ✞ ( + 123 ) . REJECTED ✝ ✆ 9.2.2 Recovering with user intervention The next step would be to give the user the option to correct the error and allow the derivation to continue. For example, you might have an option to delete the next input symbol: ✞ (+123) .REJECTED9.2.2RecoveringwithuserinterventionThenextstepwouldbetogivetheusertheoptiontocorrecttheerrorandallowthederivationtocontinue.Forexample,youmighthaveanoptiontodeletethenextinputsymbol:(+123) E$
    (+123)$ (F)$
    +123)$ F)$
    +123)$ +EE)$
    123)$ EE)$
    123)$ VE)$
    123)$ 1E)$
    23)$ E)$
    23)$ V)$
    23)$ 2)$
    5
    3)$ )$
    Error: got 3, but expected {)}.
    Delete input? Y
    )$ )$
    $ $
    ACCEPTED (+12)
    ✝ ✆
    It’s a bit more powerful if we also allow the user to insert a symbol:

    (+12$ E$
    (+12$ (F)$
    +12$ F)$
    +12$ +EE)$
    12$ EE)$
    12$ VE)$
    12$ 1E)$
    2$ E)$
    2$ V)$
    2$ 2)$
    $ )$
    Error: got , b u t e x p e c t e d ) . A d d i n p u t ? ) ) , but expected {)}. Add input? ) ) ,butexpected).Addinput?)) )$
    $ $
    ACCEPTED (+12)
    ✝ ✆
    Better still, would be to allow the user to choose to delete or insert (so they can replace a symbol)
    9.2.3 Making useful suggestions
    You might automatically suggest which is the ‘best’ correction. For example, you might have the program
    explore each option automatically, then recommend the correction which leads to an accepted string using
    a minimal number of changes. For performance reasons, it would be wise to limit the maximum number
    of changes to some fairly small number (e.g. no more than 5 changes).
    6
    9.3 Extension option B (evaluation)
    Reminder: your code should only run the extension if a second command line argument eval is given.
    Otherwise it should behave normally.
    9.3.1 Semantics of the language:
    • S := a sequence of one or more commands to execute.
    • E := an expression to evaluate, possibly including variables
    • (EBE) := apply the two expressions to some binary operator.
    – arithmetic is normal integer arithmetic (e.g. 5 - 3 is -2)
    – > := 1 if the inequality is true, 0 otherwise
    • N := a strictly positive integer.
    • V := a variable label.
    • let V = E; := assign the result of evaluating E, to V
    • E; := output (print) the result of evaluating E
    • while E do S; := as long as E evaluates to a non-zero number, do S
    • while E do S else S; := as long as E evaluates to a non-zero number, do the frst S. If E evaluates
    to zero (possibly after some repetitions), do the second S.
    • If there is more than one expression in the program, they should be evaluated in order. The only
    output is the E; statements!
    9.3.2 Examples of output:

Input Output Why
(5+3); 8
(5-3); -2
let x=1; there is no E; statement
let x=1;x; 1
1;2; 1
2 Statements are evaluated in order
let x=3;
while x do let x=(x-1); x;; 2 1
let x=3;
while x do let x=(x-1); x;
else 9;; 2 1 9
let x=3;
while x do let x=(x-1); x;
else 9;;
4; 2 1 9 4
let x=5; let y=1;
while (x>1) do let y=(x*y); let x=(x-1);;
y; 120 5! = 120
let x=5;
while (x>1) do let x=(x-1);;
x; 1 Variables are not scoped

7
9.3.3 More information:
In the resources section on Ed, in the category “Supplementary Material”, there are some resources to help
you:
• Slides explaining the concepts
• A worked example of the process of building a parse tree following this algorithm
• Some code examples of evaluating a parse tree for a simpler grammar (in Python, Java and C++)
8

你可能感兴趣的:(COMP2022 Assignment2 课业·解析)