课程源地址
第二周(下) 有穷自动机
Week 2.2: Finite Automata
正则表达式和有穷自动机的区别
- Regular expressions = specification
- Finite automata = implementation
有穷自动机结构
A finite automaton consists of
— An input alphabet ∑ \sum ∑
— A finite set of states S S S
— A start state n n n
— A set of accepting states F ⊆ S F \subseteq S F⊆S
— A set of transitions s t a t e    → i n p u t s t a t e state\;\rightarrow^{input}state state→inputstate
The example of transition
- Transition
S 1    → a S 2 S_1\;\rightarrow^a S_2 S1→aS2
- Is read
I n    s t a t e    S 1    o n    i n p u t    a    g o    t o    s t a t e    S 2 In\;state\;S_1\;on\;input\;a\;go\;to\;state\;S_2 InstateS1oninputagotostateS2
- If end of input and in accepting state ⇒ \Rightarrow ⇒ accept
- Otherwise ⇒ \Rightarrow ⇒ reject
- terminates in state S ∉ F S\notin F S∈/F
- gets stuck
It finds itself in a state and there’s no transition of that state on the input.
有穷自动的图形化表示
Examples
I: A finite automaton that accepts only “1”
State |
Input |
A |
↑ 1 _\uparrow1 ↑1 |
B |
1 ↑ 1_\uparrow 1↑ |
Result: Accept
State |
Input |
A |
↑ 0 _\uparrow0 ↑0 |
Result: Reject
State |
Input |
A |
↑ 10 _\uparrow10 ↑10 |
B |
1 ↑ 0 1_\uparrow0 1↑0 |
Result: Reject
Note: Language of a FA ≡ \equiv ≡ set of accepted strings
II: A finite automaton accepting any number of 1’s followed by a single 0 (Alphabet: { 0 , 1 } \left\{ 0, 1 \right\} {0,1})
State |
Input |
A |
↑ 110 _\uparrow110 ↑110 |
A |
1 ↑ 10 1_\uparrow10 1↑10 |
A |
1 1 ↑ 0 11_\uparrow0 11↑0 |
B |
110 110 110 |
Result: Accept
State |
Input |
A |
↑ 100 _\uparrow100 ↑100 |
A |
1 ↑ 00 1_\uparrow00 1↑00 |
B |
1 0 ↑ 0 10_\uparrow0 10↑0 |
Result: Reject
III: Another kind of transition: ε \varepsilon ε-moves
State |
Input |
A |
x 1 ↑ x 2 x 3 x_{1} {_\uparrow} x_2x_3 x1↑x2x3 |
B |
x 1 ↑ x 2 x 3 x_1 {_\uparrow} x_2x_3 x1↑x2x3 |
有穷自动机区别
- Deterministic Finite Automata (DFA)
- One transition per input per state
- No ε \varepsilon ε-moves
- A DFA takes only one path through the state graph
- Nondeterministic Finite Automata (NFA)
- Can have multiple transitions for one input in a given state
- Can have ε \varepsilon ε-moves
- An NFA can choose
An NFA accepts if some choices lead to an accepting state.
Example
An NFA can get into multiple states
- Input:          1                          0                              0 \;\;\;\;1\;\;\;\;\;\;\;\;\;\;\;\;0 \;\;\;\;\;\;\;\;\;\;\;\;\;\; 0 100
- States: { A }          { A , B }          { A , B , C } \left\{A\right\}\;\;\;\;\left\{A, B\right\}\;\;\;\;\left\{A, B, C\right\} {A}{A,B}{A,B,C}
If there is a final state in the final set of possible states, then the Nondeterministic machine accepts.
总结
- NFAs and DFAs recognise the same set of languages
- DFAs are faster to execute
- There are no choices to consider
- NFAs are, in general, smaller and more compact
词法分析过程
For each kind of rexp, define an NFA
Notation: NFA for rexp M
- For ε \varepsilon ε
- For input a
- For AB
- For A+B
- For A ∗ ^* ∗
NFA to DFA
The introduction of ε − c l o s u r e \varepsilon-closure ε−closure
The ε − c l o s u r e \varepsilon-closure ε−closure is that pick a states, and can reach by following only epsilon moves.
ε − c l o s u r e ( B ) \varepsilon-closure(B) ε−closure(B) = {B, C, D}
ε − c l o s u r e ( G ) \varepsilon-closure(G) ε−closure(G) = {A, B, C, D, G, H, I}
How many different states?
An NFA may be in many states at any time.
N s t a t e s N states Nstates
∣ S ∣ ≤ N |S|\leq N ∣S∣≤N
There are 2 N − 1 2^N-1 2N−1 possible subsets of n states, and this is a finite set of possible configurations.
What is in NFA?
states S S S
start s ∈ S s\in S s∈S
final F ≤ S F\leq S F≤S
a ( X ) = { y ∣ x ∈ X n      x → a y } a(X) = \left\{y| x\in X_n\;\;x\xrightarrow{a} y \right\} a(X)={y∣x∈Xnxa y}
X is a set of states and a is a character in the input language.
What is in DFA?
states s u b s e t s    o f    S subsets\;of\;S subsetsofS
start ε − c l o s ( S ) \varepsilon-clos(S) ε−clos(S)
final { X ∣ X ∩ F = /   ϕ } \left\{X| X\cap F{=}\mathllap{/\,}\phi \right\} {X∣X∩F=/ϕ}
Those states x and every member of the states of the DFA are sets of states of the NFA
X → a Y      i f      Y = ε − c l o s ( a ( X ) ) X\xrightarrow{a} Y\;\;if\;\;Y=\varepsilon-clos(a(X)) Xa YifY=ε−clos(a(X))
Example
- Find the ε − c l o s u r e \varepsilon-closure ε−closure of the start state.
So the first state of the DFA, the state is the subset of states a, b, c, d, h, i.
- Word out from the start state what happens on each of the impossible input values.
The alphabet of this machine is one and zero, so have two transitions out of the state, one for an input of one and one for an input of zero.
ε − c l o s ( F ) = { F G H I A B C D } \varepsilon-clos(F) = \left\{FGHIABCD\right\} ε−clos(F)={FGHIABCD}
ε − c l o s ( { E , J } ) = { F J G H I A B C D } \varepsilon-clos(\left\{E, J\right\}) = \left\{FJGHIABCD\right\} ε−clos({E,J})={FJGHIABCD}
- The result.
Table-driven implementation of DFA
DFA可以由2D表T实现
A DFA can be implemented by a 2D table T
- One dimension is states
- Other dimension is input symbol
- For every transition S i → a S k S_i\rightarrow^a S_k Si→aSk define T [ i , a ] = k T[i, a] = k T[i,a]=k
该表存储每个特定输入符号i和状态a,机器将移动到的下一个状态k。
The table stores at every particular input symbol i and state a, the next state k that the machine will move to.
Example: 如何将DFA实现为表格
How to realize the DFA as a table.
我们如何在程序中使用这种过渡关系?
How we would use this transition relation in a program?
i = 0;
state = 0;
while(input[i]){
state = A[state, input[i++]];
}
当表中有大量重复行时,使用稍微不同的表示来节省一些空间
Save some space by using a slightly different representation when there were a lot of duplicate rows in the table.
2 N − 1 2^N-1 2N−1 states in a DFA for an NFA with end states
直接使用NFA的优缺点
- 我们也可能根本不想转换为DFA,可能是我们提供的特定规范变成DFA非常复杂,同时该表变得非常庞大。我们直接使用NFA可能会更好。
It is also possible that we might not want to convert to a DFA at all. It might be that the particular specification we gave is very expensive to turn into a DFA. The table has just become truly huge and we might be better off just using the NFA directly.
- 但是用于模拟这个自动机的内循环会更加复杂,因为我们必须处理多组状态而不是单个状态。
But the inner loop for simulating this automaton is gonna be much more expensive because we have to deal with sets of states rather than single states.
- 虽然这在表中节省了大量空间,但就表的大小而言,执行起来要比确定性自动机慢得多。
While this saves a lot of space in terms of the tables, in terms of the size of the tables it can be much slower to execute than deterministic automaton.
总结
- NFA → \rightarrow → DFA conversion is key in the implementation of LA
- Tools trade between speed and space
- DFAs faster, less compact
- NFAs slower, concise
注:个人英文水平有限,如有错误请指正,谢谢!