A finite state automaton is a way of looking at a computer, a way of solving parsing problems and a way of designing special purpose hardware. The automaton at any time in is one of a finite number of internal states. It is a black box. You feed it input, usually a character at a time. It turns its internal crank and applies its rules and goes into a new state and accepts more input. Every once in a while it also produces some output. A simple parser might classify Java source as either java commands or comment. It would have states for having seen /,//, /*, /*… *, /*… */ etc.
Implementations | Books |
Strategy | Links |
Sample Code |
In the days of C, the state was stored as an integer 0 .. n. You had a complicated nest of switch statements that examined both the current state and the input to determine the next state.
In 1.4-, one way of writing finite state automata was to have a singleton class represent each possible state. There is a state variable that represents the current state. You feed the input to a standard method of the class’s interface and it computes the next state. That way states that are very similar can inherit default behaviours. You can have a static method to categorise the input and and a separate instance method to handle each categroy, or use a common method and a switch to dispatch to code based on input category to decide the next state. You don’t need any switch code based on current state. The dynamic method overriding features of Java handle that.
In Java version 1.5 or later, one way to write finite state automata is to use an enum constant to represent each state. A custom next method on each enum constant examines the input and calculates the next state. The various parsers used by JDisplay work using enums. The problem with this approach is you must usestatics for variables shared between states. You can’t instantiate several different finite state automata.
Finite state automata come in two flavours: DFA (Deterministic Finite Automaton) and NFA (Non-deterministic Finite Automaton). NFAs (Non-deterministic Finite Automatons) don’t always give the same answer.
You might wonder what use a non-deterministic program could be. Consider integration in two dimensions by the Monte Carlo method. All you need is a way of determining if a point is inside or outside the area you want to integrate. Then you generate random points and count how many are inside and how many outside. The ratio tells you the ratio of the area inside to outside which when applied to the total area gives you an approximation of the area of the region to integrate. You want a random pattern of test points. Any regular pattern could be thrown off by regularities in the shape of the region you are trying to integrate.
Here is the source code for a the Compactor finite state automaton that removes excess white space fromHTML to make it more compact for transmission. Compactor: compacts a group of files, a single file or a string.
class="jdisplay" src="http://mindprod.com/jgloss/snippet/iframe/Compactor.java.htm" width="800" height="580" style="border-width: 1px; border-style: solid; clear: both; display: block; margin: 8px 0px; padding: 5px; color: rgb(66, 31, 0); font-family: 'Tiresias PCfont Z', 'Palatino Linotype', 'Bookman Old Style', 'Book Antiqua', 'Trebuchet MS', 'Lucida Sans', 'Lucida Sans Unicode', Verdana, serif; font-size: 16px; background-color: rgb(243, 255, 246); overflow: scroll;"> HTMLCharCategory : categorises characters. class="jdisplay" src="http://mindprod.com/jgloss/snippet/iframe/HTMLCharCategory.java.htm" width="800" height="580" style="border-width: 1px; border-style: solid; clear: both; display: block; margin: 8px 0px; padding: 5px; color: rgb(66, 31, 0); font-family: 'Tiresias PCfont Z', 'Palatino Linotype', 'Bookman Old Style', 'Book Antiqua', 'Trebuchet MS', 'Lucida Sans', 'Lucida Sans Unicode', Verdana, serif; font-size: 16px; background-color: rgb(243, 255, 246); overflow: scroll;"> class="jdisplay" src="http://mindprod.com/jgloss/snippet/iframe/TagCategory.java.htm" width="800" height="580" style="border-width: 1px; border-style: solid; clear: both; display: block; margin: 8px 0px; padding: 5px; color: rgb(66, 31, 0); font-family: 'Tiresias PCfont Z', 'Palatino Linotype', 'Bookman Old Style', 'Book Antiqua', 'Trebuchet MS', 'Lucida Sans', 'Lucida Sans Unicode', Verdana, serif; font-size: 16px; background-color: rgb(243, 255, 246); overflow: scroll;"> HTMLState : the finite state automaton that parse the HTML and decides which spaces and new view