A monolithic object’s behavior is a function of its state, and it must change its behavior at run-time depending on that state. Or, an application is characterixed by large and numerous case statements that vector flow of control based on the state of the application.
The State pattern is a solution to the problem of how to make behavior depend on state.
The State pattern does not specify where the state transitions will be defined. The choices are two: the “context” object, or each individual State derived class. The advantage of the latter option is ease of adding new State derived classes. The disadvantage is each State derived class has knowledge of (coupling to) its siblings, which introduces dependencies between subclasses.
A table-driven approach to designing finite state machines does a good job of specifying state transitions, but it is difficult to add actions to accompany the state transitions. The pattern-based approach uses code (instead of data structures) to specify state transitions, but it does a good job of accomodating state transition actions.
The state machine’s interface is encapsulated in the “wrapper” class. The wrappee hierarchy’s interface mirrors the wrapper’s interface with the exception of one additional parameter. The extra parameter allows wrappee derived classes to call back to the wrapper class as necessary. Complexity that would otherwise drag down the wrapper class is neatly compartmented and encapsulated in a polymorphic hierarchy to which the wrapper object delegates.
The State pattern allows an object to change its behavior when its internal state changes. This pattern can be observed in a vending machine. Vending machines have states based on the inventory, amount of currency deposited, the ability to make change, the item selected, etc. When currency is deposited and a selection is made, a vending machine will either deliver a product and no change, deliver a product and change, deliver no product due to insufficient currency on deposit, or deliver no product due to inventory depletion.
this
pointer is passed. This session consists of the development of a small application to read and pretty-print XML and CSV files. Along the way, we explain and demonstrate the use of the following patterns: State, Interpreter, Visitor, Strategy, Command, Memento, and Facade.
Allow an object to alter its behaviour when its internal state changes. The object will appear to change its class
The State pattern is used when an object’s behaviour changes at run-time depending on its state. Indicators of the potential for using the pattern are long case statements or lists of conditional statements (the Switch Statements “bad smell”, to use refactoring parlance). In Delphi (as in most languages) a given object cannot actually change its class, so we have to use other schemes to mimic that behaviour, as we shall see.
The participants in an implementation are the context and the states. The context is the interface presented to clients of the subsystem being modelled by the State pattern. In our case this will be the TCsvParser
class. Clients will never see the states, allowing us to change them at will. The only interface client subsystems are interested in is extracting the fields from a line of text.
We do this by using a finite state machine (FSM). Essentially, an FSM is a model of a set of states. From each state, particular inputs can cause transitions to other states. There are two sorts of special states. The Start state is the state the FSM is in before beginning work. End states are those where the processing finishes, and are usually denoted by double circles. The FSM for the parser is shown below:
In the State pattern, each of the states becomes a subclass of the base state class. Each subclass must implement the abstract method ProcessChar
which handles the input character and decides on the next state.
The interface section source code for the State pattern code to parse CSV files is:
1: unit CsvParser;
2:3: interface
4: uses Classes;
5:6: type
7:8: TCsvParser = class; // Forward declaration
9: TParserStateClass = class of TCsvParserState;
10:11: TCsvParserState = class(TObject)12: private13: FParser : TCsvParser;14:15: procedure ChangeState(NewState : TParserStateClass);
16: procedure AddCharToCurrField(Ch : Char);
17: procedure AddCurrFieldToList;
18: public19: constructor Create(AParser : TCsvParser);
20: procedure ProcessChar(Ch : AnsiChar;Pos : Integer); virtual; abstract;21: end;
22:23: TCsvParserFieldStartState = class(TCsvParserState)24: private25: public26: procedure ProcessChar(Ch : AnsiChar;Pos : Integer); override;
27: end;
28:29: TCsvParserScanFieldState = class(TCsvParserState)30: private31: public32: procedure ProcessChar(Ch : AnsiChar;Pos : Integer); override;
33: end;
34:35: TCsvParserScanQuotedState = class(TCsvParserState)36: private37: public38: procedure ProcessChar(Ch : AnsiChar;Pos : Integer); override;
39: end;
40:41: TCsvParserEndQuotedState = class(TCsvParserState)42: private43: public44: procedure ProcessChar(Ch : AnsiChar;Pos : Integer); override;
45: end;
46:47: TCsvParserGotErrorState = class(TCsvParserState)48: private49: public50: procedure ProcessChar(Ch : AnsiChar;Pos : Integer); override;
51: end;
52:53: TCsvParser = class(TObject)54: private55: FState : TCsvParserState;56: // Cache state objects for greater performance
57: FFieldStartState : TCsvParserFieldStartState;58: FScanFieldState : TCsvParserScanFieldState;59: FScanQuotedState : TCsvParserScanQuotedState;60: FEndQuotedState : TCsvParserEndQuotedState;61: FGotErrorState : TCsvParserGotErrorState;62: // Fields used during parsing
63: FCurrField : string;
64: FFieldList : TStrings;65:66: function GetState : TParserStateClass;
67: procedure SetState(const Value : TParserStateClass);68: protected69: procedure AddCharToCurrField(Ch : Char);
70: procedure AddCurrFieldToList;
71: property State : TParserStateClass read GetState write SetState;72: public73: constructor Create;
74: destructor Destroy; override;
75:76: procedure ExtractFields(const s : string;AFieldList : TStrings);77: published78: end;
If we examine the parser class first, we see that we have a private instance of each of the state subclasses. In our case, where we could be parsing very long files, and the state is changing frequently, it makes sense to create all the objects once, and keep track of the current state.
If you have a situation where you have very many states (which is when this pattern really starts making a difference), especially if they are only needed occasionally, then it makes more sense to create and free the states on the fly. This might be an opportunity to use the automatic garbage collection property of interfaces, but be careful not to mix class and interface access to the state objects. It might also be a time to consider the Flyweight pattern (I’m going to refer you to the GoF for that).
Note that we are keeping track of the state using the class of the current state object. We can use a protected property (an example of the Self Encapsulate Field refactoring, as it happens) to access the field. The parser class also keeps the current field and the list of extracted fields. The states will use the protected methods to update them.
The states can manage this because the parser is passed as a parameter in the constructor. It is quite common for state objects to need access to the context in which they are being used. The base abstract state class defines methods for changing state, and updating the parser. Descendant classes only need to implement the character processing routine.
Let’s have a look at one of these routines, for the start state.
1: procedure TCsvParserFieldStartState.ProcessChar(Ch : AnsiChar;Pos : Integer);
2: begin
3: case Ch of4: '"' : ChangeState(TCsvParserScanQuotedState);
5: ',' : AddCurrFieldToList;6: else
7: AddCharToCurrField(Ch);8: ChangeState(TCsvParserScanFieldState);9: end;
10: end;
11:
If we get a double quote, then the FSM goes into the Scan Quoted
state, a comma means we have come to the end of the field, so we should add it to the list, and anything else means we are starting a new field.
However, in the Scan Quoted
state shown below, the transition when we get a double quote is different. This is what we mean by the behaviour depending on the state.
1: procedure TCsvParserScanQuotedState.ProcessChar(Ch : AnsiChar;Pos : Integer);
2: begin
3: if (Ch = '"') then begin4: ChangeState(TCsvParserEndQuotedState);5: end else begin6: AddCharToCurrField(Ch);7: end;
8: end;
The rest of the code is quite straightforward. The only slightly different state is theError
state, where we raise an exception. The parser has one long method, only because it has to handle validity checks, setting up, and so on. The essential lines ofExtractFields
are:
1: // Read through all the characters in the string
2: for i := 1 to Length(s) do begin3: // Get the next character
4: Ch := s[i];5: FState.ProcessChar(Ch,i);6: end;
This reads through the input line s, sending each character to the current state. Some sort of processing loop like this is not uncommon. I’ll leave the rest of the code to go through at your leisure. It’s all in CsvParser.pas.