Скачать книгу
of in order. For example, the automaton in Figure 5.2 generates someone really ran, but not someone ran really or someone ran quickly.
To see the connection between Type 3 rewriting grammars and finite‐state automata, we associate states with nonterminal symbols, and in particular associate the start state of an automaton with the starting nonterminal of a grammar. The grammar's rewrite rules correspond to the automaton's transitions. Consider for example the transition from state A to state B emitting someone. If we think of each nonterminal as characterizing all strings that can be used to move the automaton from state to an accepting state, then the rewrite rule “A someone B” says that a string can move the automaton from state A to an accepting state if it's made up of the symbol someone followed by a string that moves from state B to an accepting state. For transitions into an accepting state we include an additional rule with no nonterminal on the right hand side, such as “B ran”.
The string‐rewriting derivation and the tree structure in Figure 5.3 both show that someone ran and ran really quickly is generated by both the rewriting grammar and the automaton in Figure 5.2.
Figure 5.2 A simple finite‐state grammar, represented graphically and as rewrite rules
Figure 5.3 Two equivalent representations of how the grammar in Figure 5.2 generates the string someone ran and ran really quickly
The central idea in understanding the capabilities and limitations of FSGs is what I will call a string's forward set. Adopting the automaton perspective, the forward set of a string is the set of states that the automaton could reach from its initial state by taking some sequence of transitions that produce . For example, if the automaton in Figure 5.2 has, from its initial state, taken some sequence of transitions that produces someone ran, then the only state that it might be in is C; so the forward set of someone ran (for this automaton) is . Similarly, the forward set of someone really is , the forward set of someone ran and is , and the forward set of someone and is the empty set. The automaton generates a string if and only if the forward set of the string contains an accepting state.
Importantly, knowing the forward set of some initial part of a string (i.e. a prefix of the string, in formal terms) provides a head‐start toward calculating the forward set of the entire string. For example, suppose we are told that the forward set (using the grammar in Figure 5.2) of some particular string is . Then, writing for string concatenation, it is straightforward to see that the forward set of the string is , and that the forward set of is . We can be sure of this without knowing anything about except its forward set. The forward set captures everything there is to know about how the automaton treats the prefix ; it identifies what we can think of as the category of that subexpression.
This leads us to the crucial idea of intersubstitutability mentioned in the introduction. Notice that someone ran and someone really ran both have the same forward set, namely . So for any string , the forward set of must be the same as the forward set of – because in each case, all that matters is how the string can “move us forward” from whatever states happen to be in the common forward set of and . Furthermore, the two strings and are either both generated by the grammar, or neither is, since these two strings' common forward set either does or doesn't contain an accepting state.