A Companion to Chomsky. Группа авторов
an immediate constituent analysis for the generated string. For example, there is no natural way to assign such a tree structure to the string aaaa on the basis of the derivation shown in Figure 5.1. This is roughly because the grammar contains rules that do not simply replace a single symbol with a string (Chomsky and Miller, 1963, p. 294).
Another computational device that had attracted much attention in the 1950s was the finite‐state automaton (FSA) (e.g. Rabin and Scott, 1959). Chomsky (1956) investigated the generative capacity of FSAs and concluded that it was insufficient for natural language, and showed that a certain sort of phrase structure grammar formalizing the notion of immediate constituent analysis could handle at least some cases that FSAs could not.11
The classes of rewriting grammars investigated in Chomsky 1959 are therefore motivated by the interest in “devices with more generative capacity than finite automata but that are more structured (and, presumably, have less generative capacity) than arbitrary Turing machines” (Chomsky, 1963, p. 360), the idea being that “the intermediate systems are those that assign a phrase structure description to the resulting sentence” (Chomsky, 1959, p. 139). A sequence of three increasingly strict restrictions on rewriting grammars (Chomsky, 1959, p. 142) produces a hierarchy of classes of grammars, the broadest of which corresponds to Turing machines and the narrowest of which corresponds to finite‐state automata.
These three restrictions are shown in Table 5.1. In this table,
Table 5.1 Three increasing restrictions on rewriting grammars
Restriction | Required form of rules | ||
---|---|---|---|
Restriction 1, “context‐sensitive” |
|
where |
|
Restriction 2, “context‐free” |
|
where |
|
Restriction 3, “finite‐state” |
|
where |
|
and |
|
To be precise, these are restrictions on the form that individual rules can take, and a grammar is of a certain type if and only if all of its rules satisfy the corresponding restriction. But notice that rules that satisfy Restriction 2 necessarily also satisfy Restriction 1 (by choosing
1 The Type 0 grammars are simply all unrestricted rewriting grammars.The Type 1 grammars are those satisfying Restriction 1.The Type 2 grammars are those satisfying Restriction 2 (and therefore also Restriction 1).The Type 3 grammars are those satisfying Restriction 3 (and therefore also Restrictions 1 and 2).
The grammar in Figure 5.1 is a Type 0 grammar and does not belong to any of the more restricted classes, since rules such as “a B
It turns out that all four classes of grammars are distinct in their generative capacity: at each boundary, there are stringsets that can be generated by a Type
5.3 Type 3 Grammars: Finite State Grammars
As mentioned above, Type 3 grammars were understood from the outset to be a reformulation of finite‐state automata in the setting of rewriting grammars, so I will generally describe them as “finite‐state grammars” (FSGs).
Figure 5.2 shows an example of a FSG in rewrite‐rule format, and also the equivalent finite‐state automaton in a standard graphical representation. A finite‐state automaton works with a specified finite set of states (indicated with circles in the diagram) and a specified set of (terminal) symbols. One of the states is the designated start state (indicated with an unlabeled arrow), and some of the states are designated accepting states (indicated with a double circle). The workings of the automaton are specified as a set of transitions, each associating an ordered pair of states with a symbol, and represented graphically by an arrow.
A finite‐state automaton generates a string