A Companion to Chomsky. Группа авторов
rel="nofollow" href="#ulink_9757bd66-83b2-5730-b26d-f9f6dae7508f">19The pair (buy, which book) and the pair (eat, what) are intersubstitutable, in the sense that we can replace the former with the latter in which book did you buy yesterday to produce what did you eat yesterday.
Chomsky's chosen approach to these phenomena, the notion of a grammatical transformation – extensively elaborated elsewhere (e.g., Chomsky, 1957, 1965) but formally somewhat removed from the work described here – was one way to resolve the tension created by the conflation of intersubstitutability and contiguity.20 In the transformational approach, the patterns described in (11) are handled by first deriving a structure in which the co‐dependent elements (e.g., will and come, or John and to be tall) are linearly contiguous, in a base component which functions essentially like a CFG and therefore ties contiguity and intersubstitutability together. This correctly prevents generating an expression that contains will without an accompanying verb like come (see (9)), or contains which book without a verb to select it, or contains the predicate be tall without a subject – but at the cost of grouping these co‐dependent elements together in ways that do not align with their relative linear positions. The transformational component resolves the tension created by tying co‐occurrence to contiguity, transforming a structure that has such co‐dependent elements adjacent into one where they are separated.
But another logically possibility, when we are confronted with the patterns in (11), is to simply break the link between co‐dependence and linear contiguity right from the beginning. Multiple context‐free grammars (MCFGs) (Seki et al., 1991) provide a canonical instantiation of this option; see, e.g., Kallmeyer (2010, ch. 6) and Clark (2014) for overviews. Derivations in these grammars are most naturally understood in terms of a “bottom‐up” composition process, unlike the “top‐down” rewriting grammars that serve as the framework for the Chomsky hierarchy. MCFGs have proven to be a useful reference point for understanding and comparing various mildly context‐sensitive grammar formalisms (Joshi, 1985; Joshi et al., 1990), which sit between CFGs and CSGs on the scale of generative capacity, including formalisms expressed in terms of transformation‐like tree‐manipulating operations, such as Minimalist Grammars (Stabler, 1997, 2011) and Tree‐Adjoining Grammars (Joshi et al., 1975; Abeillé and Rambow, 2000; Frank, 2002).
5.6 Conclusion
The notion of intersubstitutability of subexpressions, or categorization of subexpressions into equivalence classes, is tightly related to the very idea of a grammar itself. Grammar formalisms differ in the ways that they compose these subexpressions (e.g. prefix‐suffix combinations, infix‐circumfix combinations), but this composition is mediated by categorization. Any interesting system of categorization involves isolating out the properties of a subexpression that affect its combinatory potential, and those that don't; those properties that need to be remembered or tracked, and those that can be safely ignored or forgotten. If everything is remembered and nothing is forgotten, a grammar reduces to a list of stored complete expressions (recall Figure 5.6); at the other extreme, a grammar that remembers nothing treats all subexpressions interchangeably, and therefore generates a set of expressions that exhibits no regularities. An interesting grammar is one that sits in between these two extremes, yielding constrained productivity.
The overall perspective that I have offered here is somewhat more optimistic about lasting contributions of the Chomsky hierarchy than linguists have generally been since the 1960s – not more optimistic about the role string‐generating grammars can play in linguistic theory, but more optimistic about the role that insights gleaned from the careful study of string‐generating grammars can play in an understanding of any kind of grammar.
Endnotes
1 1 Thanks to Bob Frank, Bruce Hayes, Jeff Heinz, Norbert Hornstein, Kyle Johnson, Paul Pietroski, and the editors and reviewers for helpful comments and suggestions on earlier drafts.
2 2 Chomsky (1956), Chomsky and Miller (1958), Chomsky (1959, 1963), Chomsky and Miller (1963), Miller and Chomsky (1963).
3 3 For standard presentations from the general perspective of the theory of computation, see e.g. Hopcroft and Ullman (1979), Lewis and Papadimitriou (1981) and Sipser (1997). For more linguistics‐oriented presentations, see e.g. Levelt (1974), Partee et al. (1990).
4 4 For generalizations beyond the case of strings as the generated objects, see the rich literature on tree grammars (e.g. Thatcher, 1967, 1973; Thatcher and Wright, 1968; Rounds, 1970; Rogers 1997; Comon et al. 2007). Generalizations beyond binary grammaticality arise via the theory of semirings (e.g. Kuich 1997; Goodman 1999; Mohri 2002).
5 5 Notice that the argument here does not concern the usefulness of the traditional notion of weak generative capacity that emerges from the original work on the Chomsky hierarchy, or the viewpoint which equates natural languages with sets of strings and asks where those sets of strings fall on the hierarchy (or extensions of it). The main point I hope to make here is that the usefulness of the Chomsky hierarchy for theoretical linguistics need not be limited to what emerges from those traditional and better‐known perspectives.
6 6 See e.g. Carnie (2013, pp. 48–50), Fromkin et al. (2000, pp. 147–151). Johnson (2019) gives a particularly clear presentation of the fundamental relationship between substitution classes and phrase structure.
7 7 See e.g. Chomsky (1963, pp. 358–359), Levelt (1974, pp. 106–109), Partee et al. (1990, pp. 516–517), Hopcroft and Ullman (1979, pp. 221–223).
8 8 This is based on an example from Hopcroft and Ullman (1979, pp. 220–221).
9 9 Much of the technical literature uses the term “language” here, but this creates unnecessary distractions.
10 10 See also Chomsky, 1956, Section 3.1; Chomsky and Miller, 1963, pp. 288–289.
11 11 The phrase structure grammars considered in section 3 of Chomsky (1956) do not correspond exactly to any of the classes in (1) that are discussed in Chomsky (1959).
12 12 Citing Harris, 1951, Chomsky (2006, p. 172, fn.15) writes that “The concept of ‘phrase structure grammar’ was explicitly designed to express the richest system that could reasonably be expected to result from the application of Harris‐type procedures to a corpus.”
13 13 Hopcroft and Ullman (1979, p. 224) show that this stringset can be generated by a grammar consisting of rules where is at least as long as . The stringsets generable by grammars satisfying this “non‐contracting” requirement are the same as those generable by Type 1 grammars (Chomsky, 1959, pp. 144–145). The non‐contracting requirement is sometimes given as an alternative condition defining Type 1 grammars, e.g. Levelt (1974, pp. 27–29). Chomsky (1963, pp. 360–363), departing from the Chomsky, 1959 numbering system that has now become standard, defines Type 1 grammars with the non‐contracting requirement, and calls grammars with rules satisfying the format “Type 2 grammars.”