A Companion to Chomsky. Группа авторов
of Chicago Press.
79 Postal, P. 1971. Cross‐over Phenomena. New York: Holt, Rinehart and Winston.
80 Postal, P. 1974. On Raising: One Rule of English grammar and Its Theoretical Implications. Cambridge, MA: MIT Press.
81 Potsdam, E. and M. Polinsky. 2012. “Backward Raising.” Syntax 15, 75–108. DOI: 10.1111/j.1467‐9612.2011.00158.x.
82 Reinhart, T. 1976. The Syntactic Domain of Anaphora. PhD diss., MIT.
83 Riemsdijk, H. van and E. Williams. 1981. “NP‐structure.” The Linguistic Review 1, 171–217. DOI: 10.1515/tlir.1981.1.2.171.
84 Rosenbaum, P. 1967. The Grammar of English Predicate Complement Constructions. Cambridge, MA: MIT Press.
85 Ross, J. R. 1967. Constraints on Variables in Syntax. PhD diss., MIT.
86 Sag, I. A. and C. J. Pollard. 1991. “An Integrated Theory of Complement Control.” Language 67, 63–113. DOI: 10.2307/415539.
87 Schütze, C. T. 2020. “Acceptability Ratings Cannot Be Taken at Face Value.” In Linguistic Intuitions, edited by S. Schindler, A. Drozdzowicz, and K. Brøcker 189–214. Oxford: Oxford University Press.
88 Schütze, C. T. and J. Sprouse. 2013. “Judgment Data.” In Research Methods in Linguistics, edited by R. J. Podesva and D. Sharma, 22–50. Cambridge: Cambridge University Press.
89 Speas, M. 1986. Adjunctions and Projections in Syntax. PhD diss., MIT.
90 Sprouse, J. and D. Almeida. 2012. “Assessing the Reliability of Textbook Data in Syntax: Adger's Core Syntax.” Journal of Linguistics 48, 609–652. DOI: 10.1017/S0022226712000011.
91 Sprouse, J. and D. Almeida 2013. “The Empirical Status of Data in Syntax: A reply to Gibson and Fedorenko.” Language and Cognitive Processes 28, 222–228. DOI: 10.1080/01690965.2012.703782.
92 Sprouse, J. and D. Almeida. 2017a. “Design Sensitivity and Statistical Power in Acceptability Judgment Experiments.” Glossa: A Journal of General Linguistics 2 (1), e14. DOI: 10.5334/gjgl.236.
93 Sprouse, J. and D. Almeida. 2017b. “Setting the Empirical Record Straight: Acceptability Judgments appear to be Reliable, Robust, and Replicable.” Behavioral and Brain Sciences 40, e311. DOI: 10.1017/S0140525X17000590.
94 Sprouse, J., C. T. Schütze and D. Almeida. 2013. “A Comparison of Informal and Formal Acceptability Judgments Using a Random Sample from Linguistic Inquiry 2001–2010.” Lingua 134, 219–248. DOI: 10.1016/j.lingua.2013.07.002.
95 Travis, L. 1984. Parameters and Effects of Word Order Variation. PhD diss., MIT.
96 Wasow, T. and J. Arnold. 2005. “Intuitions in Linguistic Argumentation.” Lingua 115, 1481–1496. DOI: 10.1016/j.lingua.2004.07.001.
97 Webelhuth, G. 1992. Principles and Parameters of Syntactic Saturation. Oxford: Oxford University Press.
5 The Chomsky Hierarchy1
TIM HUNTER
University of California–Los Angeles, Los Angeles, California
5.1 Introduction
An important cluster of closely related early Chomsky papers2 had two major consequences. First, they defined a new branch of mathematics, formal language theory, which has flourished in its own right. But second, and more importantly for our purposes, this new branch of mathematics provided the formal grounding for a new conception of linguistics in which grammars, rather than sentences or collections of sentences, were the scientifically central objects: instead being derived from collections of sentences as compact summaries of observed regularities, grammars are seen as (ultimately mental) systems that determine the status of sentences. The “observed regularities” come to be seen as consequences of the structure of the underlying system, the grammar. The classification of grammars that became known as the Chomsky hierarchy was an exploration of what kinds of regularities could arise from grammars that had various conditions imposed on their structure.
Rather than laying out the mathematical theory in complete detail – numerous sources already provide this3 – my aim in this chapter will be to focus on bringing out some key intuitions that emerge from the theory and try to highlight their applicability to theoretical linguistics. Looking at a completely formal treatment makes it easy to overestimate the degree to which the important concepts are bound to certain idealizations, such as the restriction to strings as the objects being generated and a binary grammaticality/ungrammaticality distinction.4 While those idealizations are there in the theory, I hope to make the case that certain intuitions that emerge from the theory are meaningful and useful in ways that transcend those idealizations.5 To the extent that I succeed in making this case, the reader will be able to turn to the formal literature with some motivating ideas in mind about the important concepts to watch out for.
One idea that plays a major role is the intersubstitutability of subexpressions. This is familiar from the distributional approach to discovering syntactic categories that is sometimes presented in introductory textbooks.6 We reach the conclusion that cat and dog belong to the same category, for example, by noting that substituting one for the other does not change a sentence's grammaticality. While we might introduce the term “noun” or the book‐keeping symbol N as a label for the class that cat and dog both belong to, there is nothing to being a noun other than being intersubstitutable with other nouns; the two‐place predicate “belongs to the same category as” is more fundamental than the one‐place predicate “is a noun.” (This diverges from the view where a noun, for example, would be defined as a word that describes a person, place or thing.)
Intersubstitutability is closely related to the way different levels on the Chomsky hierarchy correspond to different kinds of memory. A grammar that will give rise to the intersubstitutability of cat and dog is one that ignores, or forgets, all the ways that they differ, collapsing all distinctions between them. Similarly for larger expressions: the distinctions between wash the clothes and go to a bar, such as the fact that they differ in number of words and the fact that only one of the two contains the word the, can be ignored. The flip side of this irrelevant information, that a grammar ignores, is the relevant information that a grammar tracks – this remembered, relevant information is essentially the idea of a category. Different kinds of grammars correspond to different kinds of memory in the sense that they differ in how these categories, this remembered information, are used to guide or constrain subsequent generative steps.
Much of the discussion below aims to show that this idea of intersubstitutability gets at the core of how any sort of grammar differs from a mere collection of sentences, and how any sort of grammar might finitely characterize an infinite collection of expressions. A mechanism that never collapsed distinctions between expressions would be forced to specify all combinatorial possibilities explicitly, leaving no room for any sort of productivity; a mechanism that collapsed all distinctions would treat all expressions as intersubstitutable and impose no restrictions on how expressions combine to form others. An “interesting” grammar is one that sits somewhere in between these two extremes, collapsing some but not all distinctions, thereby giving rise to constrained productivity – productivity stems from the distinctions that are ignored, while constraints stem from those that are tracked. The task of designing a grammar to generate some desired pattern amounts to choosing which distinctions to ignore and which distinctions to track.
5.2