A Companion to Chomsky. Группа авторов
Nevertheless, as Chomsky and Lasnik (1977) point out, the quest for descriptive adequacy led to a tremendously rich theory. This can be seen quite clearly in Peters and Ritchie (1973), whose explicit formalization contains a range of mechanisms that were proposed at the time, such as global rules and transderivational constraints. Let us look at these mechanisms briefly (building on the discussion in Lasnik and Lohndal 2013).
Lakoff (1970, p. 628) defines a global rule as a rule that states conditions on “configurations of corresponding nodes in nonadjacent trees in a derivation.” In general, transformations have always been assumed to be Markovian, that is, that they involve one step at the time. However, global rules require a system that dramatically extends the power beyond Markovian properties. Ross (1969) famously provided an example of a global rule. In this paper, he extends results he obtained in Ross (1967) involving island constraints. One such island constraint is illustrated in (5), the Coordinate Structure Constraint, which prevents extraction from just one of the conjunctions. We have illustrated that by showing in (5) a copy of who in the position from which it has been deleted.
1 (5) *Irv and someone were dancing, but I don't know who Irv and who were dancing.
Notably, Ross (1969) showed that if the constraint isn't visible, it goes away. A way to make it disappear is to use ellipsis, as in (6).
1 (6) Irv and someone were dancing, but I don't know who.
In (6), the coordinate structure, the constituent that forms the island, has been elided and is not pronounced. That makes the example acceptable. More formally, Ross argued that for an island violation to occur, the constituent that forms the island needs to be present at surface structure. If a transformation deletes this constituent, the constraint no longer applies. This deletion became known as sluicing (see van Craenenbroeck and Merchant 2013). To capture the contrast between (5) and (6), island constraints need to mention both the surface structure and the point in the derivation where the movement of the relevant constitutent (who in (5)) takes place, the coordinate structure in (5). That the constraint needs to mention both properties makes it a global rule.
As for transderivational constraints, such constraints depend on derivations different from the one that is being considered. Hankamer (1973) provides arguments in favor of such constraints. One example involves the phenomenon known as gapping (see van Craenenroeck and Merchant 2013). Among others, he uses the example in (7) (Hankamer 1973, pp. 26–27).
1 (7) Max wanted Ted to persuade Alex to get lost, and Walt, Ira.
The question is how such a string is derived, that is, what is the correct derivation underlying (7)? Possible candidates could be (8a) or (8b).
1 (8)Max wanted Ted to persuade Alex to get lost, *and Walt [wanted] Ira [to persuade Alex to get lost]Max wanted Ted to persuade Alex to get lost,*and Walt [wanted Ted to persuade] Ira [to get lost]
Hankamer argued that both options in (8) are out because (7) can also be derived from a different constituent structure which still derives the intended meaning, namely (9).
1 (9) Max wanted Ted to persuade Alex to get lost,and [Max wanted] Walt [to persuade] Ira [to get lost]
When the bracketed constituents are deleted, (9) becomes (7). Given this, the constraint would not just have to make reference to alternative derivations created from the same deep structure, but also to alternative derivations created from different deep structures. That raises nontrivial questions concerning the expressive power of such a computational system, and consequently also its learnability.6
Any extension of the class of possible grammars requires significant empirical justification. Chomsky and Lasnik (1977) argued that this justification had not been provided in approaches that extended the original framework in Chomsky (1955/1975, Peters and Ritchie (1973), and comparable work, cf. Dougherty (1973), Chomsky (1973), and Brame (1976). Because of that, Chomsky and Lasnik proposed a new framework which restricted the number of possible grammars significantly. This was seen as a crucial step towards being able to explain the acquisition of grammatical competence, a central goal ever since Chomsky (1965).
The new framework departed from earlier frameworks in some crucial ways, not at least in assuming that Universal Grammar is not an “undifferentiated” system. That is, it was argued that core grammar has highly restricted options, since it consists of universal principles and a few parameters that account for variation. In addition to the core, there is the periphery, consisting of “marked” phenomena, e.g. irregularities (i.e. irregular verbs) and exceptions more generally (e.g. English has prepositions, but also the marked exception ago – which comes after its complement). In other words, the approach required something similar to a theory of markedness, with all its complications (see Haspelmath 2006 for a comprehensive discussion). As Chomsky and Lasnik (1977, p. 430) say:
Systems that fall within core grammar constitute “the unmarked case”; we may think of them as optimal in terms of the evaluation metric. An actual language is determined by fixing the parameters of core grammar and then adding rules or rule conditions, using much richer resources, perhaps resources as rich as those contemplated in the earlier theories of [transformational grammar]7
Research was generally devoted to the core phenomena: “A reasonable approach would be to focus attention on the core system, putting aside phenomena that result from historical accident, dialect mixture, personal idiosyncrasies, and the like” (Chomsky and Lasnik 1993, p. 510).
The name for constraints in Chomsky and Lasnik (1977) was “filters.” In their paper, the hypothesis was that surface filters can capture effects of ordering, obligatoriness and contextual dependencies. Such surface filters would be universal; thus, we would not expect any variation between languages. This makes filters different from parameters. Furthermore, a third component was language‐specific filters. For example, to capture the ill‐formedness of (10a) in Standard English, the language‐specific filter in (10b) was proposed.
1 (10)*We want for to win.*[for‐to]
This filter deems any for‐to string illicit. Chomsky and Lasnik (1977, p. 442) claim that the rule in (10b) would be a “dialect” filter, since it was assumed to involve “a high degree of uncertainty and variation.” And, indeed, for to sequences are perfectly possible in for example Irish English dialects. In essence, then, a filter can either be outside of core grammar, like (10b), or part of core grammar, like the ban on stranding an affix (The Stranded Affix Filter, cf. Lasnik 1981).
Chomsky and Lasnik's (1977) paper prepared the ground for a major change in how to think about universality and variation. We turn to that in the next section.
3.3 Principles and Parameters: Solving Plato's Problem
Chomsky (1981) is a fundamental contribution to the study of human language in its effort to develop a new theory of what is universal across languages and what is variable. The main change affected the notion of filters, which came to be replaced by parameters. Parameters were seen as providing the solution to two issues: How can we capture the observed variation across the world's languages, and how do humans know so much given the limited evidence that is available to us (Plato's problem). The idea was that the child only had to set the correct value, which mostly was thought to involve a choice between two options, much like a switchbox as James Higginbotham put it. The head parameter is a simple example of this: You look at whether the verb precedes or follows the object, which gives you the two main word orders across the world's languages: verb–object, or object–verb. As Chomsky pointed out:
If these parameters are embedded in a theory of UG that is sufficiently rich in structure, then the languages that are determined by fixing their values one way or another will