A Companion to Chomsky. Группа авторов
to distinguish looking and touching?
That's when my memories as a linguist, and conversations with Barbara Landau, brought back “chains of transformations” and The Great Verb Game. If you can't learn the meaning of see by seeing, perhaps its sense reveals itself in the syntactic structures it licenses. This hypothesis turns on the following idea. If sentence structures are projected from their semantics, then to some extent the semantics itself may be recoverable from the observed surface syntactic forms. Just Harris's position! In fact, such a theory may not be merely a fallback used by the sensorily deprived, but a general clue to the acquisition of word meanings, for almost every word is abstract and mind‐driven and requires more than observation of the world to understand.
Indeed, we discovered that the verbs of cognition and perception as used by the mother to her blind child crucially included sentence complements such as Let's see if there's cheese in the refrigerator. In contrast, Let's touch if there's cheese in the refrigerator never occurred. And it's by using this information that the congenitally blind child learns to “see.”
That's basically the answer. You could now reverse‐engineer it. That is, if you knew about how languages must map from semantics to syntax, and somebody gave you the syntax, you ought to be able to reconstruct something about the semantics, down to some level.
And that's what we called “syntactic bootstrapping,” a term which I invented to mock my good friend Steve Pinker's “semantic bootstrapping” (Pinker 1984). Pinker, who was right in many ways, was saying that you can acquire the syntactic structures by understanding the semantics. But everything I was doing was saying you can't, because that semantics business that he thinks you should have first isn't so easy. While Pinker's premise was false in part, there's something deep in his work about how the syntax–semantics correspondence can help you learn semantics. We get the main clues from what we learned from the deaf and the blind – that you can exploit that correspondence from the other end. At the point where verb learning is going on, enough syntax is accessible for the child to perceive. And from that, if you're a creature endowed in a certain way by nature, you can make a very good guess about the semantics of the verb in a sentence.
Here is an important detail. A verb will occur in several syntactic environments, some with varying numbers of arguments. So one can say, I gave a book to John, or I gave at the office, or I gave cash. But the maximal number of arguments semantically partitions the verb set in useful ways.
Of course this is complicated. But you know, infants aren't just infants; they're smart [laughter]. Right? So the relationship between number of arguments and number of NPs – there's something natural about it. And the same is true for the relation between sentence complementation and words like look and see, since we can perceive both objects and states of affairs. Therefore, just as in the original Great Verb Game, listener‐learners require several framing structures to retrieve the semantics of individual words.
[Mark: Is it fair to say that your syntactic bootstrapping theory is sort of a redemption of Zellig Harris's syntactic discovery procedures on the basis of transformational relations among sentence frames?] Right! Exactly. Because as a child, you don't get to hear underlying structures. You hear some form of a surface sentence. And you have to do something with it. So, yes, I think it's basically Zellig Harris rediscovered, redeemed, and put to purposes which he would hate [laughter].
But how else would you really understand Harris's theory? Here's how Harris expressed it: “You know, you really put together a grammar by intuition. But to have a theory you have to show that it could be mechanically invented.” You have to show that you could have done it algorithmically. A machine could have invented it. That was his story. And that machine which could have invented it is the theory of language.
Well, I believe that too, but I believe that's the language acquisition device, fondly referred to as Universal Grammar.
{The following Sections from original article have been omitted here:
sec 5. TAKING THE THEORY INTO THE LABORATORY
sec 6. THE HUMAN SIMULATION PARADIGM
sec 7. EASY WORDS: FIRST STEPS IN ACQUIRING THE LEXICON
sec 8. CONCEPTS
sec 9. WHORF: DO CONCEPTS COME FROM LANGUAGE ITSELF?}
7.10 Thoughts about the Future
After this long odyssey about our studies in language acquisition, let me return to Chomsky's early proposal concerning “the poverty of the stimulus” (the series of insights about the “stimulus‐free” nature of language learning and use). Most generally, Chomsky has invited us to consider the human Mind in light of the fact that every normal child acquires any known natural language to an expert level in a relatively brief period of time in infancy and early childhood, based on the hearing (or gesturing, in the case of sign language) an adventitious set of sentences in context.
In contrast, behaviorist psychology saw language learning as an instance of more general principles of operant conditioning, a relatively straightforward distillation and organization of experience. But the models presented, as Chomsky argued, never offered a plausible account of the dimensions of structured generalization.
Classical AI raised an analogous argument by analyzing intelligence as applied logic. But logic and formal language theory left language open to massive and pervasive ambiguity, even if the “grammar” could be learned perfectly. So the next generation re‐interpreted intelligence as applied probability. This worked much better, except when it didn't.
Useful examples come from the so‐called Winograd Schema Challenge (Morgenstern et al., 2016). Consider the following sentences:
The town councillors refused to give the angry demonstrators a permit because they feared violence.
Who feared violence?
Answer 1: the town councillors
Answer 2: the angry demonstrators
Here the special word is “feared” and its alternate is “advocated” as in the following:
The town councillors refused to give the angry demonstrators a permit because they advocated violence.
Who advocated violence?
Any average 10‐year‐old can resolve these ambiguities flawlessly. But after years and years of machine modeling, these devices still go 50–50 on the chosen resolutions. Even the newer “deep learning” networks, which require exponentially more training data than humans do, learn things that humans would never consider and lack the ability to integrate common‐sense reasoning.
Perhaps the core problem for the machine learner is that he never gets the joke. He mulishly acquires what is most probable and is stymied by the improbabilities of everyday life. Instructive for this problem are several newspaper headlines collected by Steven Pinker and republished in his {(1994/2007)} book The Language Instinct. One example is “Queen Mary has bottom scraped,” which makes every listener chuckle, but the machines have no useful chuckle routines. Another example from Pinker was the headline, “Man gets six months in violin case.” True, humans and not machines know how this choice should “usually” be made and therefore giggle, and so opt for the interpretation of “case” as litigation rather than container. But now consider the recent escape of the billionaire Carlos Ghosn from the Japanese legal system, which was accomplished indeed by his secreting himself inside a violin case (a double bass case, but who's counting). The point is that we humans can understand the probabilities in the world so as to interpret ambiguity and take this into account, but when the improbable becomes the actual, we continue to understand.
These are the kinds of formal and substantive properties of language and thought that Chomsky has invited us to consider from the earliest to most recent of his writings. So extravagant did these