The Tangled Tree: A Radical New History of Life. David Quammen
his rejoinder to Darwin’s book, based mainly on proof by authority. He noted that Pictet saw no evidence for transmutation in the fossil record of fishes. Agassiz said that the resemblances among animals derive from—where?—the mind of the Creator. “It is well to take heed to the opinions of such masters in science,” Hitchcock wrote, “when so many, with Darwin at their head, are inclined to adopt the doctrine of gradual transmutation in species.”
That was mild but firm, a dismissive shrug. Hitchcock would ignore Charles Darwin and encourage his readers to do likewise. More telling, more defensive, was his other response: he removed the trees figure from his own book. No more Paleontological Chart. It seems never to have appeared in another edition of Elementary Geology.
Darwin and Darwin’s followers owned the tree image now. It would remain the best graphic representation of life’s history, evolution through time, the origins of diversity and adaptation, until the late twentieth century. And then rather suddenly a small group of scientists would discover: oops, no, it’s wrong.
Molecular phylogenetics, the study of evolutionary relatedness using molecules as evidence, began with a suggestion by Francis Crick, in 1958, offered passingly in an important paper devoted to something else. That was characteristic of Crick—so brilliant and recklessly imaginative that he sometimes influenced the course of biology even with his elbows.
You know Crick’s name from the most famous triumph of his life: solving the structure of the DNA molecule, with his young American partner James Watson, in 1953, for which he and Watson and one other scientist would eventually, in 1962, receive the Nobel Prize. Crick wasn’t wasting his time, in 1958, mooning about dreams of glory in Stockholm. He was still interested in DNA, but he had moved on from the sheer structural question to other big problems. He had bent his mind intensely, but with his usual sense of merry play, to the challenge of deciphering the genetic code.
The code, as you’ve heard many times but might need reminding, is written in an alphabet of four letters, each letter representing a component—a nucleotide base, in chemistry lingo—of the DNA double helix. The four letters are: A (for adenine), C (cytosine), G (guanine), and T (thymine). DNA’s full moniker is deoxyribonucleic acid, of course, and it’s worth understanding why. The two helical strands of the double helix, twining around a central axis in parallel with each other, are composed of units called nucleotides, linked in a chain, each nucleotide containing a base (that’s the A, C, G, or T), a sugar (that’s the deoxyribose), and a phosphate group (that’s the acidic part). The sugar end of one nucleotide bonds to the phosphate end of the next, forming the two long helical strands. I just called them parallel, but to be more precise, those strands are antiparallel to each other, since the sugar-phosphate binding gives them directionality—a front end and a back end—and the front end of one strand aligns with the back end of the other. The nucleotide bases, linked crossways by hydrogen bonds, hold the strands together. The base A pairs with T, the base C pairs with G, forming a stable structure, like the steps in a spiral staircase. This is the nifty arrangement that Watson and Crick deduced.
It’s not just a stable structure, though. It’s a wondrously efficient one for storing, copying, and applying heritable data. When the two strands are peeled apart, the sequence of bases along one of the strands (the template strand) represents genetic information ready to be duplicated or used. Watson and Crick noted that capacity with exquisite coyness in their 1953 paper. The paper was lapidary, only a page long, as published in the journal Nature, and included a sketch. Near the end, having proposed their double helix structure and the matchup of bases, always A with T and C with G, they wrote: “It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material.”
But copying that material, for hereditary continuity, was one thing. Translating it into living organisms was another. Translated how? By what steps does the information in DNA become physically animate?
This mystery leads first to proteins. There are four kinds of molecule essential to living processes—carbohydrates, lipids, nucleic acids, and proteins—often collectively called the molecules of life. Proteins might be the most versatile, serving a wide range of structural, catalyzing, and transporting functions. Their piecemeal production, and the controls on the process of building and using them, are encoded in DNA. Every protein consists of a linear chain of amino acids, folded upon itself into an elaborate secondary structure. Although about five hundred amino acids are known to chemistry, only twenty of those serve as the fundamental components of life, from which virtually all proteins are assembled. But what sequences of the four bases determine which amino acids shall be added to a chain? What combination of letters specifies leucine? What combination produces cysteine? What arrangement of A, C, G, and T delivers its meaning as glutamine? What spells tyrosine? This fundamental matter—how do bases designate aminos?—became known as “the coding problem,” to which Francis Crick addressed himself in the late 1950s. Solving it was a crucial step toward understanding how organisms grow, live, and replicate.
There were questions within questions. Do the bases work in combinations? If so, how many? Two-base clusters, selected variously from the group of four and in specified order (CT, CG, AA, and so on) would allow only sixteen combinations, not enough to code twenty amino acids. Then maybe clusters of three or more? If three (such as CTC, CGA, AAA), do those triplets overlap one another, or do they function separately, like three-letter words divided by commas? If there are commas, are there periods too? Four letters, in every possible combination of three, yield sixty-four variants. Are all sixty-four possible triplets used? If so, that implies some redundancy; different triplets coding for the same amino acid. Does the code include a way of saying “Stop”? If not, where does one gene end and another begin? Crick and others were keen to know.
Crick himself had also started thinking beyond that problem, to the question of how proteins are physically assembled from the coded information, with one amino acid brought into line after another. How does the template strand find or attract its amino acids? How do those units become linked? He wanted to learn not just the language of life—its letters, words, grammar—but also the mechanics of how it gets spoken: its equivalent of lungs, larynx, lips, and tongue.
Crick was back in England by the mid-1950s, after a sojourn in the United States, and based again at the Cavendish Laboratory in Cambridge, where he had worked with Jim Watson. He had a contract with the Medical Research Council (MRC), a government agency with some mandate for fundamental as well as medical research. Solving the DNA structure, though it had brought scientific fame to Crick and Watson and would eventually bring the Nobel Prize, provided no immediate cure for Crick’s dicey financial situation, all the more acute since the birth of his and his wife Odile’s third child. He had to work for pay: a modest salary from the MRC and whatever small change the occasional radio broadcast or popular article might bring. Now he was sharing his office, his pub lunches, his fevered conversations, and his blackboard with another scientist, Sydney Brenner, rather than with Watson. One colleague at the Cavendish, upon early acquaintance with Crick, concluded that “his method of working was to talk loudly all the time.” When not talking, or listening to Brenner, he spent his time reading scientific papers, rethinking the results of other researchers, combing through such bodies of knowledge for clues to the mysteries that engaged him. He was not an experimentalist, generating data. He was a theoretician—probably the century’s best and most intuitive in the biological sciences.
Sometime in 1957 Crick gathered his thoughts and his informed guesses on this