Genome: The Autobiography of a Species in 23 Chapters. Matt Ridley
a small piece in the News Chronicle, the double helix did not make the newspapers. Today most scientists consider it the most momentous discovery of the century, if not the millennium.
Many frustrating years of confusion were to follow the discovery of DNA’s structure. The code itself, the language by which the gene expressed itself, stubbornly retained its mystery. Finding the code had been, for Watson and Crick, almost easy – a mixture of guesswork, good physics and inspiration. Cracking the code required true brilliance. It was a four-letter code, obviously: A, C, G and T. And it was translated into the twenty-letter code of amino acids that make up proteins, almost certainly. But how? Where? And by what means?
Most of the best ideas that led to the answer came from Crick, including what he called the adaptor molecule – what we now call transfer RNA. Independently of all evidence, Crick arrived at the conclusion that such a molecule must exist. It duly turned up. But Crick also had an idea that was so good it has been called the greatest wrong theory in history. Crick’s ‘comma-free’ code is more elegant than the one Mother Nature uses. It works like this. Suppose that the code uses three letters in each word (if it uses two, that only gives sixteen combinations, which is too few). Suppose that it has no commas, and nogapsbetweenthewords. Now suppose that it excludes all words that can be misread if you start in the wrong place. So, to take an analogy used by Brian Hayes, imagine all three-letter English words that can be written with the four letters A, S, E and T: ass, ate, eat, sat, sea, see, set, tat, tea and tee. Now eliminate those that can be misread as another word if you start in the wrong place. For example, the phrase ateateat can be misread as ‘a tea tea t’ or as ‘at eat eat’ or as ‘ate ate at’. Only one of these three words can survive in the code.
Crick did the same with A, C, G and T. He eliminated AAA, CCC, GGG and TTT for a start. He then grouped the remaining sixty words into threes, each group containing the same three letters in the same rotating order. For example, ACT, CTA and TAC are in one group, because C follows A, T follows C, and A follows T in each; while ATC, TCA and CAT are in another group. Only one word in each group survived. Exactly twenty are left – and there are twenty amino acid letters in the protein alphabet! A four-letter code gives a twenty-letter alphabet.
Crick cautioned in vain against taking his idea too seriously. ‘The arguments and assumptions which we have had to employ to deduce this code are too precarious for us to feel much confidence in it on purely theoretical grounds. We put it forward because it gives the magic number – twenty – in a neat manner and from reasonable physical postulates.’ But the double helix did not have much evidence going for it at first, either. Excitement mounted. For five years everybody assumed it was right.
But the time for theorising was past. In 1961, while everybody else was thinking, Marshall Nirenberg and Johann Matthaei decoded a ‘word’ of the code by the simple means of making a piece of RNA out of pure U (uracil – the equivalent of DNA’s T) and putting it in a solution of amino acids. The ribosomes made a protein by stitching together lots of phenylalanines. The first word of the code had been cracked: UUU means phenylalanine. The comma-free code was wrong, after all. Its great beauty had been that it cannot have what are called reading-shift mutations, in which the loss of one letter makes nonsense of all that follows. Yet the version that Nature has instead chosen, though less elegant, is more tolerant of other kinds of errors. It contains much redundancy with many different three-letter words meaning the same thing.7
By 1965 the whole code was known and the age of modern genetics had begun. The pioneering breakthroughs of the 1960s became the routine procedures of the 1990s. And so, in 1995, science could return to Archibald Garrod’s long-dead patients with their black urine and say with confidence exactly what spelling mistakes occurred in which gene to cause their alkaptonuria. The story is twentieth-century genetics in miniature. Alkaptonuria, remember, is a very rare and not very dangerous disease, fairly easily treated by dietary advice, so it had lain untouched by science for many years. In 1995, lured by its historical significance, two Spaniards took up the challenge. Using a fungus called Aspergillus, they eventually created a mutant that accumulated a purple pigment in the presence of phenylalanine: homogentisate. As Garrod suspected, this mutant had a defective version of the protein called homogentisate dioxygenase. By breaking up the fungal genome with special enzymes, identifying the bits that were different from normal and reading the code therein, they eventually pinned down the gene in question. They then searched through a library of human genes hoping to find one similar enough to stick to the fungal DNA. They found it, on the long arm of chromosome 3, a ‘paragraph’ of DNA ‘letters’ that shares fifty-two per cent of its letters with the fungal gene. Fishing out the gene in people with alkaptonuria and comparing it with those who do not have it, reveals that they have just one different letter that counts, either the 690th or the 901st. In each case just a single letter change messes up the protein so it can no longer do its job.8
This gene is the epitome of a boring gene, doing a boring chemical job in boring parts of the body, causing a boring disease when broken. Nothing about it is surprising or unique. It cannot be linked with IQ or homosexuality, it tells us nothing about the origin of life, it is not a selfish gene, it does not disobey Mendel’s laws, it cannot kill or maim. It is to all intents and purposes exactly the same gene in every creature on the planet – even bread mould has it and uses it for precisely the same job that we do. Yet the gene for homogentisate dioxygenase deserves its little place in history for its story is in microcosm the story of genetics itself. And even this dull little gene now reveals a beauty that would have dazzled Gregor Mendel, because it is a concrete expression of his abstract laws: a story of microscopic, coiled, matching helices that work in pairs, of four-letter codes, and the chemical unity of life.
Sir, what ye’re telling us is nothing but scientific Calvinism.
Anonymous Scottish soldier to William Bateson after a popular lecture 1
Open any catalogue of the human genome and you will be confronted not with a list of human potentialities, but a list of diseases, mostly ones named after pairs of obscure central-European doctors. This gene causes Niemann–Pick disease; that one causes Wolf–Hirschhorn syndrome. The impression given is that genes are there to cause diseases. ‘New gene for mental illness’, announces a website on genes that reports the latest news from the front, ‘The gene for early-onset dystonia. Gene for kidney cancer isolated. Autism linked to serotonin transporter gene. A new Alzheimer’s gene. The genetics of obsessive behaviour.’
Yet to define genes by the diseases they cause is about as absurd as defining organs of the body by the diseases they get: livers are there to cause cirrhosis, hearts to cause heart attacks and brains to cause strokes. It is a measure, not of our knowledge but of our ignorance that this is the way the genome catalogues read. It is literally true that the only thing we know about some genes is that their malfunction causes a particular disease. This is a pitifully small thing to know about a gene, and a terribly misleading one. It leads to the dangerous shorthand that runs as follows: ‘X has got the Wolf–Hirschhorn gene.’ Wrong. We all have the Wolf–Hirschhorn gene, except, ironically, people who have Wolf–Hirschhorn syndrome. Their sickness is caused by the fact that the gene is missing altogether. In the rest of us, the gene is a positive, not a negative force. The sufferers have the mutation, not the gene.
Wolf–Hirschhorn syndrome is so rare and so serious – its gene is so vital – that its victims die young. Yet the gene, which lies on chromosome 4, is actually the most famous of all the ‘disease’ genes because of a very different disease associated with it: Huntington’s chorea. A mutated version of the gene causes Huntington’s chorea; a complete lack of the gene causes Wolf–Hirschhorn syndrome. We know very little about what the gene is there to do in everyday life, but we now know in excruciating detail how and why and where it can go wrong and what the consequence for the body is. The gene contains a single ‘word’, repeated