Replicating And Repairing The Genome: From Basic Mechanisms To Modern Genetic Technologies. Kenneth N Kreuzer
of nucleic acids: A structure for deoxyribose nucleic acid. Nature, 171(4356), 737–738.
How did they test that?
The base composition of DNA and Chargaff’s rule
Studies of DNA in the first half of the 1900s uncovered the basic units of the molecule, namely the sugar, phosphate, and four distinct bases (for a fascinating review of this period of research, see Frixione and Ruiz-Zamarripa, 2019). At the time, DNA was not suspected to be the genetic material — it seemed too simple. The dominant model was the “tetranucleotide hypothesis,” in which the four bases were linked to each other in a simple ring structure. This structural proposal was based on crude data that suggested roughly equal amounts of the four bases in preparations of DNA. Erwin Chargaff and his colleagues conducted very careful and detailed studies that disproved this hypothesis and provided key evidence that helped Watson and Crick arrive at the correct structure of duplex DNA. Chargaff and his associates meticulously purified DNA from various sources, carefully eliminating RNA contamination. Next, they hydrolyzed the DNA samples and determined the fractions of each of the four bases in the resulting mixtures. Without going into the chemical details, the determinations were done as follows. First, the base mixtures were subjected to paper chromatography, in which the various bases migrated to different positions along paper strips soaked in various chemical solutions. Four small areas on the paper strips, containing each of the purified bases, were cut out, and the four bases were eluted into separate test tubes. These were then analyzed by UV spectroscopy, which provided the adsorption maxima (revealing base identity) and intensities (revealing amounts). In one study, Chargaff et al. (1952) measured the base composition from sperm DNA isolated from four different sea urchin species (Table 1.1). In each of the four samples, the proportions of adenine and thymine were equal (within experimental error), as were the proportions of guanine and cytosine (Table 1.1). However, the proportions of other pairs of bases were clearly disparate. This data clearly disprove the tetranucleotide hypothesis mentioned above. The equality of adenine/thymine and guanine/cytosine, which was seen in disparate species (Table 1.2), is the socalled “Chargaff rule.”
Table 1.1.Base content of sea urchin sperm DNAs.
Base values are in mole percent; data from Chargaff et al. (1952).
Table 1.2.Base content of diverse species.
Base values are in mole percent; data from Chargaff as cited in Bansal (2003).
1The total number of human cells is difficult to estimate and somewhat controversial, but a recent compilation put the number at about 37 trillion. About 80% of human cells are red blood cells that lack a nucleus, leaving about 7 trillion nucleuscontaining cells.
2When discussing DNA or a particular DNA sequence, the nucleotide residues are often referred to simply as the base designations A, G, C, and T, rather than the more cumbersome dA, dG, dC, or dT, and without explicitly indicating the phosphate linkages. The deoxy designations will be used in situations where both RNA and DNA nucleotide residues are relevant and need to be distinguished.
3See Okazaki et al. (1968) in Further Reading at the end of this chapter.
Chapter 2
The simple DNA replication system of a bacterial virus
2.1Why the interest in a bacterial virus?
The DNA replication systems of prokaryotic and eukaryotic cells are quite complex, with over 20 proteins needed to replicate the genome of the model prokaryote Escherichia coli. This complexity undoubtedly relates to the relatively large genome sizes of these cells, the need for a very low error rate particularly given the large genome, the need to replicate the genome once and only once per cell cycle, and the need to carefully couple DNA replication to other cellular events such as cell division.
Because of the complexity of cellular replication systems, many scientists were initially drawn to study DNA replication in simpler viral systems. The rationale was that the key functions in DNA replication would be more evident, the required proteins easier to identify and isolate, and the biochemical mechanisms easier to decipher in a system with fewer interacting parts (i.e., replication proteins). Viruses of bacterial cells, called bacteriophages or phages, provide such simple model replication systems. Some viruses co-opt the host replication machinery, while others encode their own replication proteins. For those viruses that encode their own replication proteins, the number of involved proteins is indeed smaller than for host cell DNA replication.
1969 Nobel Prize in Physiology or Medicine
This prize was awarded to Max Delbruck, Alfred D. Hershey and Salvador E. Luria for their studies on bacterial viruses, elucidating important aspects of their genetic structure and replication mechanisms.
https://www.nobelprize.org/prizes/medicine/1969/summary/
The bacteriophage called T7 turned out to provide a particularly good system for detailed study in that only four proteins are needed to form a fully functional replisome (a few other proteins are involved in initiating and finalizing the process; see below). Studies in this simple T7 system have been very productive, leading to high-resolution structures of each of the replication proteins and the replisome, a complete reconstitution of the reaction with the purified proteins in vitro, and remarkably insightful studies on the dynamics of the replication process based on the in vitro system. These studies of the T7 replication system were spearheaded in the laboratory of Charles C. Richardson (Harvard Medical School). For the same reasons that the T7 system was experimentally tractable, it is also a great system to introduce many of the concepts and mechanisms in DNA replication, and that will be the topic of this chapter.
There is another reason for interest in the T7 replication system. Surprisingly, replication of phage T7 DNA has been found to be very similar to the replication of mitochondrial DNA in human cells. The T7 replication proteins (as well as T7 RNA polymerase) are structurally homologous to the corresponding proteins in human mitochondria, and the basic mechanisms involved in the replication of the two genomes have clear parallels. This has led to the proposal that the genes for mitochondrial replication proteins have their evolutionary origin in a bacterial virus. It is interesting to speculate about how the genes from a bacterial virus could have been captured during the process by which a bacterial cell formed a symbiotic relationship with a primitive eukaryotic cell to form the precursors of mitochondria. Anyone interested in understanding how mitochondria replicate their DNA should certainly start by carefully learning all there is to know about phage T7 DNA replication.
2.2The four proteins involved in T7 DNA replication
The T7 genome encodes only three proteins needed to form the phage replisome: a DNA polymerase, a combined helicase/primase protein, and an ssDNA-binding protein. The fourth protein involved in T7 DNA replication is a host-encoded protein, thioredoxin, which interacts with the T7 DNA polymerase in a 1:1 complex. For readers who delve into the primary literature on this topic, the T7 DNA polymerase is called gene product 5 (gp5), the helicase/primase is gp4, and the ssDNA-binding protein is gp2.5. For simplicity, in this book, we will use only the generic names that reflect the functions of these proteins.
The T7 DNA polymerase contains two major domains, an N-terminal exonuclease domain of 201 amino acids and a C-terminal polymerase domain of 503