Structure and Function of the Bacterial Genome. Charles J. Dorman
and was cultured in Palo Alto, California, USA, in 1922 (Bachmann 1996). This isolate was the ancestor of W1485 from the Joshua Lederberg laboratory, the isolate that was named MG1655 by Mark Guyer (hence ‘MG’). The first E. coli chromosome to be sequenced came from this intensively studied MG1655 strain (Blattner et al. 1997). However, this was not the first bacterial chromosome to have its complete nucleotide sequence determined: that honour belongs to Haemophilus influenza (Fleischmann et al. 1995).
The Blattner lab chose MG1655 because it has undergone relatively little genetic manipulation and is considered a good representative of wild‐type E. coli. It has been cured of bacteriophage lambda and of the F plasmid and has few genetic lesions. An ilvG mutation deprives it of acetohydroxy acid synthase II, making it prone to valine‐dependent isoleucine starvation (Lawther et al. 1981, 1982) and there is an IS5 insertion in the rfb locus that interferes with O‐antigen synthesis (Liu and Reeves 1994). If this mutation is repaired, the bacterium has its lipopolysaccharide expression reinstated and it becomes pathogenic in an infection model based on the worm Caenorhabditis elegans (Browning et al. 2013). Strain MG1655 displays mild starvation for pyrimidine arising from poor expression of its pyrE gene: the cause is a frameshift mutation at the end of the rph locus (rph‐1) (Jensen 1993). Interestingly, genome sequence analysis shows that MG1655 is closely related to NCTC 86, the bacterium originally named Bacillus coli by Theodor Escherich in 1885, isolated before the antibiotic era (Dunne et al. 2017).
The E. coli K‐12 chromosome is a single, covalently closed, circular, double‐stranded DNA molecule of 4 639 221 bp (Blattner et al. 1997). Although chromosome circularity is the norm in E. coli, cells in which the chromosome is artificially linearised (with the ends closed by hairpin turns) are viable, show few alterations in gene expression, have normal nucleoid structure, and do not display growth defects (Cui et al. 2007). Thus, the circular nature of the chromosome is not essential for its functionality or for its ability to be replicated and to be segregated at cell division.
The E. coli chromosome was visualised originally in the early 1960s by autoradiography of cells fed with tritiated thymidine in a classic experiment that also revealed the existence of the moving replication fork (Cairns 1963a,b). The chromosome undergoes bi‐directional replication from its oriC locus (Kaguni 2011), creating two replichores: Left and Right (Figure 1.1) (Lesterlin et al. 2005; Wang, X., et al. 2006). Through a process of semi‐conservative DNA replication, the bacterium acquires a second copy of its chromosome prior to cell division. In rapidly growing bacteria, one or more additional rounds of chromosome replication are initiated before the first one is completed, creating multiple copies of those chromosomal sequences that lie closest to oriC (Figure 1.1) (Cooper and Helmstetter 1968). Genes in the oriC‐proximal zones of the E. coli chromosome will be present in higher copy numbers than genes in Ter, the region of the chromosome where replication terminates. In slower‐growing bacterial populations, gene copy numbers are more in balance around the chromosome with only a twofold difference in copy number between genes close to oriC and those near Ter.
Figure 1.1 The macrodomain structure of the E. coli chromosome. Shaded segments represent the Ori, Right, Ter, and Left macrodomains, and the Left and Right non‐structured regions. The curved arrows outside the circular chromosome represent the Left (anticlockwise) and Right (clockwise) replichores. (a) The positions of genes that encode NAPs, chromosome organisation factors, topoisomerases, proteins involved in the process of transcription, the Hfq RNA chaperone are indicated around the periphery of the chromosome. (b) The positions of the seven rrn operons and genes encoding transcription regulators that are discussed in the text are shown. The positions of the lac operon and the bacteriophage lambda attachment site (attλ) are also indicated.
Most of our knowledge about chromosome replication and segregation comes from studying a handful of model organisms: E. coli, Caulobacter crescentus, Vibrio cholerae, and Bacillus subtilis. The focus in this chapter will be on E. coli, with comparisons to other organisms where this is useful.
1.3 Chromosome Replication: Initiation
Chromosome replication, segregation, and cell division are complex processes that must be coordinated to ensure the successful replication of the cell (Reyes‐Lamothe et al. 2012). The nutritional status of the cell and its metabolic flux are very influential in achieving this coordination and they have a direct bearing on the growth rate of the culture (Wang and Levin 2009).
Replication of the E. coli chromosome begins at a specific site, oriC, which has a number of important DNA sequence elements called DnaA boxes that make up the DnaA Oligomerisation Region, DOR (Figure 1.2) (Fuller et al. 1984; Jameson and Wilkinson 2017; Katayama et al. 2017). These boxes are bound by DnaA, an adenosine triphosphate (ATP)‐dependent initiator protein (Schaper and Messer 1995; Sutton and Kaguni 1997), which then forms a right‐handed helical protein oligomer along the DNA that unwinds the duplex at an A+T‐rich element known as the DNA Unwinding Element, DUE (Bramhill and Kornberg 1988a; Kowalski and Eddy 1989) (Figure 1.2). The DnaA oligomerisation process is assisted by another protein called DiaA (Ishida et al. 2004). The DUE has an A‐rich and a T‐rich DNA strand; once it is unwound, the T‐rich strand binds to the DnaA oligomers at the DOR. A helicase loader known as DnaC then loads the DnaB helicase onto the single‐stranded DNA (Koboris and Kornberg 1982). This helicase then recruits in turn the DnaG primase and DnaN, the DNA polymerase beta‐clamp (Fang et al. 1999). When fully assembled, this complex is known as the replisome (Figures 1.3 and 1.4).
Figure 1.2 Structure of oriC on the E. coli chromosome. The ATP‐dependent DnaA protein binds to sites throughout oriC and oligomerises in the DnaA Oligomerisation Region (DOR), driving DNA unwinding at the A+T‐rich DNA Unwinding Element (DUE). Single‐stranded T‐rich DNA in DUE binds to the DnaA oligomers at DOR. High‐affinity sites bind DnaA‐ATP or DnaA‐ADP; low affinity sites bind just DnaA‐ATP. Binding sites for the NAPs FIS and IHF are also shown: FIS and IHF modulate the process of replication initiation negatively and positively, respectively. The Dam methylase methylates oriC at several 5′‐GATC‐3′ sites (indicated by vertical arrows): hemimethylated sites bind SeqA, excluding DnaA and preventing untimely re‐initiation of chromosome replication.
Figure 1.3 Structure of the E. coli replisome in chromosome replication. The replisome is made up of the two cores of DNA Polymerase III, a gamma (γ) complex (or clamp loader) and the beta clamp together with a hexameric helicase, the DnaG primase, and the single‐stranded binding protein, SSB. The DnaB helicase uses energy from ATP hydrolysis to translocate along the lagging strand, unwinding the DNA duplex. The two Polymerase III cores, linked by the tau subunits (Figure 1.4), are each dedicated to coordinated and simultaneous replication of the leading and lagging template strands of the replication fork. The ring‐like beta (β) clamp (DnaN), or processivity factor, encircles DNA and is attached to the replisome via the alpha