Molecular Biotechnology. Bernard R. Glick
the natural bacterial CRISPR-Cas systems, short sequences (“protospacers”) from an invading DNA molecule are incorporated as “spacers” between repeat sequences in the CRISPR locus of the bacterial genome (Fig. 2.17A). Thus, the CRISPR locus contains an array of spacers separated by repeat sequences that are a record of past foreign DNA invasions from which the bacterium survived. When the bacterium is subsequently invaded by a virus or plasmid whose DNA contains a sequence that is homologous to a spacer sequence, the spacer DNA is transcribed, producing a CRISPR RNA (crRNA) molecule that binds to and guides the Cas endonuclease complex to the target sequence in the invading DNA, which is cleaved (Fig. 2.17B). Recognition of the target sequence on the invading DNA requires that it is adjacent to a short specific sequence known as a protospacer adjacent motif (PAM). For example, the Streptococcus pyogenes endonuclease Cas9 recognizes a target sequence that is complementary to a crRNA only if it is immediately upstream of the motif NGG (where N is any nucleotide). PAMs are also important for selection of protospacers during spacer acquisition. The PAM requirement prevents cleavage of the bacterium’s own genome at sequences that are complementary to the crRNA, including the site in the CRISPR array from which the crRNA was transcribed, which lacks a PAM.
Figure 2.17 Bacterial CRISPR-Cas system for protection against invading bacteriophage. (A) Fragments of bacteriophage DNA (protospacer) are incorporated into the host bacterial genome as spacers between repeat sequences (gray) in the CRISPR array. (B) On subsequent invasion, the spacer DNA is transcribed to produce CRISPR RNA (crRNA) that guides an endonuclease (Cas) to a sequence in the invading DNA that is homologous to the spacer sequence and is adjacent to a protospacer adjacent motif (PAM). The viral genome is cleaved. Adapted by permission from Macmillan Publishers Ltd. from Yosef and Qimron, Nature 519:166–167, 2015.
Because of its relative simplicity compared to systems in other bacteria, the CRISPR-Cas system from S. pyogenes has been adapted for use as a genome engineering tool. In the natural S. pyogenes system, two RNA molecules, crRNA and transactivating crRNA (tracrRNA), form a crRNA:tracrRNA hybrid that directs the Cas9 endonuclease to the target site. For ease of use in genome engineering, the two RNAs are combined into a single guide RNA (sgRNA) that is 80 to 100 nucleotides long. The sgRNA is designed to include a 20-nucleotide sequence that is complementary to the target site (which is located adjacent to a PAM), and the fused crRNA:tracrRNA sequence that forms a stem loop structure involved in endonuclease binding (Fig. 2.18A). Following binding to the target sequence, the endonuclease makes a double-stranded break in the target DNA (Fig. 2.18B). This damage activates the cellular systems for DNA repair either by homologous recombination, in which DNA sequences with sufficient similarity are exchanged, or nonhomologous end joining, in which sequences are deleted or inserted. The repair systems can be harnessed to disrupt, insert, or replace a DNA sequence at a targeted site.
Figure 2.18 CRISPR-Cas system for genome editing. (A) An 80- to 100- nucleotide long single guide RNA (sgRNA) is constructed that contains a 20-nucleotide guide sequence (orange) that is complementary to the target site. The secondary structure, stabilized by intramolecular base-pairing between regions of the fused crRNA and tracrRNA sequences, is required for binding to the Cas9 endonuclease. (B) The sgRNA guides Cas9 to the target sequence (blue) in the genome. Target recognition requires an adjacent PAM sequence (red) NGG and complementarity between the guide sequence and the target sequence. Cas9 makes a double-stranded break in the target DNA (arrows) which is repaired by homologous recombination or nonhomologous end joining. The repair systems generate deletions and insertions at the target site.
Insertion of a DNA sequence (donor sequence) into a target site requires introduction of the sequences for the sgRNA, the Cas9 endonuclease, and the donor DNA into a recipient cell. The sgRNA and Cas9 coding sequences may be introduced on a vector (Fig. 2.19A), or the sgRNA and Cas9 mRNA may be directly injected along with the donor DNA. When the genes are introduced on a vector, the promoters that drive expression of the sgRNA and endonuclease, and the coding sequence (e.g., codon usage) for the endonuclease are optimized for expression in the chosen host. The donor DNA sequence is flanked by sequences that are homologous to the target genomic site for insertion by homologous recombination (Fig. 2.19B). The vector is introduced into the recipient cell and following expression of the sgRNA and the endonuclease the recipient cell genome is cleaved at the target site. Activation of recombinases that mediate DNA repair results in recombination between homologous sequences on the vector and in the recipient genome, and thereby, insertion of the donor DNA into the genome at the target site (Fig. 2.19B).
Figure 2.19 Vector for production of sgRNA and Cas9 in host cells (A). The gene encoding sgRNA contains a 20-nucleotide sequence (hatched region) that is complementary to the target site in the host genome. Promoters (arrows) for the sgRNA and Cas9 genes, and codon usage for Cas9, must be suitable for expression in host cells. An origin of replication (ori) and a selectable marker (e.g., bla encoding β-lactamase, which confers resistance to the antibiotic ampicillin) are included for initial vector construction in E. coli. The vector and donor DNA are introduced into a recipient cell. Following expression, the sgRNA guides the Cas9 endonuclease to the target sequence in the recipient cell chromosome and the endonuclease makes a double-stranded break in the target DNA. (B) The donor DNA sequence (green) is flanked by regions that are homologous to the target site (grey) for insertion by homologous recombination. Therefore, activation of recombinases that mediate DNA repair results in recombination between homologous sequences on the vector and in the recipient chromosome, and thereby, insertion of the donor DNA into the genome at the target site.
Polymerase Chain Reaction
The polymerase chain reaction (PCR) is a simple, efficient procedure for synthesizing large quantities of a specific DNA sequence in vitro (see Milestone box on page 36). The reaction exploits the mechanism used by living cells to accurately replicate a DNA template. PCR can be used to produce millions of copies from a single template molecule and to detect a specific sequence in a complex mixture of DNA.
milestone Specific Enzymatic Amplification of DNA In Vitro: the Polymerase Chain Reaction
PCR, which is the invention of Kary Mullis (U.S. patent 4,683,202), has had a tremendous impact on many research areas, including molecular biotechnology. The power of the method is in its simplicity, sensitivity, and specificity. It utilizes a mechanism similar to that used by our cells to accurately replicate a DNA template, it can detect and produce millions of copies from a single template molecule in a few hours, and, under appropriate conditions, it can amplify a specific sequence in a complex mixture of DNA molecules even when other similar sequences are present.
PCR was a unique idea that did not replace any existing technology. In the early 1980s, Kary Mullis was trying to solve the problem of using synthetic oligonucleotides to detect single nucleotide mutations in sequences that were present in low concentration. He needed a method to increase the concentration of the target sequence. He reasoned that if he mixed heat denatured DNA with two oligonucleotides that bound to opposite strands of the DNA at an arbitrary distance from each other and added some DNA polymerase and deoxynucleoside triphosphates, the polymerase would add the deoxynucleoside triphosphates to the hybridized oligonucleotides. The reaction did not yield the expected products. Mullis then heated the reaction products to separate the extended