Genome Editing in Drug Discovery. Группа авторов
(a). If a host is reinfected by previously encountered genetic element, interference machinery can recognize the pathogen and digest its genome. Successful targeting by type II effectors (b) leads to recruitment of Cas1:Cas2:Csn2 complex which acquires prespacer sequences proximal to the double‐strand break. Type I systems (c) recruit Cas1:Cas2 tetramer when efficient on‐target cleavage is inhibited, followed by helicase Cas3, allowing the adaptation complex to acquire prespacers from the nontarget strand. In any of the cases, the mechanism of integration into the CRISPR array is similar (d). Prespacer ends are processed and then integrated by two nucleophilic attacks by the 3’OH end at the first repeat sequence, with the gapped DNA filled by the host machinery. Sites of DNA cleavage activity are depicted by orange arrows.
In order to rapidly respond to the ever‐changing gallery of infectious predators, CRISPR systems have evolved a further adaptation pathway that allows rapid acquisition of new spacers through the action of already existing CRISPR‐mediated immunity. This process is termed primed spacer acquisition or priming, and is at least 1000‐fold more efficient than naïve adaptation (Staals et al. 2016; Stringer et al. 2020). A reinfecting strain against which the host has already been immunized will be efficiently targeted and cleaved, allowing new spacers from proximal sequences to be acquired, updating the CRISPR array with more spacers. This mechanism is particularly important in preventing infection by new phage strains that arise by mutations in the target sequence, be it PAM or the protospacer, which would make CRISPR‐targeting ineffective (van Houte et al. 2016).
Priming mechanistically differs from naïve adaptation in early stages, i.e. the origin of prespacer DNA. While in naïve adaptation the prespacers originate from intermediates of nucleolytic cleavage at compromised replication forks, in primed adaptation they are generated by the effector complex. The simplest mechanism is observed in type II systems (Figure 3.6b), where primed adaptation initiates with Cas9‐crRNA‐tracrRNA complexes recognizing and then cleaving the invading genome’s protospacer. Generated double‐stranded breaks allow recruitment of the adaptation complex Cas1:Cas2:Csn2 which together with Cas9 leads to nucleolytic degradation of the bound DNA, albeit many mechanistic details are still missing (Wilkinson et al. 2019). The degradation of the invading DNA leads to the acquisition of spacers from both strands of DNA, and most frequently proximal to the DSB (Nussenzweig et al. 2019). Here, the priming is absolutely dependent on on‐target cleavage by Cas9, and intuitively mutations in the seed regions or PAM abrogate spacer acquisition. As a result of this mechanism, the efficiency of spacer acquisition directly depends on the cutting efficiency mediated by a pre‐acquired spacer. This means that once immunized, the host can rapidly acquire new spacers, forming a feed‐forward immunization response.
In contrast, type I priming is promoted by failure to recognize a perfect target. Mutations in the protospacer will nominally decrease cleavage efficiency; however, Cascade can bind to such targets nonetheless (Blosser et al. 2015). Failure to establish a full‐length R‐loop, in essence failing to recognize a previously encountered target, is thought to abrogate potential off‐target cleavage and instead promote Cascade complex into primed adaptation mode (Xue et al. 2016). This is mechanistically achieved by delaying the recruitment of Cas3 nuclease (Figure 3.6c), permitting recruitment of Cas1:Cas2 complex first (Redding et al. 2015). Upon subsequent binding, Cas1:Cas2:Cas3 complex is able to translocate along the nontargeted DNA strand without target degradation, with Cas1‐Cas2 identifying new protospacers for integration (Hille et al. 2018). It remains unclear how prespacers are extracted when found, but it is not inconceivable that the same machinery involved in processing prespacers is involved as well.
3.4 CRISPR Systems as the Basis for New Tools in Drug Discovery
So far, how different CRISPR systems exert their function and safeguard microbes from invading genetic elements have been discussed. In the last part of this Chapter, we will turn to how the understanding of CRISPR biology has advanced biotechnology.
3.4.1 Cas Proteins for Gene Editing
The most prominent application of CRISPR systems since its discovery has been genome editing. This has stemmed from realization that Cas proteins can induce precise cuts in DNA (Marraffini and Sontheimer 2008; Garneau et al. 2010) guided by the crRNA molecule (Gasiunas et al. 2012; Jinek et al. 2012). The first protein shown to do this in a simple manner was the type II Cas9 protein of S. pyogenes. The simplicity of type II systems, where only one effector protein guided by crRNA:tracrRNA is required to introduce a guided DNA break, in contrast to type I systems which require a dozen of Cas proteins to form a functional Cascade complex, has allowed this system to be used in a plethora of eukaryotic systems to introduce precise genetic changes (Cong et al. 2013; DiCarlo et al. 2013; Ding et al. 2013; Friedland et al. 2013; Hwang et al. 2013; Jinek et al. 2013; Mali et al. 2013). The system has been further simplified by fusing crRNA to tracrRNA into a chimeric single‐guide RNA (sgRNA) (Jinek et al. 2012), thus reducing the complexity of the toolkit needed for gene editing to just two components, where the sgRNA can very easily be replaced, allowing unprecedented flexibility in gene editing. SpyCas9‐mediated gene editing has since been one of the most prolific technologies in biomedical sciences, with the application ranging from precise (epi)genome engineering, modulation of gene activity, forward genetic screens, proteomics, imaging studies, diagnostics, and therapy.
While gene editing will be addressed in detail elsewhere in this book, we would like to give the basics here. SpyCas9, guided by crRNA:tracrRNA (or more frequently sgRNA), once heterologously expressed in a cell, tissue, or organism, is guided to a specific site in the genome where upon successful recognition it will introduce a blunt‐ended DSB. The repair outcome of the DSB by the host machinery can result in mutagenic events such as insertions or deletions by the nonhomologous end joining or microhomology‐mediated end joining (optimal for generating genetic knockouts), or stimulate the introduction of desired DNA sequence via recombination with a donor sequence (desired outcome for model generation and therapy). It is therefore of utmost importance for Cas9 to specifically and efficiently introduce a break only at the desired site. Furthermore, the precision and efficiency of other gene editing methods, such as base editing (Komor et al. 2016), prime editing (Anzalone et al. 2019), or site‐specific transposition (Chen and Wang 2019), are also in part dictated by the properties of Cas protein.
As discussed previously, Cas9 binds and identifies its target by recognizing the PAM sequence and then base pairing the crRNA initially with the seed sequence, and subsequently with the remainder of the sequence, followed by activation of the nucleolytic activities of RuvC and HNH domains. A plethora of biochemical studies have unequivocally confirmed that mismatches between seed and crRNA sequence are refractory to Cas9 editing (Jinek et al. 2012). However, mismatches outside the seed sequence (i.e. 10–20 nt away from the PAM) are well tolerated and support efficient cutting (Anderson et al. 2015), allowing Cas9 to cleave at off‐target sites. Potential off‐target mutagenesis induced by Cas proteins is a serious concern in therapy and model development, in particular, as Cas9 was shown to be able to induce large deletions and chromosomal rearrangements (Kosicki et al. 2018). In order to circumvent such problems, a number of different computational and experimental methods have been developed to identify and prevent off‐target modifications (refer to Chapter 20).
3.4.1.1 Cas9 Variants
A major strategy to reduce off‐target editing is to increase the specificity of Cas9 protein. To this end, SpyCas9 has been mutated at strategic residues so that mismatches between crRNA and the target strand are not compatible anymore with activating the enzyme’s nucleolytic activity, generating Cas9 variants (such as eSpCas9, HypaCas9, and SpCas9‐HF1) with improved specificity (Slaymaker et al. 2016; Kleinstiver et al. 2016a; Chen et al. 2017). However, improved specificity was often shown to have a negative impact on enzyme activity, decreasing the overall editing efficiency (Jones et al. 2020, Schmid‐Burgk et al. 2020), likely due to the altered interaction between sgRNA