Genome Editing in Drug Discovery. Группа авторов
between the crRNA and target RNA at the central seed sequence is essential for binding and then stimulating RNase activity (Abudayyeh et al. 2016; Tambe et al. 2018), whereas peripheral mismatches are tolerated to a greater extent. Further elements are also needed for activation of RNase activity of the HEPN domains. Sequence and structure of the hairpin are critical, where reducing the length under 24 nt or mutating key nucleotides significantly decreased its activity (Abudayyeh et al. 2016; Smargon et al. 2017). Analogous to PAM, Cas13 also requires protospacer flanking site (PFS) proximal to the target site; the exact sequence and position is dependent on the subtype and species. For example, Cas13a requires a 3’ non‐G PFS, while Cas13b need a PFS at either side of the protospacer, with the 5’ being depleted of C and the 3’ PFS having a consensus sequence of NAN or NNA, where N is any nucleotide (Abudayyeh et al. 2016, Smargon et al. 2017).
Productive base pairing between crRNA and target RNA triggers a conformational change bringing both of the HEPN domains in close proximity and assembling into a composite catalytic site on the outer surface of the protein (Liu et al. 2017b; Liu et al. 2017c). Activated HEPN domains can indiscriminately degrade ssRNA, both the target at single‐stranded regions and any other RNA in the vicinity (Abudayyeh et al. 2016; East‐Seletsky et al. 2016). This indiscriminate RNase activity was shown to be able to restrict invasion of RNA phages, such as MS2 (Abudayyeh et al. 2016). Furthermore, a recent study showed that type VI systems can confer immunity to DNA phages as well. Similarly to type III systems, indiscriminate RNase activity of Cas13 is able to induce growth arrest of the infected cells. While the growth arrest induced by type III systems is transient as Csm6 (which also contains HEPN domains) gets deactivated, Cas13 does not (likely due to persistent transcription from the phage genome, which remains intact), consequentially inducing dormancy. This way, the further expansion of the phage is prevented, achieving immunity on a population level (Meeske et al. 2019).
3.3.3 Adaptation
The last phase of the CRISPR‐mediated immunity that shall be addressed here is the adaptation phase, also known as spacer acquisition. In this phase, the memory of previous infections is recorded, allowing an organism and its descendants to confer immunity to the reinvading genome. The CRISPR array acts as a genetic ledger, with spacers acting as records of infections (with the most recent usually located closest to the leader sequence). The key players in spacer acquisition are Cas1 and Cas2, present in nearly every CRISPR system. In contrast to the ubiquity of these genes, understanding of molecular mechanisms of adaptation is restricted just to type I and type II systems with many mechanistic details still missing, with more studies needed to elucidate this process in its entirety.
The acquisition of spacers upon the invasion of an infectious genome that the host has not encountered before is often termed naïve spacer acquisition. This is an incredibly rare event, estimated to occur with a frequency of 10−7 per infected cell (Hynes et al. 2014; Heler et al. 2015). The acquisition of spacers by Cas1:Cas2 adaptation complex in these circumstances can be viewed as accidental, as it relies on other systems to generate suitable substrate for this protein. In E. coli, spacer acquisition was linked to excessive replication of the viral or plasmid genome, during which the inherently unstable replication intermediates are recognized and digested by the RecBCD endonuclease complex (involved in restriction and recombination), generating small fragments that can be captured and further processed by the adaptation complex (Levy et al. 2015). A similar mechanism has been described in Gram‐positive species as well, where the AddAB restriction system is able to enrich for spacers (Figure 3.6a).
The latter phase of adaptation is mediated by highly conserved Cas1:Cas2 adaptation complex. However, some CRISPR systems also require additional proteins, such as Csn2 and Cas9 of S. pyogenes type II system (Heler et al. 2015; Wei et al. 2015), or Cas4 of type I and type V systems (Lee et al. 2018). Furthermore, in certain species with type III and type VI systems, fusions of cas1 and reverse transcriptase (RT) genes have been discovered (Silas et al. 2016; Gonzalez‐Delgado et al. 2019; Toro et al. 2019), and then shown to be able to insert spacers originating from RNA and DNA, where the integration of RNA is followed by cDNA synthesis by the RT domain (Silas et al. 2016; Gonzalez‐Delgado et al. 2019).
The adaptation complex is composed of the central Cas2 dimer flanked on either side by a Cas1 dimer (Figure 3.6d). The captured DNA fragments (termed prespacer DNA) are bound by the central Cas2 dimer, with the Tyr22 of the E. coli Cas1 acting as a protein wedge to splay open the bound dsDNA fragment (Nunez et al. 2014; Nunez et al. 2015). The splayed ends of the bound prespacer DNA can be trimmed by host’s DnaQ 3’‐5’ exonuclease‐domain containing enzymes (such as DNA polymerase III, exonuclease T, DnaQ, or in some species by Cas2‐DnaQ fusion proteins) into fragments with 5 nt overhangs (Kim et al. 2020; Ramachandran et al. 2020), which are optimal for integration into CRISPR locus (Nuñez et al. 2015). In some CRISPR systems, trimming can be performed by Cas4, a RecB nuclease domain‐containing protein (Lee et al. 2018). Either way, the length of the prespacer DNA is maintained at the fixed length by the distance between two Cas1 subunits flanking Cas2 dimer, ensuring a uniform length of spacers within the CRISPR array. Finally, the 3’ ends are positioned into the active sites of Cas1, making them poised for catalysis at CRISPR array.
While the interactions between Cas1:Cas2 complex and protospacer DNA are through phosphate backbone rather than base‐specific interactions, the selection of specific protospacers by the adaptation machinery is often nonrandom. Spacers are preferentially acquired from sequences proximal to PAM sites (Savitskaya et al. 2013), with the adaptation complex in E. coli type I system able to select functional prespacers by directly recognizing the PAM sequence (Datsenko et al. 2012; Swarts et al. 2012; Wang et al. 2015). Similarly, the aforementioned Cas4 also seems to have a key role in selecting prespacers (Rollie et al. 2018). Furthermore, recognizing the PAM sequence is also used to orient the prespacer into CRISPR array so that the crRNA ultimately contains the correct sequence necessary for recognition (Shiimori et al. 2018).
Crucially, while PAM is used to identify functional prespacers and position them in the correct orientation, it must be removed prior to integration into CRISPR array; otherwise, it will induce self‐immunity. This is achieved by removing the PAM sequence immediately prior to integration, either by the exonucleases (Ramachandran et al. 2020) or Cas4 (Lee et al. 2019). By coupling recognition of functional PAM sites with the prespacer processing increases the chance of integrating a functional spacer that will be compatible with future interference phases.
The integration into the CRISPR array occurs more frequently at the 5’ end of the locus, adjacent to the AT‐rich leader sequence (Figure 3.6d). To perform an efficient and on‐target integration, a major distortion of the target DNA is required. In E. coli, this is performed by the architectural protein integration host factor (IHF), which tightly binds to the leader sequence of the CRISPR array (Nunez et al. 2016), inducing a major bend in the DNA (Wright et al. 2017). Active integration complex binds to the bent DNA, catalyzing the nucleophilic attack of the 3’OH group of the prespacer DNA at each end of the repeat sequence in the CRISPR array (Nuñez et al. 2015), resulting in a gapped intermediate that is ultimately resolved by the host’s DNA repair and transcription machinery (Ivančić‐Baće et al. 2015; Budhathoki et al. 2020). In species lacking IHF (most of Gram‐positive bacteria), recognition of the first repeat instead requires a short 5 base pair‐long motif termed leader‐anchoring site (LAS) (McGinn and Marraffini 2016), which is directly recognized by the adaptation complex.
Figure 3.6 Adaptation phase(s) in CRISPR system. Depending on whether invading genomes have been encounter before, spacers can be acquired without (naïve) and with (primed) the assistance of CRISPR interference machinery. Fragments of the invading DNA generated by the RecBCD/AadAB restriction