Genome Editing in Drug Discovery. Группа авторов
major limitation of using SpyCas9 for genome editing is the PAM sequence. As discussed previously, type II effector proteins require a compatible PAM sequence downstream of the protospacer sequence. The preferred PAM for SpyCas9 is 5’‐NGG‐3’, albeit it is able to use other GC‐rich PAMs at lower efficiencies (Leenay et al. 2016). To expand the targeting repertoire, Cas9 variants have also been evolved to be able to use other PAM sequences. Initial variants relied on structural understanding of the SpyCas9 mechanism, and have therefore mutated the key residue (Arg1335) involved in recognition of the third G in the PAM, followed by compensatory mutations in the vicinity that allowed these novel proteins SpyCas9‐VQR and ‐EQR to use the NGAN and NGNG as PAMs (Kleinstiver et al. 2015). A similar strategy was used to generate two other variants, the QQR1 which uses NAAG (Anders et al. 2016), and SpyCas9‐NG requiring a dinucleotide 5’‐NG‐3’ (Nishimasu et al. 2018). Recently, thanks to sophisticated in vitro evolution systems based on continuous phage‐assisted evolution, multiple SpyCas9 variants that recognize non‐G PAMs have been evolved; these variants can use NRN and to a lesser extent NYN PAM (Walton et al. 2020) and NRNH (where R represents A or G, Y is C or T, and H is A, C or T) (Miller et al. 2020). The development of these novel variants, albeit compromising their efficiency, brings us one step closer to a Cas system able to truly target any genomic sequence without restrictions imposed by the PAM.
3.4.1.2 Cas9 Orthologs
While in vitro evolution of SpyCas9 has expanded its utility, all the current variants still require a guanine in the PAM sequence or remain poorly active when using non‐G PAM, making the variants of this enzyme largely unsuitable for targeting AT‐rich sequences. Furthermore, while mutagenesis of Cas9 has improved its specificity, due to the structural constraints, these cannot be evolved past certain properties. Moreover, the size of the protein is of relevance for therapeutic applications, as typical vectors used for nuclease delivery (AAV) have a packaging capacity very close to the size of the SpyCas9 expression module (typically ~4.7 kb (Grieger and Samulski 2005)), making the use of this system difficult.
An alternative to in vitro evolution of Cas9 toward different activities, specificities, and sizes is to use Cas9 proteins originating from different species. Due to a staggering sequence variation, Cas9 from different species exhibit diverse specificities, activities, thermodynamic properties (i.e. exhibit longer stability and activity at higher temperatures), and PAM requirements. For example, in parallel with the discovery of SpyCas9 and the interference mechanisms, two orthogonal Cas9 from Streptococcus thermophilus (StCas9) has been described (Magadan et al. 2012; Karvelis et al. 2013), that were later shown to display editing efficiencies comparable to SpyCas9 in human cells, but with substantially lower off‐target rate and longer, more diverse PAMs (NNAGAAW and NGGNG, where W is A or T) (Cong et al. 2013; Muller et al. 2016). Over the course of the last decade, a number of different, often smaller Cas9 proteins have been described and then used in genetic engineering in eukaryotic cells. These include Cas9 proteins from S. aureus (SauCas9) with the NNGRRT PAM (Ran et al. 2015), Neisseria meningitidis Nme1Cas9 with NNNNGATT PAM and lower off‐target due to longer crRNAs (Esvelt et al. 2013; Hou et al. 2013; Zhang et al. 2013), and also Nme2Cas9 requiring NNNNCC (Edraki et al. 2019), Staphylococcus auricularis (SauriCas9) requiring NNGG in PAM (Hu et al. 2020), Francisella novicida (FnCas9) with NGG PAM (with rational engineering further relieving the PAM constraints to YG) (Hirano et al. 2016), Campylobacter jejuni (CjCas9) with NNNNRYAC (Kim et al. 2017), Streptococcus canis (ScCas9) with NNG (Chatterjee et al. 2018), Geobacillus stearothermophillus (GeoCas9) using the NNNNCRAA as a PAM (Harrington et al. 2017), and Streptococcus macacae (SmacCas9) requiring a NAAN in PAM site (Chatterjee et al. 2020).
Recently, a biochemical tour de force has identified and characterized the activity of 79 novel Cas9 proteins. Previously unknown G‐, A‐, T‐, C‐rich PAM repertoires, together with different patterns of cuts (blunt or with staggered ends), kinetics, and crRNA sequences have been described, forming the basis of a catalogue of orthologs that can be used for genome editing in the future (Gasiunas et al. 2020). Together with the ever‐expanding and ever‐improving number of variants with a wide range of PAM specificities (Collias and Beisel 2021), one can envisage that thanks to these efforts one will be able to choose a Cas9 protein for genome editing purposes based on the desired target sequence, unconstrained by the PAM restrictions, specificities, and activities.
3.4.1.3 The Use of Other Cas Proteins in Genome Editing
In parallel with the advent of new SpyCas9 variants and Cas9 orthologs, there was a push to examine whether other types of CRISPR‐Cas9 proteins can be used for gene editing. While type I systems were essentially the first CRISPR systems to be examined in detail (Barrangou et al. 2007), the fact that the system requires a multitude of proteins (at various stoichiometry) gave little appeal to use this technology in eukaryotic systems. Type I systems are extensively used for precise engineering in microbes (Kiro et al. 2014; Li et al. 2016; Pyne et al. 2016; Xu et al. 2019), and only recently have been implemented for editing in human cells. Due to type I system’s reliance on hyperactive Cas3 helicase/nuclease, editing by the Cascade complex leads to large deletions (up to 100 kb) from a single cut site (Dolan et al. 2019; Morisaka et al. 2019; Osakabe et al. 2020). Whereas using type I systems are therefore an excellent tool for deleting large segments of the genome (which could be useful for removing transgenes from model organisms or interrogating regulatory elements, for example), this system has been adapted for introducing small insertions by fusing the Cascade complex to FokI endonuclease domain. When Cas3 was omitted from and FokI‐Cascade fusion expressed to target proximal sequences, DNA sequences were successfully deleted (Cameron et al. 2019). Whereas the efficiency of Cascade‐FokI mediated editing is low compared with standard SpyCas9, it is on par with other genome editing approaches using catalytically inactive Cas proteins (see Section 3.4.2).
The most prominent use of other types of CRISPR systems is the Cas12a (originally named Cpf1), a member of type V systems (Zetsche et al. 2015), which have some beneficial features over Cas9 proteins. The first three described Cas12a proteins, the Franscisella novicida (FnCas12a), Acidaminococcus sp. (AsCas12a), and Lachnospiraceae bacterium (LbCas12a), use T‐rich PAM sequences, such as TTN and TTTN, making them a complementary tool to G‐rich PAM utilizing Cas9 proteins. Secondly, Cas12a proteins are able to autonomously process crRNA from pre‐crRNA, unlike Cas9 proteins which require tracrRNA (Deltcheva et al. 2011). This property has been exploited to significantly improve simultaneous genome editing at multiple sites, where multiple Cas12a‐compatible crRNAs are expressed as a single pre‐crRNA transcript, contrasting to Cas9 multiplex editing which requires multiple expression modules, one for each sgRNA (Zetsche et al. 2017; DeWeirdt et al. 2021). The most prominent benefit of Cas12a is, however, its cleavage mechanism, where Cas12a generates staggered ends (unlike blunt ends generated by most of Cas9 proteins) outside the critical seed region. Staggered ends are particularly suitable for precise integration of DNA via NHEJ or MMEJ (Maresca et al. 2013). As Cas12a cuts the target DNA away from the seed sequence, repair via NHEJ (leading to small indels) will still support cleavage. Cas12 editing is therefore likely to promote resection of the staggered DNA break, leading to deletions spreading into the seed region or more favorably promoting HDR and MMEJ (Begemann et al. 2017; Moreno‐Mateos et al. 2017; Li et al. 2018a). Finally, Cas12a shows lower tolerance to mismatches, with only the mismatches at the last 4 nt of a 23 nt‐long crRNA supporting cleavage (Kleinstiver et al. 2016b), in contrast to SpyCas9 which will cut DNA even with 10 mismatches (Klein et al. 2018; Jones et al. 2020). Similarly to Cas9 systems, new Cas12a orthologs have also been described to expand the targeting spectrum by choosing Cas12a variants with convenient PAM requirements, such as those of Coprococcus eutactus (CeCas12a) or F. novicida (FnCas12a) (Aliaga Goltsman et al. 2020; Chen et al. 2020; Zetsche et al. 2020). Furthermore, AsCas12a has been evolved for higher specificity and activity (Kleinstiver et al. 2019).
Other type V systems have very recently been adapted for gene editing. Type V‐B systems, epitomized by Cas12b effector proteins, were difficult to use for gene editing due to their collateral ssDNAse activity and low activity at mammalian physiological temperature.