Genome Engineering for Crop Improvement. Группа авторов
of the more successful methods for redirecting targeting involves generating a library of three zinc‐finger variants from a pre‐selected pool of zinc‐finger monomers (Maeder et al. 2008). The resulting library of zinc‐finger arrays can then be interrogated using a bacterial two‐ hybrid screen, where binding of the zinc‐finger array to a pre‐determined sequence results in the expression of a selectable marker gene. This method has generated highly‐active zinc‐finger nuclease (ZFN) pairs for sites within animal and plant genomes. Since the development of ZFN technology, several studies have been done to engineer specific zinc‐finger modules for each of the 64 codon triplets (Bae et al. 2003; Dreier et al. 2001; Pabo et al. 2001). Until now, several ZFNs have been designed and used in numerous species. The developments for more specific and efficient technologies also gave rise to fewer off‐target effects. There are three most commonly available tools for engineering the ZF domains: context‐dependent Assembly (CoDA), Oligomerized Pool Engineering (OPEN), and Modular Assembly (MA). Several softwares are available for designing engineered ZFs (ZiFiT), containing the database of ZFs (ZiFDB) and identification of potential targets for ZFNs in several model organisms (ZFNGenome) (Kim et al. 2009; Mandell and Barbas 2006; Sander et al. 2007).
Figure 1.1 (A) Diagrammatic representation of (a) Zinc‐finger nucleases (ZFNs), (b) Transcription activator‐like effector nucleases (TALENs) and (c) Clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9 mediates DSBs formation. (B) dCas9‐based targeted genome regulation by (a) activation of gene expression, (b) repression of gene expression and (c) DNA methylation.
Source: Adapted from Mahfouz et al. (2014) © 2014. Reproduced with the permission of John Wiley & Sons.
Zinc‐finger nucleases have been widely used for plant genome engineering. Plant species that have been modified using zinc‐finger nucleases include, Arabidopsis, maize, soybean, tobacco, etc. (Ainley et al. 2013; Cai et al. 2009; Curtin et al. 2011; Lloyd et al. 2005; Marton et al. 2010; Osakabe et al. 2010; Shukla et al. 2009; Townsend et al. 2009; Wright et al. 2005; Zhang et al. 2010). With their relatively small size (~300 amino acids per zinc‐finger nuclease monomer), and the further advancements in methods for redirecting targeting (Sander et al. 2011a), zinc‐finger nucleases should continue to be an effective technology for editing plant.
1.3 TALENs
Transcription activator‐like effectors nucleases (TALENs) are fusion proteins, consisting of a DNA‐binding domain and a DNA‐cleavage domain. Whereas the DNA‐cleavage domain is the same between zinc‐finger nucleases and TALENs (the catalytic portion of FokI), the DNA binding domains are different. The TALEN DNA‐binding domain is derived from TALE proteins found in the plant pathogen Xanthomonas. These proteins are composed of direct repeats of 33–35 amino acids, and nearly all arrays found in Xanthomonas contain a final, half repeat, consisting of the first 20 amino acids from the normal repeat. Two amino acids within these repeats (positions 12 and 13) are responsible for recognizing a single nucleotide base (these amino acids are referred to as repeat‐variable diresidues; RVDs). When the TALE effector code was broken (i.e. the relationship between the RVD and corresponding target base) (Boch et al. 2009; Moscou and Bogdanove 2009), the ability to redirect targeting, and their use as a genome engineering tool was realized (Christian et al. 2010; Li et al. 2011; Mahfouz et al. 2011). To make TALENs useful in gene targeting, the basic requirement is the modular assembly of repeat sequences containing the appropriate RVD corresponding to the nucleotide target. The most widely used RVDs and their nucleotide targets are HD, cytosine; NG, thymine; NI, adenine; NN, guanine, and adenine; NS, adenine, cytosine, and guanine; N*, all four nucleotides. This one‐to‐one correspondence of a single RVD to a single DNA base has eliminated construction challenges due to context‐dependency seen with zinc‐fingers and meganucleases. However, one limitation when using TALENs is that the target sequence must have thymine at the −1 position (Boch et al. 2009). Further, the long and repetitive nature of TALENs puts a strain on delivery methods where cargo capacity or stability is a limitation.
The assembly of engineered TALE repeat arrays can be challenging from nearly similar repeat sequences; therefore, a number of platforms have been designed to facilitate this assembly. These can be classified into three categories: standard restriction enzyme and ligation‐based cloning methods (Huang et al. 2011; Sander et al. 2011); Golden Gate assembly methods (Briggs et al. 2012; Cermak et al. 2011; Engler et al. 2008) and solid‐phase assembly methods (Heigwer et al. 2013; Wang et al. 2012).
Several online tools are available for designing TALE effectors to target specific gene sequence and off‐target analysis. For example‐ E‐TALEN (Lin et al. 2014), Scoring Algorithm for Predicting TALEN Activity (SAPTA) (Neff et al. 2013), Mojo‐hand (Coordinators 2013), TAL Effector‐Nucleotide Targeter (TALE‐NT), etc. TALE‐NT is a collection of versatile web‐based tools like‐TALEN Targeter, TAL Effector Targeter, Target finder, Paired Target Finder, and TALEN Targeter Off‐Target Counter (Christian et al. 2013).
Several studies have demonstrated the usefulness of TALENs in different plant species, including Arabidopsis (Zhang et al. 2013), tobacco (Wang et al. 2012; Wendt et al. 2013), barley (Li et al. 2012), rice (Shan et al. 2013a) and Brachypodium (Reyon et al. 2011). Taken together, the modular nature of TALE repeats, along with efficient methods for assembling repetitive DNA sequences (Garneau et al. 2010; Wang et al. 2012), have enabled TALENs to become one of the premier tools for plant genome engineering.
1.4 CRISPR‐Cas System
The most recent addition to the SSN family is the CRISPR/Cas system that is normally present within bacteria and archaea, and provides an adaptive immunity against invading plasmids or viruses. CRISPR/Cas system functions to destroy invading nucleic acids by introducing targeted DNA breaks (Garneau et al. 2010).
There are three major types of CRISPR/Cas system: Types I – III (Makarova et al. 2011). The Type II system was adopted for genome engineering a few years ago (Cong et al. 2013; Zhang et al. 2011). In this system, two components enable targeted DNA cleavage: a Cas9 protein and an RNA complex consisting of a CRISPR RNA (crRNA; contains 20 nucleotides of RNA that are homologous to the target site) and a trans‐activating CRISPR RNA (tracrRNA). Cas9 protein causes double‐stranded DNA break at the sequences homologous to the crRNA sequence and upstream of a protospacer‐adjacent motif (PAM) (PAM; e.g. NGG for Streptococcus pyogenes Cas9). For genome engineering purposes, the complexity of the system was reduced by fusing the crRNA and tracrRNA to generate a single‐guide RNA (gRNA). Moreover, off‐target cleavage is a limitation of the CRISPR/Cas system (Cho et al. 2014; Fu et al. 2013).
The target site recognition in CRISPR‐Cas system is facilitated through RNA: DNA interaction (as opposed to a protein: DNA interaction used by meganucleases, zinc‐finger nucleases, and TALENs). Redirecting of Cas9 targets involves modification of 20 nucleotides within the crRNA or gRNA. These 20 nucleotides are used to direct Cas9 binding and cleavage, the system has been shown to tolerate mismatches, with a higher tolerance closer to the 5′ end of the target sequence (Fu et al. 2013). Results from recent studies suggest the first 8–12 nucleotides, in addition to the PAM sequence, are most critical for target site recognition (Sternberg et al. 2014; Wu et al. 2014). To reduce off‐targeting, several methods have been developed, including dual‐nicking of DNA (Mali et al. 2013; Ran et al. 2013), a fusion of catalytically‐dead Cas9 to FokI (Guilinger et al. 2014; Tsai et al. 2014) and shortening of gRNA sequence (Fu et al. 2014). Several softwares and programs have been developed in recent years for the identification of target sequences in the genome and the design of specific gRNA, which are listed in Table 1.1.
The