Spectrums of Amyotrophic Lateral Sclerosis. Группа авторов
results from altered TDP‐43 [24]. A loss‐of‐function model for TDP‐43 implicates not only mis‐spliced targets but also a lower expression of these targets. Conversely, conserved exons that should be included in properly‐spliced transcripts are skipped (skiptic) in cells with TDP‐43 variants [26]. In mice carrying certain variants in either the RNA‐binding or low‐complexity domains, exons were either aberrantly included (cryptic) or excluded (skiptic), respectively [26].
Cytoplasmic TDP‐43–containing inclusions are a common hallmark of ALS, appearing in as many as 97% of ALS cases [27] and in all but SOD1 variant carriers [28]. There is still debate about whether the aggregates are toxic per se. First, while most ALS‐causing variants occur in the C‐terminal region of the TARDBP transcript, aggregated C‐terminal TDP‐43 fragments do not appear to be the driving force of the pathology [29]. Second, the N‐terminus of TDP‐43 appears to strongly affect cell pathology and aggregation [30], although ALS‐associated variants are not often observed in the first exons of TARDBP. Lastly, the normal function of the RNA‐binding domains of TDP‐43 appears to prevent the aggregation of the protein: when RNA targets of TDP‐43 are not available to bind, aggregation of TDP‐43 increases [31]. Supporting the involvement of RNA‐binding for ALS pathology, TARDBP variants require intact TDP‐43 RNA‐binding domains to exert neurotoxic effects [32].
Fused in Sarcoma (FUS)
Using loss‐of‐heterozygosity mapping in a consanguineous family, variants in a region of chromosome 16 were identified to be linked with ALS [33]. Simultaneously, a separate group found fused in sarcoma (FUS) variants segregating dominantly in several pedigrees after screening a similar linkage region [34]. Protein‐altering variants have been observed across the FUS coding sequence, but the most penetrant variants (not in unaffected controls) are clustered in the nuclear localization signal (NLS) domain of the last exon [17]. FUS variants have been linked to strongly penetrant and dominantly inherited forms of ALS. Indeed, variants in FUS have been consistently associated with earlier onset and even juvenile cases of ALS, with age at onset generally below the disease average [35]. While this is a rare cause of both familial and sporadic ALS (approximately 4% of familial ALS and less than 1% of sporadic [4]), the discovery of another aggregating RNA‐binding protein in ALS strengthened the themes of protein aggregation, RNA metabolism, and nuclear trafficking [17].
FUS and TDP‐43 are similar in that they localize in the nucleus under normal conditions, regulate splicing of pre‐mRNA, transport transcripts from the nucleus, and form cytoplasmic aggregates in the presence of specific ALS‐associated variants [36, 37]. However, FUS is also a transcription factor, binding to open chromatin to regulate transcription of RNA [38]. FUS directly binds to RNA on long introns but also interacts with splicing machinery to affect RNA processing indirectly [37]. FUS has a different RNA sequence recognition motif compared to TDP‐43 and does not bind the same set of transcripts [36]. Further, FUS also binds a specific secondary structure of RNA in addition to its sequence motif [35]. Cytoplasmic FUS aggregates might suggest a loss of these specific RNA binding functions as FUS is sequestered [39], and it is unclear whether FUS aggregates are toxic or an indication of lowered FUS activity.
Chromosome 9 Open Reading Frame 72 (C9orf72)
A hexanucleotide repeat expansion (HRE) of GGGGCC in the first intron of chromosome 9 open reading frame 72 (C9orf72) was discovered by two simultaneous studies [40, 41]. The locus containing C9orf72 was identified in linkage scans in large families with multiple ALS cases [42], but the actual variant was elusive due to the contemporary paradigm of searching for single nucleotide exonic variants. The C9orf72 HRE was the first noncoding variant with a substantial impact on ALS genetic research [4]. Currently the most common genetic cause of ALS, the C9orf72 HRE explains about 10% of all ALS cases (approximately 40% of familial and 7% of sporadic) [4]. The frequency of the C9orf72 HRE is strongly dependent on population, ranging from 20% of Finnish ALS cases to very rare in Asian populations [7]. Alleles in the range of 2 to 20 repeats are considered normal and non‐pathogenic, with repeat lengths above 30 being strongly penetrant for ALS. Indeed, repeat lengths of several thousand have been reported, with a potential correlation between disease severity and length [43]. Intermediate lengths between 20 and 30 repeats have been observed and have recently been recognized as associated with ALS, although with a lower risk than expanded alleles [44]. The C9orf72 HRE is unstable at very large repeat lengths [43], but somatic expansion of normal length does not likely occur [45].
Adding to the evidence that RNA metabolism is a significant factor in ALS pathology, C9orf72 HRE carriers show RNA foci and aberrant RNA translation. RNA foci result when HRE transcripts amass together and subsequently sequester RNA‐binding proteins [46]. For example, RNA‐binding proteins such as hnRNPA2B1 and hnRNPA1, as well as splicing factors SRSF1 and SRSF2, are also colocalized in RNA foci [46]. TDP‐43 has been observed to bind G‐quadruplex structures containing the GGGGCC repeat, implicating a loss of TDP‐43 function in C9orf72 ALS [47]. These sequestered proteins could have an impact on proper RNA splicing, and indeed transcriptomic aberrations have been observed in C9orf72 HRE patient cerebellums [48]. RNA foci are likely not toxic per se, as expressions of RNA containing only very long repeats (without the C9orf72 transcript) do not cause acute toxicity in Drosophila neurons [49]. While the C9orf72 HRE is in an intronic region of the pre‐spliced C9orf72 transcript, the repeat itself can be translated into dipeptide repeat proteins (DPRs) [50]. Through a process known as repeat‐associated non‐ATG‐initiated (RAN) translation, five different DPRs with two repeating amino acids each (depending on direction and reading frame) are generated. DPRs might affect ALS pathology in several ways. First, zebrafish models have demonstrated that expressing DPR without the C9orf72 HRE results in motor deficits and morphological defects [51]. Second, poly‐Gly‐Arg and poly‐Pro‐Arg DPR are toxic in Drosophilia neurons, and a mass spectrometry screen of proteins bound to DPR showed enrichment of ribosomal proteins, potentially linking translational inhibition [52]. Third, the poly‐Pro‐Arg DPR might alter proper nuclear pore function and affect nuclear import and export [53]. The direct effects of DPR on cell function and survival are still debated, and more study will be needed to determine whether the levels generated by inherited C9orf72 HRE are related to ALS progression.
RECENTLY DISCOVERED GENES
The rate of gene discovery appears linear, with new ALS genes discovered each year (Figure 2.1a). However, newly discovered genes tend to explain very few ALS cases; and despite the constant discovery rate of genes, the percent of genetically explained cases appears to have plateaued (Figure 2.1b). Nonetheless, there are genes for which there is significant evidence of their involvement and through which we can expand the scope of ALS cell biology theory. Below we discuss three recent gene discoveries, chosen because they were replicated in multiple cohorts or pedigrees.
Annexin A11 (ANXA11)
By examining families with multiple affected individuals, Smith et al. observed the Annexin A11 (ANXA11) p.D40G variant segregating with ALS [62]. In a replication cohort, additional variants were observed; and while more than half were in the N‐terminus, no functional domain was enriched for variants. These variants may account for as much as 1–2% of ALS, whether sporadic or familial. Variants in ANXA11 in ALS patients were quickly replicated in Chinese cohorts [63–65], with results ranging from ANXA11 variants being a frequently altered gene in sporadic ALS patients [65] to ANXA11 variants being rare and of uncertain impact [64]. As variants will be found at varying frequencies in different cohorts, more studies will be needed to assess whether any given variant is coincidental or an actual cause of disease. Similar to some RNA‐binding proteins and SOD1 [17], ANXA11 variants appear to cause its encoded protein to aggregate when overexpressed, and in turn these