Chemistry and Biology of Non-canonical Nucleic Acids. Naoki Sugimoto
of Z-DNA-binding proteins has been identified in various organisms including eukaryotes, prokaryotes, and viruses [10].
2.4 Branched DNA with Junction
Single-stranded DNAs, which have a palindromic sequence in 5′ and 3′ regions in close position, form stable hairpin structure. Even though the sequence is in the duplex such as genomic DNA, addition of negative supercoiling of DNA can loosen the twist of the helix and induce the formation of such intrastranded structures. Cruciform is a structure forming two intrastrand hairpins at opposite positions within the duplex (Figure 2.6). The region producing the branched DNA structure is called junction. Similar branched DNA structure with the four-way junction is formed during homologous gene recombination that is called Holliday junction (Figure 2.6). X-ray and NMR studies of the four-way junction showed that the four arms are aligned in pairs to give an oblique “X” structure with continuity of base stacking and helical axes across the junctions. The characteristic of the structure, in which the helices formed by the branched chain coaxially stack each other, is also frequently observed in RNAs having the junction structure. This suggests importance of stacking interaction between the nucleobases that stabilizes the helix even the helices are separated by the junction. Numerous proteins have been shown to interact with the cruciform by recognizing features of the structure such as DNA crossovers, four-way junctions, and curved or bent DNA. Many of them are involved in chromatin organization, transcription, replication, DNA repair, and other processes [11].
Two pathways, S-type and C-type, have been proposed when the cruciform is formed in DNA (Figure 2.6) [12]. S-type pathway is triggered by small opening, which allows a few bases to make intrastrand base pairs as a nuclear for the branched structure. Formation of the short base pairs is followed by migration and extension of the intrastrand base pairing, which is further facilitated by negative supercoiling. C-type pathway occurs when considerably long region of duplex is melted due to franking AT-rich sequences that allows one-step formation of the mature cruciform. C-type pathway does not probably occur at physiological ion concentrations, but possibly take place in DNA regions with propensities to undergo substantial denaturation such as replication origins [13].
Figure 2.6 Structure of four-way junction to form cruciform and Holliday junction. (a) S-type and C-type pathways to form cruciform structure from long duplex. (b) Structure of Holliday junction formed by DNA strand with 5′-TCGGTACCGA-3′ sequence (PDB ID: 1M6G). Top view of four TA consecutive nucleobases at the region of interstrand exchange are shown above the structure.
2.5 Multi-stranded DNA Helices
Base paring through hydrogen bonds and stacking interactions between the assembled nucleobases are the main factors that maintain the organized structure of nucleic acids. Multi-stranded structures are formed when nucleobases present in different strands more than three interact with each other to form the assembled structure and stably stack over. Figure 2.7 shows canonical duplex and other typical multi-stranded helices of nucleic acids, in which structures are resisted in protein data bank, including those consisting of unnatural nucleobases.
Among the multi-stranded nucleic acid structures, triplexes and tetraplexes composed of three and four strands, respectively, have been confirmed in not only aqueous solutions but also intracellular conditions and have attracted attention as non-canonical structures that contribute to gene regulations. Various studies have been conducted on the molecular mechanisms of their influence on the gene expressions from viewpoints of their thermodynamic stabilities and conformational dynamics. The structural characteristics and factors that contribute to the thermodynamic stabilities of triplex and representative tetraplexes, G-quadruplex and i-motif, are described in Chapter 3. The effects of these multi-stranded structures on gene expressions are explained in Chapters 5–8, focusing on DNA replication including telomeric regions, RNA transcription, and protein translation.
2.6 Structures in RNA
RNA is transcribed from DNA as a single strand, whereas the natural DNA basically exists as a set of complementary strands. The information transformed into functional proteins as a sequence of amino acids is encoded on the primary sequence of RNA. On the other hand, information to modulate the gene expressions exists in their higher-order structures, which are dependent on the secondary and tertiary interactions.
2.6.1 Basic Structure Distinctions of RNA
As described in Chapter 1, by comparing natural RNA to DNA, there are two difference in their chemical structures. One is the ribose sugar, which bears a 2′-hydroxyl group, and the other is uracil nucleobase, which lacks the 5-methyl group of thymidine in DNA. Regarding structure formation, 2′-hydroxyl group of ribose is the most important factor that confers unique conformational features compared to the DNA. The basic feature that RNA forms an A-type duplex is based on the fact that RNA ribose adopts C3′-endo-type conformation in its sugar pucker due to the presence of 2′-hydroxyl group. The constitutive characteristic that RNA is basically single-stranded is also an important element for RNA to form various secondary and tertiary structures.
Figure 2.7 Typical structures of multi-stranded DNA helices. (a) Canonical duplex (PDB ID: 1BNA). (b) Triplex (PDB ID: 1D3X). (c) G-quadruplex (PDB ID: 139D). (d) i-motif (PDB ID: 1YBL). (e) DNA hexaplex (PDB ID: 2FZA). (f) DNA octaplex (PDB ID: 1V3P). Top views of nucleobase interactions that are located on coplanar region and important for the formation of multi-stranded helix are shown along with the structure. BrC indicates cytosine modified by bromine.
2.6.2 Elements in RNA Secondary Structures
Single-stranded RNAs basically form their structure based on Watson–Crick-type base pairing within the same strand. At that time, bases not involved in the base pairs remain as single-stranded loop. Depending on how the single-stranded loop region is formed, RNA forms various secondary structures.
2.6.2.1 Hairpin Loop
Hairpin loop, which is also designated as stem loop, is formed when sequence of RNA has complementary regions at 5′ and 3′ regions. The complementary regions form duplex, and the region in between remains as single-stranded loop. It is an essential secondary structure element not only for interaction of proteins but also for further formation of complexed tertiary structures of RNA. RNA transcripts from the DNA regions, which potentially form