Molecular Biotechnology. Bernard R. Glick
then digested with the restriction endonuclease prior to insertion into a vector via sticky-end ligation.
Figure 2.11 Synthesis of double-stranded cDNA using gene-specific primers (A) or oligo(dT) primers (B). A short oligonucleotide primer is added to a mixture of purified mRNA and anneals to a complementary sequence on the mRNA. Reverse transcriptase catalyzes the synthesis of a DNA strand from the primer using the mRNA as a template. To synthesize the second strand of DNA, the mRNA is nicked by RNase H, which creates initiation sites for E. coli DNA polymerase I. The 5′ exonuclease activity of DNA polymerase I removes RNA sequences that are encountered as DNA synthesis proceeds. The ends of the cDNA are blunted using T4 DNA polymerase prior to cloning.
When the sequence of the target mRNA intended for cloning is not known or when several target mRNAs in a single sample are of interest, cDNA can be generated from all of the mRNAs using an oligo(dT) primer rather than a gene-specific primer (Fig. 2.11B). The mixture of cDNAs, ideally representing all possible mRNA produced by the cell, is cloned into a vector to create a cDNA library that can be screened for the target sequence(s) (described below).
Recombinational Cloning
Recombinational cloning is a rapid and versatile system for cloning sequences without restriction endonuclease and ligation reactions. It is particularly useful when a large number of DNA fragments are to be cloned into one type of vector, for example, to introduce protein coding sequences into an expression vector for the production and purification of thousands of different proteins in parallel to facilitate the creation of a proteomic microarray (described later in this chapter). One method, known as Gateway cloning technology, exploits the mechanism used by bacteriophage λ to integrate viral DNA into the host bacterial genome during infection. Bacteriophage λ integrates into the E. coli chromosome at a specific sequence (25 bp) in the bacterial genome known as the attachment bacteria (attB) site. The bacteriophage genome has a corresponding attachment phage (attP) sequence (243 bp) that can recombine with the bacterial attB sequence with the help of the bacteriophage λ recombination protein integrase and an E. coli-encoded protein called integration host factor (Fig. 2.12A). Recombination between the attP and attB sequences results in insertion of the phage genome into the bacterial genome to create a prophage with attachment sites attL (100 bp) and attR (168 bp) at the left and right ends of the integrated bacteriophage λ DNA, respectively. For subsequent excision of the bacteriophage λ DNA from the bacterial chromosome, recombination between the attL and attR sites is mediated by integration host factor, integrase, and bacteriophage λ excisionase (Fig 2.12B). The recombination events occur at precise locations without either the loss or gain of nucleotides.
Figure 2.12 Integration (A) and excision (B) of bacteriophage λ into and from the E. coli genome via recombination between attachment (att) sites in the bacterial and bacteriophage DNA.
For recombinational cloning, a modified attB sequence is added to each end of the target DNA. The attB sequences are modified so that they will only recombine with specific attP sequences. For example, attB1 recombines only with attP1, and attB2 recombines with attP2. The target DNA with flanking attB1 and attB2 sequences is mixed with a vector (donor vector) that has attP1 and attP2 sites flanking a toxin gene that will be used for negative selection following transformation into a host cell (Fig. 2.13A). Integrase and integration host factor are added to the mixture of DNA molecules to catalyze in vitro recombination between the attB1 and attP1 sites and between the attB2 and attP2 sites. As a consequence of the two recombination events, the toxin gene sequence between the attP1 and attP2 sites on the donor vector is replaced by the target gene. The recombination events create new attachment sites flanking the target gene sequence (designated attL1 and attL2), and the plasmid with the attL1-target gene-attL2 sequence is referred to as an entry clone. The mixture of original and recombinant DNA molecules is transformed into E. coli, and cells that are transformed with donor vectors that have not undergone recombination retain the toxin gene and therefore do not survive. Host cells carrying the entry clone are positively selected by the presence of a selectable marker.
Figure 2.13 Recombinational cloning. (A) Recombination (thin vertical lines) between a target gene with flanking attachment sites (attB1 and attB2) and a donor vector with attP1 and attP2 sites on either side of a toxin gene results in an entry clone where the target gene is flanked by attL1 and attL2 sites. The selectable marker (SM1) enables selection of cells transformed with an entry clone. The protein encoded by the toxin gene kills cells transformed with nonrecombined donor vectors. The origin of replication of the donor vector is not shown. (B) Recombination between the entry clone with flanking attL1 and attL2 sites and a destination vector with attR1 and attR2 sites results in an expression clone with attB1 and attB2 sites flanking the target gene. The selectable marker (SM2) enables selection of transformed cells with an expression clone. The second plasmid, designated as a by-product, has the toxin gene flanked by attP1 and attP2 sites. Cells with an intact destination vector that did not undergo recombination or that retain the by-product plasmid are killed by the toxin. Transformed cells with an entry clone, which lacks the SM2 selectable marker, are selected against. The origins of replication and the sequences for expression of the target gene are not shown.
The advantage of this procedure is the ability to easily transfer the target gene to a variety of vectors that have been developed for different purposes. For example, to produce high levels of the protein encoded on the cloned gene, the target DNA can be transferred to a destination vector that carries a promoter and other expression signals. An entry clone is mixed with a destination vector that has attR1 and attR2 sites flanking a toxin gene (Fig. 2.13B). In the presence of integration host factor, integrase, and bacteriophage λ excisionase, the attL1 and attL2 sites on the entry clone recombine with the attR1 and attR2 sites, respectively, on the destination vector. This results in the replacement of the toxin gene on the destination vector with the target gene from the entry clone, and the resultant plasmid is designated an expression clone. The reaction mixture is transformed into E. coli, and a selectable marker is used to isolate transformed cells that carry an expression clone. Cells that carry an intact destination vector or the exchanged entry plasmid (known as a by-product plasmid) will not survive, because these carry the toxin gene. Destination vectors are available for maintenance and expression of the target gene in various host cells such as E. coli and yeast, insect, and mammalian cells.
Genomic Libraries
A genomic library is a collection of DNA fragments, each cloned into a vector, that represents the entire genomic DNA, or cDNA derived from the total mRNA, in a sample. For example, the genomic library may contain fragments of the entire genome extracted from cells in a pure culture of bacteria or from tissue from a plant or animal. A genomic library can also contain the genomes of all of the organisms present in a complex sample such as from the microbial community on human tissue. Such libraries are known as metagenomic libraries. Whole-genome libraries may be used to identify genes that contain specific sequences, encode particular functions, or interact with other molecules.
To create a genomic library, the DNA extracted from the cells (cell cultures or tissues) of a source organism (or a community of organisms for a metagenomic library) is first digested with a restriction endonuclease. Often a restriction endonuclease that recognizes a sequence of four nucleotides, such as Sau3AI, is used. Although four-cutters will theoretically cleave the DNA approximately once in every 256 bp, the reaction conditions are set to give a partial, not a complete, digestion to generate fragments of all possible sizes (Fig.