Mammalian Genome 2: 252-259, 1992

9 Springer-VerlagNewYorkInc. 1992

Rapid isolation of desired sequences from lone linker PCR amplified cDNA mixtures: Application to identification and recovery of expressed sequences in cloned genomic DNA Kuniya Abe* Furusawa MorphoGene Project, ERATO, Research Development Corporation of Japan (JRDC), 5-9-6 Tohkohdai, Tsukuba 300-26, Japan Received July 18, 1991; accepted September 9, 1991

Abstract. A simple and efficient method for the rapid isolation of specific sequences from PCR-amplified cDNA mixtures has been developed, cDNA mixtures obtained using lone linker PCR (Ko et al. 1990) appeared to be highly representative even though the starting material, 100 ng-2 txg of total RNA, is much less than is required for making an ordinary cDNA library. With this method, cDNA mixtures were obtained from limited materials, including early mouse embryos and primordial germ cells. For selective enrichment of desired cDNAs, biotinylated probe was hybridized with the lone linker-linked cDNA in solution and the resulting probe-cDNA hybrid was captured by Streptavidin-coated magnetic beads. After appropriate washing, cDNA was released from the beads and subjected to amplification followed by cloning into a vector. Using genomic fragments isolated during chromosomal walking in the T/t complex of mouse Chromosome (Chr) 17, cDNAs encoding novel germ cell specific genes have been readily isolated by the above procedures. The method, termed random access retrieval of genetic information through PCR (RARGIP), will streamline the entire process from RNA to cDNA greatly. Its application potentials in various areas of molecular genetics will be discussed.

Introduction During mammalian development, various parts of the genome are activated or inactivated to drive the molecular events underlying morphological changes. Capturing the flow of genetic information, its classifi-

*Present address: Institute for Medical Genetics, Kumamoto University Medical School, Kuhonji 4-24-1, Kumamoto 862, Japan.

cation and mapping on the genome are central fields of mouse developmental genetics. Such molecular genetic analysis, however, has been a technical challenge in mammals. As part of a long-term project on the developmental mutations mapped on the T/t complex (Bennett 1975; Silver 1985), we have been conducting a detailed characterization of cloned genomic DNA ultimately to find mutated genes. The T/t complex is a large genomic region located in mouse Chr 17, which contains a set of mutations affecting germ cell and early embryos (Bennett 1975). Molecular genetic analysis of the mutations thus requires construction of cDNA libraries corresponding to early stages of embryogenesis or germ cell development. Such an attempt has been hampered by the difficulty of obtaining sufficient material: for instance, theoretically, at least 7,000 blastocysts should be dissected out to make one representative library (Watson and McConnel 1987). Construction of cDNA libraries from several developmental stages or mutant embryos may be impractical. Furthermore, although improved methods for constructing physical maps of large genomic segments have been established, the identification and recovery of expressed sequences from cloned DNA remains a cumbersome task. The procedures usually involve either restriction mapping to identify CpG islands, that often mark the 5' portion of most of the house keeping genes (Bird 1986) or derivation of probes subsequently used for "Zoo blots" or Northern blots to find putative transcribed sequences (Abe et al. 1988; Bell 1989). It has been found that transcripts thus detected sometimes merely represent cross-hybridizing sequences, mapping to a different genetic region. For confirmation, isolation and mapping of corresponding cDNA clones is absolutely necessary as a final step. Searching for embryo-expressed genes in a large genomic region therefore requires repeated, large-scale screening of libraries with numerous probes, With the advent of PCR technology, candidate genes active in early em-

K. Abe: Rapid isolation of sequences from PCR-amplified cDNA

bryos could be amplified without construction and screening of libraries. However, detailed sequencing analysis of genomic fragments derived at every walking step and synthesis of specific primers may become laborious and cDNA sequencing outside the amplified region is eventually necessary. Recently, the techniques that amplify DNA sequences tagged by "universal" primer sequences through PCR have been reported (Kinzler and Vogelstein 1989; Akowitz and Manuelidis 1989; Ko et al. 1990). These methods do not require sequence-specific primers, but allow amplification of any DNA fragments flanked by the linkers with comparable efficiency to regular PCR. We found that this modified PCR method could amplify with similar efficiency each constituent of a highly complex mixture of DNA fragments such as mouse genomic DNA and termed it lone linker PCR (LLPCR; Ko et al. 1990). For example, 1 ng of mouse DNA consisting of ~300 copies of the genome was digested with restriction enzyme RsaI. This DNA mixture should contain more than 106 different species of fragments. Southern hybridization of amplified genomic DNA revealed that eight unique copy fragments tested were all retained in similar quantity. LL-PCR should be applicable to the construction of representative cDNA libraries. Akowitz and Manuelidis (1989) also suggested the possibility of constructing general cDNA libraries from limited material. In this paper, synthesis, amplification and detailed characterization of cDNAs derived from limited samples such as early mouse embryos and primordial germ cells are described. A novel method coupling LL-PCR and the biotin-avidin technique for hybrid capture after solution hybridization was also devised. This "screening in solution" procedure, combining the efficiency of regular PCR and the versatility of conventional screening, made it possible to enrich for genes of interest directly from cDNA mixtures, skipping large-scale screening steps. Here it has been applied to recovering expressed sequences from cloned genomic DNA. Another application using oligonucleotide probes is also presented.

Materials and methods

Embryos and PGC Early mouse embryos were obtained by dissecting C57BL/6 or ICR mice and were frozen immediately in liquid Nz, and stored at - 80~ Primordial germ cells (PGC) were collected from 12.5 d.p.c. genital ridges and separated from somatic cells by the method of Hashimoto and co-workers (t990).

Isolation of total RNA Total RNA from embryos, cell lines, and various adult organs was isolated by the guanidinium thiocyanate (GTC) method of Chirgwin and colleagues (1979) with some modification. Tissues were lysed in 500 p,1 of 4 M GTC, 25 mM sodium citrate, 0.1 M [3-mercaptoethanol and homogenized in an Eppendorf tube with a tight-fitting plastic pestle. The lysate was overlaid on 1.5 ml of 5.7 M CsC1 cushion and

253 centrifuged in Beckman TLS 55 rotor at 50000 rpm for 3 h. The resulting pellet was dissolved in TE, extracted with n-butanol/ CHCI3 (4:1) and precipitated in EtOH in the presence of glycogen as a carrier. An aliquot of RNA sample and a known amount of standard RNA was electrophoresed and transferred to Hybond-N membrane. The Northern blot was hybridized with a 28S rRNA probe to examine the integrity and relative amount of the isolated RNA. In our experience, this procedure was successful for isolation of RNA from several thousand cells or as little as 50 ng of total RNA. Very little DNA is detectable in these, CsCl-purified RNA preparations. ff DNA contamination is suspected, it is recommended that the sample is treated with RNase free DNase (Promega) to remove possible contamination. Using several different source of RNA, cDNAs were synthesized, Iigated to lone linker and amplified. Synthesis and amplification of t0 d.p.c, mouse embryonic cDNA will be described below as a typical example.

Synthesis and amplification of cDNA Two p~g of total RNA isolated from 10 d mouse embryos was mixed with 100 ng of ClaI-oligo(dT) primer adaptor (5'-CTGATGATCGATTTTTTTTTTTTTTTTT-Y) and heated at 70~ for 10 min. For estimation of sequence representation, in vitro transcribed RNA corresponding to the bacterial neomycin resistance gene (neo) or Herpes simplex virus thymidine kinase (tk) gene was mixed with the embryonic RNA at various ratios. These genes were cloned into pSP64 poly(A) vector (Promega) so that the cDNAs contained a poly A-tail. First, strand cDNA was synthesized in 20 p~lreaction mixture containing 50 mM Tris-HCl, pH 8.3, 75 mM KCI, 3 mM MgC1z, 10 mM DTT, 500 IxM each of dNTPs, and 200 units of Moloney Murine Leukemia Virus RNase H - reverse transcriptase (SuperScript, BRL) at 45~ for 1 h. Second, strand cDNA was synthesized in a mixture of 50 mM Tris-HC1, pH 7.6, 100 mM KC1, 5 mM MgCIz, 100 p~M [3-NAD, 10 mM (NH4)2504, 5 mM DTT, 230 u/ml DNA polymerase I, 200 u/ml E. coli DNA ligase, and 200 u/ml RNaseH at 12~ for 2 h. Double-stranded cDNA was treated with T4 DNA polymerase at 12~ for 10 min. One-twentieth of the synthesized, bluntended cDNA was ligated to phosphorylated "lone linker" LLSal2A,B (5) for 16 h at 16~ in a 50 txl mixture of 6.6 mM Tris-HC1, pH 7.6, 1 mM MgCI 2, 1 mM DTT, 0.03 mM ATP, 0.1 mM SpermidineHC1, 5% (v/v) polyethylene glycol 6000, 3 ixg of each lone linker, ds cDNA, and 20 units/t~l of T4 DNA ligase (TAKARA). Ligation efficiency was close to 100% under these conditions.

LL-Sal2A; 5'-pTCGAGTCGACTATATGTACC-3' LL-Sal2B; 3'TCAGCTGATATACATGGp-5' This linker is different from either a conventional linker or adaptor, as it has both a non-palindromic protruding end and a blunt end, preventing multimerization of linkers (Ko et al. 1990). Excess linker was removed by Centricon-100 (Amicon) filtration (1000 • g, 3 • 15 min) and one half of the ligated cDNA estimated to be - 2 ng was used for subsequent amplification steps. Lone linker PCR (LL-PCR) amplification was performed in a 100 p~l mixture containing 10 mM Tris-HC1 (pH 8.3), 50 mM KC1, 2.5 mM MgC12, 10 mM 13-mercaptoethanol, 200 ixM each of dATP, dCTP, dGTP and dTTP, 0.01% gelatin, 100 pmol of LL-SaI2A primer, the linked cDNA, 2 Ixg of gene 32 protein (Ambion, Tex., USA) and 2 units of AmpliTaq (Perkin Elmer/Cetus). After 14 cycles of amplification (94~ for 45 s, 53~ for 2 min, 72~ for 7 rain), one-tenth of the reaction product was subjected to an additional five cycles of amplification in 100 ~1 reaction buffer under the same conditions. Under similar conditions, fragments up to - 2 kb were found to be amplified with nearly equal efficiency (Jeffereys et al. 1988). When Vent DNA polymerase (New England Biolabs, USA) was used, a heat-stable DNA polymerase with proofreading activity, the reaction buffer was replaced by a buffer supplied with the enzyme.

Probes and plasmids Probes used in this study are: (1) pSPMI3A, a cDNA encoding mouse cytoskeleta113-actin, a kind gift of K. Tokunaga (Tokunaga et al. 1986); (2) 1-19, 4.8 kb EcoRI-SalI fragment corresponding to

254 mouse 28S rRNA gene (Tiemeier et al. 1977); (3) probe A, 1.25 kb Sinai genomic DNA fragment isolated from cosmid clone 31/2.1, corresponding to male germ cell specific gene Tctex-3, mapped to the distal inversion of the T/t complex (Ha et al. 1991; Yeom et al. 1991); (4) probe B, 2.15 kb BamHI-NotI fragment from cosmid 51/ 1.1 mapped close to the H-2K gene (Ha et al. 1991; Yeom et al. 1991); (5) probe C, 5.45 kb KpnI fragment from cosmid 51/1.1; and (6) Hi12-5, a partial cDNA clone for the germ cell specific gene, Tctex-7, isolated by screening a spleen cDNA library with probe C. There appear to be a total of four copies of Tctex-7 in the genome, of which two are highly homologous and have been mapped to Chr 17. These probes were radiolabeled by random priming or biotinylated using a Bionick nicktranslation kit (BRL). An amino-modified oligonucleotide probe corresponding to the linker region between zinc finger motifs ( 5 ' - C A C A C ( G / A ) G G ( C / G ) G A ( C / G ) A A ( G / A)CCCTT(T/C)T-3') was synthesized using Applied Biosystems 391 DNA synthesizer. The oligonucleotide was biotinylated with biotinXX-NHS ester according to the method suggested by the supplier (Clonetech, USA). The labeled oligomer was separated from the unreacted materials on 20% denaturing PAGE and purified by standard procedure (Maniatis et al. 1982). Plasmid clones encoding neo and tk were kindly supplied by M.S.H. Ko (Ko 1990).

Solution hybridization for RARGIP Five hundred ng of probe DNA was biotinylated and the labeled probe was washed twice by Centricon-30 to remove unincorporated nucleotides. Approximately 1/50 of the labeled probe was mixed with 20 p,g of sheared salmon sperm DNA and the amplified cDNA; for the actin probe, about 10 ng of cDNA was used and 100 ng of cDNA was used for other probes. The DNA mixture was heatdenatured, mixed with 100 ~.1 of buffer containing 50% formamide, 5x SSPE, 0.1% SDS and 5 x Denhardt's solution and renatured at 42~ for 2 h (for actin) or 15 h (for other probes). For the oligonucleotide probe, 200 ng of the biotinylated oligomer was hybridized with 20 ng of cDNA derived from mouse PGC in 3 M Trimethyl ammonium chloride (TMA-CI), 0.1 M NaPO 4, pH 6.8, 1 mM EDTA, 5x Denhardt's solution, 0.6% SDS at 42~ for 2 h.

Separation of probe-cDNA hybrids After the solution hybridization reaction, probe-cDNA hybrids were separated from unhybridized DNA using Streptavidin-coated magnetic beads, Dynabeads M-280 (Dynal, Oslo, Norway; Uhlen 1989). Twenty txl of 10 mg/ml Dynabeads was prehybridized with salmon sperm DNA for 1-2 h, then mixed with probe-cDNA solution and incubated for 15-30 rain at room temperature. The hybrids captured by the beads were washed in 500 ~1 of 0.2 x SSC, 0.1% SDS at 65~ for 15 rain three times and finally in 0.1 x SSC at 65~ for 15 min. Washed beads were resuspended in 50 p,1 of 0.1 x SSC, heated to 90~ and the released cDNA was set aside. For oligoprobes, the hybrids were washed three times in 3 M TMA-C1, 50 mM Tris-C1, pH 8.0, 0.2% SDS at 50~ for 20 min each. One-half of the released cDNA was subjected to two step LL-PCR amplification under the similar conditions described above.

K. Abe: Rapid isolation of sequences from PCR-amplified cDNA Miniprep DNA analysis of randomly-picked plasmid clones revealed that more than 70% of clones contained inserts.

Results

General characterization of amplified cDNA Double-stranded cDNA was synthesized from several different sources of RNA including embryonal carcinoma cells, testicular cells, primordial germ cells purified from 12.5 d.p.c, genital ridges and early post implantation embryos dissected at 6 d.p.c., 7 d.p.c., 8 d.p.c., or 10 d.p.c. The amount of starting material ranged from 100 ng to 2 Ixg of total RNA. After linker ligation, cDNA was amplified using one strand of the linker as a PCR primer (Fig. 1). The size of the amplified cDNA was distributed from several hundred bases to - 2 kb in length (Fig. 2A). Typically, around 5 ~g of cDNA was obtained from a single, two-step PCR reaction under the conditions described in Materials and methods. Determination of the insert size of 36 randomly-picked clones revealed that these clones contained inserts of 1.29 -+ 0.56 kb on average (Fig. 2B). The shortest was 0.5 kb, while a 2.6 kb insert was also found in one of the clones tested. A part of 7.5 d.p.c. cDNA was cloned into plasmids, and screened with the 13-actin probe to isolate four independent clones with inserts of 0.6, 0.85, 0.9, and 1.4 kb. Sequence analysis revealed that there was no mismatching in 1975 bp agreeing totally with the published actin sequence, and no gross rearrangements such as deletion or recombination in these clones. As another way to evaluate the quality of the cDNA, a very small amount

5' ,ram.ram'4k''primer

]

5' ~

I1 '11'

] ~.~+++s,

Lone linker linked DNA ~ Target

~

~ 5'

Biotinylated

Solution/

probe

hybridization

,1,

Hybrid ~

I ybrld c a p t u ~

General methods

========================================================================================

~ : + ~

r~--

I

Filter hybridization for Southern blots, Northern blots and library screening was carried out in 5 x SSPE, 5 x Denhardt's solution and 0.5% SDS at 65~ and the filters were washed twice in 2 x SSC, 0.1% SDS at 65~ for 30 rain and once in 0.1 x SSC at 65~ for 30 min. The filters were then exposed to imaging plates (Fuji Film), which were subsequently scanned and analyzed by Bioimage Analyzer (Fuji Film). For vector cloning, amplified cDNA was digested with Sal I and partially filled with dCTP and dTTP, then ligated to partially-filled pT7T318U plasmid vector (Pharmacia) according to the method of Zabarovsky and Allikmets (1986) or Sal I-digested cDNA was ligated to dephosphorylated, Xho I digested XZAPII phage vector (Stratagene). Ligated DNAs were used either for transformation of DH5ct competent cells or transduction of XL1B cells.

=~~ unhybrldizedssDNA

Avidin-coated magnetic bead | [~~ :i ~~ Lone Linker-PCR

Fig. 1. Schematic representation of the RARGIP method. A population of any double-stranded DNA containing desired sequence(s) is ligated to lone linker, of which one strand can serve as a primer for lone linker PCR (LL-PCR). Hybrids between biotinylated probe and target DNAs are separated from unhybridized DNA by magnetic beads and the single-stranded target cDNAs are amplified by LLPCR.

K. Abe: Rapid isolation of sequences from PCR-amplified cDNA

255

Fig, 2. (A) Ethidium bromide stained pattern of amplified cDNA. Approximately 1 - 2 ng of lone linker linked cDNA was amplified under the conditions described in the Materials and methods and separated on a 1% TBA agarose gel. Left: L L - B a m 2 A, B linked mouse 7.5 d.p.c, cDNA. Right lane: L L - S a l I 2 A, B linked 8.5 d.p.c.

mouse embryo-derived cDNA. (M) Size marker, h H i n d l I I digest + H a e l I I digest of + • 174 DNA. (B) Demonstration of inserts from 24 randomly-picked cDNA clones derived from 10.5 d.p.c. LL-PCRbased cDNA library.

of in vitro transcribed mRNA was mixed with the starting materials as a representative of rare messages and I asked whether these standard RNAs were indeed converted to cDNA and amplified. As shown in Table 1, poly (A)+RNA coding for the tk gene and/or the neo gene were added to 2 txg of total 10 d.p.c, embryonic RNA at a ratio of 1:400,000 or 0.00025% (200 fg), or 1:200,000 or 0.0005% (400 fg) assuming that 4% of total RNA is poly(A)+RNA. Double-stranded cDNA was synthesized using this RNA mixture, and 1/20 of the cDNA was ligated to lone linker and amplified. The presence and relative amount of tk or neo cDNA in the starting cDNA or cDNA amplified by LL-PCR was assessed by regular PCR. Both tk and neo message added at 1:200,000 were readily detected in LL-PCR-

amplified cDNA pools and found to be amplified many fold, while tk added at 1:400,000 could not be detected. This result suggests that the rarest class messages are clonable with this method even though the starting RNA amount is much less than is required for making an ordinary cDNA library.

Table 1. Detection of t k and n e o cDNA in LL-PCR amplified cDNA. Preamplified cDNA Exp. 1 tk (0.00025%) n e o (0.0005%) Exp. 2 tk (0.0005%)

LL-PCR Amplified cDNA

-

-

Fold Increase

m

+

> 10,000

+

>500

Exp. 1: cDNA was made from an RNA mixture containing both tk and n e o . To test for the presence of tk and n e o cDNA, a part (1% of total) of the starting, pre-amplified cDNA as well as LL-PCR amplified cDNA equivalent to 0.001% of the starting cDNA were used as templates for the following PCR reaction (94~ 45 s, 58~ 2 rain, 72~ 2 rain for 30 cycles; n e o primers, 5'-TATCAGGACATAGCGTTGGC-Y and 5'-GCGAAGAACTCCAGCATGAG-Y; tk primers, 5'-GCTGGCACTCTGTCGATACC-Y and 5'-TGTCTGCTCAGTCCAGTCGTGG-Y). For n e o , a band of the expected size was seen only in LL-PCR amplified cDNA (0.001% of starting), while it was undetected in pre-amplified cDNA (1% of starting). The increase was 1/0.001 = 10,000-fold. Exp. 2 was done in same way as Exp. 1 except with increased tk content and template amount.

Rapid c D N A isolation by R A R G I P

As mentioned in the Introduction, we have been analyzing cloned genomic DNA isolated from the T/t complex of the mouse. To facilitate the process of identification and recovery of transcribed sequences in large genomic DNA, the following procedure was devised. Figure 1 illustrates the principle of the RARGIP method. A population of double-stranded DNA containing target sequences is ligated to lone linker. To select a specific sequence, a biotinylated probe DNA is hybridized with the linker-linked DNA in solution. Hybrids formed between the target and the probe are captured and immobilized on magnetic beads by biotin-avidin interactions. The beads are easily removed from the suspension with the use of a magnet, so the hybrids can readily be separated from the rest of the DNA, and this " b a t c h " procedure also facilitates washing of the hybrids at an appropriate stringency. Single-stranded target DNA is released from the probe, amplified, converted to double-stranded DNA, cloned into a vector, and finally selected by smallscale screening. Because in using this method enrichment of specific sequences is achieved in solution, large scale filter hybridization steps can be avoided. For example, cDNAs encoding actin were isolated by RARGIP as a preliminary test. The biotin-labeled actin probe was hybridized with the linked cDNA in solution for 2 h, and hybrids trapped by streptavidin-

256

coated magnetic beads were amplified and analyzed. Heterogenous DNA ranging from several hundred bases long to - 2 kb was obtained, while small amounts of much shorter DNA were seen in negative controls lacking biotinylated probe during hybridization (Fig. 3A). The material found in negative controls should represent amplified cDNA that bound nonspecifically to the magnetic beads. The amplified cDNA hybridized with the actin probe was then cloned and checked for the presence of actin sequences by colony hybridization. Of 445 clones screened, 436 (or 98% of the colonies) were positive for actin (Table 2), clearly demonstrating that the desired gene sequence was greatly enriched in one step. Sequencing analysis of the actin-positive clones unambiguously showed that the clones encoded actin message (not shown). Asking whether this technique can retrieve much rarer messages than actin, and is indeed applicable to identification of expressed sequences in cosmid clones, several genomic probes derived from the t complex were then used. During the course of chromosomal walking from mouse H-2K gene toward the Crya-1 locus (Abe et al. 1988; Yeom et al. 1991), we found a number of putative embryo-expressed or germ cell-specific genes. Among them, genes designated as Tctex-3 and Tctex-7 were identified on the basis of RNA blot analysis with genomic probes. Northern blot data showed that Tctex-3 was expressed only in male germ cells and Tctex-7 was found in early embryos, spleen and male germ cells, but not in other adult tissues including liver, kidney, brain and heart. The transcript size for both genes was coincidentally 2.4 kb. Probes A and B (Fig. 4), which were used for Northern analysis, failed to isolate corresponding cDNA clones after screening 4 x 105 plaques of commercially avail-

K. Abe: Rapid isolation of sequences from PCR-amplified cDNA Table 2. Isolation of cDNA by RARGIP. Conventional Screening

RARGIP Actin probe A probe B probeC Hi12-5 a

436/445 66/820 180/1384 n.t. 230/160

(98%) (8%) (13%) (38%)

n.t. none in none in 2in4• 2 in 4 •

4 • 105 4 • 105 105 105

n.t. = not tested. a Hi12-5 is one of two clones isolated by probe C from a conventional library.

able testis or spleen cDNA libraries. Another round of screening with probe C yielded two positives (Table 2). Thus, messages for these genes appear to be not abundant. Nevertheless, cDNAs specific to these probes were readily isolated by the RARGIP method. Probe A and cDNA clone, Hi12-5, detected by probe C, were used to pull out corresponding cDNAs by the RARGIP method. To obtain a higher Cot value, the probes were hybridized with 100 ng of testis cDNA for 16 h, then after selection of the hybrids, captured cDNAs were amplified. Amplified cDNAs were found to be specific to the probes as determined by Southern blot analysis (Fig. 3B). Colony hybridization experiments demonstrated that positive clones for probe A and Hi12-5 represented 8% (66/820 colonies) and 38% (230/610 colonies), respectively, of the total number of colonies. In order to obtain cDNAs covering other genomic areas or elongated cDNAs, probe B, which failed to detect any cDNA clones by conventional screening, was then used for RARGIP selection of lone linker-linked, random-primed testis cDNA. A number of overlapping clones were further isolated by probe B (Table 2 and Fig. 4). Southern hybridization of Cosmid 31/2.1 B

16kb

I

B B I

I

9

probe A 4-2-5 cDNA (Tctex-3)

[] Cosmid 65/2.1 B

B

Ill IItVt

BB

,ll

B

I

R R[R\~ a R H H HH Fig. 3. Amplification of specific cDNAs trapped by Streptavidincoated magnetic beads. (A) Actin probe; lane 1, negative control lacking actin probe but containing salmon sperm DNA; lane 2, 10 ng of LL-linked cDNA was hybridized with actin probe in the presence of salmon sperm DNA, then trapped via biotinylated probe on the beads, washed, and amplified using the following parameters (94~ 45 s, 53~ 2 min, 72~ 4 rain for 20 cycles, then one-tenth of the PCR products were subjected to an additional five cycles of amplification). (B) Tctex-3 probe (1.25 kb SmaI); lane 1, amplification of cDNA captured by Tctex-3 probe; lane 2, negative control; lanes 3 and 4, Southern blot of DNA shown in lanes 1 and 2 hybridized with Tctex-3 probe. (M) Size marker, HindIII digest ofk DNA + cb • 174 DNA digested with HaeIII.

BB

I

, ..,,t~l

B i lIl

i

K RHKRHHKRK H ~ probe B probe C

#29 D u

1 kb

[] #19

) cDNA [ ] Hi12-5" (Tctex-7)

Fig. 4. Location of genomic probes on cosmid clones. Restriction map of cosmids harboring Tctex-3 and -7, and approximate position of the probes and some of the isolated cDNAs are shown. H (HindIII site); B (BamHI); S (Sinai); K (KpnI); R (EcoRI). Cosmid clone 31/2.1 mapped within the distal inversion of the T/t complex, and 65/2.1 mapped about 100 kb distal to H-2K in the t complex (16). Clone 4-2-5 is one of the cDNA clones corresponding to Tctex-3. cDNA clones #19, #29 and Hi12-5 are in Tctex-7.

K. Abe: Rapid isolation of sequences from PCR-amplified cDNA

the isolated cDNAs to genomic and cosmid DNA demonstrated that they indeed mapped to the correct genetic position. For instance, probe A-positive cDNAs all hybridized to the 16.0 kb BamHI fragment on genomic blots, and the genomic area was further narrowed down to the 1.25 kb Sinai fragment located within the 16.0 kb region (Fig. 4) on cosmid DNA blots. Single bands of the expected sizes and tissue distributions were detected on Northern blots by the isolated cDNA probes, suggesting that they indeed correspond to Tctex-3 or Tctex-7 exons. The positive clones for probe A or Hi12-5 were randomly isolated and their insert sizes were examined. As shown in Fig. 5, inserts of several different sizes from -400 bp to over 1 kb were observed. It is noteworthy that the insert size varied from clone to clone: for probe A (Fig. 5A), seven different sizes were found in nine randomly-picked clones, and all of the Hi12-5 positives were of different sizes (Fig. 5B); nevertheless, these were selected from amplified cDNA. Such differences should be a reflection of their independent origin during cDNA synthesis, suggesting strongly that a broad spectrum of sequences are available in the amplified cDNA mixtures. It was determined that the clones shared similar, if not the same, sequences to each other and to the genomic probes by hybridization or partial sequencing analysis (not shown). The data described above verified that authentic cDNAs were successfully cloned by the RARGIP method.

Isolation of zinc-finger containing genes from a cDNA pool of mouse primordial germ cells The data described above clearly indicated that DNA fragments derived from cloned genomic DNA could be used to isolate corresponding cDNAs. Another common probing strategy, namely the use of oligonucle-

257

otide probes, was tested in solution hybridization. A degenerate oligonucleotide mixture corresponding to the linker region found between zinc-finger motifs was biotinylated and hybridized with linked cDNA made from mouse primordial germ cells at low stringency, that is, 3 M trimethyl ammonium chloride hybridization buffer at 42~ for 2 h. After washing at 50~ bound cDNA was amplified, and cloned into the plasmid. Five out of 30 randomly-picked clones were found to contain sequences hybridizing to the radiolabeled oligo-probe. Partial sequencing analysis revealed that at least two of them have a typical zincfinger motif and encode previously unknown genes. Genomic Southern data suggest that the five cDNAs belong to five different genes. Details of these five genes and others will be described elsewhere. The data indicate that the RARGIP method can isolate not only known genes but also related members of gene families with either oligonucleotide probes or longer DNA fragments.

Discussion

The data described here shows that the RARGIP method allows rapid retrieval of desired sequences, and also evaluation of cDNA mixtures obtained by LL-PCR. Successful detection of Tctex clones of several independent origins indicated the representation of a wide variety of sequences in the mixture. The data obtained with neo and tk internal standard messages also demonstrated that even the rarest class of messages are clonable with this method, even though it is easy to use and has much reduced requirements for the starting materials. The effectiveness of this method is in contrast to the results obtained with conventional libraries for which only two clones for Tctex-7 and none for Tctex-3 were isolated by screening 4 x 105

Fig. 5. Inserts of randomly-picked clones positive for Tctex-3 (A), and for Tctex-7 (B). (A) Insert DNAs prepared from nine randomly selected positives for Tctex-3. (B) Insert DNAs from ten different clones positive for Tctex-7.

258

clones. The amplified cDNA pool is highly representative, which is important when it is applied to subtractive cloning, for example. With the LL-PCR format, either oligo-(dT) primed or random primed cDNA can be amplified and the common primer artifact, "primer dimer," occurring with other PCR-based cDNA amplification strategies (Welsh et al. 1990) is less of a problem since self-ligated primers were removed by centricon filtration and mispriming between primers was found to be low. One hundred ng of total, not poly (A) selected RNA was the lowest amount used in the present study, though it is possible to start from much smaller quantities. In fact, Belyavsky and co-workers (1989) and Welsh and co-workers (1990) constructed cDNA libraries from 10-50 cells. However, considering the efficiency of cDNA synthesis, in which only 10-30% of the starting RNA is converted to cDNA and the non-quantitative recovery of materials at each step especially prior to amplification, it is likely that rare messages may be lost if the starting materials are too limited. It is noteworthy, in this context, that the synthetic tk mRNA mixed at 0.00025% seemed to be lost (Table 2) during cDNA construction. Nevertheless, the present method significantly reduces the effort required for collecting the starting materials: for example, only - 7 0 blastocysts are needed for the present method instead of 7,000 required for conventional methods (Watson and McConnel 1987). Thus, it should be possible to construct the cDNA libraries from any stages of mouse development, or from specific parts of embryos or selected regions of various organs using this technique. Limitations of the RARGIP method may result from problems intrinsic to PCR amplification, that is, a relatively higher error rate of DNA synthesis as Taq polymerase lacks proofreading ability (Gelfand 1989; Keohavong et al. 1989), and generation of anomalous fragments caused by incorrect priming or recombination during PCR (Meyerhans et al. 1990). Although it may be difficult to avoid these problems completely, careful choice of amplification conditions (Innis and Gelfand 1990); use of reagents that enhance correct priming or DNA synthesis; and application of a heatstable polymerase with proofreading function, would circumvent at least a part of these problems. For instance, two-step amplification with fewer cycles and longer extension time helps to reduce anomalous fragment formation (K. Abe, unpublished; Meyerhans et al. 1990). Gene 32 protein ofT4 phage which is known to improve the activity and accuracy of Taq polymerase, and c o n s e q u e n t l y regular P C R e f f i c i e n c y (Schwarz et al. 1990), is also effective for LL-PCR (K. Abe, unpublished). Polymerase with 3'-5' exonuclease activity has been used successfully for LLPCR. In fact, as described in the text, sequencing data for some of the amplified cDNAs suggest that the error rate was not very high at all. Although further studies will be required to examine critically the performance of such new PCR reagents, progress in basic PCR technology must improve this modified PCR method as well. The method for rapid genetic information retrieval presented here can be a useful substitute for conven-

K. Abe: Rapid isolation of sequences from PCR-amplified cDNA

tional screening procedures in certain instances. Searching for expressed sequences in cloned, large genomic regions involves many rounds of large scale screening with multiple probes (Abe et al. 1988). The RARGIP method avoids large-scale filter hybridization and is well-suited for such purposes, as described in this paper. High probability of finding even rare class transcripts should facilitate the entire process. It is also possible that a set of cosmids or YAC clones covering a part of the genome or even sorted chromosomes could be used as probes. Large genomic DNA can be digested, LL-linked, and amplified. Then such linked DNA in turn can serve as probes for hybridization with target cDNAs linked to a different lone linker sequence. Highly repetitive sequences that interfere with hybridization can be removed from genomic DNA by subtractive procedures (Elvin et al. 1990; Djabali et al. 1990; K. Abe unpublished). Recently a method termed inverted PCR (Ochman et al. 1988; Triglia et al. 1988) was developed to gain information about unknown sequences adjacent to known sequences. The present method is applicable to this use, and may be more versatile, as it does not require knowledge of the probe sequence, specific primers or the template DNA self-ligation step. Preliminary results showed that genomic DNA containing integrated transgenes has been isolated from transgenic mouse DNA by the RARGIP method (K. Abe, unpublished). Another major application may be in detecting related, but different, sequences from the same organism or phylogenetically distant species. Genes with vital biological functions often share similar sequences forming a family of genes (Dressier and Gruss 1988). Low stringent hybridization in solution may be the best way to isolate such related gene sequences since each parameter affecting DNA reassociation can be controlled more precisely in solution (Anderson and Young 1985). Ease of construction and of screening many different libraries may make the method an alternative to the existing procedures in which conventional libraries are screened at low stringency or cDNAs are isolated by PCR with a highly degenerate set of primers (Lee and Caskey 1990). The successful isolation of zincfinger containing genes from previously unaccessible material like PGC strongly suggests the feasibility of this strategy. Acknowledgments. The author would like to express his gratitude to Dr. Mitsuru Furusawa for support and encouragement during the course of this study; Dr. Minoru S.H. Ko for comments on the manuscript; Ms. Kaori Nishiguchi and Ms. Hiromi Kawabata for the synthesis of oligonucleotides and technical assistance; Dr. Toshiaki Noce for information on the zinc-finger consensus sequence and PGC sample; Dr. Susan F. Godsave for critical reading of the manuscript; and Young II Yeom, Haesook Ha, Drs. Karen Artzt and Dorothea Bennett for sharing information and materials concerning Tctex genes.

References Abe, K., Wei, J.-F., Wei, F.-S., Hsu, Y.-C., Uehara, H., Artzt, K., and Bennett, D.: Searching for coding sequences in the mamma-

K. Abe: Rapid isolation of sequences from PCR-amplified cDNA lian genome: The H-2K region of the mouse MHC is replete with genes expressed in embryos. EMBO J 7: 3441-3449, 1988. Akowitz, A., and Manuelidis, L.: A novel cDNA/PCR strategy for efficient cloning of small amounts of undefined RNA. Gene 81: 295-306, 1989. Anderson, L.M. and Young, B.D.: Quantitative filter hybridisation. In B.D. Haines and S.J. Higgins (eds.); Nucleic Acid Hybridization: A Practical Approach, pp. 73-111, IRL Press, Oxford, 1985. Bell, J.: Chromosome crawling in the MHC. Trends Genet 5: 289290, 1989. Belyavsky, A., Vinogradova, T., and Rajewsky, K.: PCR-based cDNA library construction: General cDNA libraries at the level of a few cells. Nucl Acids Res 17: 2919-2932, 1989. Bennett, D.: The T-locus of the mouse. Cell 6: 441-454, 1975. Bird, A.P.: CpG islands and the function of DNA methylation. Nature 321: 209-213, 1986. Chirgwin, J.M., Przybyla, A.E., MacDonald, R.J., and Ratler, W.J.: Isolation of biologically active ribonucleic acid from sources enriched in ribonuclease. Biochemistry 18: 5294--5299, 1979. Djabali, M., Nguyen, C., Roux, D., Demengeot, J., Yang, H.M., and Jordan, B.R.: A simple method for the direct use of total cosmid clones as hybridization probes. Nucl Acids Res. 18: 6166, 1990. Dressier, G.R. and Gruss, P.: Do multigene families regulate vertebrate development? Trends Genet 4: 214-219, 1988. Elvin, P., Slynn, G., Black, D., Graham, A., Butler, R., Riley, J., Anand, R., and Markham, A.F.: Isolation of cDNA clones using yeast artificial chromosome probes. Nucl Acids Res 18: 39133917, 1990. Gelfand, D.H.: Taq DNA polymerase. In PCR Technology, pp. 1722, Stockton Press, New York, 1989. Ha, H., Howard, C.A., Yeom, I.Y., Abe, K., Uehara, H., Artzt, K., and Bennett, D.: Testis-expressed genes in the mouse t-complex have expression differences between wild-type and t-mutant mice. Develop Genet, in press, 1991. Hashimoto, N., Kubokawa, R., Yamazaki, K., Noguchi, M., and Kato, Y.: Germ cell deficiency causes testis cord differentiation in reconstructed mouse fetal ovaries. J Exp Zool 253: 61-70, 1990. lnnis, M.A. and Gelfand, D.H.: Optimization of PCRs. In M.A. Innis, D.H. Gelfand, J.J. Snisky, and T.J. White (eds.); PCR Protocols: A Guide to Methods and Applications, pp. 3-12, Academic Press, New York, 1990. Jefferys, A.J., Wilson, V., Neumann, R., and Keyte, J.: Amplification of human minisatellites by the polymerase chain reaction: Towards DNA fingerprinting of single cells. Nucl Acids Res 16: 10953-10971, 1988. Keohavong, P. and Thilly, W.G.: Fidelity of DNA polymerases in DNA amplification. Proc Natl Acad Sci USA 86: 9253-9257, 1989. Kinzler, K.W. and Vogelstein, B.: Whole genome PCR: Application

259 to the identification of sequences bound by gene regulatory proteins. Nucl Acids Res 17: 3645-3653, 1989. Ko, M.S.H.: An "equalized cDNA library" by the reassociation of short double-stranded cDNAs. Nucl Acids Res t8: 5705-5711, 1990. Ko, M.S.H., Ko, S.B.H., Takahashi, N., Nishiguchi, K., and Abe, K.: Unbiased amplification of a highly complex mixture of DNA fragments by "lone linker"-tagged PCR. Nucl Acids Res 18: 4293-4294, 1990. Lee, C.C. and Caskey, T.: cDNA cloning using degenerate primers. In M.A. Innis, D.H. Gelfand, J.J. Snisky, and T.J. White (eds.); PCR Protocols: A Guide to Methods and Applications, pp. 46-53. Academic Press, 1990. Maniatis, T., Fritsch, E.F. and Sambrook, J.: Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, New York, 1982. Meyerhans, A., Vatanian, J.-P. and Wain-Hobson, S.: DNA recombination during PCR. Nucl Acids Res 18: 1787-1791, 1990. Ochman, H., Gerber, A.S., and Hartl, D.L.: Genetic applications of an inverse polymerase chain reaction. Genetics 120: 621-625, 1988. Schwarz, K., Hansen-Hagge, T., and Bartram C.: Improved yields of long PCR products using gene 32 protein. Nucl Acids Res 18: 1079, 1990. Silver, L.M.: Mouse t-haplotypes. Ann Rev Genet 19: 179-208, 1985. Tiemeier, D.C., Tilghman, S.M., and Leder, P.: Purification and cloning of a mouse ribosomal gene fragment in coliphage h. Gene 2: 173-191, 1977. Tokunaga, K., Taniguchi, H., Yoda, K., Shimizu, M., and Sakiyama, S.: Nucleotide sequence of a full-length cDNA for mouse cytoskeletal 13-actin mRNA. Nucl Acids Res. 14: 2829, 1986. Triglia, T., Peterson, M.G., and Kemp, D.J.: A procedure for in vitro amplification of DNA sequences that lie outside the boundaries of known sequences. Nucl Acids Res. 16: 8186, 1988. Uhlen, M.: Magnetic separation of DNA. Nature 340: 733-734, 1989. Watson, C.J. and McConnel, J.: Construction of cDNA libraries for pre-implantation mouse embryos. In M. Monk (ed.); Mammalian Development: A Practical Approach, pp. 183-197, IRL Press, Oxford, 1987. Welsh, J., Liu, J.-P., and Efstratiadis, A.: Cloning of PCR-amplified total eDNA: Construction of a mouse oocyte cDNA library. Genet Anal Tech Appl 7: 5-17, 1990. Yeom, Y.I., Abe, K., Bennett, D., and Artzt, K.: Testis-/embryoexpressed genes are clustered in the mouse H-2K region. Proe Natl Acad Sei USA, in press, 1992. Zabarovsky, E.R. and Allikmets, R.L.: An improved technique for the efficient construction of gene libraries by partial filling-in of cohesive ends. Gene 42:119-123, t986.

Rapid isolation of desired sequences from lone linker PCR amplified cDNA mixtures: application to identification and recovery of expressed sequences in cloned genomic DNA.

A simple and efficient method for the rapid isolation of specific sequences from PCR-amplified cDNA mixtures has been developed. cDNA mixtures obtaine...
1MB Sizes 0 Downloads 0 Views