,I. Mol. Hid. (1990)
211.
19-33
Intramolecular DNA Triplexes, Bent DNA and DNA Unwinding Elements in the Initiation Region of an Amplified Dihydrofolate Reductase Replicon Mark S. Caddle, Richard H. Lussier and Nicholas Drpartrvxnf
of Pathology,
CnGwsity
Rurhgton,
(Received 9 May
of I’rrmont
College
H. Heintzt of Medicine
VT 05405, r’.B.d.
1989, and in, revised
form
16 A4u,gucst 1989)
The nucleotide sequence of 6.2 kb (1 kb= IO3 base-pairs) of DN-A that encompassesthe ra,rlient replicating portion of the amplified dihydrofolatr redurt,ase domains of (‘HOC MO cells has been determined. Origin region DNA contains two dlul family repeats. a novel repetitive element (termed ORR#-I). a TGGGT-rich region, and several homopurinej homopyrimidine and alt#ernat’ing purine/pvrimidine tracts, including an unusual cluster of simple repeating sequencescomposed of (6-C)5. (A-(‘ols, (A-C:)21, (G)9. (CAGA),. GAGGG,~(:A(:=26GCBGAGAGGG. (A-G),,. Recombinant plasmids containing origin region sequenceswere examined for DNA st’ructural conformations previously implicated in origin activation. Mung bean nuclease sensitivity assays for DSA anwinding elements show the preferred order of nuclease cleavage at neutral pH in supercoiled origin plasmids IO be: (A-T),,>> the (A-G) cluster >>(A),, >> vect’or= (AATT),. At acid pH, the hierarchy of cleavage preferences changes to: the (A-G) cluster >>(A-T),, >>(AATT), > vector = (A)38. A region of stably bent DNA was identified and shown not t’o be react’ive in the mung bean nuclrase unwinding assay at either acid or neutral pH. Intermolecular hybridization st.udies show that, in the presence of torsional stress at pH 5.2, the (A-G) cluster forms triplestranded l)?\‘A. These resuhs show that t’he origin region of an amplified chromosomal replicon contains a novel repetitive element and multiple sequence elements tha,t facilit,ate 1)9,\ bending, DSA unwinding and the formation of intramolecular triple-stranded DSA.
1. Introduction
synchronized cells indicate that replicat,ion of t*he amplified domains commences within a series of restriction fragments, termed early-Meled fragments, or ELFs, located 3’ to the dhfr gene (Heintz & Hamlin, 1982; Heintz et nl., 1983; Heintz & Stillman, 1988). Hybridization of replication intermediates formed during the onset of 6 phase to cloned ELFs indicate t)hat Dh’A synthesis begins within a 4.3 kb XbaI fragment located approximately 14 kb 3’ to the last exon of the dhfr gene (Burhans et al., 1986a.b). “In gel” renaturation analysis of the labeling pattern of amplified restriction fragments suggests an initiation site is located within a 1.8 kb RamHFNindIII subfragment of the 4.3 kb XbaI fragment (Leu & Hamlin, 1989). This site is enriched for repetitive sequences contained within an “origin-specific” DNA fraction (Anachkova & Hamlin, 1989). Searby sequenceshave been reported to function as an autonomously replicating sequence (ARS) element’ in yeast (Hamlin et al., 1988). A map of the amplified domain of CHOC 400 cells that locates these and other biological landmarks within the
The met.hotrexate-resistant, Chinese hamster cell strain, CHOC 400, contains 1000 copies of an earlyreplicat,ing sequence that includes the gene for dihydrofolate reduct’ase (dhfr$; Milbrandt et al., 1981). The amplified dhfr domains, each approximatelv 270 kb in lengt’h (Montoya-Zavala bz Hamlin, 1985; Looney & Hamlin. 1987), are situated in tandem arrays (Looney et al., 1988) in several homogeneously st’aining chromosomal regions ( HSRs) t’hat’ begin replication immediately upon entry into S phase (Hamlin & Biedler, 1981: Milbrandt et a,l., 1981). Pulse-labeling studies in
?Author to whom all correspondence should be addressed. 1 Abbreviations used: dhfr. dihydrofolate rrductase: kb. 103 hasrs or base-pairs: bp. base-pair(s): HSR, homogeneously staining chromosomal region: ELF, rarly-lajbelrd fragment: ARS. autonomously replicating sryuenw: MKK, mung bean nuclease; ssDNA. singlestranded DXA. 19
c
1990 hradrmic
Press Limit,ed
ELF-F/ELF-F’ doublet to t,he dhfr gene is provided in Figure 1. The molecular precedents established by t,he study of prokaryotic. viral, and yeast, origins have pe&itted construction of generalized models for initiation of DNA synthesis (Bramhill & Kornberg. 198%: ITrnek et nl., 198%). These models postulat,e that’ binding of the initiator prot,ein assists in the localized melting in a nearby A + T-rich region that may have special structural properties. Umek et al. postulate that origin-associated A + T-rich regions. or DNA unwinding elements. represent a fundament)al thermodynamic property of’ replication origins, and suggest that t,he sensitivity of thrsc sequencest’o mung bean nuclea,se (MHN) clea,vagc at neut,ral pH in supercoiled molecules direct)]? reflectIs the free energy required for DNA unwinding (Umek et al.. 19886). Indeed, the MHN assay for DNA unwinding indicates t)hat the A + T-rich origin sequencesof bacteriophage PM2 and t)he yeast’ 2 pm plasmid are more readily unwound bhan all other sequencesin their genomes (Kowalski, 1984: Umek & Kowalski, 1987). Deletion analysis indicates that facile DNA unwinding is an essential feature of severa, origins, including t,he H4 ARR (Umek CC Kowalski, 1988) and ori(’ (Hramhill & Kornberg. 1988a). Another structural feat’ure. st,ably bent I)NA. has been found near a variety of origins, including those of bacteriophage lambda (Zahn dz Hlattner, 1987). t’he ARS-I element from yeast (Snyder et al., 1986). and simian virus 40 (SV40) (Deb et al., 1986). Delrtion mutagenesis and sequencsesubstitution exprriment s suggest t’hat DNA bending may be important for activity of t,he SV40 (Deb et nl.. 1986) and AR&l (Williams et al.. 1988) origins. DNA bending. which may be enhanced by the binding of spec*ific proteins (Zahn & Rlat,tner, 1987). has been shown to promote disruption of the DNA helix (Ramstein B I,a.verv. 1988). Lack of a definitive assay for vertebrato origin function has hindered precise delineation of t,hr J)NA sequences that comprise the dhfr origin of replication. In the absence of such a,n assay, we determined the nucleotide sequence of 6.2 kb of DNA from t,he ELF region of amplified dhfr domain, and then analyzed recombinant, plasmids containing these DNA sequences for st’ructural properties that may be related to origin structure ot function. Here we report that the dhfr origin region contains five types of unusual sequences:(1) a novel repetitive DNA sequence; (2) stably bent, DNA; (3) homologies to yeast chromosomal origins; (4) elements that promote DNA unwinding; and (5) a cluster of simple alternating repeats that, forms triple-stranded DNA when subjected to torsional st,ressat acid pH. EcoR’J
2. Materials and Methods (a) DIVA wpenciny Origin region I)R’A was sequenwd by the dideoxy method (&nger et al.. 1977). To sequence the 4.3 kb XbaI
fragment from rrcwmhinant cwsmid SI 3 ( ISurhans PI ,I/.. 1986h). t,he fragment was subclonrd in plT(‘f 2. yielding plasmid 813X-24. SISX-24 was linearizrtl a,t thr rlniclw pCCI2 Pstl cloning site 5’ to the insert, 1)S.A. and a graded series of B&f deletions were prepared and subcloned in the Ml3 v&or, mplO, as described (Poncz rt al., 1982). Phage were propagated in th(l Eschrrichitr rol; strain ,JMfOf. and I to 2 pg of phagr ssl)S.-\ SYW prepare’d and srquenwd with Klenow fragment and thx universal 1 i-mer primel, AS desc~ribrtl (Messing. 1983). [32P]dATP w-as used as the labeled prwursor. Thtl GO,, to IO o. (M-/V) pal\-acrylamide sequencing pt.1~ w\-erc J)wl)art~tl as described (Maniatis r,t ctl.. 1982). To fill gaps missing in thr cwllrction of 12rrl:Sf th+i iotlh. sl~bc~lom?s of srlr&d restrictiolr tiagmwts wrrc~ cnonstru&d in >I IS or pT% vcactors (I -.$. l~ioc,hPnric,wls). sslI?u’A was prepared. ant1 ra(ah template was seqrwircr~i as desc.ribed. Sequence ambiguities serf also rwol\~rtl I)> using spwific oligrioriuc~lrotidr primers mad0 on il :Z 131 model 38fA DPI’A svnthesizrr. The 1% kh HintlIJl t’r:tgmrnt that) overlaps SISX-24 was subclonrd irit,o II 13mpf S in both orientations. SSDXA was pJ’t’pNr”‘t. ;tmi tllc~ srquenw det,erminrd hy t,he dideox\ method iwing wstom oligonut~leot~idr primers syntl~es~zfd as rrwrss;\r~ Eac>h segment of the 6.2 kb region was srquc~n(~l :I minimum of 4 times in srparatc~ experinlrnts. Sequence information was assembled \vith a packayfb of programs from the Molecular Hiology (‘omputt*r R~rwarch Resource (MHCR,R) at Harvard I’nirrrsity. including th(’ Int,ellipenetics program “(kf”. or thr (:rl Ass~rnhi~ programs of the &w%ic~s C’omputrr (:rr)up. C’nivrrsit y ,>t’ Wisconsin (I )ewrrux fd al.. 19X4). Srqurtlw analysis \z ita performed with the following programs I)ASHEJ:. J,OCAL. and AJ,I(:N from the MJWK R: thf, Jwgrams lvordsearch 1 (‘ompa re. StrmI,oop. and Itt’pf’Rt tror~r (:fsnet,icx (‘ornputer Group at ttlr I’nivcwity of LViwoirsiri: and the matrix hotnotogy programs of l’ustt~fl (JRI) ar~tl L),UASIS (LKB). I’nfess ot’hww-iw irldic.atetl. propwnI default parameters \vere used in styut:nc~t~ anafyww
The tihfr origin ygiorr \vas survcb> rti ti)l, ~~n~~malotlhl~ migrating fragments by a r’-dinlrnsional goI twhnicluc~ (Anderson. 1986). FinrllI digests of j)lwsinitl srrlx~l~~ws were seperatetl on a No,, agarow gel in T;\E bufh~~~ (40 rn$f-Tris-aretatr (pH ‘7.8). 1 ~IIVI-EIYI’.A) at room temperaturr. The agarost’ gel lane \vas c~sc~iwci. rwrirntc~cL 90” relat,ivcs to thtl first dimension. alItI cd;rst in a So0 polyacrylamidr gel. The 2nd dimension gt.1 c\‘as run at 1 (’ t,o exaggerate t,hr effects of I)NX lwriding 011 fragrnc,nt mobilit&
Mur~g bean nu&ast~ (hri3s) sensitivit \’ assays \\ VI’C’ performed essent,iafly as described ( ITr,lt,k & Kowafski. 1987. 1988: lTmek rf nl.. I!Nn.6). Plasrnitl 1)N.A IVHS prepared by equilibrium sedimentatiorl in wsium c*hlori& gradients cwntaining ethidium hronlidr (Matliatis pf II/. 1982), and used at native levels of siip(~r(.oiling. f’fasmid f)NA (1 tjo 1.5 pg) was preincubated for 1.5 min at 373’ iri IO mivr-Tris. HC’I (pH 7.0 to 7.2). prior to thus addition of’ MRX (BRL). The ac*tivity of each MEN preparation \vas titrated under reaction caonditions at both nwtral and arid pH to yield approximatrly one ssl).V:I nick per’ supewoiled molecule. The rra.ctions (1 A ~1) \vtAre st’oppwl aft,er an additional 60 rnin incubation at 37°C’ by th(t addition of an equal volunlt, of J)hrllol,‘c~klorofor~~~ (24. 1
dhfr Origin
Region
37°C. The hybridization products were separat,ed bx rlectrophoresis at 4°C in l y0 agarose gels in 10 pg ethldium bromide/ml. 40 mix-Tris-acetate (pH 7%). 1 mM-EDTA. The gels were photographed. transferred to Zetaprobe (Biorad). hybridized with the 2.9 kb (‘la1 fragment from M13mp18 replicative form DEA, washed. and exposed to X-ray film as described previously.
v/v). The D3A was isolated. digested with the indicat’ed restriction enzymes, and the products were end-labeled with [a-32P]dATP and Klenow fragment (Maniatis et al.. 19W). The products of the end-labeling react)ions (30.000 cts/min per lane) were denatured for 2 min at 98”(‘. adjust’rd to 30 mM-KaOH. 1 mM-EDTA. @05’j/,
(~/;~~s~~~;;eirn., y,,l o(v/v) glycerol, and subjected to ‘0 (w/r) alkaline agarose gel m 30 mM-Sa.C)H. 1 mu-EDTA. The gels were neutralized by three 20 min washes in 40 mM-Tris-acetate (pH 7.9). I rn31-EDTA. dried. and exposed t,o X-ra,v film for 3 t’o 36 h at -70°C'. (‘leavagr sit,es u’err mapped by digesting the nucleasr reaction products with selected combinations of restrirtion rndonucleasrs. In some instances, Southern blot hybridization with selected probes was required t,o resolve amhiguit)ies (data not shown). The lengths of ssDR-A fragment’s were determined from short exposures and were judged to be aclc*uratr within A.50 bp.
3. Results (a) Primary
0
10
20
DNA sequence of dhfr origin region DNA
(i) Biological landmarks The dhfr origin region has been identified by a variety of replication studies (reviewed in Tntroduction). Tn Figure l(a), the locations of previously reported biological landmarks are collated relative to the dhfr gene. Outlined in Figure 1(b) is the strategy used to determine the primary nucleotide sequence of this region from the 5’ XbaI site of the 43 kb XbaI fragment of S13X-24 to the 3’ end of the overlapping 1.63 kb WindIT fragment of ELF-F. The sequence is 6157 nucleotides in length (Fig. 2): t’he EcoRI site that separates El,F-F from
Intermolrrular hybridizations were performed as described (Htun 8: Dahlberg, 1988). The 20~1 reactions containing 0.2 pg of supercoiled plasmid DSA and @l ,~g of the indicated recombinant Ml3 phage DNA were incubated in 200 m>l-sodium acetate (pH 5.2 or 7.2) for 24 h at
(a)
21
DNA
30
50
(=thons C!iS=lntrons
R
ELF-F’
R ELF-F
R
HHHH
.
Activity
/
Early replication in synchronized cells; immediate replication in GIIS nuclei in vitro Hybridization intermediates from
by ‘in gel’ analysis
Hybridization ‘origin-specific’
of DNA fraction
ARS
(b)
nuclease-sensitive in chromatin activity
Region Sequencing
in yeast
:
/
/
”
\I
R \ :;H
ELF-C ”
\/
R H
i 1
I I
0.
References
:
Y 0&F+
R 1
1
ELF-F
; 1
Heintz 81 Hamlin (1982) Heintz et al. (1983) Heintz 8 %on (1988)
I kb
of replication Gl/S cells or nuclei
Initiation site renaturation
Micrococcal site
R,
/
,
6? kb
Burhans
Leu
8
et al. (19866) Hamlin
(1989)
PP
Anochkova
u
“4
“d
8 Hamlin
Homlin
$aJ
Hamlin
et
(1989)
(1988) (1988)
sequenced strategy
Figure 1. Biological landmarks in the initiation region of the amplified dhfr domain of CHOC 400 cells. (a) A restrict’ion map of the 3’ early-replicating region that locates the early-labeled fragments (ELF) F, F’, and C relative to the dhfr gene is presented. Selected restriction fragments that demonstrate the indicated activities within the EcoRI ELF-F/ELF-F’ doublet are referenced. (b) Sequencing strategy for a port’ion of the initiation region. Arrows indicate the orientation and distance of individual sequencing reactions. X. Xbal; R. EcoRI: H. HindIII; P. PNTI.
Al. S. C'nddlr
et al.
-__II_-
__.---___
XBAI 100
l ~~~cACTA~~TATCT~AT~TAATCAACATGAGAG 101
TACCACAGACAAGCACTGACCCAACCTCCACGAGTCCTCTTGAAAAGAGAGAGGMCCA~GTACGAGCC~GACTCMGAGCATGACACCG~CCCA
200
201
CACACACAGCTGACCTGCGCTTCTCGGTCATCGACCTCATCCACTC~GACCMCM~ACGG~CTCCATCAGGCCMCCTAGGMCTCTGCATGTCTCTG
300
301
ACAGTTGTATACCATGGTCTGTTTGTGAGGCTTTTACCAGTGGGACAGGGCCTCTCCTTGCCGCTTGAGTCCc~CG~McTC~CCcATGCTGCA~
600
401
ACAACCAGCTTGATGCTCGCMGCATTGTCTGTCTGCT~TCATGCCC~GCATTC~GGATCTCATGGAGGACTCCCC~CTG~GACAGGACMGTGMTA
500
501
GCCGACCCCA~CCCAGCAGACCMGGACACGAAACTCTCATACGGATC
600
601
ATTCGCTAAAATCAAACAAACATAAAACTACACAGGAAAAACACACGAGGAG~CCTCCGCACCGCCGCTCCAMGACAGCTGAGGGCCGCATGGGMTCGC
700
701
ACCTTACTCGACC~GC~CA~~TC~C~T~~A~CM~~CACC~C~AG~~~T~A~CTCCACC~T~CGCT~C~TG~CACC~C~CCACCMCACATC~
800
801
~CTAAAACAGGTCTTCACACCACCMCMTCATACACATCTCTCCAGG~CGTCCATCCCCAGCATG~GCCACAGACCCMCACATA~MCAGTC~
900
901
1000
1001
ATACAGTGTCCTATATGTGGTATTGTAAAGTCATGACT~TTGTCACTAGGCCACTG~~T~ACCTCT~GCAC~GT~CGCATCTGCCTA~CT TGCC~CCCTCCTTACCACT~CACACACACACACACAC~ATATTCTACTGT~GTAG~GCA~CMG~CCATCA~CTC
1101
T?CATMTCACATMTGTAATAACACACACAGTGGTTCATTTCACCAATCAAGTACACCTTGCCTCTCTC~~~TGTATCGC~~CCTCG~C~
1200
1201
TGT~CG~CTM~AT~C~TTGGTCGT~GGTTGGTGGGTGG~CC~GGTTG~CTAGTTGGTT~~GTGA~~C~TC~GTGC~~CG~GCCTCGCTGGG~C
1300
1301
GTTGCTCCTTTGCTTCCGTGCCGTGGG~TGT~~CACGAGGCA~ACTCTATATCTCAG~TGTCTC~CTCACTATGTCCACATGACTATCTC
1400
l&O1
ATCATGACATTA~CT~GA~~ATACT~TCT~T~~~AT~~AA~~~TCT~T~TA~~~A~A~~TTT~
1501
AGCAC~CTATGCCCTCCAAGMTMGM~CCTAGCTCGGAGTCAGC~GGMCTTCMGCCC~M~ATAGACAC~GGMCCCA~
1600
1601
CTCTCCCTCC~TCCCACCCAAGTTTCAGATGATCTCACAGACCTCCATGGCACCTTATGCAGTC~~CAGGTCC~GMTAGGATGCAGATMGCCATG
1700
1701
CCAGAATCCCMCACCACCAAAGCCTTACTCATATAGT~TATGTA~GTGTCTACCCT~CTCCA~CTGG~ATGCTACTGTCCA~TMTACACACT
1800
1801
MTACACATCTCATGC~MTATTATGTGACAC~CAGTGGCCACAGACCTACACACAC~GGT~CCA~ATMGGTG~CGTMGGATA~G~A
1900
1901
TGACATAAACAT~ACATTACTATCCTCGGT~TACA~CTCCATCCCMTGGGCATGGGCT~CTC~CTAGATGACACCTGCMTAG~CCTCG
2000
2001
CCTCTCTCATACTTCTCAGCCCTTTGAGCTCAGCTCACACTAGACAGMCTCACACC~CTCTCAGCTTTCCACCTTGATGMTCTCCATGGCACTCTTCACAC?
2100
2101
TMCACCTCACACACTTMTCA~CATATGMCCM~C~CTCTGACCATCACTCGCGTCAC~CG~GA~CTGTCAC~GGAGM~MTACC~
2200
2201
AGCACATMMTCCCATCACATCCTTAlTTTCTTCCTGTCTGTGMT&Tl'l'TCTTTT?T'TTc~~~~
2300
-l-
~TAAATTAAAAMTTMTTAATTAAAIUC~CAACACT~TACTGCTAG
1100
-3-
BAM
TCGCCATCTACCCMCACCCCTCACAAA
HI
-4-
-type
II ALU1
ACTWCTCTCCACACCA
repeat
-
1500
2400
2301
TTCC~~~TTCTCTCTCTAGCTTTTCCACCCTMTCCTCG
CCTCACCTCMClTCACAGACCTCCTCCCCClTCGCCCC
2401
TCTCCCGATTAMGGCCTCTACCACCMCGCTCTCATAG
2500
2501
TCAGAGMC~CGTMGCTACCTGTCGG~ATAGCATMTCCCACACMGAGCTGMGCACGAGGA~CTG~GAGGG~GCTACAG~CATC~G
2600
2601
ACTCCCTCCCTCAAAACACAGCMGACACAAAAAChAGCTCC~TMGA~CACTTGGGCC~C~CC~CC~CTCA~CAGTC~~~A
2700
2701
AMTCAGCTCTTAAACACGCACTTAGATCTCM~MCACGCTCTATAMCACTCCACTCATAM
2800
2801
CTTCACCC~MCC~C~CTCT~TGmCCTACTAG~CTACTGCCGTA~ATMGAC~TGTCACCATGMGCCAG~~~~~~G
2900
2901
3000
3001
3100
3101
3200
3201
3300
3301
CATTC~CCM~GCCCACACTCATCACTTCAGCTATCCTATCCCCCA~CATTC~CCACC~GCTACMGCTAGC~ATACCC~GC~GCTA~~ -2-T
3401
CAT~ACACTTA~AGTCT~CCCA~GACTC&&TAC
3501
82 -B3-B4T~TCMTCCCTTCTCCCTCTCCCTCATCAAAACTACTTTT-I-CGTTTTTTACG type I ALU I repeat -
3601
Bl-
HIND
III
~TTATTATTATTAcTTC~~ACCM~~~C~A~
BS
TTRllTTCMCACACCCTTTCTCTCTCTACCmCGACCACCCTATCCTA
CCCACTCCCTCTGGACACCAGGCCTCG~TCA~GTGATCTGCCTGCC~GCTCCCACTCCTGCCATCT~GGCATCCACCATCMCA~GGCC~
Fig. 2.
3400 3500 3600 3700
dhfr
Origin
Region
DNA
23 -63800 3900
?ZATATATA*
3901
TCCTGTGCCCTTATCTGTAACTCACACTACC~GCATCTMC~CTGGC~TGAAACCAGCCAAACAAAAACC~TGTGCCACCTCCTGTCTC~CTA~AT
4000
4001
GTTCC~TAGGATATCCTATATCCTAAAGGTTTATTTTACTGATAGCATC~MC~CC~TG~GG~GGTC~CTCMGCAGTCCTCGTGCAGCTG
4100
4101
CCTCCTCAGCTMCTGCCAGGGACMT~TTGATCCCCTCCC~CC~C~CCATGGCMCTCTCG~CC~GGCA~CACCTGC~MG
4201
MTCAGCAAATGACCAATCAGCTCATGAAACTAMTACTCTATTATTACTMAATA~
4301
CCC~GMCTCAGAGAGACTCACTTACCTTTGCCTCCCACGTGCTGGM~~GGCATGMCCACCACACC~CATMCAC~G~~TCTMGAGT
4400
4401
CC:~TC~CCAATACATTTGAGGTTTT~TGTGGCACAG~ThTCC~~C~TATMTc~C~GAcAT~c~cTCACAcT~G~CTAT
4500
4501
ACGTTCTTGCTACChATCC$i~i?TCTGAhA
4600
4601
AT~;TTTTTGATMTCTTAAACAAATCAAAGAAATTAT~GACTAGACTGTGCTACAC~CMTA~CAGATGCCMG~GAG~C~AG~ - ITTAAC~TATCCTACT~GTATAAATCCTTTATAAAGTGO~TOAC~TCTGATG~TC~~GTAG~T~GACATGM
4701 4801
ACGChAATATTTATGCATATAAhAATAAAATAAATCTTTTT-TCC
~ThMGhhGTGACAlTGTC~GGhh~GTGCC
3801
t-
HIND
III
X Al &ibCTAG
ARS
ELF-F’
ECORI
ELF-F
GACACACGCCAT~TCACATAGTTCAGGTTG
CTTCTGTTTCTAGTCTTCTCAGTG~AGTATTCTACCTATGTGCCCTGCCTCAGTGTG
Homologies
TMTMGAAAATCATATTGTGCATATCA~~CCT~CA~MC~GC~TAGMTAGTCC~GT~C~T~T~TCACCMGMC
‘--)
4200 4300
4700 4800 4900 5000 5100 5200 5300 5400 5500 5600
5601
MGMCCATCTAmTGGCTCACTCTCTCTGAGGA~CMC~ATCCChGCMT~OGOhT~GGCh~G~GChGGMTATGTGTOO~GMGcTG
5700
5701
TTTATCTCACMTAMC
5800
5801
&SAG~C~CACAC~GAG
5900 6000 6100
6101
ATTTCACMTACTACATMAACTATCAGATATTTTTCATGATGAATTTCT
6157
Figure 2. Primary nucleotide sequence of dhfr origin DNA. The sequence of the dhfr origin region from the 5’ XbnI site of the -I-.$kh KbnT fragment of S13X-24 to the 3’ Hind111 site of pMC-G (see Figs 1 and 6) is presented. Boxed elements numbered1 to 8 designabesequencemotifs reported to adopt non-B form structural configurations (SW the text). The location of ORR-1 and AZuI family repeats is indicated. BI to B5 indicate the 5 tracts of oligo(dA) that are phased with a 10 bp ptbriodicity in the 280 bp HaeIII bent DNA fragment located at nucleobides 3342 to 3622. Bold lines indicate homologies to either t.he consensus sequence of the ARS core (nucleotides 4776 to 4786 and 4782 to 4792) or the ARS 3’ conservhd rlrment (nucleotides 4709 to 4719 and 4761 to 4771). The EcoRI site that, separates ELF-F’ from ELF-F is located ;at nucleotide 4280.
ELF-F is located at position 4280. Overall, the sequenceis 59 4,, A+T. Sequencing shows the XbaI fragment of S13X-24, estimated as 4.3 kb, is art,ually 4540 bp long. (ii) Reprtitiw
elements
In an earlier study, type I and II Alu family probes failed to hybridize to Southern blots of the 4.3 Xba.1 fragment’ (Burhans et al., 1986b). However, sequence homology searches revealed the presence of t,wo A2wI family repeats located at nucleotides 2248 to 2433 and nucleotides 3536 to 3683. These repeats were not identified by hybridization because each differs from the AZu t’ype I and II repeats
present in the probe sequence by greater than 30% (data not shown). The type II AZuI family repeat located at nucleotides 2248 to 2433 contains a long poly(A) tail. Homology searches also revealed a highly conserved motif located at position 3085 to 3242 that we have termed origin-region repeat 1, or ORR-1 (seeFig. 2). ORR-1 is homologous to extragenie sequencesfrom the mouse MHC class I H2-K gene promoter region (Kimura et al., 1986), the 3’-flanking region of the mouse immunoglobulin heavy chain gene (Cheng et a.l., 1982), and $-flanking region of the rat metallothionein-1 pseudogene b (Anderson et al., 1986). The location
24
M. S. Cuddle
(a)
et al
--I_ 3’
5 dhfr
---
ZLv
Last ex0n dhfr
musigcd
17
musmhkba
a 6Ml
origin
region
6M2
L
+1 w lkb
ratmtlpb
f
lb) COIWWWUS
Orrl !lusigcdl7 Husmhkba Ratmtlpb
Con.enru. Orrl Yusigcdl7
!lusmhkba Ratmclpb 150
att.gac GTAGTT CAGAAGTT AGGAAT TCCAC. CCCAGC ii TTACC GGACCA CTGGCAC ggaa.
COtl*BllSUS
Orrl Yusigcdl'l Musmhkba Ratmclpb 151 ac
COMUWUS flusmhkba Ratmclpb
ad-5 in
Integration BHK268-C31
161 ..G ,:.
ACT.. TT CTAGTT. CC ACATC. AA ATCACAC
Orrl ?lusigcdl7
Chinese hamster
dta
t
ORRl site
AGTTTCTGTG GTCATGGTGT CTCTTCACAG CAATAGAAA** ** ****** ********* -AAGGCGCTC GCCGCCCAGC GATATCACAG CAATAGAAAA
cells
ccccmx~ GACAGTAGTTATCAGA *** * **** ***** ****** *** CCCTAACTAA GACAGAAGTTAT-AGA ORP-A
homology
Figure 3. (a) Locations and orientations of ORR-1. The orientation (small arrow) and location of OKR-I relative to the direction of transcription (large arrow) for each locus are indicated. The following Genbank designations are used to identify each sequence: musigcd 17, mouse IgD 3’ region: mushkba. mouse MHC class I H2-K gene promoter region: ratmtlpb, 5’-flanking region of the rat metallothionein-1 pseudogene b. Hatched regions indicate exons. except for the pseudogene, where the hatched region indicates the sequences homologous to the processed mRNA. (h) Sequence conservation of ORR-1. Gaps introduced into each sequence are indicated by dot,s: nucleotides conserved in all 4 sequences are boxed. (c) Integration of adenovirus type (ad-s) DNA into a sequence homologous to ORR-I. Thta sequence encompassing the integration site of ad-5 DNA in the genome of RHK26W31 cells is compared to ORR-I Identical nucleotides are indicated by an asterisk. The ORR-1 homology to the adenovirus origin region protein A (ORP-A) binding site is underlined. and orientation of ORR-1 is compared to those of the mouse and rat sequences in Figure 3(a). Comparison of the four sequences with one another
(Fig. 3(b)) shows that the ORR-1 motif contains two regions of strong identity; nucleotides 32 to 71 contain 26 of 39 residues common to all four
dhfr Origin
Region DNA
sequences,while nucleotides 91 to 115 contain 20 of 25 bases common to all four sequences. Independent searchesof Genbank with these two conserved regions revealed an additional match to an adenovirus type 5 (ad-5) DNA integration site. Alignment of ORR-1 with the right-hand ad-5 integration site of the cell line BHK268-C31 (Westin et nl.. 1982) shows that the two sequencesare nearly identical (Fig. 3(c)). The right-hand integration junction occurs at nucleotide 5083 of the ad-5 genome. which is joined to nucleotide 106 of the OR’R-1 motif. Comparison of the adenovirus sequences upstream from the viral-genomic DNA junction reveals no significant sequence homology between the pre-integration site and the ad-5 genome. A portion of the ORR-1 repeat is homologous to the consensus binding site for a nuclear protein that binds to the adenovirus origin of replication (OR,P-A: Rosenfeld et al., 1987). (iii) Shor,f direct repeats and palindromes Since ,a variety of viral and prokaryotic origins contain reiterated binding sites for initiation factors, the origin region sequence was analyzed for short direct repeats. Although the sequencecontains at least 70 perfect direct repeats of 8 to 12 bp, the repeats are not reiterated in clusters, but rather are distributed in an apparently random pattern throughout the sequence. Similarly, analysis of the origin region sequence with the program StemLoop (program parameters were 80% match in stems 10 to 50 bp; loops 10 to 300 bp) reveals only imperfect stem-loop matches of various lengths (not shown). (iv) Homo- and heteropolymeric repeats The origin region sequence contains several types of simple homo- or heteropolymeric repeats. Posit,ions 1023 to 1053 contain (A)r4 followed by (C-A),. This is followed at positions 1185 to 1337 by 142 nucleotides of G + T-rich sequence in which the
25
telomeric-like motif TGGGT (Henderson et nl., 1987) is repeated 14 times. Fifty-seven of 61 nucleotides on one strand from position 2248 to 2309 are A residues derived from the poly(A) tail associated with an AM repetitive element. A tract of (A-T)23 is located at position 3766 to 3811. An unusual array of homopurine/homopyrimidine and alternating purinelpyrimidine sequences is located at positions 5734 to 5921 at the 3’ end of the sequence. This array is characterized by two tracts of alternating (A-G), one of 42 nucleotides and a second of 54 nucleotides. These tracts are interrupted by (G), and (CAGA),, followed by GAGGGAGAGAGGC. The entire array is preceded by (G-C) 5 and (A-C) rs. Strand bias in this region results in over 180 bp of sequence that lacks T on one st,rand. Homology searches show that this array has a significant percentage of identity with t’he human I:2 RNA gene and several non-transcribed spacer regions of the ribosomal D?iA clusters from mammals (summarized in Table 1). (v) Homology
to AILS
elements
The 5’ half of the 1.6 kb HindIIT fragment that comprises nucleotides 4524 to 6157 of the origin region sequence has ARS activity in yeast’ (Hamlin et al., 1988). ARS elements are characterized by two consensus sequences; a core sequence of (AT)TTTAT(AG)TTT(AT) that is required for ARS function (Kearsey, 1984) and a 3’ consensus sequence of (AT)(AT)(AT)GCTAAAAG (Palzkill et al., 1986). The dhfr region reported to have ARS activity has one 83 bp segment (nucleotides 4709 to 4792) that contains two overlapping 10 of 11 bp matches to the core sequence, and two 9 of 11 bp matches to the more heterogeneous 3’ consensus sequence (designated by bold lines, Fig. 2). These sequences are embedded in a block of approximately 450 bp (nucleotides 4615 t)o 5069) that’ is 76% A+T-rich.
Table 1 Homologies
of the poly(dA-dG)
Genbonk deslgnotlon
Sequence
with
location RNA
Length (bp)
Human
U2
55.7
379
Human spacer
non-transcribed (t-2 class)
rlbosomal
63.5
296
HUMRGNTSC
Human spacer
non-transcribed (t-l class)
rlbosomal
51.9
343
HUMRGNTSB
Human spacer
non-tronscrlbed (+-0 class)
rlbosomal
56.1
264
RATRRNAOB
Rat 28s
57.0
365
RATNTSI
Rot non-transcribed downstream from
54.6
315
downstream
285
spacer gene
gene
% Identity
HUMRGNTSA
spacer gene
nuclear
other sequences
HUMUG20
rRNA rRNA
small
cluster
from
The program Wordsearch was used to scan Genbank for sequences homologous to the poly(dA-dG) cluster of pMC-G. The length and percentage of identical residues for each sequence are presented.
26
M.
S. Caddie
et, al. for the 251 bp lZa~lI1 fragment t,hat caontains (A-%~ or the 910 bp fragment that cbontains 11-1~ poly(A) stretch located from nuclrotidew 2248 IO 2309. The A + T-rich region of pMC-G with reported ARS activity also shows normal migration in t hca two-dimensional gel assay.
fal
fhl
Figure 4. Two-dimensional gel analysis of dhf’r origin region I1K.A. HneITI digest)s of 813X-24 (a) and pll;l(!-G (b) were resolved in neutral Z-dimensional agarosr/polyacrylamide gels at 4°C as described in Materials and ?vlethods. The arrowhead indicates the 280 bp fragment located at nucleotides 3422 to 3622 of thr seyurncrl presented in Fig. 1. The position of size markers in the 1xt dimension gel is presented in hp.
(b)
Structural
mot$q
in
thr
origirt
region
~wqurncr
(i) Brnt DATA4 St,ably bent DNA is associated with a number of viral, bact,erial and yeast origins of replication (SW Introduction). To determine if the dhfr origin region contains bent DNA sequences. a, two-dimensional gel technique was used to survey the plasmids S13X-24 and pNLC-C:(seeFig. 5 for plasmid designations) for small restriction fragments that migrate anomalously on polyacrylamide gels (Anderson, 1986). As described by Anderson. fragments of random composition form a smooth arc in the second dimension, whereas those with unusual structures migrate to positions discordant, with their size. An HaeIII digest’ of S13X-24 reveals two fragments that migrate anomalously in the t,wo-dimensional gel assay (Fig. 4(a)). The first. which migrates IOq;, faster than expect)ed in the sec*onddimension. is a 434 bp fragment derived from the cloning vector. This fragment is 58 O’ lo G -t C; the basisfor its increased mobility is unknown. The second fraymerit, migrates with an apparent size of 290 ( f 10) bp in the first dimension agarose gel, and with an apparent size of 345 ( f 10) bp in the second dimension polyacrylamide gel (arrow, Fig. 4(a)). This fragment, was cloned from the second dirnension gel and sequenced; it is 280 bp in length. caomprisesnucleotides 3342 to 3622. and contains five tracts of oligo(dA)3., spaced 10 bp apart (labeled Bl to B5 in Fig. 2). Elsewhere we will describe cyclic permutation assays and temperature-dependent alterations in the rlect.rophoret,ic* properties of this fragment that confirm that the 280 bp HaeIIT fragment cont,ains stably bent DNA (Yl. Caddle & N. H. Heintz, unpublished results). At neutral pH, the 970 bp HaelI fragment of pMC-G t,hat contains the cluster of (A-G) tracts does not migrate anomalously in the two-dimensional gel assay (Fig. 4(b)). Sor is altered migration observed
(ii) ~Vucleasr .~erzsiti&y of’ oriyita rrgiott D,V.-l The dhfr origin secp~mw c*ont,ains a rlurnbcr 01’ candidate elemenk t’ha,t are reported to atlol~t non-R form l)SX structures. To simplif& disc.ussion of thr following structural studies. th~~hemot>& havcb bran numbered 1 to 8 in the o&r in whicsh t h(J>a.ppear 5’ to 3’ iI1 the clhfr origin rrgioll st~clu~‘n~~~’ (s(ltl Figs 2 and 5). Region 1. locat,rd at rrucleotides 55 I to 585. consists of 33 c~ontiyuous A or ‘T’ residues in which the motif (WATT) is repeatjet four, t irnrs. This .A+ T-rich region abuts a (i-rich region from nrcc~lt~o tides 500 to 550. Kegiorl 2. iit positio~~s llW! t0 1053. is (composed of (A)ll. (;\-(09. a srquenc~r reportr~d to form K-1)NA under torsional sl rt+(Rich rf cl/.. 198~). Region 3. a,t nuc~leotitles 1185 to 1337. c.ontains I4 topics of the telomeric*-likrbrepvat ‘I‘($G(:T (Henderson rt al.. 1987). R,egiolj 4. locatrd at nuclrot~ides 2264 t’o 2332 ronsists of 3X c>ontiguouh A residut>s, a stAquenc*r of enhancletl Hexihilit? (Hogan et nl.. 1983) that, is dificult to caondrnseinto nucleosomes(Simpson Jz Kunzler. 1979). Region 5. the bent) l)XA sequence. is located at nuc~leotides 3410 to 3460; bent DNA has been show11to promotc~ disrupt’ion of duplex DNA (Koo rt I//.. 1986; Ramstein & Lavery. 1988). Region 6. the (A-T),, tract located at nucleotides 3766 to 381 I shoultl he readilr r~xtruded as a c~rueiform in super~oilctl plasm’ids ((>reaves et trl.. 1985: ;CI art’ inter rupted by (C:)+ i+ sequcn(‘eirnplicat,ed in t~riple helix formation in supercoiled plasmids (Kohn-i & KolvhiShigematus, 1988). and are preceded hy ((i-C’):. :I classical Z-DNA mot’if. To hegin to define t,hr contribution:, of ttrvsr se’penres t)o structural organization of the tlhfr origin region. a panel of recombinant plasmid:: cwnt,aining various fragrnrnts of the origin repioll was L clonsiruc+d (Fig, 5). Each plasmid was analyzed for DNA unwinding elemen1.sand/or carucaiform formation by dt%ermininp it’s st&tirit>, t0 mung hean nucleasc at neutral pH. The tiamc\ plasm mids then were t,ested for Z-DNA. triple-xtrarrtletl DNA. or other non-H czonformers stabilized at ac,itl pH by repeating tbe MBN sensitjivity assays at
dhfr Origin
Xba
H3
Barn
~l~~7 Xba
H3
Xba
Barn ikb
y!zr
Barn -H3 PMC-A
H3 d3 pMC-FH3 II3 Xba
Xba
PMC-G
‘“Si;X-241 Barn
Element
Sequence
I
(AATTi,
i!
(dA),d(dC-dA)s
3
Poiy
5
Bent
6
8
1022-
ITGGGT),
4
7
PosItIon 550-582
Poly ARS
(dA13s
2264-23G2
DNA
3342-3622
idA-dTjz3
3766-381
Homology/76%A+T Poly
(dA-dG)
1053
1185-1337
4709cluster
I 4792
5734-5921
Figure 5. I,oratiotrs of seyuenceelementswithin dhf’r origin region plasmids.The identities and locationsof the sryuenc~elementsdesignatedI to 8 in Fig. I and the text are denoted as filled boxes within the recombinant,plasmids ~.ll(‘-~i. -13.-1). -F. -C:, and S13X-24. Not shown is thts700 hp Hi~zdlll-Rp?,T portion of PVC-D that extends from the Y Ni?rdTIT site (nucleotide 6157).
pH 5.2. Formation of triple-stranded DNA was monitored by t,hr intermolecular hybridization assay described by Htun & Dahlberg (1988). (iii) D,Td unwinding activity at neutral pH A prerequisite for initiat’ion of DNA synthesis is the unwinding and subsequent separation of duplex DNA in the origin template. Sequences that promote facile unwinding can be detected as MBNsensitive sites at neutral pH when subjected to superhelical stress in plasmid DNA (Umek et al., 1988a,b). Since supercoiled vector containing no genomic insert reveals constitutive cleavage sites (Fig. 7(a), lanes 14 and 15), it provides a reference for comparing the unwinding activity of various genomic sequences. When supercoiled pMC-D, which contains a 4.5 kb HamH I-KpnI fragment that includes nucleotides 1534 to 6157 (seeFig. 5), is incubated with MBN at neutral pH in a low ionic strength buffer, four single-stranded fragments in addition to the fulllengt’h HindITI products are observed (Fig. 6(a), compare lane 3 to lane 1). (The ssDNA fragments generated by MMN cleavage in this and following experiments are designated by letters corresponding to the fragment from which they are derived as
Region
DNA
27
delineated in Fig. 5, and by numerals to allow comparison of the digestion products of individual fragments in different plasmids.) The ssDNA products of 730 nucleotides (Fl) and 320 nucleotides (F2) (Fig. 6(a), lane 3) result from cleavage at region 6 sequence). Cleavage also occurs at a (the (A-% lesser frequency at region 8 (the (A-G) cluster), yielding ssDNA fragments of about 1300 (Gl) and 360 nucleotides (G2) (Fig. 6(a), lane 3). Note that treatment of pMC-D with MBN after digestion with I1indIII generates no additional ssDNA fragments, confirming that the cleavage sites are induced by supercoiling (Fig. 6(a), lane 2). Not unexpectedly, the (A-T)23 sequenceremains the preferred cleavage site in pMC-F, a plasmid that contains only the 1 kb Hind111 fragment located at nucleotides 3490 to 4524 (Fig. 6(a), lanes 6 and 7). Incubation of pMC-G in the MBN assay at neutral pH results in equal amounts of single-strand scission at two sites (Fig. 6(a), lane 5). Cleavage at one site yields ssDNA products of 1300 (Gl ) and 360 nucleotides (G2) identical with those observed in pMC-D (compare lanes 3 and 5). Cleavage at the second site yields products of 1350 (G3) and 320 nucleotides (G4). Mapping with various restriction endonucleases (not shown) locates both nucleasesensitive sites within pMC-G t’o region 8. Supercoiled pMC-A is also cleaved by MKN at two sites. Prominent digestion occurs within region 4, generating ssDNA fragments after digestion with EcoRI and Hind111 of about 1240 (Al) and 750 nucleotides (A2) (Fig. 6(a), lane 9). ,4 lesser portion of the pMC-A molecules are cleaved in vector sequences yielding ssDNA fragments of 1585 (Vl) and 1010 nucleotides (V2) (compare lanes 9 and 15). pMC-B is cleaved with about equal efficiency at two sites, one within t’he vector (compare lanes 9 and 11 in Fig. 6(a)) and one within the insert (lane 11) yielding ssDNA fragments of 1100 (El) and 610 nucleotides (B2). Analysis of XbaI digests of the pMC-B cleavage products (not shown) locates the MBN cleavage site 610 nucleotides 3’ to the XbaI site at nucleotide position 3, or within region 1. To ascertain the relative sensitivity of sites within pMC-B to those observed in pMC-A and pMC-F, digestion assays were performed on plasmid S13X-24 that spans nucleotides 1 to 4540 (Fig. 6(a), lanes 12 and 13). In this instance, prominent cleavage products of 730 (Fl) and 320 nucleotides (F2) are observed, indicating preferential cleavage at region 6. Thus, in the context of S13X-24, the sites within fragments A and B are not reactive. Systematic analysis of the cleavage pat,terns of supercoiled plasmids containing various fragments of the origin region sequence allows ranking of the ease with which each DNA sequence element is attacked by MBN relative to one another and to the vector control (summarized in Fig. 6(c)). The preferred order of cleavage activity at neutral pH relative to endogenous vector sites is: (A-‘% >>(A-G) >> (A)38 > vector = (AATT),, op 6 >>8 >>4 > vector = 1. The (A)14, (C-A), motif (region 2), the G+T-rich region (region 3), the bent DNA sequence(region 5),
28
(01
M. 8. Caddie
flamid AE
pMc-D
Ph?eG
HJ-.---+w
MEN-
PMC-F
n3 -
c
+
PMC-A
H3 ----a@.
-
c
et al.
-
H3iRI
pllbc-a
*
c
put 13
S13X--24
IWRI
-e
W3!6
--h
-
+
-
+
H3 - --
+
V‘- _I v-
V
-am
v.
AG-Gl
F--
m
m-F1
@p -G2 -F2
1
2
3
a
5
6
”
Fl (r -F2
7
8
9
10
I1
I2
:aj
t3
I(;
-nl(l)
F’
#R-Q@ 2
B-
-G5
-G5
1
-f=l
V-
v-
-F2 4
*,
‘VI Al V2 A2
=-
F-
IC)
V-W
A-
3
4
5
- F2 6
7
-F2 8
9
10
11
12
13
14
15
PVC13
pMC-B +
SI3X-24
m
I
1 f
Figure 6. Mung bean nuclease assay for DNA unwinding at neutral pH. The indicated plasmids wer’r inc.ubated M ith LMBN at 37°C for 1 h at pH 7.2. Plasmid DNA was isolated, digested with the indicated restriction endonuc~leases (HE). end-labeled, and the products were resolved by alkaline agarose gel electrophoresis and autoradiography. The bold letters on t,he left of each panel indicate restriction fragments, regardless of context: corresponding to those’indicated in Fig. 5. Thus, A denotes the 1955 bp BumHI-Hind111 fragment of pMC-A that is also contained in pMC-1) and S13X-24: B denotes the 1535 bp X&z-BunzHI fragment of pMC-B that is also contained within S13X-24: P denotes the 1030 bp fragment of pMC-F that is also contained within pMC-D and 513X-24; and G denot,es the 1630 bp fragment of pM(‘-C that is also contained within pMC-D. V indicates the vector fragment; V* indicates a vector band derived frorn pM(:-1) that retains the 3’ 790 bp HindID-KynI portion of the insert after digestion with HindIII. Digestion products derived from various fragments or the vector are denoted by letters and numerals on the right of each panel. Cl and (L!, for example. correspond to the products of MBN attack within fragment C: in the plasmid p&MC-D that are also observed in the
dhfr Origin Region DNA and the A + T-rich region containing ARS homology (region 7) were not reactive at neutral pH in any cont)ext tested. (iv) Effect of pH
on nzung bean nuclease cleavage
A number of non-B form DNA conformers are stabilized by an acidic environment. In particular, t,he alternating sequence (C-T),. (G-A), has been shown to undergo a structural transition at acid pH and moderate supercoiling to a nuclease-sensitive conformaJtion. consistent with triplex DNA formation (for review. see Wells et al., 1988). Triplex DNA formation is not limited to alternating purinejpyrimidine sequencers. as (G),, and other simple homopurine/homopyrimidine repeats have also been implicated in triple helix formation (Kohwi 8r Kohw-Shigematsu, 1988; Mirkin et al., 1987). It is also important to note that nuclease sensitivity is not diagnostic for triplex DNA structures, since other non-N DNA conformations are also sensitive to single-stranded nucleases (for a review, see Wells, 1988). To asc~ertain the effect of pH on cleavage specificities by MBN nuclease. we repeated the MBN digestion studies at pH 52. As shown in Figure 6(b). the csleavage pat tern of pM(‘-D changes significantly whw the pH is lowered from 7.2 to 52. While region 6 remains quite reactive at pH 5.2, region 8 represents the most nuclease-sensitive site at acid pH (Fig. 6(b). lane 5). Tn contrast to neutral pH, clea\= age at acid pH within region 8 generates a single broad band of about 1300 nucleotides (G5) in pJIC’-G ‘(lane 5). Thr (A-T),, tract of region 6 that is the primary cleavage site at neutral pH remains reacative at a(%1 pH in pMC-F (Fig. 6(b), lane 7). As was observed for neutral pH, the vector sequences in p~lC~.~I). pMC-F and pMC-G are not reactive. The plasmid pJTc*-A. which contains regions 4 and 5. is cleaved at the same site within the insert at arid pf-1 that is observed at neutral pH; vector sequences appear somewhat more reactive in pMC-A at acid pH (compare lane 9, Fig. 6(a), with lane 9, Fig. 6(h). The cleavage pattern of pMC-B at acid pH is also similar t’o that observed at neutral pH (lanes Il. Fig. 6(a) and 6(b)), with vector sequences remaining reactive. Thus, reducing the pH from 7.2 to 5.2 does not appear to alter cleavage specificity, but does a,ffettct. the reactivity of regions 1 to 3
29
relative to vector in both pMC-A and pMC-B. Vector cleavage sites are not altered by an acidic environment (Fig. 6(b), lane 15). The cleavage patt,ern of S13X-24 at acid pH is also identical with that observed at neutral pH, with region 6 remaining the most reactive site (Fig. 6(b), lane 13). In some experiments, long exposures reveal minor products consistent with cleavage within fragment “A” at region 4 and within fragment “B“ at region 2. Vector sequences are not reactive in the context of S13X-24. In a manner similar to that used for ranking the relative reactivity of each sequence element at neutral pH, we determined the order of preferred nuclease sensitivity a,t acid pH to be: the (A-G) cluster >>(8-T),, >> (AATT), > (A),,=vector: or 8 >> 6 >> 1 > vector=4. In some experiments, (A)14r (C-A), was weakly reactive. Bent DN;A, the A ST-rich region containing ARS homology. and the telomerit-like TGGGT region were not’ reactive in any context tested. (v) Triple-stranded DNA formation nt acidic pH Models for nuclease and chemical sensitivity of simple repeat’ing purine/pyrimidine sequences similar to those of region 8 favor a triple-helix D?\‘A conformation in which the st’rand containing (T-C),, folds back and forms Hoogsteen base-pairs with the double helix, forcing the (A-G), strand into a singlestranded conformation (reviewed by IYells et al., 1988). In the triplex configuration, t’he extruded strand is available for strand-specific int,ermoleeular hybridization with ssDNA containing (T-C:), (Htun 8r Dahlberg. 1988). To determine if the MMHN cleavage of pMC-D and pMC-G at region 8 was the result of triplex-D%A formation, an intjermolecular hybridization experiment was performed. Recombinant Ml3 phage DNA containing either the (T-C) strand (clone 154/3) or the (A-G) strand (clone 154/ 4) of the 1.6 kb Hind111 fragment of pMC-G were incubated overnight at 37°C with supercoiled plasmids pMC-G and pMC-D at both acidic (pH 5.2) and neutral (7.2) pH. The hybridization products were separated by agarose gel electrophoresis at pH 7.8, blotted, and probed with a labeled fragment specific t’o h’I13in order to visualize M13-plasmid complexes. As shown in Figure 7, complex formation between Ml 3 ssDNA and either plasmid at acid pH is limited t,o only t’hose reactions containing phage with single-stranded (C-T) tract’s (lanes 2 and 5). For
digestionproducts of pMC-G. The sizesof digestionproducts are estimatedwithin k 50 basesas follows: Al, 1240;A2. 750; Bl. 1200; HZ, 660; Fl. 730; F2, 320; Gl, 1350;G2, 360; G3, 1300;G4, 300; Vl, 1585;V2, 1005.Abbreviations for restriction endonucleases are: H3, HindIII; Rl, EcoRI; B, BarnHI. (b) MBN cleavagepatterns of origin region plasmids at arid pH. MB?; activity was titrated at pH 5.2 to yield an average of one ssDNA nick per molecule, and the MNB digestion experiment presented in (a) was repeated at pH 52. Cleavage products were resolved as before. Lane 2 displays the products of pMC-D treated with nucleaseafter digestion with HindIII. Full-length restriction fragments and digestion products are denoted with letters and numbers as in (a), with the following exceptions: G5, 1300; G6, 350. (c) Summary of MBN sensitivity at neutral and acid pH. The relative reactivity of MBN-sensitive sites in each plasmid is indicat’ed by an arrow. React$ity at $l-7%‘is’&jica& $bove eaih $Iasm& reactivity at pH 52 is indicated below each plasmid. The lower line summarizes the hierarchv of reactivit,y of sites 1 to 8 as discussed in the text.
30
et al.
M. 8. Caddie PH
-_I_-
5.2
7.2
pMC-D
--
-+++
---
pMC-G
+ + + - -
--+++
--
-
- - + + + - _ _ __ -
M13-15413
-
+
--+-+---f--+-+-
Ml3-15414
-
-
+
-
-
+
-
+
-
-
+
-
-
+
-
-
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
1
2
3
4
5
6
7
9
10
11
12
13
14
15
16
(a)
(b)
6
Figure 7. Triple-stranded DNA formation in plasmids pMCI-G and pM(‘-1) at acid pH. Supercoiled plasmid 1)K.A ~a:: strand incubated with recombinant Ml3 phage DNA from either the poly(dT-dC) strand (154/3) or the pol~(tiA-cl(:) (154/4) of the 1.6 kb Hind111 fragment of pMC-G at either acid or neutral pH for 24 h at 37°C’. The int~rrmolrc~ulat~ hybridization products were separated by agarose gel electrophoresis at 4°C’. The gel was stained with c*thidiurn brotnidt, (a), blotted to nitrocellulose, and the blot was hybridized with an Ml3 specific> probe (1)). (‘omplrses specific* to thtb hybridization reactions are denoted by arrowheads. SSL. single-stra.nded linear phagr IJXVA: SS(‘, single-stranded circular phage DPiA.
pMC-G, which contains the (A-G) tracts isolated in the 1.6 kb Hind111 fragment, two complexes are visualized, one migrating (relative to linear DNA) at 2.2 kb and a second at 6 kb (lane 2). For pMC-D. a single complex between phage 154/3 and the plasmid that migrates at approximately 6 kb is formed (lane 4). Examination of the ethidium bromide staining pattern of the gel shows t,hat all the plasmid-recombinant Ml3 complexes migrate behind the position of the supercoiled plasmid DNA (arrows, Fig. 7(a)). At neutral pH, no hybridization between pMC-D or pMC-G and either recombinant Ml3 phage DNA is observed (Fig. 7(a) and (b)). Xote that the probe does not) cross-react, with plasmid sequences(Fig. 7(b), lanes 1, 4, 9 and 12). 4. Discussion Experimental evidence for specific origins in vertebrate chromosomescomes from numerous biochemical and physical observations, and is supported by the molecular precedents established by the study of the replication of double-stranded DNA molecules from diverse biological sources. Because origins of replication from animal cells might be expected to share common features with their
counterparts from other sources. we havcl analyzetl 6.2 kb of DNA derived from the (:HO dhfr origin region for homologies to other srquen~s. md for selected st’ructural features.
The origin region sequence contains two ttlcrnI~c~r;; of the highly repetitive AU family of intcrspt~rs(~d repeats. Because the 30 kb initiation region cxont,ains numerous other AluT farnil!, repeats (at Itlast IO). these elements likely have no direct rolr in l)NA replication. In cont.rast t,o tho .31ul and ot,hc,t repeats, hybridization c~sprriments (data not shown) indicate that, t’hc, OR’R-I motif loc~at~c~l itt position 3085 to 3243 occurs onl?; otIq(t ill t~:~vtl of’thtl amplified dhfr domains. The highI>. c~onsc~rvc~tl nature of this extragenic. motif (Fig. 3) im1icat’c.s that the sequence mai have a specific funr%ion. Thrt selection of a site homologous to ORR-I for atlwovirus DNA integration in the cell line BHKBBX-(‘31 suggests that ORR- I may share features with ot)hrt adenovirus DNA integration sit,es, many of which are actively transcribed int)o small RXAs (Doerfler et al, 1983: Gahlmann et al.. 1984). The transcrip tional status of the ORR)-1 motif in CHO(’ 400 c:&
dhfr Origin
is unknown. Both the ORR-1 and the bent DXA sequence are located within the 1.8 kb RnnzHIP NindlIT fragment (nucleotides 1534 to 3490) t’hat “in gel” renaturation experiments suggest, as containing an init’intion site (Leu &, Hamlin. 1989).
(b) Othrr
homo/ogiPn
The chlstrr of homol)~~rin~/homop~rimidi~~etracts at the 3’ end of the sequence has signiticant’ ident,itJ with the I’:! RNA gene of humans and with portions of the noi1-transcribed spacer regions of ribosomal RS,L\ genes from a number of vert>ebrate species (Table 1). Tn yeast. origin mapping experiment)s she\\- that rI)NA repeats contain initiat’ion sites within thr non-tralls~ril)etl spacer region (Brewer B Fangman. 1988: Linskens & Huberman. 1988). InterestingI;\. the rI)SA regions in yeast also cant ain a barrier to replication that prevents fork movement in a 3’ to 5’ direction through the transcriptioll units (Brewer Br Fangman, 198X; Linskens & H ubrrman. 1988). Since (A-G) repea,ts cause replication fork pausing in SY-10 DNA molecules (Rao rt ~1.. 1988). we’ are presently utilizing two-dimrtlsional g:el techniques to determine if replication forks pause near the cluster of homopurine/homol)>Yrin~idine1ra~fs i?/ l,ir~.
.A number of srquencr element’s in dhfr origin region DSA arc’ reactive in the MRX assay for DNA unwinding rlements. Tn this study. the observed heirarchy of ;CIBN cleavage was: (A-T)23 >>the (.4-G) (*luster >>p+(A) >>vector= (AATT),. The remaining origin region elements, including t.he A +-T-rrch region with ARS homology, were unreactive. (‘ontext has dramatic effects on nuclease sensitivity. nuggesting that the relationship between t’hr abilit:- of I)SA sequencesto unwind in vitro and a role 1n l)SA replication in zGo may be very complex. Though the (A-T),, t’ract of region 6 was t,hr preferred cleavage site at neutral pH in every context examined, we consider it’ unlikely that this sequence represents an element of the dhfr replication origin. Although (A-T) has unique properties that permit it to he readily extruded as a cruciform in supercaoiledplasmids (Greaves et al., 1985), it is also reactive wit’h nucleases in linear DNA (>l&lellan et nl.. 1986). It would be interesting to test the function of this sequence as a DNA unwinding element in yeast ARS elements to determine if facile unwinding, or some other property of the A-t T-rich regions associated with origins, is critical fbr initiation activity. The hypprsrnsitivity of pMC-A at neutral pH in region 1 that contains (A)38 was unexpected. Although theoretical considerations predict that poly(A) should display non-B form conformations under superhelical stress (reviewed by Wells et al., 1988). previous st,udies indicate that (A),, is not
Region
31
D,V*:A
reactive with ssDNA nucleases in supercoiled plasmids (Hanvey et al., 1988a). The basis for this discrepancv is unknown. The weak cleavage at region 1 within pMC-B is reminiscent of t’he reactivity of an A + T-rich region induced by neighboring G-rich homopurine sequencesin the 3’ end of the rat 1, (long interspersed repeat DNA) element (Ysdin & Furano, 1988). Our results also show that’ sensit’ivit’y t,o MBN at neut,ral pH is not’ limited to A +T-rich sequences. I>igest,ion at the (A-G) cluster in pMC-G, for example, likely reflects cleavage due t’o other non-H form conformers. Preliminary primer estension experiments that have mapped the Ml%s caleavage sit)es to nucleot’ide resolut’ion in region 8 suggest that cleavage occurs within the sequence element, it,self. as well as at its junction with flanking sequences(M. Caddle Kr N. H. Heints, unpublished results). Fine st’ruct’ure mapping with additional enzymat’ic and chemical probes will be required for resolving the precise basis for the nucaleasesensitivity of this and other sequence elements in the dhfr origin region. (d) Dijj%rmt
sequences promote and bending
unwinding
The relationship of sequencesthat promote DNA bending and DNA unwinding remains unclear. The two-dimensional gel assay at’ neutral pH shows that the (A-G) tracts, (A)38r and (A-T),,, all nucleasesensitive at neutral pH, do not induce markedly abnormal migration of small fragments in polyacrylamide gels. Rather, anomalous migration ‘is limited to the 280 bp Ha&T fragment located at nucleotides 3342 to 3622 (Fig. 4); which is not, a detectable cleavage site for MBN in any context yet tested. Thus. DNA bending is not equivalent to unwinding, nor is nuclease sensitivity predictive of sequences that induce anomalous migration of linear fragments on polyacrylamide gels. We suspect that the 270 kb of DNA that comprises t,he amplified dhfr domain contains numerous elements that demonstrate varying propensities for melting or bending in different contexts and under particular experimental environments. Elements that promote localized unwinding or bending may be meaningful in initiat*ion of DNA replicat’ion only when such sequenc$es are adjacent’ to or interact) with initiat#or binding sites. (e) Nuckase
sensitivity
and trip1e.r
DLVA
at acid pH
Incubation of the panel of plasmids used in the DNA unwinding assayswith MBN at’ acid pH shows that protenation may alter cleavage preferences and specificity (Fig. 6). Tn particular, the weak reactivity of region 8 at neutral pH is markedly enhanced at pH 5.2. As shown in Figure 7. int,ermolecular hybridization studies with recombinant, singlestranded Ml3 phage DNA and the supercoiled plasmids pMC-D and pMMc-G indicate that the DNA st’rand containing (A-G) is extruded at acid pH in a
M. AS.(‘addle et al
32
form available for hybridization to ssJ)l\A containing (T-C). Modeling st’udies suggest that triplex DNA introduces dramatic kinks in duplex DNA (reviewed by Wells, 1988). Since at least’ t,wo triplex conformations are available to each (A-(:) t,ra,ct, the conformations available t.o the (A-(:) cluster in t,hr dhfr origin region may be quite complex. The observation that two complexes are visualized for the reactions containing pMC-G, while single complex is visualized for rractions :ontaining pM(‘-1) (Fig. 7(b)). suggestjs t’ha,t a significant portjion of supercoiled pMC-(: moleculrs may conta,in two regions of triple-stranded DNA. Ongoing stjruct’ural studies support this interpreta,tion (M. Caddle & ?i. H. Heintz. unpublished data). Since the intermolecular complexes formrd at acid pH are stable during rl~~c,t’rophoresis at pH 7.X. it should be possible to study t’hese hybrids. once formed at acid pH. under a variety of’ tlt’at ph,vsiological conditions.
Origins of replication are (composed of niultiplc~ elements that ftmction in specific stagrs of the multi-step init’iation process. Thtb results prrsentjrd here show that a relatively small region within tht, earliest, replicating port,ion of the amplified dhfi domain contains a novel conserved repetitive element and a number of st~quencr~ rnot,ifs that> have unusual structural properties under various c*ontjrxtual. borsional and environmental influences. W’hile we recognize that, formation of non-K form strucatures in V&O requires non-F)hvsiological envirollm-f suspect that c~~~llular proteins. ments. transcription or replicat,ion ma)- exert regional influences that promot*e the formation of particular structural conformations in ~lr:o. Such conformations need not, persist in cliro. but may be transient in nature, occurring only during defined stages of the cell cycle in response t,o various influenccas. Dorument,ation of thr structures available to t,he dhfr origin sequences i/l P&O. and dernonst,ra,tion that these or other conformat,ions occur in r:iw, is required before a role for particula,r conformations in initiation of DIVA replication can be delineated. Sote added in proof’: Recent origin mapping experiments confirm that the bidirectional origin of replication for the C!HO dhfr gene is locat,rd within the region studied in this paper (Handeli, S., Klar, A.. Meuth. M. &z Cedar. H. (1989). (‘41, 57. 909-920). \Ve
thank
.Janr
Selrgue Rabrns
sequencing, Laurie
for
her and
early ,Juciy
help with lies&r
ance with the mnnuscript. Hen Van Houtrn Heintz t’or critical comments. and Susan Temple Smith at MRCRR for assistance seyuenw analysis. This work was supported
GM32859
from the NH.
thr fi~r
l)iK5;1. assist-
and iVat Russo and with the by gritnt~
dhfr Origin Region DNA Cold Spring Harbor Laboratory Press, Cold Spring Harbor. K;\;. Mc(‘lellan. ,I A.. Palecek, E. & Lilley, D. M. J. (1986). Sucl. Acids Res. 14, 9291-9309. Messing. J. (1983). Methods Enzymol. 101, 2Cb-78. Nilbrandt. ,J. I).. Heintz. K. H.. White, W. C., Rothman, S. M. & Ha,mlin, J. I,. (1981). Proc. Nat. Acad. Sci.. r..S...l. 78. 6043-6047. Mirkin. S. IN.. Lyamicher. V. 1.. Drushlyak, K. h’.. v. K.. Filippov. Uobrynin. S.A. & Frank-Kamenetskii. 11. I). (1987). AV~furr (London). 330. .495-497. Jlontoya-Zavala. JI. 8r Hamlin. *J. L. (1985). ,%Zol. (‘e/l. Hid. 5. 619~-627. Palzkill. T. C.. Oliver. S. (:. & NewIon. C. S. (1986). ,V~cl. dcidw Hrs. 14. 6”47- 6263. Poncz. 11 Solowiejczyk. D.. Ballantine, M.. Schwart’z. E. 8: Surrey. S. (1982). Proc. lVa,f. Acnd. Sci., I:.X’.A. 79. 4298bK~OP. Ramstein. J. Cy Larery. R. (1988). Proc. Nat. Acad. Sri., f..S..-l. 85. 7231-72‘3%. c R#ao. U. S,.. Manor. H. h Martin. R. G. (1988). Mucl. Acids Krs. 16. X077 -8094. Rich. A.. Sordheim. r\. 8r Wang. A. H. J. (1984). ;2nncr. Rrt.. J~iwhew. 53. inI--846. R’osenfeld. I’. ,I.. 0‘1veill. E. ,l., Wides, R. J. & Kelly. 1‘. ,I. (l!)Xi).
Mol.
("~11. Kid
33
Sanger. F., Ncklen, S. & Coulson, A. R. (1977). Proc. Nat. Acad. Sci., (1.S.A. 74, 5463-5467. Simpson, R. T. & Kunzler, P. (1979). XIX1. Acids Res. 6, 1387-1415. Snyder, M.. Buchman, A. R. & Davis. R,. W. (1986). Nature (London), 324, 87-89. Vmek, R. M. & Kowalski. D. (1987). Nucl. .-Ic%ds Res. 15, 44674480.
ITmek. R. M. & Kowalski. D. (1988). (‘~11. 52. 559-567. I’mek. R’. M.. Eddy. M. J. & Kowalski. I). (1988a). (!ancer Crlls, 6. 473-478. ITmek. R’. M.. Linskens. M. H. K.. Kowalski. 1). & Huberman. J. A. (19880). Hiorhinc. Rioph~ys. Acta. 1007. l-44. ITsdin. K. & Furano. A. \‘. (1988). PIW. Snt. Acad. Sri., T7.S.d. 85. 4416.-4420. Wells, R. D. (1988). .I. Biol. C’hem. 263. 1095--1098. Wells. R. I)., Collier, D. A., Hanvry. ,J. (‘.. Shimizu. M. & Wohlrab. F. (1988). FASER .I. 2. ?93!)- 2949. Westin. (i., Visser, L., Zabielski. .I.. van Mansfeld, A. 1). M.. Pettersson. L’. & Rozijn. ‘l’h. H. (1982). Genr. 17. 263-270. ~$Xiarns. J. S.. Eckdahl, T. T. 8r Anderson. .I. 5. (1988). Mol. (‘~11. Hiol. 8, 2763-2769. Zahn. K. 8i Blattner. F. R. (1987). Sci~ncr. 236. 416-422.
7. 875-886.
Edited by K. A. Lask~y