,I. Mol. Hid. (1990)

211.

19-33

Intramolecular DNA Triplexes, Bent DNA and DNA Unwinding Elements in the Initiation Region of an Amplified Dihydrofolate Reductase Replicon Mark S. Caddle, Richard H. Lussier and Nicholas Drpartrvxnf

of Pathology,

CnGwsity

Rurhgton,

(Received 9 May

of I’rrmont

College

H. Heintzt of Medicine

VT 05405, r’.B.d.

1989, and in, revised

form

16 A4u,gucst 1989)

The nucleotide sequence of 6.2 kb (1 kb= IO3 base-pairs) of DN-A that encompassesthe ra,rlient replicating portion of the amplified dihydrofolatr redurt,ase domains of (‘HOC MO cells has been determined. Origin region DNA contains two dlul family repeats. a novel repetitive element (termed ORR#-I). a TGGGT-rich region, and several homopurinej homopyrimidine and alt#ernat’ing purine/pvrimidine tracts, including an unusual cluster of simple repeating sequencescomposed of (6-C)5. (A-(‘ols, (A-C:)21, (G)9. (CAGA),. GAGGG,~(:A(:=26GCBGAGAGGG. (A-G),,. Recombinant plasmids containing origin region sequenceswere examined for DNA st’ructural conformations previously implicated in origin activation. Mung bean nuclease sensitivity assays for DSA anwinding elements show the preferred order of nuclease cleavage at neutral pH in supercoiled origin plasmids IO be: (A-T),,>> the (A-G) cluster >>(A),, >> vect’or= (AATT),. At acid pH, the hierarchy of cleavage preferences changes to: the (A-G) cluster >>(A-T),, >>(AATT), > vector = (A)38. A region of stably bent DNA was identified and shown not t’o be react’ive in the mung bean nuclrase unwinding assay at either acid or neutral pH. Intermolecular hybridization st.udies show that, in the presence of torsional stress at pH 5.2, the (A-G) cluster forms triplestranded l)?\‘A. These resuhs show that t’he origin region of an amplified chromosomal replicon contains a novel repetitive element and multiple sequence elements tha,t facilit,ate 1)9,\ bending, DSA unwinding and the formation of intramolecular triple-stranded DSA.

1. Introduction

synchronized cells indicate that replicat,ion of t*he amplified domains commences within a series of restriction fragments, termed early-Meled fragments, or ELFs, located 3’ to the dhfr gene (Heintz & Hamlin, 1982; Heintz et nl., 1983; Heintz & Stillman, 1988). Hybridization of replication intermediates formed during the onset of 6 phase to cloned ELFs indicate t)hat Dh’A synthesis begins within a 4.3 kb XbaI fragment located approximately 14 kb 3’ to the last exon of the dhfr gene (Burhans et al., 1986a.b). “In gel” renaturation analysis of the labeling pattern of amplified restriction fragments suggests an initiation site is located within a 1.8 kb RamHFNindIII subfragment of the 4.3 kb XbaI fragment (Leu & Hamlin, 1989). This site is enriched for repetitive sequences contained within an “origin-specific” DNA fraction (Anachkova & Hamlin, 1989). Searby sequenceshave been reported to function as an autonomously replicating sequence (ARS) element’ in yeast (Hamlin et al., 1988). A map of the amplified domain of CHOC 400 cells that locates these and other biological landmarks within the

The met.hotrexate-resistant, Chinese hamster cell strain, CHOC 400, contains 1000 copies of an earlyreplicat,ing sequence that includes the gene for dihydrofolate reduct’ase (dhfr$; Milbrandt et al., 1981). The amplified dhfr domains, each approximatelv 270 kb in lengt’h (Montoya-Zavala bz Hamlin, 1985; Looney & Hamlin. 1987), are situated in tandem arrays (Looney et al., 1988) in several homogeneously st’aining chromosomal regions ( HSRs) t’hat’ begin replication immediately upon entry into S phase (Hamlin & Biedler, 1981: Milbrandt et a,l., 1981). Pulse-labeling studies in

?Author to whom all correspondence should be addressed. 1 Abbreviations used: dhfr. dihydrofolate rrductase: kb. 103 hasrs or base-pairs: bp. base-pair(s): HSR, homogeneously staining chromosomal region: ELF, rarly-lajbelrd fragment: ARS. autonomously replicating sryuenw: MKK, mung bean nuclease; ssDNA. singlestranded DXA. 19

c

1990 hradrmic

Press Limit,ed

ELF-F/ELF-F’ doublet to t,he dhfr gene is provided in Figure 1. The molecular precedents established by t,he study of prokaryotic. viral, and yeast, origins have pe&itted construction of generalized models for initiation of DNA synthesis (Bramhill & Kornberg. 198%: ITrnek et nl., 198%). These models postulat,e that’ binding of the initiator prot,ein assists in the localized melting in a nearby A + T-rich region that may have special structural properties. Umek et al. postulate that origin-associated A + T-rich regions. or DNA unwinding elements. represent a fundament)al thermodynamic property of’ replication origins, and suggest that t,he sensitivity of thrsc sequencest’o mung bean nuclea,se (MHN) clea,vagc at neut,ral pH in supercoiled molecules direct)]? reflectIs the free energy required for DNA unwinding (Umek et al.. 19886). Indeed, the MHN assay for DNA unwinding indicates t)hat the A + T-rich origin sequencesof bacteriophage PM2 and t)he yeast’ 2 pm plasmid are more readily unwound bhan all other sequencesin their genomes (Kowalski, 1984: Umek & Kowalski, 1987). Deletion analysis indicates that facile DNA unwinding is an essential feature of severa, origins, including t,he H4 ARR (Umek CC Kowalski, 1988) and ori(’ (Hramhill & Kornberg. 1988a). Another structural feat’ure. st,ably bent I)NA. has been found near a variety of origins, including those of bacteriophage lambda (Zahn dz Hlattner, 1987). t’he ARS-I element from yeast (Snyder et al., 1986). and simian virus 40 (SV40) (Deb et al., 1986). Delrtion mutagenesis and sequencsesubstitution exprriment s suggest t’hat DNA bending may be important for activity of t,he SV40 (Deb et nl.. 1986) and AR&l (Williams et al.. 1988) origins. DNA bending. which may be enhanced by the binding of spec*ific proteins (Zahn & Rlat,tner, 1987). has been shown to promote disruption of the DNA helix (Ramstein B I,a.verv. 1988). Lack of a definitive assay for vertebrato origin function has hindered precise delineation of t,hr J)NA sequences that comprise the dhfr origin of replication. In the absence of such a,n assay, we determined the nucleotide sequence of 6.2 kb of DNA from t,he ELF region of amplified dhfr domain, and then analyzed recombinant, plasmids containing these DNA sequences for st’ructural properties that may be related to origin structure ot function. Here we report that the dhfr origin region contains five types of unusual sequences:(1) a novel repetitive DNA sequence; (2) stably bent, DNA; (3) homologies to yeast chromosomal origins; (4) elements that promote DNA unwinding; and (5) a cluster of simple alternating repeats that, forms triple-stranded DNA when subjected to torsional st,ressat acid pH. EcoR’J

2. Materials and Methods (a) DIVA wpenciny Origin region I)R’A was sequenwd by the dideoxy method (&nger et al.. 1977). To sequence the 4.3 kb XbaI

fragment from rrcwmhinant cwsmid SI 3 ( ISurhans PI ,I/.. 1986h). t,he fragment was subclonrd in plT(‘f 2. yielding plasmid 813X-24. SISX-24 was linearizrtl a,t thr rlniclw pCCI2 Pstl cloning site 5’ to the insert, 1)S.A. and a graded series of B&f deletions were prepared and subcloned in the Ml3 v&or, mplO, as described (Poncz rt al., 1982). Phage were propagated in th(l Eschrrichitr rol; strain ,JMfOf. and I to 2 pg of phagr ssl)S.-\ SYW prepare’d and srquenwd with Klenow fragment and thx universal 1 i-mer primel, AS desc~ribrtl (Messing. 1983). [32P]dATP w-as used as the labeled prwursor. Thtl GO,, to IO o. (M-/V) pal\-acrylamide sequencing pt.1~ w\-erc J)wl)art~tl as described (Maniatis r,t ctl.. 1982). To fill gaps missing in thr cwllrction of 12rrl:Sf th+i iotlh. sl~bc~lom?s of srlr&d restrictiolr tiagmwts wrrc~ cnonstru&d in >I IS or pT% vcactors (I -.$. l~ioc,hPnric,wls). sslI?u’A was prepared. ant1 ra(ah template was seqrwircr~i as desc.ribed. Sequence ambiguities serf also rwol\~rtl I)> using spwific oligrioriuc~lrotidr primers mad0 on il :Z 131 model 38fA DPI’A svnthesizrr. The 1% kh HintlIJl t’r:tgmrnt that) overlaps SISX-24 was subclonrd irit,o II 13mpf S in both orientations. SSDXA was pJ’t’pNr”‘t. ;tmi tllc~ srquenw det,erminrd hy t,he dideox\ method iwing wstom oligonut~leot~idr primers syntl~es~zfd as rrwrss;\r~ Eac>h segment of the 6.2 kb region was srquc~n(~l :I minimum of 4 times in srparatc~ experinlrnts. Sequence information was assembled \vith a packayfb of programs from the Molecular Hiology (‘omputt*r R~rwarch Resource (MHCR,R) at Harvard I’nirrrsity. including th(’ Int,ellipenetics program “(kf”. or thr (:rl Ass~rnhi~ programs of the &w%ic~s C’omputrr (:rr)up. C’nivrrsit y ,>t’ Wisconsin (I )ewrrux fd al.. 19X4). Srqurtlw analysis \z ita performed with the following programs I)ASHEJ:. J,OCAL. and AJ,I(:N from the MJWK R: thf, Jwgrams lvordsearch 1 (‘ompa re. StrmI,oop. and Itt’pf’Rt tror~r (:fsnet,icx (‘ornputer Group at ttlr I’nivcwity of LViwoirsiri: and the matrix hotnotogy programs of l’ustt~fl (JRI) ar~tl L),UASIS (LKB). I’nfess ot’hww-iw irldic.atetl. propwnI default parameters \vere used in styut:nc~t~ anafyww

The tihfr origin ygiorr \vas survcb> rti ti)l, ~~n~~malotlhl~ migrating fragments by a r’-dinlrnsional goI twhnicluc~ (Anderson. 1986). FinrllI digests of j)lwsinitl srrlx~l~~ws were seperatetl on a No,, agarow gel in T;\E bufh~~~ (40 rn$f-Tris-aretatr (pH ‘7.8). 1 ~IIVI-EIYI’.A) at room temperaturr. The agarost’ gel lane \vas c~sc~iwci. rwrirntc~cL 90” relat,ivcs to thtl first dimension. alItI cd;rst in a So0 polyacrylamidr gel. The 2nd dimension gt.1 c\‘as run at 1 (’ t,o exaggerate t,hr effects of I)NX lwriding 011 fragrnc,nt mobilit&

Mur~g bean nu&ast~ (hri3s) sensitivit \’ assays \\ VI’C’ performed essent,iafly as described ( ITr,lt,k & Kowafski. 1987. 1988: lTmek rf nl.. I!Nn.6). Plasrnitl 1)N.A IVHS prepared by equilibrium sedimentatiorl in wsium c*hlori& gradients cwntaining ethidium hronlidr (Matliatis pf II/. 1982), and used at native levels of siip(~r(.oiling. f’fasmid f)NA (1 tjo 1.5 pg) was preincubated for 1.5 min at 373’ iri IO mivr-Tris. HC’I (pH 7.0 to 7.2). prior to thus addition of’ MRX (BRL). The ac*tivity of each MEN preparation \vas titrated under reaction caonditions at both nwtral and arid pH to yield approximatrly one ssl).V:I nick per’ supewoiled molecule. The rra.ctions (1 A ~1) \vtAre st’oppwl aft,er an additional 60 rnin incubation at 37°C’ by th(t addition of an equal volunlt, of J)hrllol,‘c~klorofor~~~ (24. 1

dhfr Origin

Region

37°C. The hybridization products were separat,ed bx rlectrophoresis at 4°C in l y0 agarose gels in 10 pg ethldium bromide/ml. 40 mix-Tris-acetate (pH 7%). 1 mM-EDTA. The gels were photographed. transferred to Zetaprobe (Biorad). hybridized with the 2.9 kb (‘la1 fragment from M13mp18 replicative form DEA, washed. and exposed to X-ray film as described previously.

v/v). The D3A was isolated. digested with the indicat’ed restriction enzymes, and the products were end-labeled with [a-32P]dATP and Klenow fragment (Maniatis et al.. 19W). The products of the end-labeling react)ions (30.000 cts/min per lane) were denatured for 2 min at 98”(‘. adjust’rd to 30 mM-KaOH. 1 mM-EDTA. @05’j/,

(~/;~~s~~~;;eirn., y,,l o(v/v) glycerol, and subjected to ‘0 (w/r) alkaline agarose gel m 30 mM-Sa.C)H. 1 mu-EDTA. The gels were neutralized by three 20 min washes in 40 mM-Tris-acetate (pH 7.9). I rn31-EDTA. dried. and exposed t,o X-ra,v film for 3 t’o 36 h at -70°C'. (‘leavagr sit,es u’err mapped by digesting the nucleasr reaction products with selected combinations of restrirtion rndonucleasrs. In some instances, Southern blot hybridization with selected probes was required t,o resolve amhiguit)ies (data not shown). The lengths of ssDR-A fragment’s were determined from short exposures and were judged to be aclc*uratr within A.50 bp.

3. Results (a) Primary

0

10

20

DNA sequence of dhfr origin region DNA

(i) Biological landmarks The dhfr origin region has been identified by a variety of replication studies (reviewed in Tntroduction). Tn Figure l(a), the locations of previously reported biological landmarks are collated relative to the dhfr gene. Outlined in Figure 1(b) is the strategy used to determine the primary nucleotide sequence of this region from the 5’ XbaI site of the 43 kb XbaI fragment of S13X-24 to the 3’ end of the overlapping 1.63 kb WindIT fragment of ELF-F. The sequence is 6157 nucleotides in length (Fig. 2): t’he EcoRI site that separates El,F-F from

Intermolrrular hybridizations were performed as described (Htun 8: Dahlberg, 1988). The 20~1 reactions containing 0.2 pg of supercoiled plasmid DSA and @l ,~g of the indicated recombinant Ml3 phage DNA were incubated in 200 m>l-sodium acetate (pH 5.2 or 7.2) for 24 h at

(a)

21

DNA

30

50

(=thons C!iS=lntrons

R

ELF-F’

R ELF-F

R

HHHH

.

Activity

/

Early replication in synchronized cells; immediate replication in GIIS nuclei in vitro Hybridization intermediates from

by ‘in gel’ analysis

Hybridization ‘origin-specific’

of DNA fraction

ARS

(b)

nuclease-sensitive in chromatin activity

Region Sequencing

in yeast

:

/

/



\I

R \ :;H

ELF-C ”

\/

R H

i 1

I I

0.

References

:

Y 0&F+

R 1

1

ELF-F

; 1

Heintz 81 Hamlin (1982) Heintz et al. (1983) Heintz 8 %on (1988)

I kb

of replication Gl/S cells or nuclei

Initiation site renaturation

Micrococcal site

R,

/

,

6? kb

Burhans

Leu

8

et al. (19866) Hamlin

(1989)

PP

Anochkova

u

“4

“d

8 Hamlin

Homlin

$aJ

Hamlin

et

(1989)

(1988) (1988)

sequenced strategy

Figure 1. Biological landmarks in the initiation region of the amplified dhfr domain of CHOC 400 cells. (a) A restrict’ion map of the 3’ early-replicating region that locates the early-labeled fragments (ELF) F, F’, and C relative to the dhfr gene is presented. Selected restriction fragments that demonstrate the indicated activities within the EcoRI ELF-F/ELF-F’ doublet are referenced. (b) Sequencing strategy for a port’ion of the initiation region. Arrows indicate the orientation and distance of individual sequencing reactions. X. Xbal; R. EcoRI: H. HindIII; P. PNTI.

Al. S. C'nddlr

et al.

-__II_-

__.---___

XBAI 100

l ~~~cACTA~~TATCT~AT~TAATCAACATGAGAG 101

TACCACAGACAAGCACTGACCCAACCTCCACGAGTCCTCTTGAAAAGAGAGAGGMCCA~GTACGAGCC~GACTCMGAGCATGACACCG~CCCA

200

201

CACACACAGCTGACCTGCGCTTCTCGGTCATCGACCTCATCCACTC~GACCMCM~ACGG~CTCCATCAGGCCMCCTAGGMCTCTGCATGTCTCTG

300

301

ACAGTTGTATACCATGGTCTGTTTGTGAGGCTTTTACCAGTGGGACAGGGCCTCTCCTTGCCGCTTGAGTCCc~CG~McTC~CCcATGCTGCA~

600

401

ACAACCAGCTTGATGCTCGCMGCATTGTCTGTCTGCT~TCATGCCC~GCATTC~GGATCTCATGGAGGACTCCCC~CTG~GACAGGACMGTGMTA

500

501

GCCGACCCCA~CCCAGCAGACCMGGACACGAAACTCTCATACGGATC

600

601

ATTCGCTAAAATCAAACAAACATAAAACTACACAGGAAAAACACACGAGGAG~CCTCCGCACCGCCGCTCCAMGACAGCTGAGGGCCGCATGGGMTCGC

700

701

ACCTTACTCGACC~GC~CA~~TC~C~T~~A~CM~~CACC~C~AG~~~T~A~CTCCACC~T~CGCT~C~TG~CACC~C~CCACCMCACATC~

800

801

~CTAAAACAGGTCTTCACACCACCMCMTCATACACATCTCTCCAGG~CGTCCATCCCCAGCATG~GCCACAGACCCMCACATA~MCAGTC~

900

901

1000

1001

ATACAGTGTCCTATATGTGGTATTGTAAAGTCATGACT~TTGTCACTAGGCCACTG~~T~ACCTCT~GCAC~GT~CGCATCTGCCTA~CT TGCC~CCCTCCTTACCACT~CACACACACACACACAC~ATATTCTACTGT~GTAG~GCA~CMG~CCATCA~CTC

1101

T?CATMTCACATMTGTAATAACACACACAGTGGTTCATTTCACCAATCAAGTACACCTTGCCTCTCTC~~~TGTATCGC~~CCTCG~C~

1200

1201

TGT~CG~CTM~AT~C~TTGGTCGT~GGTTGGTGGGTGG~CC~GGTTG~CTAGTTGGTT~~GTGA~~C~TC~GTGC~~CG~GCCTCGCTGGG~C

1300

1301

GTTGCTCCTTTGCTTCCGTGCCGTGGG~TGT~~CACGAGGCA~ACTCTATATCTCAG~TGTCTC~CTCACTATGTCCACATGACTATCTC

1400

l&O1

ATCATGACATTA~CT~GA~~ATACT~TCT~T~~~AT~~AA~~~TCT~T~TA~~~A~A~~TTT~

1501

AGCAC~CTATGCCCTCCAAGMTMGM~CCTAGCTCGGAGTCAGC~GGMCTTCMGCCC~M~ATAGACAC~GGMCCCA~

1600

1601

CTCTCCCTCC~TCCCACCCAAGTTTCAGATGATCTCACAGACCTCCATGGCACCTTATGCAGTC~~CAGGTCC~GMTAGGATGCAGATMGCCATG

1700

1701

CCAGAATCCCMCACCACCAAAGCCTTACTCATATAGT~TATGTA~GTGTCTACCCT~CTCCA~CTGG~ATGCTACTGTCCA~TMTACACACT

1800

1801

MTACACATCTCATGC~MTATTATGTGACAC~CAGTGGCCACAGACCTACACACAC~GGT~CCA~ATMGGTG~CGTMGGATA~G~A

1900

1901

TGACATAAACAT~ACATTACTATCCTCGGT~TACA~CTCCATCCCMTGGGCATGGGCT~CTC~CTAGATGACACCTGCMTAG~CCTCG

2000

2001

CCTCTCTCATACTTCTCAGCCCTTTGAGCTCAGCTCACACTAGACAGMCTCACACC~CTCTCAGCTTTCCACCTTGATGMTCTCCATGGCACTCTTCACAC?

2100

2101

TMCACCTCACACACTTMTCA~CATATGMCCM~C~CTCTGACCATCACTCGCGTCAC~CG~GA~CTGTCAC~GGAGM~MTACC~

2200

2201

AGCACATMMTCCCATCACATCCTTAlTTTCTTCCTGTCTGTGMT&Tl'l'TCTTTT?T'TTc~~~~

2300

-l-

~TAAATTAAAAMTTMTTAATTAAAIUC~CAACACT~TACTGCTAG

1100

-3-

BAM

TCGCCATCTACCCMCACCCCTCACAAA

HI

-4-

-type

II ALU1

ACTWCTCTCCACACCA

repeat

-

1500

2400

2301

TTCC~~~TTCTCTCTCTAGCTTTTCCACCCTMTCCTCG

CCTCACCTCMClTCACAGACCTCCTCCCCClTCGCCCC

2401

TCTCCCGATTAMGGCCTCTACCACCMCGCTCTCATAG

2500

2501

TCAGAGMC~CGTMGCTACCTGTCGG~ATAGCATMTCCCACACMGAGCTGMGCACGAGGA~CTG~GAGGG~GCTACAG~CATC~G

2600

2601

ACTCCCTCCCTCAAAACACAGCMGACACAAAAAChAGCTCC~TMGA~CACTTGGGCC~C~CC~CC~CTCA~CAGTC~~~A

2700

2701

AMTCAGCTCTTAAACACGCACTTAGATCTCM~MCACGCTCTATAMCACTCCACTCATAM

2800

2801

CTTCACCC~MCC~C~CTCT~TGmCCTACTAG~CTACTGCCGTA~ATMGAC~TGTCACCATGMGCCAG~~~~~~G

2900

2901

3000

3001

3100

3101

3200

3201

3300

3301

CATTC~CCM~GCCCACACTCATCACTTCAGCTATCCTATCCCCCA~CATTC~CCACC~GCTACMGCTAGC~ATACCC~GC~GCTA~~ -2-T

3401

CAT~ACACTTA~AGTCT~CCCA~GACTC&&TAC

3501

82 -B3-B4T~TCMTCCCTTCTCCCTCTCCCTCATCAAAACTACTTTT-I-CGTTTTTTACG type I ALU I repeat -

3601

Bl-

HIND

III

~TTATTATTATTAcTTC~~ACCM~~~C~A~

BS

TTRllTTCMCACACCCTTTCTCTCTCTACCmCGACCACCCTATCCTA

CCCACTCCCTCTGGACACCAGGCCTCG~TCA~GTGATCTGCCTGCC~GCTCCCACTCCTGCCATCT~GGCATCCACCATCMCA~GGCC~

Fig. 2.

3400 3500 3600 3700

dhfr

Origin

Region

DNA

23 -63800 3900

?ZATATATA*

3901

TCCTGTGCCCTTATCTGTAACTCACACTACC~GCATCTMC~CTGGC~TGAAACCAGCCAAACAAAAACC~TGTGCCACCTCCTGTCTC~CTA~AT

4000

4001

GTTCC~TAGGATATCCTATATCCTAAAGGTTTATTTTACTGATAGCATC~MC~CC~TG~GG~GGTC~CTCMGCAGTCCTCGTGCAGCTG

4100

4101

CCTCCTCAGCTMCTGCCAGGGACMT~TTGATCCCCTCCC~CC~C~CCATGGCMCTCTCG~CC~GGCA~CACCTGC~MG

4201

MTCAGCAAATGACCAATCAGCTCATGAAACTAMTACTCTATTATTACTMAATA~

4301

CCC~GMCTCAGAGAGACTCACTTACCTTTGCCTCCCACGTGCTGGM~~GGCATGMCCACCACACC~CATMCAC~G~~TCTMGAGT

4400

4401

CC:~TC~CCAATACATTTGAGGTTTT~TGTGGCACAG~ThTCC~~C~TATMTc~C~GAcAT~c~cTCACAcT~G~CTAT

4500

4501

ACGTTCTTGCTACChATCC$i~i?TCTGAhA

4600

4601

AT~;TTTTTGATMTCTTAAACAAATCAAAGAAATTAT~GACTAGACTGTGCTACAC~CMTA~CAGATGCCMG~GAG~C~AG~ - ITTAAC~TATCCTACT~GTATAAATCCTTTATAAAGTGO~TOAC~TCTGATG~TC~~GTAG~T~GACATGM

4701 4801

ACGChAATATTTATGCATATAAhAATAAAATAAATCTTTTT-TCC

~ThMGhhGTGACAlTGTC~GGhh~GTGCC

3801

t-

HIND

III

X Al &ibCTAG

ARS

ELF-F’

ECORI

ELF-F

GACACACGCCAT~TCACATAGTTCAGGTTG

CTTCTGTTTCTAGTCTTCTCAGTG~AGTATTCTACCTATGTGCCCTGCCTCAGTGTG

Homologies

TMTMGAAAATCATATTGTGCATATCA~~CCT~CA~MC~GC~TAGMTAGTCC~GT~C~T~T~TCACCMGMC

‘--)

4200 4300

4700 4800 4900 5000 5100 5200 5300 5400 5500 5600

5601

MGMCCATCTAmTGGCTCACTCTCTCTGAGGA~CMC~ATCCChGCMT~OGOhT~GGCh~G~GChGGMTATGTGTOO~GMGcTG

5700

5701

TTTATCTCACMTAMC

5800

5801

&SAG~C~CACAC~GAG

5900 6000 6100

6101

ATTTCACMTACTACATMAACTATCAGATATTTTTCATGATGAATTTCT

6157

Figure 2. Primary nucleotide sequence of dhfr origin DNA. The sequence of the dhfr origin region from the 5’ XbnI site of the -I-.$kh KbnT fragment of S13X-24 to the 3’ Hind111 site of pMC-G (see Figs 1 and 6) is presented. Boxed elements numbered1 to 8 designabesequencemotifs reported to adopt non-B form structural configurations (SW the text). The location of ORR-1 and AZuI family repeats is indicated. BI to B5 indicate the 5 tracts of oligo(dA) that are phased with a 10 bp ptbriodicity in the 280 bp HaeIII bent DNA fragment located at nucleobides 3342 to 3622. Bold lines indicate homologies to either t.he consensus sequence of the ARS core (nucleotides 4776 to 4786 and 4782 to 4792) or the ARS 3’ conservhd rlrment (nucleotides 4709 to 4719 and 4761 to 4771). The EcoRI site that, separates ELF-F’ from ELF-F is located ;at nucleotide 4280.

ELF-F is located at position 4280. Overall, the sequenceis 59 4,, A+T. Sequencing shows the XbaI fragment of S13X-24, estimated as 4.3 kb, is art,ually 4540 bp long. (ii) Reprtitiw

elements

In an earlier study, type I and II Alu family probes failed to hybridize to Southern blots of the 4.3 Xba.1 fragment’ (Burhans et al., 1986b). However, sequence homology searches revealed the presence of t,wo A2wI family repeats located at nucleotides 2248 to 2433 and nucleotides 3536 to 3683. These repeats were not identified by hybridization because each differs from the AZu t’ype I and II repeats

present in the probe sequence by greater than 30% (data not shown). The type II AZuI family repeat located at nucleotides 2248 to 2433 contains a long poly(A) tail. Homology searches also revealed a highly conserved motif located at position 3085 to 3242 that we have termed origin-region repeat 1, or ORR-1 (seeFig. 2). ORR-1 is homologous to extragenie sequencesfrom the mouse MHC class I H2-K gene promoter region (Kimura et al., 1986), the 3’-flanking region of the mouse immunoglobulin heavy chain gene (Cheng et a.l., 1982), and $-flanking region of the rat metallothionein-1 pseudogene b (Anderson et al., 1986). The location

24

M. S. Cuddle

(a)

et al

--I_ 3’

5 dhfr

---

ZLv

Last ex0n dhfr

musigcd

17

musmhkba

a 6Ml

origin

region

6M2

L

+1 w lkb

ratmtlpb

f

lb) COIWWWUS

Orrl !lusigcdl7 Husmhkba Ratmtlpb

Con.enru. Orrl Yusigcdl7

!lusmhkba Ratmclpb 150

att.gac GTAGTT CAGAAGTT AGGAAT TCCAC. CCCAGC ii TTACC GGACCA CTGGCAC ggaa.

COtl*BllSUS

Orrl Yusigcdl'l Musmhkba Ratmclpb 151 ac

COMUWUS flusmhkba Ratmclpb

ad-5 in

Integration BHK268-C31

161 ..G ,:.

ACT.. TT CTAGTT. CC ACATC. AA ATCACAC

Orrl ?lusigcdl7

Chinese hamster

dta

t

ORRl site

AGTTTCTGTG GTCATGGTGT CTCTTCACAG CAATAGAAA** ** ****** ********* -AAGGCGCTC GCCGCCCAGC GATATCACAG CAATAGAAAA

cells

ccccmx~ GACAGTAGTTATCAGA *** * **** ***** ****** *** CCCTAACTAA GACAGAAGTTAT-AGA ORP-A

homology

Figure 3. (a) Locations and orientations of ORR-1. The orientation (small arrow) and location of OKR-I relative to the direction of transcription (large arrow) for each locus are indicated. The following Genbank designations are used to identify each sequence: musigcd 17, mouse IgD 3’ region: mushkba. mouse MHC class I H2-K gene promoter region: ratmtlpb, 5’-flanking region of the rat metallothionein-1 pseudogene b. Hatched regions indicate exons. except for the pseudogene, where the hatched region indicates the sequences homologous to the processed mRNA. (h) Sequence conservation of ORR-1. Gaps introduced into each sequence are indicated by dot,s: nucleotides conserved in all 4 sequences are boxed. (c) Integration of adenovirus type (ad-s) DNA into a sequence homologous to ORR-I. Thta sequence encompassing the integration site of ad-5 DNA in the genome of RHK26W31 cells is compared to ORR-I Identical nucleotides are indicated by an asterisk. The ORR-1 homology to the adenovirus origin region protein A (ORP-A) binding site is underlined. and orientation of ORR-1 is compared to those of the mouse and rat sequences in Figure 3(a). Comparison of the four sequences with one another

(Fig. 3(b)) shows that the ORR-1 motif contains two regions of strong identity; nucleotides 32 to 71 contain 26 of 39 residues common to all four

dhfr Origin

Region DNA

sequences,while nucleotides 91 to 115 contain 20 of 25 bases common to all four sequences. Independent searchesof Genbank with these two conserved regions revealed an additional match to an adenovirus type 5 (ad-5) DNA integration site. Alignment of ORR-1 with the right-hand ad-5 integration site of the cell line BHK268-C31 (Westin et nl.. 1982) shows that the two sequencesare nearly identical (Fig. 3(c)). The right-hand integration junction occurs at nucleotide 5083 of the ad-5 genome. which is joined to nucleotide 106 of the OR’R-1 motif. Comparison of the adenovirus sequences upstream from the viral-genomic DNA junction reveals no significant sequence homology between the pre-integration site and the ad-5 genome. A portion of the ORR-1 repeat is homologous to the consensus binding site for a nuclear protein that binds to the adenovirus origin of replication (OR,P-A: Rosenfeld et al., 1987). (iii) Shor,f direct repeats and palindromes Since ,a variety of viral and prokaryotic origins contain reiterated binding sites for initiation factors, the origin region sequence was analyzed for short direct repeats. Although the sequencecontains at least 70 perfect direct repeats of 8 to 12 bp, the repeats are not reiterated in clusters, but rather are distributed in an apparently random pattern throughout the sequence. Similarly, analysis of the origin region sequence with the program StemLoop (program parameters were 80% match in stems 10 to 50 bp; loops 10 to 300 bp) reveals only imperfect stem-loop matches of various lengths (not shown). (iv) Homo- and heteropolymeric repeats The origin region sequence contains several types of simple homo- or heteropolymeric repeats. Posit,ions 1023 to 1053 contain (A)r4 followed by (C-A),. This is followed at positions 1185 to 1337 by 142 nucleotides of G + T-rich sequence in which the

25

telomeric-like motif TGGGT (Henderson et nl., 1987) is repeated 14 times. Fifty-seven of 61 nucleotides on one strand from position 2248 to 2309 are A residues derived from the poly(A) tail associated with an AM repetitive element. A tract of (A-T)23 is located at position 3766 to 3811. An unusual array of homopurine/homopyrimidine and alternating purinelpyrimidine sequences is located at positions 5734 to 5921 at the 3’ end of the sequence. This array is characterized by two tracts of alternating (A-G), one of 42 nucleotides and a second of 54 nucleotides. These tracts are interrupted by (G), and (CAGA),, followed by GAGGGAGAGAGGC. The entire array is preceded by (G-C) 5 and (A-C) rs. Strand bias in this region results in over 180 bp of sequence that lacks T on one st,rand. Homology searches show that this array has a significant percentage of identity with t’he human I:2 RNA gene and several non-transcribed spacer regions of the ribosomal D?iA clusters from mammals (summarized in Table 1). (v) Homology

to AILS

elements

The 5’ half of the 1.6 kb HindIIT fragment that comprises nucleotides 4524 to 6157 of the origin region sequence has ARS activity in yeast’ (Hamlin et al., 1988). ARS elements are characterized by two consensus sequences; a core sequence of (AT)TTTAT(AG)TTT(AT) that is required for ARS function (Kearsey, 1984) and a 3’ consensus sequence of (AT)(AT)(AT)GCTAAAAG (Palzkill et al., 1986). The dhfr region reported to have ARS activity has one 83 bp segment (nucleotides 4709 to 4792) that contains two overlapping 10 of 11 bp matches to the core sequence, and two 9 of 11 bp matches to the more heterogeneous 3’ consensus sequence (designated by bold lines, Fig. 2). These sequences are embedded in a block of approximately 450 bp (nucleotides 4615 t)o 5069) that’ is 76% A+T-rich.

Table 1 Homologies

of the poly(dA-dG)

Genbonk deslgnotlon

Sequence

with

location RNA

Length (bp)

Human

U2

55.7

379

Human spacer

non-transcribed (t-2 class)

rlbosomal

63.5

296

HUMRGNTSC

Human spacer

non-transcribed (t-l class)

rlbosomal

51.9

343

HUMRGNTSB

Human spacer

non-tronscrlbed (+-0 class)

rlbosomal

56.1

264

RATRRNAOB

Rat 28s

57.0

365

RATNTSI

Rot non-transcribed downstream from

54.6

315

downstream

285

spacer gene

gene

% Identity

HUMRGNTSA

spacer gene

nuclear

other sequences

HUMUG20

rRNA rRNA

small

cluster

from

The program Wordsearch was used to scan Genbank for sequences homologous to the poly(dA-dG) cluster of pMC-G. The length and percentage of identical residues for each sequence are presented.

26

M.

S. Caddie

et, al. for the 251 bp lZa~lI1 fragment t,hat caontains (A-%~ or the 910 bp fragment that cbontains 11-1~ poly(A) stretch located from nuclrotidew 2248 IO 2309. The A + T-rich region of pMC-G with reported ARS activity also shows normal migration in t hca two-dimensional gel assay.

fal

fhl

Figure 4. Two-dimensional gel analysis of dhf’r origin region I1K.A. HneITI digest)s of 813X-24 (a) and pll;l(!-G (b) were resolved in neutral Z-dimensional agarosr/polyacrylamide gels at 4°C as described in Materials and ?vlethods. The arrowhead indicates the 280 bp fragment located at nucleotides 3422 to 3622 of thr seyurncrl presented in Fig. 1. The position of size markers in the 1xt dimension gel is presented in hp.

(b)

Structural

mot$q

in

thr

origirt

region

~wqurncr

(i) Brnt DATA4 St,ably bent DNA is associated with a number of viral, bact,erial and yeast origins of replication (SW Introduction). To determine if the dhfr origin region contains bent DNA sequences. a, two-dimensional gel technique was used to survey the plasmids S13X-24 and pNLC-C:(seeFig. 5 for plasmid designations) for small restriction fragments that migrate anomalously on polyacrylamide gels (Anderson, 1986). As described by Anderson. fragments of random composition form a smooth arc in the second dimension, whereas those with unusual structures migrate to positions discordant, with their size. An HaeIII digest’ of S13X-24 reveals two fragments that migrate anomalously in the t,wo-dimensional gel assay (Fig. 4(a)). The first. which migrates IOq;, faster than expect)ed in the sec*onddimension. is a 434 bp fragment derived from the cloning vector. This fragment is 58 O’ lo G -t C; the basisfor its increased mobility is unknown. The second fraymerit, migrates with an apparent size of 290 ( f 10) bp in the first dimension agarose gel, and with an apparent size of 345 ( f 10) bp in the second dimension polyacrylamide gel (arrow, Fig. 4(a)). This fragment, was cloned from the second dirnension gel and sequenced; it is 280 bp in length. caomprisesnucleotides 3342 to 3622. and contains five tracts of oligo(dA)3., spaced 10 bp apart (labeled Bl to B5 in Fig. 2). Elsewhere we will describe cyclic permutation assays and temperature-dependent alterations in the rlect.rophoret,ic* properties of this fragment that confirm that the 280 bp HaeIIT fragment cont,ains stably bent DNA (Yl. Caddle & N. H. Heintz, unpublished results). At neutral pH, the 970 bp HaelI fragment of pMC-G t,hat contains the cluster of (A-G) tracts does not migrate anomalously in the two-dimensional gel assay (Fig. 4(b)). Sor is altered migration observed

(ii) ~Vucleasr .~erzsiti&y of’ oriyita rrgiott D,V.-l The dhfr origin secp~mw c*ont,ains a rlurnbcr 01’ candidate elemenk t’ha,t are reported to atlol~t non-R form l)SX structures. To simplif& disc.ussion of thr following structural studies. th~~hemot>& havcb bran numbered 1 to 8 in the o&r in whicsh t h(J>a.ppear 5’ to 3’ iI1 the clhfr origin rrgioll st~clu~‘n~~~’ (s(ltl Figs 2 and 5). Region 1. locat,rd at rrucleotides 55 I to 585. consists of 33 c~ontiyuous A or ‘T’ residues in which the motif (WATT) is repeatjet four, t irnrs. This .A+ T-rich region abuts a (i-rich region from nrcc~lt~o tides 500 to 550. Kegiorl 2. iit positio~~s llW! t0 1053. is (composed of (A)ll. (;\-(09. a srquenc~r reportr~d to form K-1)NA under torsional sl rt+(Rich rf cl/.. 198~). Region 3. a,t nuc~leotitles 1185 to 1337. c.ontains I4 topics of the telomeric*-likrbrepvat ‘I‘($G(:T (Henderson rt al.. 1987). R,egiolj 4. locatrd at nuclrot~ides 2264 t’o 2332 ronsists of 3X c>ontiguouh A residut>s, a stAquenc*r of enhancletl Hexihilit? (Hogan et nl.. 1983) that, is dificult to caondrnseinto nucleosomes(Simpson Jz Kunzler. 1979). Region 5. the bent) l)XA sequence. is located at nuc~leotides 3410 to 3460; bent DNA has been show11to promotc~ disrupt’ion of duplex DNA (Koo rt I//.. 1986; Ramstein & Lavery. 1988). Region 6. the (A-T),, tract located at nucleotides 3766 to 381 I shoultl he readilr r~xtruded as a c~rueiform in super~oilctl plasm’ids ((>reaves et trl.. 1985: ;CI art’ inter rupted by (C:)+ i+ sequcn(‘eirnplicat,ed in t~riple helix formation in supercoiled plasmids (Kohn-i & KolvhiShigematus, 1988). and are preceded hy ((i-C’):. :I classical Z-DNA mot’if. To hegin to define t,hr contribution:, of ttrvsr se’penres t)o structural organization of the tlhfr origin region. a panel of recombinant plasmid:: cwnt,aining various fragrnrnts of the origin repioll was L clonsiruc+d (Fig, 5). Each plasmid was analyzed for DNA unwinding elemen1.sand/or carucaiform formation by dt%ermininp it’s st&tirit>, t0 mung hean nucleasc at neutral pH. The tiamc\ plasm mids then were t,ested for Z-DNA. triple-xtrarrtletl DNA. or other non-H czonformers stabilized at ac,itl pH by repeating tbe MBN sensitjivity assays at

dhfr Origin

Xba

H3

Barn

~l~~7 Xba

H3

Xba

Barn ikb

y!zr

Barn -H3 PMC-A

H3 d3 pMC-FH3 II3 Xba

Xba

PMC-G

‘“Si;X-241 Barn

Element

Sequence

I

(AATTi,

i!

(dA),d(dC-dA)s

3

Poiy

5

Bent

6

8

1022-

ITGGGT),

4

7

PosItIon 550-582

Poly ARS

(dA13s

2264-23G2

DNA

3342-3622

idA-dTjz3

3766-381

Homology/76%A+T Poly

(dA-dG)

1053

1185-1337

4709cluster

I 4792

5734-5921

Figure 5. I,oratiotrs of seyuenceelementswithin dhf’r origin region plasmids.The identities and locationsof the sryuenc~elementsdesignatedI to 8 in Fig. I and the text are denoted as filled boxes within the recombinant,plasmids ~.ll(‘-~i. -13.-1). -F. -C:, and S13X-24. Not shown is thts700 hp Hi~zdlll-Rp?,T portion of PVC-D that extends from the Y Ni?rdTIT site (nucleotide 6157).

pH 5.2. Formation of triple-stranded DNA was monitored by t,hr intermolecular hybridization assay described by Htun & Dahlberg (1988). (iii) D,Td unwinding activity at neutral pH A prerequisite for initiat’ion of DNA synthesis is the unwinding and subsequent separation of duplex DNA in the origin template. Sequences that promote facile unwinding can be detected as MBNsensitive sites at neutral pH when subjected to superhelical stress in plasmid DNA (Umek et al., 1988a,b). Since supercoiled vector containing no genomic insert reveals constitutive cleavage sites (Fig. 7(a), lanes 14 and 15), it provides a reference for comparing the unwinding activity of various genomic sequences. When supercoiled pMC-D, which contains a 4.5 kb HamH I-KpnI fragment that includes nucleotides 1534 to 6157 (seeFig. 5), is incubated with MBN at neutral pH in a low ionic strength buffer, four single-stranded fragments in addition to the fulllengt’h HindITI products are observed (Fig. 6(a), compare lane 3 to lane 1). (The ssDNA fragments generated by MMN cleavage in this and following experiments are designated by letters corresponding to the fragment from which they are derived as

Region

DNA

27

delineated in Fig. 5, and by numerals to allow comparison of the digestion products of individual fragments in different plasmids.) The ssDNA products of 730 nucleotides (Fl) and 320 nucleotides (F2) (Fig. 6(a), lane 3) result from cleavage at region 6 sequence). Cleavage also occurs at a (the (A-% lesser frequency at region 8 (the (A-G) cluster), yielding ssDNA fragments of about 1300 (Gl) and 360 nucleotides (G2) (Fig. 6(a), lane 3). Note that treatment of pMC-D with MBN after digestion with I1indIII generates no additional ssDNA fragments, confirming that the cleavage sites are induced by supercoiling (Fig. 6(a), lane 2). Not unexpectedly, the (A-T)23 sequenceremains the preferred cleavage site in pMC-F, a plasmid that contains only the 1 kb Hind111 fragment located at nucleotides 3490 to 4524 (Fig. 6(a), lanes 6 and 7). Incubation of pMC-G in the MBN assay at neutral pH results in equal amounts of single-strand scission at two sites (Fig. 6(a), lane 5). Cleavage at one site yields ssDNA products of 1300 (Gl ) and 360 nucleotides (G2) identical with those observed in pMC-D (compare lanes 3 and 5). Cleavage at the second site yields products of 1350 (G3) and 320 nucleotides (G4). Mapping with various restriction endonucleases (not shown) locates both nucleasesensitive sites within pMC-G t’o region 8. Supercoiled pMC-A is also cleaved by MKN at two sites. Prominent digestion occurs within region 4, generating ssDNA fragments after digestion with EcoRI and Hind111 of about 1240 (Al) and 750 nucleotides (A2) (Fig. 6(a), lane 9). ,4 lesser portion of the pMC-A molecules are cleaved in vector sequences yielding ssDNA fragments of 1585 (Vl) and 1010 nucleotides (V2) (compare lanes 9 and 15). pMC-B is cleaved with about equal efficiency at two sites, one within t’he vector (compare lanes 9 and 11 in Fig. 6(a)) and one within the insert (lane 11) yielding ssDNA fragments of 1100 (El) and 610 nucleotides (B2). Analysis of XbaI digests of the pMC-B cleavage products (not shown) locates the MBN cleavage site 610 nucleotides 3’ to the XbaI site at nucleotide position 3, or within region 1. To ascertain the relative sensitivity of sites within pMC-B to those observed in pMC-A and pMC-F, digestion assays were performed on plasmid S13X-24 that spans nucleotides 1 to 4540 (Fig. 6(a), lanes 12 and 13). In this instance, prominent cleavage products of 730 (Fl) and 320 nucleotides (F2) are observed, indicating preferential cleavage at region 6. Thus, in the context of S13X-24, the sites within fragments A and B are not reactive. Systematic analysis of the cleavage pat,terns of supercoiled plasmids containing various fragments of the origin region sequence allows ranking of the ease with which each DNA sequence element is attacked by MBN relative to one another and to the vector control (summarized in Fig. 6(c)). The preferred order of cleavage activity at neutral pH relative to endogenous vector sites is: (A-‘% >>(A-G) >> (A)38 > vector = (AATT),, op 6 >>8 >>4 > vector = 1. The (A)14, (C-A), motif (region 2), the G+T-rich region (region 3), the bent DNA sequence(region 5),

28

(01

M. 8. Caddie

flamid AE

pMc-D

Ph?eG

HJ-.---+w

MEN-

PMC-F

n3 -

c

+

PMC-A

H3 ----a@.

-

c

et al.

-

H3iRI

pllbc-a

*

c

put 13

S13X--24

IWRI

-e

W3!6

--h

-

+

-

+

H3 - --

+

V‘- _I v-

V

-am

v.

AG-Gl

F--

m

m-F1

@p -G2 -F2

1

2

3

a

5

6



Fl (r -F2

7

8

9

10

I1

I2

:aj

t3

I(;

-nl(l)

F’

#R-Q@ 2

B-

-G5

-G5

1

-f=l

V-

v-

-F2 4

*,

‘VI Al V2 A2

=-

F-

IC)

V-W

A-

3

4

5

- F2 6

7

-F2 8

9

10

11

12

13

14

15

PVC13

pMC-B +

SI3X-24

m

I

1 f

Figure 6. Mung bean nuclease assay for DNA unwinding at neutral pH. The indicated plasmids wer’r inc.ubated M ith LMBN at 37°C for 1 h at pH 7.2. Plasmid DNA was isolated, digested with the indicated restriction endonuc~leases (HE). end-labeled, and the products were resolved by alkaline agarose gel electrophoresis and autoradiography. The bold letters on t,he left of each panel indicate restriction fragments, regardless of context: corresponding to those’indicated in Fig. 5. Thus, A denotes the 1955 bp BumHI-Hind111 fragment of pMC-A that is also contained in pMC-1) and S13X-24: B denotes the 1535 bp X&z-BunzHI fragment of pMC-B that is also contained within S13X-24: P denotes the 1030 bp fragment of pMC-F that is also contained within pMC-D and 513X-24; and G denot,es the 1630 bp fragment of pM(‘-C that is also contained within pMC-D. V indicates the vector fragment; V* indicates a vector band derived frorn pM(:-1) that retains the 3’ 790 bp HindID-KynI portion of the insert after digestion with HindIII. Digestion products derived from various fragments or the vector are denoted by letters and numerals on the right of each panel. Cl and (L!, for example. correspond to the products of MBN attack within fragment C: in the plasmid p&MC-D that are also observed in the

dhfr Origin Region DNA and the A + T-rich region containing ARS homology (region 7) were not reactive at neutral pH in any cont)ext tested. (iv) Effect of pH

on nzung bean nuclease cleavage

A number of non-B form DNA conformers are stabilized by an acidic environment. In particular, t,he alternating sequence (C-T),. (G-A), has been shown to undergo a structural transition at acid pH and moderate supercoiling to a nuclease-sensitive conformaJtion. consistent with triplex DNA formation (for review. see Wells et al., 1988). Triplex DNA formation is not limited to alternating purinejpyrimidine sequencers. as (G),, and other simple homopurine/homopyrimidine repeats have also been implicated in triple helix formation (Kohwi 8r Kohw-Shigematsu, 1988; Mirkin et al., 1987). It is also important to note that nuclease sensitivity is not diagnostic for triplex DNA structures, since other non-N DNA conformations are also sensitive to single-stranded nucleases (for a review, see Wells, 1988). To asc~ertain the effect of pH on cleavage specificities by MBN nuclease. we repeated the MBN digestion studies at pH 52. As shown in Figure 6(b). the csleavage pat tern of pM(‘-D changes significantly whw the pH is lowered from 7.2 to 52. While region 6 remains quite reactive at pH 5.2, region 8 represents the most nuclease-sensitive site at acid pH (Fig. 6(b). lane 5). Tn contrast to neutral pH, clea\= age at acid pH within region 8 generates a single broad band of about 1300 nucleotides (G5) in pJIC’-G ‘(lane 5). Thr (A-T),, tract of region 6 that is the primary cleavage site at neutral pH remains reacative at a(%1 pH in pMC-F (Fig. 6(b), lane 7). As was observed for neutral pH, the vector sequences in p~lC~.~I). pMC-F and pMC-G are not reactive. The plasmid pJTc*-A. which contains regions 4 and 5. is cleaved at the same site within the insert at arid pf-1 that is observed at neutral pH; vector sequences appear somewhat more reactive in pMC-A at acid pH (compare lane 9, Fig. 6(a), with lane 9, Fig. 6(h). The cleavage pattern of pMC-B at acid pH is also similar t’o that observed at neutral pH (lanes Il. Fig. 6(a) and 6(b)), with vector sequences remaining reactive. Thus, reducing the pH from 7.2 to 5.2 does not appear to alter cleavage specificity, but does a,ffettct. the reactivity of regions 1 to 3

29

relative to vector in both pMC-A and pMC-B. Vector cleavage sites are not altered by an acidic environment (Fig. 6(b), lane 15). The cleavage patt,ern of S13X-24 at acid pH is also identical with that observed at neutral pH, with region 6 remaining the most reactive site (Fig. 6(b), lane 13). In some experiments, long exposures reveal minor products consistent with cleavage within fragment “A” at region 4 and within fragment “B“ at region 2. Vector sequences are not reactive in the context of S13X-24. In a manner similar to that used for ranking the relative reactivity of each sequence element at neutral pH, we determined the order of preferred nuclease sensitivity a,t acid pH to be: the (A-G) cluster >>(8-T),, >> (AATT), > (A),,=vector: or 8 >> 6 >> 1 > vector=4. In some experiments, (A)14r (C-A), was weakly reactive. Bent DN;A, the A ST-rich region containing ARS homology. and the telomerit-like TGGGT region were not’ reactive in any context tested. (v) Triple-stranded DNA formation nt acidic pH Models for nuclease and chemical sensitivity of simple repeat’ing purine/pyrimidine sequences similar to those of region 8 favor a triple-helix D?\‘A conformation in which the st’rand containing (T-C),, folds back and forms Hoogsteen base-pairs with the double helix, forcing the (A-G), strand into a singlestranded conformation (reviewed by IYells et al., 1988). In the triplex configuration, t’he extruded strand is available for strand-specific int,ermoleeular hybridization with ssDNA containing (T-C:), (Htun 8r Dahlberg. 1988). To determine if the MMHN cleavage of pMC-D and pMC-G at region 8 was the result of triplex-D%A formation, an intjermolecular hybridization experiment was performed. Recombinant Ml3 phage DNA containing either the (T-C) strand (clone 154/3) or the (A-G) strand (clone 154/ 4) of the 1.6 kb Hind111 fragment of pMC-G were incubated overnight at 37°C with supercoiled plasmids pMC-G and pMC-D at both acidic (pH 5.2) and neutral (7.2) pH. The hybridization products were separated by agarose gel electrophoresis at pH 7.8, blotted, and probed with a labeled fragment specific t’o h’I13in order to visualize M13-plasmid complexes. As shown in Figure 7, complex formation between Ml 3 ssDNA and either plasmid at acid pH is limited t,o only t’hose reactions containing phage with single-stranded (C-T) tract’s (lanes 2 and 5). For

digestionproducts of pMC-G. The sizesof digestionproducts are estimatedwithin k 50 basesas follows: Al, 1240;A2. 750; Bl. 1200; HZ, 660; Fl. 730; F2, 320; Gl, 1350;G2, 360; G3, 1300;G4, 300; Vl, 1585;V2, 1005.Abbreviations for restriction endonucleases are: H3, HindIII; Rl, EcoRI; B, BarnHI. (b) MBN cleavagepatterns of origin region plasmids at arid pH. MB?; activity was titrated at pH 5.2 to yield an average of one ssDNA nick per molecule, and the MNB digestion experiment presented in (a) was repeated at pH 52. Cleavage products were resolved as before. Lane 2 displays the products of pMC-D treated with nucleaseafter digestion with HindIII. Full-length restriction fragments and digestion products are denoted with letters and numbers as in (a), with the following exceptions: G5, 1300; G6, 350. (c) Summary of MBN sensitivity at neutral and acid pH. The relative reactivity of MBN-sensitive sites in each plasmid is indicat’ed by an arrow. React$ity at $l-7%‘is’&jica& $bove eaih $Iasm& reactivity at pH 52 is indicated below each plasmid. The lower line summarizes the hierarchv of reactivit,y of sites 1 to 8 as discussed in the text.

30

et al.

M. 8. Caddie PH

-_I_-

5.2

7.2

pMC-D

--

-+++

---

pMC-G

+ + + - -

--+++

--

-

- - + + + - _ _ __ -

M13-15413

-

+

--+-+---f--+-+-

Ml3-15414

-

-

+

-

-

+

-

+

-

-

+

-

-

+

-

-

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

1

2

3

4

5

6

7

9

10

11

12

13

14

15

16

(a)

(b)

6

Figure 7. Triple-stranded DNA formation in plasmids pMCI-G and pM(‘-1) at acid pH. Supercoiled plasmid 1)K.A ~a:: strand incubated with recombinant Ml3 phage DNA from either the poly(dT-dC) strand (154/3) or the pol~(tiA-cl(:) (154/4) of the 1.6 kb Hind111 fragment of pMC-G at either acid or neutral pH for 24 h at 37°C’. The int~rrmolrc~ulat~ hybridization products were separated by agarose gel electrophoresis at 4°C’. The gel was stained with c*thidiurn brotnidt, (a), blotted to nitrocellulose, and the blot was hybridized with an Ml3 specific> probe (1)). (‘omplrses specific* to thtb hybridization reactions are denoted by arrowheads. SSL. single-stra.nded linear phagr IJXVA: SS(‘, single-stranded circular phage DPiA.

pMC-G, which contains the (A-G) tracts isolated in the 1.6 kb Hind111 fragment, two complexes are visualized, one migrating (relative to linear DNA) at 2.2 kb and a second at 6 kb (lane 2). For pMC-D. a single complex between phage 154/3 and the plasmid that migrates at approximately 6 kb is formed (lane 4). Examination of the ethidium bromide staining pattern of the gel shows t,hat all the plasmid-recombinant Ml3 complexes migrate behind the position of the supercoiled plasmid DNA (arrows, Fig. 7(a)). At neutral pH, no hybridization between pMC-D or pMC-G and either recombinant Ml3 phage DNA is observed (Fig. 7(a) and (b)). Xote that the probe does not) cross-react, with plasmid sequences(Fig. 7(b), lanes 1, 4, 9 and 12). 4. Discussion Experimental evidence for specific origins in vertebrate chromosomescomes from numerous biochemical and physical observations, and is supported by the molecular precedents established by the study of the replication of double-stranded DNA molecules from diverse biological sources. Because origins of replication from animal cells might be expected to share common features with their

counterparts from other sources. we havcl analyzetl 6.2 kb of DNA derived from the (:HO dhfr origin region for homologies to other srquen~s. md for selected st’ructural features.

The origin region sequence contains two ttlcrnI~c~r;; of the highly repetitive AU family of intcrspt~rs(~d repeats. Because the 30 kb initiation region cxont,ains numerous other AluT farnil!, repeats (at Itlast IO). these elements likely have no direct rolr in l)NA replication. In cont.rast t,o tho .31ul and ot,hc,t repeats, hybridization c~sprriments (data not shown) indicate that, t’hc, OR’R-I motif loc~at~c~l itt position 3085 to 3243 occurs onl?; otIq(t ill t~:~vtl of’thtl amplified dhfr domains. The highI>. c~onsc~rvc~tl nature of this extragenic. motif (Fig. 3) im1icat’c.s that the sequence mai have a specific funr%ion. Thrt selection of a site homologous to ORR-I for atlwovirus DNA integration in the cell line BHKBBX-(‘31 suggests that ORR- I may share features with ot)hrt adenovirus DNA integration sit,es, many of which are actively transcribed int)o small RXAs (Doerfler et al, 1983: Gahlmann et al.. 1984). The transcrip tional status of the ORR)-1 motif in CHO(’ 400 c:&

dhfr Origin

is unknown. Both the ORR-1 and the bent DXA sequence are located within the 1.8 kb RnnzHIP NindlIT fragment (nucleotides 1534 to 3490) t’hat “in gel” renaturation experiments suggest, as containing an init’intion site (Leu &, Hamlin. 1989).

(b) Othrr

homo/ogiPn

The chlstrr of homol)~~rin~/homop~rimidi~~etracts at the 3’ end of the sequence has signiticant’ ident,itJ with the I’:! RNA gene of humans and with portions of the noi1-transcribed spacer regions of ribosomal RS,L\ genes from a number of vert>ebrate species (Table 1). Tn yeast. origin mapping experiment)s she\\- that rI)NA repeats contain initiat’ion sites within thr non-tralls~ril)etl spacer region (Brewer B Fangman. 1988: Linskens & Huberman. 1988). InterestingI;\. the rI)SA regions in yeast also cant ain a barrier to replication that prevents fork movement in a 3’ to 5’ direction through the transcriptioll units (Brewer Br Fangman, 198X; Linskens & H ubrrman. 1988). Since (A-G) repea,ts cause replication fork pausing in SY-10 DNA molecules (Rao rt ~1.. 1988). we’ are presently utilizing two-dimrtlsional g:el techniques to determine if replication forks pause near the cluster of homopurine/homol)>Yrin~idine1ra~fs i?/ l,ir~.

.A number of srquencr element’s in dhfr origin region DSA arc’ reactive in the MRX assay for DNA unwinding rlements. Tn this study. the observed heirarchy of ;CIBN cleavage was: (A-T)23 >>the (.4-G) (*luster >>p+(A) >>vector= (AATT),. The remaining origin region elements, including t.he A +-T-rrch region with ARS homology, were unreactive. (‘ontext has dramatic effects on nuclease sensitivity. nuggesting that the relationship between t’hr abilit:- of I)SA sequencesto unwind in vitro and a role 1n l)SA replication in zGo may be very complex. Though the (A-T),, t’ract of region 6 was t,hr preferred cleavage site at neutral pH in every context examined, we consider it’ unlikely that this sequence represents an element of the dhfr replication origin. Although (A-T) has unique properties that permit it to he readily extruded as a cruciform in supercaoiledplasmids (Greaves et al., 1985), it is also reactive wit’h nucleases in linear DNA (>l&lellan et nl.. 1986). It would be interesting to test the function of this sequence as a DNA unwinding element in yeast ARS elements to determine if facile unwinding, or some other property of the A-t T-rich regions associated with origins, is critical fbr initiation activity. The hypprsrnsitivity of pMC-A at neutral pH in region 1 that contains (A)38 was unexpected. Although theoretical considerations predict that poly(A) should display non-B form conformations under superhelical stress (reviewed by Wells et al., 1988). previous st,udies indicate that (A),, is not

Region

31

D,V*:A

reactive with ssDNA nucleases in supercoiled plasmids (Hanvey et al., 1988a). The basis for this discrepancv is unknown. The weak cleavage at region 1 within pMC-B is reminiscent of t’he reactivity of an A + T-rich region induced by neighboring G-rich homopurine sequencesin the 3’ end of the rat 1, (long interspersed repeat DNA) element (Ysdin & Furano, 1988). Our results also show that’ sensit’ivit’y t,o MBN at neut,ral pH is not’ limited to A +T-rich sequences. I>igest,ion at the (A-G) cluster in pMC-G, for example, likely reflects cleavage due t’o other non-H form conformers. Preliminary primer estension experiments that have mapped the Ml%s caleavage sit)es to nucleot’ide resolut’ion in region 8 suggest that cleavage occurs within the sequence element, it,self. as well as at its junction with flanking sequences(M. Caddle Kr N. H. Heints, unpublished results). Fine st’ruct’ure mapping with additional enzymat’ic and chemical probes will be required for resolving the precise basis for the nucaleasesensitivity of this and other sequence elements in the dhfr origin region. (d) Dijj%rmt

sequences promote and bending

unwinding

The relationship of sequencesthat promote DNA bending and DNA unwinding remains unclear. The two-dimensional gel assay at’ neutral pH shows that the (A-G) tracts, (A)38r and (A-T),,, all nucleasesensitive at neutral pH, do not induce markedly abnormal migration of small fragments in polyacrylamide gels. Rather, anomalous migration ‘is limited to the 280 bp Ha&T fragment located at nucleotides 3342 to 3622 (Fig. 4); which is not, a detectable cleavage site for MBN in any context yet tested. Thus. DNA bending is not equivalent to unwinding, nor is nuclease sensitivity predictive of sequences that induce anomalous migration of linear fragments on polyacrylamide gels. We suspect that the 270 kb of DNA that comprises t,he amplified dhfr domain contains numerous elements that demonstrate varying propensities for melting or bending in different contexts and under particular experimental environments. Elements that promote localized unwinding or bending may be meaningful in initiat*ion of DNA replicat’ion only when such sequenc$es are adjacent’ to or interact) with initiat#or binding sites. (e) Nuckase

sensitivity

and trip1e.r

DLVA

at acid pH

Incubation of the panel of plasmids used in the DNA unwinding assayswith MBN at’ acid pH shows that protenation may alter cleavage preferences and specificity (Fig. 6). Tn particular, the weak reactivity of region 8 at neutral pH is markedly enhanced at pH 5.2. As shown in Figure 7. int,ermolecular hybridization studies with recombinant, singlestranded Ml3 phage DNA and the supercoiled plasmids pMC-D and pMMc-G indicate that the DNA st’rand containing (A-G) is extruded at acid pH in a

M. AS.(‘addle et al

32

form available for hybridization to ssJ)l\A containing (T-C). Modeling st’udies suggest that triplex DNA introduces dramatic kinks in duplex DNA (reviewed by Wells, 1988). Since at least’ t,wo triplex conformations are available to each (A-(:) t,ra,ct, the conformations available t.o the (A-(:) cluster in t,hr dhfr origin region may be quite complex. The observation that two complexes are visualized for the reactions containing pMC-G, while single complex is visualized for rractions :ontaining pM(‘-1) (Fig. 7(b)). suggestjs t’ha,t a significant portjion of supercoiled pMC-(: moleculrs may conta,in two regions of triple-stranded DNA. Ongoing stjruct’ural studies support this interpreta,tion (M. Caddle & ?i. H. Heintz. unpublished data). Since the intermolecular complexes formrd at acid pH are stable during rl~~c,t’rophoresis at pH 7.X. it should be possible to study t’hese hybrids. once formed at acid pH. under a variety of’ tlt’at ph,vsiological conditions.

Origins of replication are (composed of niultiplc~ elements that ftmction in specific stagrs of the multi-step init’iation process. Thtb results prrsentjrd here show that a relatively small region within tht, earliest, replicating port,ion of the amplified dhfi domain contains a novel conserved repetitive element and a number of st~quencr~ rnot,ifs that> have unusual structural properties under various c*ontjrxtual. borsional and environmental influences. W’hile we recognize that, formation of non-K form strucatures in V&O requires non-F)hvsiological envirollm-f suspect that c~~~llular proteins. ments. transcription or replicat,ion ma)- exert regional influences that promot*e the formation of particular structural conformations in ~lr:o. Such conformations need not, persist in cliro. but may be transient in nature, occurring only during defined stages of the cell cycle in response t,o various influenccas. Dorument,ation of thr structures available to t,he dhfr origin sequences i/l P&O. and dernonst,ra,tion that these or other conformat,ions occur in r:iw, is required before a role for particula,r conformations in initiation of DIVA replication can be delineated. Sote added in proof’: Recent origin mapping experiments confirm that the bidirectional origin of replication for the C!HO dhfr gene is locat,rd within the region studied in this paper (Handeli, S., Klar, A.. Meuth. M. &z Cedar. H. (1989). (‘41, 57. 909-920). \Ve

thank

.Janr

Selrgue Rabrns

sequencing, Laurie

for

her and

early ,Juciy

help with lies&r

ance with the mnnuscript. Hen Van Houtrn Heintz t’or critical comments. and Susan Temple Smith at MRCRR for assistance seyuenw analysis. This work was supported

GM32859

from the NH.

thr fi~r

l)iK5;1. assist-

and iVat Russo and with the by gritnt~

dhfr Origin Region DNA Cold Spring Harbor Laboratory Press, Cold Spring Harbor. K;\;. Mc(‘lellan. ,I A.. Palecek, E. & Lilley, D. M. J. (1986). Sucl. Acids Res. 14, 9291-9309. Messing. J. (1983). Methods Enzymol. 101, 2Cb-78. Nilbrandt. ,J. I).. Heintz. K. H.. White, W. C., Rothman, S. M. & Ha,mlin, J. I,. (1981). Proc. Nat. Acad. Sci.. r..S...l. 78. 6043-6047. Mirkin. S. IN.. Lyamicher. V. 1.. Drushlyak, K. h’.. v. K.. Filippov. Uobrynin. S.A. & Frank-Kamenetskii. 11. I). (1987). AV~furr (London). 330. .495-497. Jlontoya-Zavala. JI. 8r Hamlin. *J. L. (1985). ,%Zol. (‘e/l. Hid. 5. 619~-627. Palzkill. T. C.. Oliver. S. (:. & NewIon. C. S. (1986). ,V~cl. dcidw Hrs. 14. 6”47- 6263. Poncz. 11 Solowiejczyk. D.. Ballantine, M.. Schwart’z. E. 8: Surrey. S. (1982). Proc. lVa,f. Acnd. Sci., I:.X’.A. 79. 4298bK~OP. Ramstein. J. Cy Larery. R. (1988). Proc. Nat. Acad. Sri., f..S..-l. 85. 7231-72‘3%. c R#ao. U. S,.. Manor. H. h Martin. R. G. (1988). Mucl. Acids Krs. 16. X077 -8094. Rich. A.. Sordheim. r\. 8r Wang. A. H. J. (1984). ;2nncr. Rrt.. J~iwhew. 53. inI--846. R’osenfeld. I’. ,I.. 0‘1veill. E. ,l., Wides, R. J. & Kelly. 1‘. ,I. (l!)Xi).

Mol.

("~11. Kid

33

Sanger. F., Ncklen, S. & Coulson, A. R. (1977). Proc. Nat. Acad. Sci., (1.S.A. 74, 5463-5467. Simpson, R. T. & Kunzler, P. (1979). XIX1. Acids Res. 6, 1387-1415. Snyder, M.. Buchman, A. R. & Davis. R,. W. (1986). Nature (London), 324, 87-89. Vmek, R. M. & Kowalski. D. (1987). Nucl. .-Ic%ds Res. 15, 44674480.

ITmek. R. M. & Kowalski. D. (1988). (‘~11. 52. 559-567. I’mek. R’. M.. Eddy. M. J. & Kowalski. I). (1988a). (!ancer Crlls, 6. 473-478. ITmek. R’. M.. Linskens. M. H. K.. Kowalski. 1). & Huberman. J. A. (19880). Hiorhinc. Rioph~ys. Acta. 1007. l-44. ITsdin. K. & Furano. A. \‘. (1988). PIW. Snt. Acad. Sri., T7.S.d. 85. 4416.-4420. Wells, R. D. (1988). .I. Biol. C’hem. 263. 1095--1098. Wells. R. I)., Collier, D. A., Hanvry. ,J. (‘.. Shimizu. M. & Wohlrab. F. (1988). FASER .I. 2. ?93!)- 2949. Westin. (i., Visser, L., Zabielski. .I.. van Mansfeld, A. 1). M.. Pettersson. L’. & Rozijn. ‘l’h. H. (1982). Genr. 17. 263-270. ~$Xiarns. J. S.. Eckdahl, T. T. 8r Anderson. .I. 5. (1988). Mol. (‘~11. Hiol. 8, 2763-2769. Zahn. K. 8i Blattner. F. R. (1987). Sci~ncr. 236. 416-422.

7. 875-886.

Edited by K. A. Lask~y

Intramolecular DNA triplexes, bent DNA and DNA unwinding elements in the initiation region of an amplified dihydrofolate reductase replicon.

The nucleotide sequence of 6.2 kb (1 kb = 10(3) base-pairs) of DNA that encompasses the earliest replicating portion of the amplified dihydrofolate re...
3MB Sizes 0 Downloads 0 Views