Cell, Vol. 13, 345-358,

February

1978,

Copyright

0 1978 by MIT

Nucleotide Sequence of the DNA Replication Origin for Human Papovavirus BKV: Sequence and Structural Homology with SV40 Ravi Dhar, Ching-Juh Lai and George Khoury Laboratory of DNA Tumor Viruses National Cancer lnstitue Bethesda, Maryland 20014

Summary DNA and RNA sequencing techniques were used to obtain the sequence surrounding the origin of DNA replication for human papovavirus BKV. The structure is characterized by a true palindrome of 17 residues followed by two sets of symmetrical sequences and a stretch of 20 AT residues. Within the two symmetrical sequences is a segment containing a strong purine bias, 23 of 26 nucleotldes. These structures are similar, if not identical, to those found in the region of the SV40 replication origin. Within the homologous DNA segments, 60-60% of the BKV and SV40 nucleotides are the same. The remarkable similarity of BKV and SV40 sequences containing the origins of DNA replication would appear to confirm our previous suggestion of an evolutionary reiationship between the two genomes. in addition, topological similarities between these sequences suggest the possibility of certain structural requirements for bidirectional replication origins in these superhelical DNAs. Introduction Human papovavirus BKV has been isolated from the urine of renal transplant recipients (Gardner et al., 1971) and from the urine and a brain tumor of patients with the Wiskott-Aldrich syndrome (Takemoto et al., 1974), a rare genetic disorder manifested by defects in both humoral and cellular immunity. The evidence for this virus being a human agent includes its isolation from humans, the presence of anti-BKV antibodies in almost 70% of most adult populations (Gardner, 1973; Mantyjarvi et al., 1973; Shah, Daniel and Worszawski, 1973) and the preferential growth of the virus in human cells in tissue culture (Mason and Takemoto, 1976). Like papovaviruses polyoma and SV40, BKV is a 45 nm particle containing a double-stranded superhelical genome with molecular weight of about 3.4 x lo* daltons. It produces an early T antigen which almost completely cross-reacts immunologically with SV40; late capsid polypeptides, however, are easily distinguished from SV40 on the basis of size and antigenicity (Gardner et al., 1971; Padgett et al., 1971; Takemoto and Mullarkey, 1973; Field et al., 1974; Mullarky, Hruska and Takemoto, 1974). In addition to a lytic interaction with human cells, BKV, like SV40, undergoes an abortive interaction

with a number of cells, which in many cases leads to stable transformation (Major and dihrlayorca, 1973; Portolani, Bardanti-Brodano and LaPlace, 1975; van der Noordaa, 1976). While the biological properties of BKV and SV40 show similarities, distinct differences are apparent. BKV reproduces in monkey kidney cells but grows to much higher titers in human embryonic kidney (HEK) cells. SV40 (especially laboratory strains) grows poorly if at all in HEK cells (C.-J. Lai, unpublished results). Nevertheless, it appears that the BKV T antigen can interact with the SV40 genome since BKV preinfection of AGMK cells can complement an early temperature-sensitive mutant of SV40 at the nonpermissive temperature (Mason and Takemoto, 1976). On the other hand, BKV does not appear to complement the growth of SV40 in HEK cells (C.-J. Lai and G. Khoury, manuscript in preparation). Since the mechanism of viral complementation appears to involve the shared interaction between biologically active proteins and the viral DNA, we were interested in comparing the sequences of BKV and SV40, especially in the region of the origin for DNA replication. A number of DNA-DNA hybridization experiments have shown that, depending upon the degree of stringency of the technique used, between lo-50% of the DNA sequences of BKV and SV40 are complementary. Most of this homology was localized to the late region of the viral genomes (Khoury et al., 1975a; Osborn et al., 1976). More recently, Newell et al. (1977) studied the homology between these two viral genomes using a technique in which the two viral DNAs are covalently ligated to one another (Ferguson and Davis, 1975). After denaturation, the “snap-back” hybrid molecules showed as much as 80% sequence homology under nonstringent conditions; the nonhomologous 20% was located in DNA regions close to the initiation and termination sites for viral DNA replication. Since hybridization data cannot provide the detailed analysis needed for a comparison of the active DNA sites on the two viral genomes, we turned to nucleotide sequencing. The location of the initiation sites for DNA replication in SV40 (Nathans and Danna, 1972; Fareed, Garon and Salzman, 1972) and BKV (P. M. Howley, M. S. Law, G. Khoury and M. A. Martin, manuscript in preparation) is known, and both map at about 0.67 units from the Eco RI site (0.00 map units). DNA synthesis has been shown to proceed bidirectionally (Danna and Nathans, 1972; Fareed et al., 1972) with termination occurring at a random site approximately 180”from the origin (Lai and Nathans, 1975). A detailed sequence of the SV40 replication origin has already been obtained (Subramanian,

Cdl 345

Dhar and Weissman, 1977a), and the functional segment has been localized by the analysis of deletion mutants to cl00 nucleotides at approximatley 0.655 to 0.675 map units (Lai and Nathans, 1974; Shenk, 1977; C. Cole, personal communication; K. Subramanian and T. Shenk, personal communication). This segment contains a number of interesting features including true and symmetrical palindromes, tandem repeats and a run of AT-rich residues. In addition, it contains the 5’ end of the early SV40 RNA (Dhar et al., 1974b; Khoury et al., 1975b; Dhar et al., 1977a). This study reports the nucleotide sequence of the BKV replication origin and compares it with that of SV40. Results Direct DNA Sequencing Restriction endonuclease cleavage of BKV form I DNA with Hind Ill generates four fragments which can be isolated on 4% polyacrylamide gels (Howley et al., 1975). The origin of replication has been localized to the Hind Ill C fragment by P. M. Howley, M. S. Law, G. Khoury and M. A. Martin (manuscript in preparation) and C.-J. Lai and G. Khoury (unpublished results) at approximately 0.67 map units from the unique Eco RI cleavage site. The general approach used in obtaining the sequence around the origin of replication of BKV DNA was first to isolate the Hind Ill-C fragment and to subject this fragment to further restriction enzyme cleavage with Hae III, Alu III, Barn X or Hinf I. The order of the resulting fragments is shown in Figure 1. The restriction enzyme fragments obtained by cleavage of Hind Ill-C separately with each of the above enzymes were labeled with 32P at their 5’ ends using Y-~*P-ATP and T4 polynucleotide kinase as described by Maxam and Gilbert (1977). The specific fragment to be sequenced was purified by electrophoresis on 7% polyacrylamide gels, eluted, and either recleaved with another of the restriction endonucleases or strand-separated. The two strands of the fragment were separated HindIll

HindIllHindlll 3 +i

t ’

-4

/ Alulll

1. Restriction

HindIll.-A Hinfl

Hinfl

Hinfl

ljindlll

f ’

2-&++-7~!

A H&II

Ah1111

Haslll

I

Endonuclease

Cleavage

5

I[

lli.t,l

Ha&

1 I AlullI

I Origin of DNA Replication Figure

RNA Sequencing of DNA Fragments The RNA sequencing was performed primarily to confirm the sequence obtained by direct DNA sequencing. The five largest Hind III Hae III fragments were transcribed with E. coli RNA polymerase in the presence of four ribonucleotide triphosphates,

-C

thmx I I Hinfl :-I-

JL -4-+-5-l

either on 7% polyacrylamide gels or by two-dimensional homochromatography. These procedures resulted in a fragment that was labeled only at one 5’ end and could then be directly sequenced using the chemical procedure described by Maxam and Gilbert (1977). Alternatively, the end-labeled fragment was subjected to partial digestion with snake venom phosphodiesterase (SVP) and two-dimensional homochromatography to obtain the sequence close to the 5’ ends of the fragment. The order of the restriction enzyme fragments was determined by sequencing a series of overlapping fragments. For example, 5’ end-labeled Hind Ill Alu Ill-Cl, when digested with Hinf I, generated two fragments. The larger fragment, Hind III Alu III Hinf I-Cl-l, when subjected to partial SVP digestion and to a Maxam and Gilbert-type analysis, gave the sequence overlapping Hind Ill Hae Ill-Cl, -C7, -C6 and -C3, in this order (Figures 11 and 12) (the complete sequence of Hind III Hae Ill-Cl is known and will be published elsewhere). Figure 6 shows an example of overlapping fragments which were determined by partial SVP digestion and twodimensional homochromatography. Fragment Hind III Hinf I-C3 was labeled at both 5’ ends; one end provides the sequence from the Hind III site and the other from the Hinf I site. The sequence from the Hinf I site overlaps the Alu III site at position 64 (Figure 13). This also establishes the order of the two Hind III Hae Ill fragments (Hind III Hae Ill-C4 and -C2). Using a similar type of analysis, the remainder of the sequence was similarly obtained (Figures 2-12), and it is summarized in Figure 13. In most cases, the sequence was verified using the same procedures on the complementary DNA strand.

‘* Haslll

3-2

Haalll lb Alulll

Alulll

I

Map of Hind III-C Fragment

The vertical arrows indicate the position of the restriction Alu Ill, Hinf I and Barn X gives rise to 8.6, 5 and 2 fragments,

endonuclease respectively.

cleavage sites. The fragments

The cleavage are numtered

of Hind III-C fragment with Hae Ill, according to their relative sites.

Nucleotide 347

Sequence

of BKV Replication

Origin

G>A

CT

C

c T T T A G A

20 A G G G a A T

T A Figure 2. Autoradiogram of a Two-Dimensional Homochromatogram of the 5’ End-Labeled Fragment Hind Ill Hae WC4 Partially Digested with Snake Venom Phosphodiesterase (Maniatis et al., 1975) (1) denotes electrophoresis in the first dimension at pH 3.5, and (2) denotes homochromatography in the second dimension. The sequence deduced from the Hind Ill site on the early strand from nucleotide 2 is AGCmCTCATTAAGGG, and that from the Hae III site on the late strand from nucleotide 47 is CClTGAAAGAGCTGCCT. The first few nucleotides of the sequence were deduced by eluting the spot from the thin-layer plates, running them on one-dimensional DEAE paper at pH 3.5 and calculating the ratio of the migration of the oligonucleotide to that of xylene cyanol dye (Subramanian et al., 1977a).

one of which was labeled in the (Y position. Both strands of the fragments were transcribed with almost equal efficiency. Transcripts were purified and digested with either Tl or pancreatic RNAase. The oligonucleotides derived were analyzed further by digestion with appropriate ribonucleases, and nearest-neighbor analysis was performed wherever necessary. In the elucidation of the sequence of some of the largest Tl or pancreatic RNAase digestion products, information obtained from the com-

C

10 7 c

T

Figure 3. Fragment Hind Ill Hae Ill Alu III-CC1 Subjected to Three Sets of Chemical Reactions and Run on a 200/o Polyactylamide Gel Containing 9 M Urea (Maxam and Gilbert, 1977) The label was at nucleotide 2 of the early strand (Figure 13).

Cell 340

here since it was possible to obtain the complete sequence using the direct DNA sequencing techniques. Nevertheless, these results served to confirm the accuracy of the DNA sequence.

Polarity and Strandedness The polarity of the sequence was established by digesting uniformly labeled BKV DNA with Hae Ill, Hind III or the combination of the two enzymes; the fragments were then separated by electrophoresis on polyacrylamide gels to determine which Hae III fragments contained the Hind III sites. These fragments were next recleaved with Hind Ill and co-migrated with Hind III fragments cleaved with Hae III, thus establishing that the Hind III Hae III-C4 fragment was at the Hind III C,D junction, and that the Hind Ill Hae Ill-C-5 was at the Hind III-C,-A junction (data not presented). The strandedness of the sequence was determined by first separating the complementary strands of the 5’ end-labeled Hind Ill Hae Ill-C-5 fragment. The strand with a 5’ end labeled at the Hind Ill-C,-A junction was then annealed to the separated strands of Eco RI linear BKV DNA immobilized on nitrocellulose paper. Hybridization was detected only to the strand corresponding to the template for early BKV RNA (C.-J. Lai and G. Khoury, manuscript in preparation). Thus we confirm that the BKV DNA strands which serve as the templates for early and late BKV RNA are, in fact, the strands which bear a striking similarity to the E and L SV40 DNA strands, respectively (Figure 14).

Discussion

Figure 4. Partial Hind III Alu WC4

Snake

Venom

Digestion

Pattern

Both the 5’ ends were labeled, and the sequence nucleotide 2 on the early strand is AGCTTTTCTC, nucleotide 36 on the late strand is CTGCCTGGGAAA. the details are the same as in Figure 2.

of Fragment deduced from and that from The rest of

plementary strand was used. For example, the Tl product within Hind Ill Hae Ill-C2 is a mixture of two oligonucleotides: AACUUUAUCCAUUUUUG(C) and AAAUCCCUAUUCUUUUG(C). The complements of the former Tl product are the pancreatic RNAase digestion products GGAU(A) and AAAAAU(G). The complements of the latter Tl product are AAAAGAAU(A) and AGGGAU(U), respectively. To determine the number of A residues in a given sequence, we used information obtained by direct DNA sequence analysis of the fragment, or we analyzed the product by one-dimensional high voltage electrophoresis on DEAE paper at pH 1.7 and calculated the ratio of the mobilities of the product with respect to the xylene cyanol dye. The data obtained with the transcripts are not included

In this study, we present the sequence of 255 nucleotides from a fragment which includes the initiation site for DNA replication of papovavirus BKV. As shown in Figure 14, the most interesting aspects of this study arise from the comparison of this sequence with that of a similar region from the SV40 genome. It was known from DNA-DNA hybridization studies that considerable sequence homology existed between the two viral genomes (Khoury et al., 1975a; Howley et al, 1975; Osborn et al., 1976; Newell et al., 1977). Both the extent and location of this homology could only be appreciated for any specific DNA segment with a knowledge of the nucleotide sequence.

Similarity of the Replication Origins The nucleotide sequence of the SV40 DNA near the A-C junction, proximal to the 5’ end of the early region of the SV40 RNA, has been published (Subramanian et al., 1977a; Dhar et al. 1977b). In Figure 14, we compare this sequence with that determined for BKV. The nucleotides contained within boxes represent sequence identity between the two genomes; approximately 85% of the first

Nucleotide 349

Figure

Sequence

5. Partial

Snake

of BKV Replication

Venom

Origin

Digestion

The sequence of (a) from nucleotide early strand is CC%AAGGTCCATGAG.

Patterns

of (a) Hind

150 on the late strand

III Hae Ill Alu Ill-C2-1 is CCTGCAAAACTAlT,

100 nucleotides, corresponding to the origin-proximal end of the SV40 Hind II + Ill A fragment, are the same. The presumed limits of the “replication origin” of SV40 have been determined by analyzing sets of deletion mutants on both sides of an essential set of nucleotides (Lai and Nathans, 1974; Shenk, 1977; C. Cole, personal communication; K. Subramanian and T. Shenk, personal communication). For SV40, this includes the nucleotides from approximately 130-210 (Figures 14 and 15b) which map between 0.655 and 0.675 units. A remarkably similar set of sequences in BKV DNA extends from approximately nucleotides 150-240 (Figures 14 and 15a); it almost certainly corresponds to SV40 in terms of function. Within the replication origins is a true palindrome of 17 nucleotides (Figures 15a and 15b) for both BKV (CCTCAGAAAAAGCCTCC; 151 to 166) and for SV40 (CCTCCAAAAAAGCCTCC; 128-144). Both sequences are almost identical, and both contain only one nucleotide which does not contribute to the palindrome. This sequence is followed by two symmetrical segments represented with intrastrand homology (Figure 15). The first

and (b) Hind Ill Hae Ill Alu Ill-C2-2 and the sequence

of (b) from

nucleotide

48 on the

segment (BKV = 169-195; SV40 = 148-162) shows considerable sequence divergence between the two viruses. The second symmetrical region (BKV = 196-218; SV40 = 165-191) is characterized by extensive sequence homology (23 of 24 bases) and a high concentration of GC residues. Beyond the second symmetrical segment there is a long ATrich sequence. For BKV, this sequence is 20 nucleotides long (219-238) and contains a single run of 9 A residues in a row; for SV40, the AT stretch is 17 nucleotides long (193-209) and contains 8 A residues in tandem. There is also a purine bias on the early strand of both symmetrical regions (23 of 26 nucleotides for BKV from position 183-208) (Figures 14 and 15a); 17 of 24 for SV40 from position 156-179 (Figures 14 and 15b). The similarity within the regions containing the replication origins of these viruses might simply reflect a close evolutionary relationship between the two (Khoury et al., 1975a), or it might also represent a functional requirement for the specific DNA sequences. This second possibility is supported by the observation that even where the nucleotide sequence diverges within this region, the general topology is retained

Cdl 350

Figure 6. Partial Snake Venom Digestion Pattern of Hind III Hinfc3 The sequence from nucleotide 2 on the early strand is AGCTllTCTCATl, and that from nucleotide 75 on the late strand is AATCCATGGAGCT. [for example, within the first symmetrical segment (Figure 15); in the tandem repeat sequences beyond nucleotide 240 for BKV and 230 for SV40 (R. Dhar, C-J. Lai and G. Khoury, manuscript in preparation)]. Most of the DNA segment presented in Figure 15b is indispensible for viral function. It has been implicated in the control of DNA replication and the control of transcription, and as the recognition site for DNA binding proteins. Given the comparison of the SV40 and BKV sequences, we speculate below on the possible functional significance of some of these structures. Transcriptional Regulation Early SV40 transcription proceeds counterclockwise on the conventional map (Khoury et al., 1973; Sambrook et al., 1973). The 5’ end of the early SV40 RNA has been localized close to the origin of DNA replication (Dhar et al., 1974b; Khoury et al., 1975b; Reed and Alwine, 1977; Dhar et al., 1977a) and has been mapped at approximately nucleotide 170 (Figure 15b) (Dhar et al., 1977a; Subramanian et al., 1977a). By analogy, we speculate that the 5’ end of BKV early RNA will be located at about map position 190. In view of the evidence for processing of at least the late SV40 RNA, it is possible that these 5’ ends are not located within the region of

the early promoter. On the other hand, there is considerable evidence from bacterial systems that promoter sites located near AT-rich stretches closely precede the 5’ end of the message (Pribnow, 1975). Such a sequence is present in both genomes as described above. In addition, there is a purine bias on the early strand in the region proximal to the 5’ end of the early RNA, a situation similar to that for yeast 5S RNA (Maxam et al., 1977; Valenzuela et al., 1977). Thus we speculate that this region might function as a promoter (or as a processing signal) for early viral RNA. A number of lines of evidence suggest that short stretches of A residues might serve as terminators for transcription (Lebowitz, Weissman and Radding, 1971; Rosenberg, Weissman and de Crombrugghe, 1975; Sklar, Yot and Weissman, 1975) or, alternatively, as processing signals for the 3’ terminal end of messages (Rosenberg and Kramer, 1977; Robertson, Dickson and Dunn, 1977). Several studies with procaryotic systems have shown that transcription termination can occur when there are more than 6 A residues in the DNA preceded by GC residues (see Gilbert, 1976). For example, there are 29 A residues immediately beyond the end of the coding sequence for yeast Saccharomomyces cerevisiae 5S ribosomal RNA, short runs of A residues beyond the coding region for Xenopus oocyte 5S RNA (Brown and Brown, 1976), and 5 A residues at and beyond the DNA sequences which code for the adenovirus VA RNA (Celma et al., 1977a; Celma, Pan and Weissman, 1977b). Furthermore, there are 6 A nucleotides beyond the 3’ ends of the major early and late SV40 RNAs (Dhar et al, 1974a, 1974b). Thus it is interesting that short A stretches occur on the early strand of BKV (157-161) and SV40 (133-138) counterclockwise or to the.left of the 5’ end of the early RNA (Figure 15). These could signal the termination of a small RNA (40-50 nucleotides) transcribed near the origin of replication. Such a small RNA could function in the control of transcription or in the initiation of viral DNA replication. The transition from the primer RNA to DNA in the origin of replication of the colicin El plasmid consists of a segment of 5 A residues preceded by a stretch of GC-rich residues (Tomizawa, Ohmori and Bird, 1977; D. Bastia, personal communication). The SV40 and BKV origins of replication could be analogous to that of the colicin El in this respect. The 2 fold axis of symmetry observed for the colicin El origin of replication is not as extensive, however, as that found for BKV and SV40 DNA. Translational Controls Translational initiation mechanisms in both procaryotic and eucaryotic systems have been shown to

Nucleotide 351

Sequence

of BKV Replication

Origin

C A

A

G ACT 1%&E

90

G A

GC

lJ TT~ 140 TA

A

AA CA CC C

130 T L G

E t i 120 ", 2A

G

C

c 80

: T A

C

110; A * C

G T Figure

7. Fragment

Hind III Hinf I Alu Ill-Cl-1

Labeled

The fragment was subjected to three sets of chemical different time periods (Maxam and Gilbert, 1977).

at Position reactions,

73 on the Early and aliquots

Strand

of the sample

were

analyzed

by electrophoresis

at three

Cell 352

CT

C

G>A

G>A

CT

C x c

G

Figure 8. Partial Alu Ill-Cl-1 The :quence ATTCTl CCCTGll

Snake from

Venom nucleotide

Digest 73

of Fragment on

the

A

Hind Ill Hinf I early

strand

is

160 A n

recognize the RNA triplet AUG. There are a number of these encoded within the proximal portion of the early strand of SV40 in each of the three possible reading frames. Within a similar segment of BKV, four such triplets are encoded and all are in the same reading frame (Figure 13). It appears probable that the triplets read as nonsense codons in .procaryotic systems are also recognized as translational termination signals in eucaryotic cells (Forget et al., 1975; Proudfoot, 1977; Efstratiadis, Kafatos and Maniatis, 1977). There are no termination codons within the sequence which we have determined in the BKV early strand or within a similar segment on the SV40 early DNA strand (Dhar et al., 1977b) in phase with the putative translational starts. If the putative translation start signals for the early BKV and SV40 proteins are correct, then one can predict the amino acid sequence of the amino terminal end of the respective peptides. Figure 16 shows that while there is some variability in the nucleotide sequence, it occurs predominantly in first and third codon positions. Thus we might predict that as many as 28 out of the first 32 N terminal amino acids of the BKV and SV40 early proteins could be identical. The prediction of similarities in the N

A

A

G

:

Figure 9. 5’ End-Labeled Hind Ill Hae Ill-C3 Fragment Separated by Two-Dimensional Homochromatography The strand of chemical and Gilbert,

Strand-

labeled at nucleotide 151 was subjected to three sets reactions and analyzed by electrophoresis (Maxam 1977).

terminal portions of these early peptides is supported in part by the data comparing the methionine-labeled tryptic peptides of SV40 and BKV T antigens (Simmons and Martin, 1977). These speculations are of course based upon the assumption that the DNA sequence in this region is accurately reflected in the mRNA. In view of recent findings on splicing of RNAs (Klessig, 1977; Aloni et al.,

Nucleotide 353

Sequence

of BKV Replication

Origin

A C C C T

T T

Figure

10. Partial

Snake

(a) The sequence from strand from nucleotide

Venom

Digestion

Pattern

of the Separated

the early strand from nucleotide 208 is CCGCCTCTGCCTCCACCCUT.

Stra nds of Hind

151 is CCTCA(

Ill Hae Ill-C3 Fragment

WWAAGCCTCCACACCClT,

and (b) the sequence

frc ,rn the late

Cdl

354

C

CT

G>A

C

CT

G>A

T

T

A

c

Figure 11. Partial III Hinf I-Cl-1

Snake

Venom

Digestion

Pattern

The fragment was labeled on the late strand and the sequence is CTCCTCCCTGTGGCCTT.

of Hind

III Alu

at nucleotide

253,

T A

24oc

T d



A

‘ A

T 230

T

1977; Berget, f$oore and Sharp, 1977; Celma et al., 1977a; Chow et al., 1977) however, these assumptions may be erroneous. To the right of (that is, clockwise from) the SV40 origin are a set of tandem repeat sequences (Dhar et al., 1977b; Subramanian, Reddy and Weissman, 1977b). Similarly in BKV, we have found a set of large repeats in a similar segment, but there is no sequence homology to those found in SV40 (R. Dhar, manuscript in preparation). The function of these repeats is not clear. Experimental

Procedures

Matorlals a-aPP-labeled ribonucleotide triphosphates (spec. act. 100-200 Ci/mmofe) were purchased from either New England Nuclear (Boston, Massachusetts) or ICN Pharmaceuticals (Irvine, California). y-“P-ATP (spec. act. 19094000 Ci/mmole) was purchased from ICN Pharmaceuticals (Irvine, California). Most of the restriction enconucleases were purchased from New England BioLabs (Beverly, Massachusetts). Polynucleotide kinase and restriction endonuclease Hae Ill were a gift from Dr. Sherman Weissman or were obtained from New England BioLabs. Endonuclease Hinf I was a gift from Dr. Roberto DiLauro. Alkaline phosphatase and pancreatic RNAase were obtained from Worthington Biochemicals (Freehold, New Jersey). Snake venom phosphatase was purchased from BoehringerMannheim Biochemicals (Indianapolis, Indiana), and Tl RNAase from Sankyo (Tokyo, Japan).

T G

T

T c T c

Figure 12. Fragment Hind Ill Alu Late Strand at Position 253

Ill Hinf

I-Cl-1

Labeled

The fragment was subjected to chemical analysis, were electrophoresed at two different time intervals Gilbert, 1977).

on the

and aliquots (Maxam and

Prototype BKV was grown in second or third passage human embryonic kidney (HEK) cells at a multiplicity of 0.1-l pfu per cell. Form I BKV DNA was extracted by the Hirt procedure (1967) followed by equilibrium centrifugation on CsCI-ethidium bromide gradients (Radloff, Bauer and Vinograd, 1967).

5’ End Labellng

and DNA Sequencing

5’ end labeling of the restriction endonuclease fragments derived by cleavage of the Hind III-C fragment with Hae Ill, Alu Ill, Barn X or Hinf I was performed as described by Maxim and Gilbert

Nucleotide 355

Sequence

of BKV Replication

Hindlll

Origin

10

EcoRll

70

pA’AGCTTTTCTCATTAAGG%AAGA

Alulll

4o

TTTCC’C?AGGCAG’CTCTTTC

TTCGAAAAGAm\ATTCCCTTCTAAAGGGGTCCGTC,GAGAAAG Eco k II

Haell1’ 5o

Bamx

Alulll

60

7o

Alulll

Hinfl

80

AAGGCCTAAAA’GGTCCATGAG’CTCCATGG’ATTCTTCCCTGTTA T T C C,G

G A T T T T

C C A GIG;T[C

Haelll

T C

Bamx

G A G[GjC

C T A,A.G Hi~f

Alu i II

90 100 AGAACTTTATCCATTTTTGCAAAAATTGCAAAAGAATAGGGAT

110

A A G,G MbOlI

G A C A

A T

120

TCTTGAAATAGm\AAAACGTTTTTAACGTTTTCTTATCCCTA

130 TTCCCCAAATAGTTTTGCA

140

170

AAGGGGTTTATCAAAACGTCC+GGAGTCTTTTTCGGAGGTGTG Hael I I

180

Hael

200

190

CCTTACTACTTGAGAGAAAGGGTGGAGGCAGAGGCGG’C

I 10 t TCGG’

Haelll

GGAATGATGAACTCTCfTTCCCACCTCCGTCTCCGCCtGGAGCC Haelll

220

24DHaelll

230

250

Hae I II Alulll

CCTCTTATATATTATAAAAAAAAAGG’CCACAGGGAGGAG’CT

‘Earlv’

GGAGAATATATAATATTTTTTTTTCC,GGTGTCCCTCCTC,GAp

‘Late’ Haelll

Figure

13.

Nucleotide

Sequence

of Part of Hind

Ill-C Fragment

near the Hind III-A.-C

strand strand

Alulll Junction

of BKV DNA

The sequence is numbered from the Hind Ill side. The upper line is the sequence of the early strand DNA, and the lower line is that of the late strand. The arrows indicate the restriction endonuclease cleavage sites for various enzymes. The line indicates the recognition site of Mbo II restriction endonuclease. The ATG triplets are marked in boxes and represent the possible translational initiation codons, all of which are in the same reading frame.

cell 356

BKV[nr][

pAAGCTTTTCTCATTAAGGGAAGATTTCCCCAGGCAGCTCTTTC

SV40

pATGCCTTTCTCATCAGAGGAATATTCCCCCAGGCACTCCTTTC

ifnr!

10

fe

20

30

40

AAGGCCTAAAAGGTCCATGAGCTCCATGGATTCTTCCCTGTTA ~r”l~r~fl”r-lfl~T AAGACCTAGAAGGTCCATTAGCTGCAAAGATTCCTC 50

70

60

90

100

130

A*

140

Hindlll

80

- C 110

120 160

150

130 180

220

GCCTCTIGCAT

140

190

150

200

160 230

170

210

170 240

180 250

AAIATAAAAAAAATTAGTCAGCCATGGGGCG 190

200

210

220

Figure 14. Comparison of the Nucleotide Sequence from the Early Strand of BKV of Hind Ill-C DNA with the Early Strand of SV40 DNA near the Hind II+III-A,-C Junction The boxes indicate the homology between the two viral DNA. The vertical arrow indicates the Hind Ill cleavage site in SV40, and the horizontal arrow indicates the direction of Hind II+IIIA and -C fragments of SW0 DNA.

Nucleotide

Sequence

of BKV Replication

Origin

357

(1977). To obtain label at only one end, the fragments were either cut with an appropriate endonuclease or strand-separated on polyacrylamide gels as described by Maxam and Gilbert (1977). Strand separation of fragments smaller that 60 nucleotides was performed by two-dimensional homochromatography, high voltage electrophoresis with cellogel strips at pH 3.5 in the first dimension and thin-layer homochromatography (HE 7% unhydrolyzed RNA) in the second dimension (Maniatis. Jeffrey and Kleid, 1975). After the fragments were obtained with only one end labeled, direct DNA sequencing was performed by the method of Maxam and Gilbert (1977). A few nucleotides near the 5’ ends were sequenced by partial digestion with snake venom phosphodiesterase. followed by two-dimensional homochromatography or a one-dimensional high voltage electrophoresis on DEAE paper at pH 3.5 and 1.7 (Maniatis et al., 1975; Subramanian et al., 1977a). From the ratios of the mobilities of the oligonucleotides to the mobility of the xylene cyanol dye, it was possible to confirm some of the sequences near the 5’ end.

RNA Sequencing BKV DNA was cleaved with Hind Ill restriction endonuclease, and the Hind Ill-C fragment was purified on 4% polyacrylamide gels. Gels of approximately 40 cm in length were run for 16 hr at 150 V in standard buffer. This fragment was further cleaved with Hae Ill restriction endonuclease and eight fragments were obtained (see Figure 1). The five largest fragments were transcribed with E. coli RNA polymerase as described (Lebowitz et al., 1971) in the presence bf the four ribonucleotide triphosphates. one of which was labeled with &*P (spec. act. 100-200 Ci/mmole). The in vitro transcripts were purified and digested with either Tl or pancreatic RNAase. and two-dimensional homochromatograms were then run as d,escribed by Brownlee and Sanger (1969).

Acknowledgment8

Figure 15. Comparison of for (a) the BKV DNA Origin 242 (Figure 14) and (b) between Positions 122-216

We thank Dr. Sherman Weissman for his help in the initial experiments; Myles Brown, Shoshana Segal. Franz-Josef Ferdinand and Yosef Aloni for helpful discussions; and Michael Lewis and Monika Konig for excellent technical assistance. A portion of the cell lysates used in these experiments was supplied by University Laboratories, Inc., through the Office of Program Resources and Logistics, Viral Cancer Program, National Cancer Institute. The costs of publication of this article were defrayed in part by

the Double Hairpin Model Structures of Repliqation between Positions 146the SV40 DNA Origin of Replication (Figure 14)

The limits for the origin of replication for SV40 are based on the information obtained from deletion mutants (Shenk, 1977; T. Shenk and K. Subramanian, personal communication; C. Cole, personal communication). The dashes indicate the palindromic

BKV

sequence, BKV and pairing.

and the solid SV40 origins

lines indicate of replication.

the homology between the The dots indicate base

AUG GAU AAA GUU CUU AAC AGG GAA GAG UCC AUG GAG CUC AUG GAC CUU UUA GGC glu val I eu asn leu met I eu I eu asp IYS ar!3 glu cllu ser met asp 9lY leu gin AUG GAU AAA GUU UUA AAC AGA GAG GM UCU UUG CAG CUA AUG GAC CUU CUA GGU

met

SW0

CUti GAA AGA GCU GCC UGG GGA AAU CUl! CCC UUA AUG AGA AAA GCU I eu

!3lu

erg

ala ser

ala

trp

glY

as”

I eu ile

Pro

I eu

met

arg

IYS

ala

CUU GAA AGG AGU GCC UGG GGG AAU AUU CCU CUG AUG AGA AAG GCA Figure 16. Early Strand-Specific RNA Sequence for BKV and SV40 Extending from 99 Nucleotides in a Counterclockwise Potential Translational Initiation Codon at Approximately 0.65 Map Units Between the two sets of sequences are the putative polypeptides for which they encode. See text for discussion.

Direction

from

a

Cell 356

the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 USC. Section 1734 solely to indicate this fact. Received

September

9,1977;

revised

December

Maniatis. T., Jeffrey, A. and Kleid, D. (1975). Proc. Nat. Acad. Sci USA 72, 1184-I 166. Mantyjarvi, R. A., Meurman. 0. H., Vilna, L. and Berglund, 8. (1973). Ann. Clin. Res. 5, 283-287. Mason, D.A. and Takemoto, K. K. (1976). J. Virol. 77, 106&1062. Maxam, A. and Gilbert, W. (1977). Proc. Nat. Acad. Sci. USA 74, 557-564. Maxam, A. M., Tizard, R., Skryabin, K. G. and Gilbert, W. (1977). Nature 267, 643-645. Mullarkey, 0. H., Hruska, J. F. and Takemoto, K. K. (1974). J. Virol. 13, 1014-1019. Nathans. D. and Danna, K. J. (1972). Nature New Biol. 236, 20& 202. Newell, N., Lai. C.-J., Khoury, G. and Kelly, T. J., Jr. (1977). J. Virol.. in press. Osborn, J. E., Robertson, S. M.. Padgett, 6. L., Walker, D. L. and Weisblum, B. (1976). J. Virol. 19, 675-684. Padgett, B. L., Walker, D. L., RuRhein, G. M.. Eckroede, R. J. and Dessel, R. (1971). Lancet i. 1257-1260. Portolani. M., Bardanti-Brodano, G. and LaPlaca, M. (1975). J. Virol. 75, 420-422. Pribnow, D. (1975). Proc. Nat. Acad. Sci. USA 72, 764-788. Proudfoot, N. J. (1977). Cell 70, 559-570. Radloff, R., Bauer, W. and Vinograd, J. (1967). Proc. Nat. Acad. Sci. USA 57, 1514-1521. Reed, S. I. and Alwine, J. C. (1977). Cell 11, 523-531. Robertson, f-f. D., Dickson, E. and Dunn, J. J. (1977). Proc. Nat. Acad. Sci. USA 74, 822826. Rosenberg, M. and Kramer, R. A. (1977). Proc. Nat. Acad. Sci. USA 74, 964-988. Rosenberg, M., Weissman, S. and decrombrugghe, 8. (1975). J. Biol. Chem. 250, 4755-4764. Sambrook, J., Sugden. B., Keller, W. and Sharp, P. A. (1973) Proc. Nat. Acad. Sci. USA 70. 3711-3715. Shah, K. V., Daniel, R. N. and Worszawski, S. (1973). J. Infect. Dis. 128. 764-787.

1,1977

Rekrsnces Aloni. Y., Dhar, R., Laub, 0. Horowitz, M. and Khoury, G. (1977). Proc. Nat. Acad. Sci. USA, in press. Berget. S. M.. Moore, C. and Sharp, P. A. (1977). Proc. Nat. Acad. Sci. USA, in press. Brown, R. D. and Brown, D. D. (1976). J. Mol. Biol. 702, l-14. Brownlee, C. G. and Sanger, F. (1969). Eur. J. Biochem. 11, 395408. Celma, M. L.. Dhar, Ft., Pan, J. and Weissman, S. (1977a). Nucl. Acids Res.. in press. Celma, M., Pan, J. and Weissman, S. (1977b). J. Biol. Chem., in press. Chow, L. T., Gelinas, R. E., Broker, T. R. and Roberts, R. J. (1977). Cetl.12, l-6. Danna, K. J. and Nathans. D. (1972). Proc. Nat. Acad. Sci. USA 69, 3fX17-3100. Dhar, R., Zain, B. S., Weissman, S. M. and Pan, J. (1974a). Proc. Nat. Acad. Sci. USA 71, 371-375. Dhar, R., Subramanian. K. N., Zain, B. S., Pan, J. and Weissman, S. M. (1974b). Cold Spring Harbor Symp. Quant. Biol. 39, 153160. Dhar, R., Subramanian, K. N., Pan. J. and Weissman. S. M. (1977a). J. Biol. Chem.252, 368-376. Dhar, R., Subramanian, K. N.. Pan, J. and Weissman, S. M. (1977b). Proc. Nat. Acad. Sci. USA 74, 827-831. Efstratiadis, A., Kafatos, F. C. and Maniatis, T. (1977). Cell 70, 571-565. Fareed, G. C., Garon, C. F. and Salrman, N. P. (1972). J. Virol. 10, 484-491. Ferguson, J. and Davis R. W. (1975). J. Mol. Bioi. 94, 135-149. Field, A. M., Gardner, D. S., Goodbody, R. A. and Woodhouse, M. A. (1974). J. Clin. Pathol. 27, 341-347. Forget, 8. G., Marotta, C. A., Weissman, S. M. and Cohen-Solal. M. (1975). Proc. Nat. Acad. Sci. USA 72, 3614-3618. Gardner, S. D. (1973). Br. Med. J. i. 77-78. Gardner, S. D.. Field, A. M., Coleman, D. V. and Hulme. B. (1971). Lancet i. 1253-1257. Gilbert, W. (1976). In RNA Polymerase. R. Losick and M. Chamberlain, eds. (Cold Spring Harbor, New York: Cold Spring Harbor Laboratory Press), pp. 193-295. Hirt, B. (1967). J. Mol. Biol. 26, 365-369. Howley, P. M., Khoury, G., Byrne, J. C., Takemoto, K. K. and Martin, M. A. (1975). J. Virol. 16, 959-973. Khoury, G., Martin, M. A., Lee, T. N. H., Danna, K. and Nathans, D. (1973). J. Mol. Biol. 78, 377-389. Khoury, G., Howley, P. M., Garon. C., Mullarkey, M. F., Takemoto, K. K. and Martin, M. A. (1975a). Proc. Nat. Acad. Sci. USA 72, 2563-2567. Khoury, G., Howley, P., Nathans, D. and Martin, M. A. (1975b). J. Virol. 75, 433-437. Klessig, D. F. (1977). Cell 72, 9-21. Lai. C.-J. and Nathans, D. (1974). J. Mol. Biol. 89, 179-193. Lai, C,J. and Nathans, D. (1975). J. Mol. Biol. 97, 113-118. Lebowitz, P., Weissman, S. M. and Radding, C. (1971) J. Biol. Chem. 246, 5120-5139. Major, E. 0. and diMayorca, G. (1973). Proc. Nat. Acad. Sci. USA 70,3216-3212.

Shenk, T. (1977). J. Mol. Biol. 173, 503-515. Simmons, D. and Martin, M. (1977). Proc. Nat. Acad. Sci. USA, in press. Sklar. J., Yot, P. and Weissman, S. M. (1975). Proc. Nat. Acad. Sci. USA 72, 1817-1821. Subramanian, K. N., Dhar, R. and Weissman, S. M. (1977a). J. Biol. Chem. 252, 355-367. Subramanian, K. N., Reddy, V. B. and Weissman, S. M. (1977b). Cell 10, 497-507. Takemoto, K. K. and Mullarkey, M. F. (1973). J. Virol. 12, 625 631. Takemoto, K. K., Rabson, A. S., Mullarkey, M. F., Blaese. R. M., Garon, C. F. and Nelson, D. (1974). J. Nat. Cancer Inst. 53, 12051207. Tomizawa, J., Ohmori, tf. and Bird, R. E. (1977). Proc. Nat. Acad. Sci. USA 74, 18651889. Valenzuela. P., Bell, G. I., Masiarez, F. R., DeGennaro, L.-J. and Rutter. W. J. (1977). Nature267, 641-643. van der Noordaa, J. (1976). J. Gen. Virol. 30, 371-373.

.

Nucleotide sequence of the DNA replication origin for human papovavirus BKV: sequence and structural homology with SV40.

Cell, Vol. 13, 345-358, February 1978, Copyright 0 1978 by MIT Nucleotide Sequence of the DNA Replication Origin for Human Papovavirus BKV: Seque...
11MB Sizes 0 Downloads 0 Views