Cell, Vol.

14,695711,

Structure

July 1976,

Copyright

0 1976 by MIT

of the Adenovirus

Arnold J. Berk and Phillip A. Sharp Center for Cancer Research Department of Biology Massachusetts Institute of Technology Cambridge, Massachusetts 02139

Summary We have defined the structure of adenovirus 2 (Ad2) cytoplasmic RNAs produced during the early phase of infection. Hybrids between cytoplasmic RNA and DNA restriction fragments of the viral genome were digested with endonuclease Sl or exonuclease VII, and the products were analyzed by gel electrophoresis. Seven abundant cytoplasmic RNAs (assumed to be mRNAs) were identified, and all have a spliced structure. Different mRNAs produced from a single transcriptional unit contain extensively overlapping sequences, and differ from each other by the pattern in which genome sequences are spliced together. The structures of the early Ad2 mRNAs are consistent with a model for mRNA biosynthesis in which an initial transcript is processed into a mature mRNA by “splicing out” internal sequences. The pattern of spliced mRNAs produced from the early region responsible for the transforming activity of Ad2 resembles the splicing pattern of the oncogenic early mRNAs of simian virus 40 (SV40). This fact, in conjunction with recent DNA sequencing results, leads us to suggest that, like the SV40 tumor antigens, the polypeptides encoded by these Ad2 mRNAs have an identical amino acid sequence at their N terminal ends, but have different C terminal sequences.

2 Early mRNAs

mic RNAs synthesized during the early phase of infection with Ad2. These RNAs are probably synthesized entirely as a result of host cell functions. Of thirteen cytoplasmic RNAs transcribed from four regions of the Ad2 genome, eleven have a spliced structure. These findings demonstrate that functions required for the synthesis of these spliced RNA molecules exist in uninfected cells, and that splicing is, in fact, a general feature of eucaryotic mRNA structure. The early mRNAs of Ad2 are transcribed from four regions of the viral genome (Sharp, Gallimore and Flint, 1974; Pettersson, Tibbetts and Philipson, 1976). From analysis of pulse-labeled nascent chains and studies of the sensitivity of early mRNA synthesis to ultraviolet irradiation, it appears that each of these early regions has at least one independent promoter for the initiation of transcription (Berk and Sharp, 1977a; Evans et al., 1977). In this work, we show that a cytoplasmic RNA from an early region typically has a sequence at its 5’ end which maps onto the genome in a segment believed to contain the promoter for transcription of that early region. The cytoplasmic RNAs from a single early region often contain extensively overlapping sequences. The differences between these RNAs consist of relatively subtle alterations of the pattern in which genome sequences are spliced together. These findings are consistent with a model for early Ad2 mRNA biosynthesis in which each early region is initially transcribed into a long RNA. The initial transcript is then processed into a mature mRNA by removal or “splicing out” of internal sequences (Berget et al., 1977b; Klessig, 1977).

Results Introduction Strategy and Terminology Berget, Moore and Sharp (1977) have shown that the sequences comprising the mRNA for the major adenovirus 2 (Ad2) virion polypeptide, hexon, are encoded by three separate regions of the viral genome, separated from each other by as much as 8 kilobases. Other late Ad2 mRNAs have been shown to have a similar structure (Chow et al., 1977b; Klessig, 1977) and are referred to as being “spliced .” These observations have profoundly influenced theories concerning the mechanisms of mRNA biosynthesis and regulation of gene expression in eucaryotes, since it appears that this type of spliced structure may be a general one for eucaryotic messenger RNAs (Breathnatch, Mandel and Chambon, 1977; Brock and Tonegawa, 1977; Jeffreys and Flavell, 1977; Tilghman et al., 1978). This paper defines the structure of viral cytoplas-

For the purposes of this paper, we define “colinear transcripts” or “co-transcripts” to be the sequences within an RNA molecule which are transcribed from a contiguous segment of a DNA genome. A spliced RNA molecule is composed of two or more co-transcripts joined at points defined as “splice points.” Biochemical methods have been described for accurately mapping the regions of viral genomes which encode co-transcripts present at low cellular concentrations (Berk and Sharp, 1977b). In those studies, co-transcripts 350 nucleotides or longer, found in the cytoplasm of cells during the early phase of infection with Ad2, were detected by virtue of their ability to hybridize to contiguous segments of 32P-labeled DNA, and thereby to protect these hybridized sections of DNA from degradation by the single strand-specific en-

Cdl 696

donuclease Sl. Here we describe the patterns in which these co-transcripts are joined at splice points. Figure 1 depicts the structure of a hybrid between a spliced RNA molecule and genome DNA. RNA-DNA duplex results over the lengths of the cotranscripts comprising the spliced RNA (a and b in the figure). Single-stranded DNA extends beyond the 5’ and 3’ ends of the spliced RNA, and a loop of single-stranded DNA occurs in the hybrid at the splice point in the RNA. The single-stranded DNA loop is formed by the intervening sequence in the genome which occurs between the sequences present in the RNA co-transcripts (length c in Figure 1). When the hybrid structure is digested with the single-strand-specific endonuclease Sl, the single-stranded DNA is hydrolyzed. The resulting structure consists of two specific singlestranded DNA fragments, of lengths a and b, hybridized to the RNA molecule. In the experiments reported here, the RNA in such hybrids is unlabeled and the DNA is labeled with 32P. When such an Sldigested hybrid is analyzed by electrophoresis on a neutral gel, a single major band is observed following autoradiography of the gel. The band migrates at the mobility expected for aduplex DNA molecule equal in length to the spliced RNA molecule-that is, equal to length a + b. If the Sl-digested hybrid is denatured and analyzed by electrophoresis, two bands are observed to migrate with the mobilities of the two RNA co-transcripts (lengths a and b). The hybrid structure shown in Figure 1 can be analyzed further following digestion with the single-strand-specific exonuclease, exonuclease VII of E. coli. This exonuclease digests single-stranded DNA processively from both 5’ and 3’ ends, but ceases digestion at a duplex region (Chase and Richardson, 1974). Thus the product of digestion with exonuclease VII is a hybrid molecule in which a continuous DNA strand of length a + b + c is hybridized to the spliced RNA over the sequences a and b. When this structure is denatured and analyzed by electrophoresis, a single radioactive band is observed migrating at the rate expected for a single-stranded DNA molecule of length a + b +

n, d/ A

l---a---l-

*

b

Figure 1. Diagram of a Hybrid between and the Coding Strand of Genome DNA

a Spliced

DNA RNA

RNA Molecule

The spliced RNA (wavy line) is composed of two colinear transcripts (a and b) joined at a splice point (A). An intervening sequence (c) occurs in the DNA between the sequences complementary to a and b.

c. By hybridizing cytoplasmic RNA isolated from infected cells to 32P-labeled restriction fragments of Ad2 DNA, and analyzing the structure of the Sl and exonuclease VII digestion products on neutral and denaturing gels, it is possible to deduce the structure and mapping coordinates of spliced viral RNAs (Berk and Sharp, 1978). Data obtained by these methods confirm and extend previous mapping results for the early Ad2 co-transcripts, and define the structure of eleven spliced early Ad2 mRNAs produced from the four early regions. Early Region 1 Early region 1 maps between -1.5 and -11.5 map units (Berk and Sharp, 1977b; Chow et al., 1977a), and is transcribed in the rightward direction (Sharp et al., 1974; Pettersson et al., 1976). Expression of early genes from this region of the genome is required for transformation by Ad2 (Graham et al., 1974; Williams, Young and Austin, 1974; Gallimore et al., 1975; Flint et al., 1976; Harrison, Graham and Williams, 1977; Graham, Harrison and Williams, 1978). Complementation studies with temperature-sensitive and host-range mutants of the closely related virus adenovirus 5 (Ad5) suggest that three early gene functions map in this region (Harrison et al., 1977). The data required to deduce the structure of the early cytoplasmic RNAs encoded in this region are obtained from the gels shown in Figures 2a-2c. Table 1 lists the lengths of the major bands observed on the denaturating and native gels of Sl- or exonuclease VII-digested RNADNA hybrids. The map positions of five co-transcripts in early region 1, as determined in previous work (Berk and Sharp, 1977b), are represented by heavy lines above the genome map in Figure 2d. Data from exonuclease VII digestion of RNA-DNA hybrids and analysis of Sl digestion products on neutral gels allow us to determine how these co-transcripts are linked together in spliced RNA molecules. For example, exonuclease VII digestion of RNA-DNA hybrids formed between early cytoplasmic RNA and Barn I-B, which includes the entire sequence of early region 1, results in a single-stranded DNA molecule 1200 nucleotides in length (Figure 2a, track 2). All but approximately 25 nucleotides of this exonuclease VII-resistant DNA are contained in Hpa I-E (Figure 2a, track 3). When the hybrid to Hpa I-E is digested with Sl and resolved on a neutral agarose gel, RNA-DNA hybrids migrating at 1050 and 900 nucleotides are observed (Figure 2c, track 2). With the results of the previous data mapping the positions of the co-transcripts, we conclude that the 660 nucleotide co-transcript is linked to the 375 nucleotide co-transcript, and the 485 nucleotide co-transcript is linked to the 375 nucleotide co-transcript in the covalently continu-

Structure 697

Table

Figure

of Adz Early

1. Mobilities

2a

mRNAs

of Bands

in Figure

2

Track

Fragment

Digestion

Observed8

Expected*

1

Barn I-B

Sl

1600

1850

660

660

405b

485

375b 2

5900 3600

3500

2450

2400

1200

1050

Hpa I-E

Exo VII

1150

1025

4

Hoa I-C

Sl

1850

1850

465

465

5

Hpa I-C

Exo VII

2450

2400

6

Sma I-E

Sl

1650

1850

373

375

7

Sma I-E

Exo VII

2350

2300

8

Bgl II-E

Si

1550

1700

660

660

10

11

2b

Exo VII

3

9

Figure

Barn I-B

375 -

1

2

3

4

5

Bgl II-E

Hind Ill-F + G

Hind III-F + G

Sma I-J

Sma I-J

Hpa I-E

Bgl II-B

Bgl II-B

Exo VII

Sl

Exo VII

Si

Exo VII

Sl

Sl

Exo VII

465

405

375b

375

2850

2800

1550

1700

1200

1050

22ow

23OP

19ow

19off

16OW

16OW

1150

1050

660

660

485

485

2350

2300

2200

2100

1900

1900

1600

1600

1200

1050

1150

1050

500

500

475

475

510

500

485

475

660

660

475

475

350

350

450

450

160

160

800

735

Cell 696

Table

Figure

1 -Continued

2c

Track

Fragment

Digestion

6

Sma I-E

Sl

1

Barn I-B

Hpa I-E

Hpa I-C

Si

Si

Sl

Observed”

Expecteda

375

375

330

330

2300

2300

1800

1850

1100

1035

950

850

1050

1010

900

825

2300

2300

1800

1850 1715

Bgl II-E

Sl

1650 1100

1035

Sma I-E

Sl

2150

2125

1800

1850

Hind III-F + G

Si

2150

215oE

195@

195oE

16OoE

16OoE

1150

1050

1100

1035

950

850

B Lengths are listed in nucleotides. Under “expected,” we have listed the lengths of Sl- and Exo VII-resistant DNA calculated from the map coordinates of the proposed RNAs shown in Figure 2d. These values are in good agreement with the observed values taken from the gels shown in Figure 2. We could find no other scheme for the arrangement of RNAs in this region which agreed with the observed lengths of Sl- and Exo VII-resistant DNAs. b Molecular weights of these bands were also determined by electrophoresis on acrylamide gels. c These bands are generated by hybridization to Hind III-F and result from RNAs encoded in region 4 (see Figure 6).

ous spliced RNA molecules (Figure 2d). From all the data shown in Figure 2, we deduce the structure of four early Ad2 cytoplasmic RNAs mapping in region 1, which accumulate in the cytoplasm of HeLa cells 8 hr after infection in the presence of 20 hg/ml cytosine arabinoside. As discussed above, two cytoplasmic RNAs have 5’ sequences mapping at 1.5kO.07 map units, 500?25 nucleotides to the left of the Sma I-J/E boundary, and 3’ sequences mapping at 4.520.06 map units, 25220 nucleotides to the right of the Hpa I-E/C junction. Since these mRNA molecules have the same 5’ and 3’ sequences, they generate one band of 1200 nucleotides on the alkaline gel of exonuclease VII-digested hybrid following hybridization to Barn I-B (o-30.5 map units), Bgl II-E (O9.4 map units) and Hind Ill-G (O-7.5 map units), and a slightly shorter band following hybridization to Hpa I-E (O-4.4 map units) (Figure 2a, tracks 2, 9, 11 and 3, respectively). These two RNA molecules are approximately 1035 and 860 nucleotides long, respectively, and generate bands of approximately that length (migrating at 1100 and 950 nucleotides

relative to duplex DNA markers) on neutral gels of Sl-treated hybrids following hybridization to Barn I-B, Bgl II-E and Hind Ill-G, and slightly shorter bands following hybridization to Hpa I-E (Figure 2c, tracks 1, 4, 6 and 2, respectively). The end points of the co-transcripts comprising these mRNAs are mapped more precisely relative to restriction endonuclease cleavage sites by acrylamide gel electrophoresis of the denatured products of Sl-treated hybrids (Figure 2b). A third early cytoplasmic RNA in region 1 maps directly to the right of the two mRNAs discussed above, and is also composed of two co-transcripts. The total length of this RNA, 2300 nucleotides, is determined from the neutral Sl gel following hybridization to Barn I-B (Figure 2c, track 1). It is composed of two co-transcripts of 1850 and 485 nucleotides, as observed by electrophoresis of denatured Sl-treated hybrids (Figure 2a, tracks 1, 4 and 6; Figure 2b, track 4). The 5’ and 3’ sequences present in this mRNA are separated by 2400 nucleotides along the genome, as indicated by exonuclease VII digestion of the hybrids (Figure 2a,

Structure 699

of Adz Early

mRNAs

a. 1234567

8 910111213

36002450, 2350’ 2200-

123456

7

C

ls5%= 1200, 1150’ 660485375-

b.

I

Figure 2. Autoradiograms Deduced RNA Structures

2

3

of Sl-

4

and

5

Exo

VII-Digested

6

7

RNA-DNA

Hybrids

of Early

Region

1 Cytoplasmic

RNA,

and

Diagrams

of the

(a) Alkaline agarose gel of Si- and Exo VII-digested early RNA restriction fragment hybrids. Tracks 1, 4, 6, 8 and 10 are Sl digestion products of early RNA hybridized to Barn I-B, Hap I-C, Sma I-E, Bgl II-E and Hind III-F+G, respectively. Tracks 2,3, 5,7.9 and 11 are Eco VII digestion products of early RNA hybridized to Barn I-B, Hpa I-E, Hpa I-C, Sma I-E, Bgl II-E and Hind III-F+G, respectively. Tracks 12 and 13 are Hind Ill and Sma I marker digests of Ad2 DNA, respectively. The lengths (in kilobases) of the Ad2 Hind III restriction fragments are: A, 8.23; B, 5.08; C, 3.33; D, 3.19; E, 3.12; F, 2.73; G, 2.63 (F and G co-migrate); H, 2.21; I, 2.03; J. 1.30; K, 0.95. The Ad2 Sma I restriction fragments have lengths of: A, 7.04; B, 6.27 (A and B run as a doublet): C, 5.22; D, 4.24; E. 2.84; F, 2.31; G. 2.21; H. 1.51; I, 1.33; J, 1.05; K, 0.63 (not visible on this exposure). (b) 8 M urea 5-10% acrylamide gel of single-stranded DNA. Tracks 1,3,4 and 6 are Sl products of early RNA hybridized to Sma IJ. Hpa I-E, Bgl II-B and Sma I-E, respectively. Tracks 2 and 5 are Exo VII products of early RNA hybridized to Sma I-J and Bgl II-B, respectively. Track 7 is a marker Hae III digest of SV40 DNA. The lengths (in nucleotides) of the Hae Ill fragments are: A, 1662; 8, 752; C, 540; D, 372; E, 329; F,, 322; Fib, 298; F,, 300; G, 272; H, 193. (c) Neutral agarose gel of Sl-digested early RNA-restriction fragment hybrids. Hybridization was to: Barn l-8 (track 1); Hpa I-E (track 2); Hpa I-C (track 3); Bgl II-E (track 4); Sma I-E (track 5); Hind III-F+G (track 6). Track 7 is a marker digest of Ad2 digested with Sma I, Fragments D, E, F, G, H, I and J are shown. (d) Diagrams of the structure of region 1 cytoplasmic RNAs deduced from the data in (a, b and c). Genome sequences present in an RNA are represented by a line above the corresponding region of the genome map. (The genome map is marked off in map units and kilobases from the left end.) Caret symbols represent the joining of colinear transcripts into one RNA molecule, and the 3’ end of an RNA is indicated by an arrowhead. The more abundant cytoplasmic RNAs are represented by heavy lines, and the less abundant cytoplasmic RNAs by thin lines. Numbers above the lines represent the co-transcript lengths (in nucleotides). Restriction maps relevant to the analysis are represented below the genome map. The Barn I-B fragment is not shown because it spans from O-30.5 map units.

tracks 2 and 5). The left end of the 1850 nucleotide co-transcript is mapped at 4520.2 map units, 1150?55 nucleotides from the Hind III-F/G junction and 1550r80 nucleotides from the Bgl II-E/B junction (Figure 2a, tracks 8 and 10). The right end maps at 9.9&0.005 map units, 160?10 nucleotides to the right of the Bgl II-E/B junction (Figure 2b, track 4). The end points of the 485 nucleotide

transcript are at 10.2?0.05, 330220 nucleotides to the left of the Sma l-E/M junction, and at 11.6?0.1, 155t35 nucleotides to the right of the Sma l-E/M junction (Figure 2b, track 6). The bands migrating at 1800 nucleotides on the neutral agarose gel following hybridization to Barn I-B, Hpa I-C and Sma I-E (Figure 2b, tracks 1, 3 and 5) are due to Sl cutting of the RNA chain at the splice point in a

Cell 700

fraction of the 2300 nucleotide mRNA-DNA hybrids under the Sl digestion conditions used in these studies. Exonuclease VII digestion of these hybrids does not produce a band of this length on the alkaline agarose gel (Figure 2a, tracks 2, 5 and 7), indicating that most or all of the 1850 nucleotide co-transcript is spliced to the neighboring 485 nucleotide co-transcript. Following hybridization of early RNA to Barn I-B DNA and digestion with exonuclease VII, a band is observed on the alkaline agarose gel migrating at 3600 nucleotides (Figure 2a, track 2). Bands are also observed at 2850 and 2350 nucleotides following exonuclease VII digestion of early RNA hybridized to Bgl II-E and Hind Ill-G, respectively (Figure 2a, tracks 9 and 11). This is the result expected for an RNA species which contains 5’ sequences from map position 1.5 and 3’ sequences from map position 11.6-that is, sequences from both ends of early region 1, as well as sequences at the Bgl IIE/B and Hind III-G/C junctions. Thus the 2300 nucleotide RNA appears to be composed of two species: one RNA contains two co-transcripts mapping from 4.5-9.9 and from 10.2-11.6, and the second RNA contains these two co-transcripts plus a third co-transcript with its 5’ end mapping at 1.5. It seemed possible that all of the 2300 nucleotide RNA might contain the 5’ co-transcript mapping at 1.5. If such a co-transcript were sufficiently short or A-U rich, the RNA-DNA hybrid of this presumed co-transcript might be very close it its melting temperature under the conditions of the exonuclease VII digestion. Thus in some molecules, exonuclease VII might be blocked from processive digestion at position 1.5, but in another fraction of molecules, exonuclease VII might digest through these sequences. In this second fraction of molecules, processive digestion by exonuclease VII would be blocked by hybrid at position 4.5. To test this possibility, early RNA was hybridized to Bgl II-E DNA, and the products were digested with exonuclease VII under standard conditions at 45, 35 and 25°C. At each temperature, approximately the same relative fraction of counts remained in the 1550 and 3000 nucleotide bands following hybridization to Bgl II-E and exonuclease VII digestion (Figure 3). (The difference in the observed mobility of the longer Exo VII-protected fragment in this experiment, 3000 nucleotides, and in the previous experiment, 2850 nucleotides, is due to an -5% variation in length estimations determined from alkaline agarose gels.) The production of these two bands is therefore not due to the presence of an RNA-DNA hybrid at position 1.5 which is close to its Tm at 45°C under the conditions of exonuclease VII digestion. Rather, there are probably two separate species of RNAs present

3300 3000

Figure 3. Autoradiogram of an Alkaline Agarose Products of Early RNA Hybridized to Bgl II-E

Gel of Exo

VII

Exo VII digestion was at 45°C for 1 hr. 35°C for 2 hr or 25°C for 4 hr. The increased digestion times at 35 and 25°C were used to maintain a constant amount of exonuclease digestion at each temperature, assuming an approximate 2 fold decrease in rate per 10°C decrease in temperature. The band migrating at 3300 nucleotides is a full-length Bgl II-E fragment which was not digested, probably due to renaturation before Exo VII digestion.

in the early cytoplasmic RNA preparation which lead to the production of these exonuclease VIIresistant single-stranded DNAs. One species has 5’ sequences from position 1.5 and 3’ sequences from position 11.6, whereas the other has 5’ and 3’

Structure 701

of Adz

Early

mRNAs

sequences mapping at 4.5 and 11.6, respectively. The co-transcript at 1.5 map units is probably ~50 nucleotides long because no doublet character to the 2300 nucleotide band is observed in neutral agarose gels of %-treated early RNA-Barn I-B hybrids (Figure 2c, track l), and because no bands ~350 nucleotides are observed on 8 M urea gradient acrylamide gels of the denatured Sl products of early DNA hybridized to left end restriction fragments (Figure 2b, tracks 1 and 3). We estimate that it would be possible to detect both a doublet character to the bands on the neutral Sl gel and a band on the acrylamide gels if this proposed colinear transcript were >50 nucleotides in length.

Early Region 2 Early region 2 of Ad2 and of the closely related virus Ad5 is known to encode a phosphoprotein of molecular weight 72,000 daltons (as estimated by SDS-polyacrylamide gel electrophoresis) which has single-stranded DNA binding activity and is required for viral DNA replication (Ginsberg et al., 1974; Grodzicker et al., 1974; van der Vliet et al., 1975; Lewis et al., 1976; Sugawara, Gilead and Green, 1977). Transcription of this region in Ad2 is known to be in the leftward direction (Sharp et al., 1974; Pettersson et al., 1976), and production of mRNA from this region of the genome is more sensitive to ultraviolet irradiation of purified virions than is mRNA production from any of the other three early regions (Berk and Sharp, 1977b). We have previously mapped two long co-transcripts in region 2 (Berk and Sharp, 1977b). These have lengths of 1600 and 1700 nucleotides, and have the same 3’ end mapping at 61.6r0.1, and 5’ ends mapping at 66.420.2 and 66.720.2, respectively. Studies of the kinetics of ultraviolet inactivation of mRNA production from this region suggest that the promoter for transcription initiation in region 2 maps at approximately 10 map units from the 5’ ends of these colinear transcripts (Berk and Sharp, 1977b). To determine whether sequences from this putative promoter are covalently joined to the long colinear transcripts of region 2, the products of exonuclease VII digestions of early cytoplasmic RNA, hybridized to Sma I-A and Bgl II-C, were analyzed on alkaline agarose gels (Figure 4a, tracks 5 and 7). A prominent high molecular weight band migrating at 4550 nucleotides was observed in these experiments following hybridization to Sma I-A. A prominent band migrating at 3950 was also observed following hybridization to Bgl II-C. These results indicate that sequences from the region in which the promoter had been approximately mapped are indeed spliced to the 5’ ends of mRNAs from region 2. The 5’ end of this sequence

maps at 74.9r0.6, 3950?200 nucleotides to the right of the Bgl II-J/C junction. Splicing does not occur at the 3’ end of these mRNAs because the length of the Bgl II-J fragment protected from exonuclease VII digestion is the same as that protected from endonuclease Sl digestion (Figure 4a, tracks 4 and 8). At least one more splice, however, occurs toward the 5’ end of these mRNAs. This conclusion follows from the observation of a single band migrating at 2450 nucleotides on alkaline agarose gels of exonuclease VII-digested Eco RI-B-early cytoplasmic RNA hybrids (Figure 4a, track 6). The Eco RI-B fragment which maps from 58.5 to 70.7 does not contain a sequence homologous to the 5’ co-transcript of these messages, which maps at 74.9kO.6. Thus the processive digestion of the Eco RI-B I strand hybridized to region 2 mRNAs is initiated at its 3’ end (at 70.7 map units), and proceeds until it is blocked by RNA-DNA hybrid, which occurs at sequence mapping position 68.620.3, 2450 nucleotides to the right of the sequences encoding the 3’ end of region 2 mRNAs (Figure 4a, track 6). The early mRNAs from this region therefore contain a cotranscript comprised of sequences extending in the leftward direction along the genome from position 68.6kO.3. The lengths of the co-transcripts mapping at 74.9kO.6 and at 68.6kO.3 are determined as follows. Analysis of the Sl digestion products of early RNA hybridized to Eco RI-B (58.5-70.7), Sma I-A (56.9-77.0) and Bgl II-C (63.6-77.9) all yield products which migrate on neutral agarose gels, as expected for duplex molecules -200 nucleotides longer than the long co-transcripts identified by analysis on alkaline agarose gels (Figures 4a and 4~). The sum of the lengths of the co-transcripts mapping at 68.6kO.3 and at 74.910.6 must therefore equal -200 nucleotides. Most of this length is accounted for by the co-transcript at 68.6kO.3, since the native Sl products of early RNA hybridized to Eco RI-B (which does not include RNA-DNA hybrid at map position 74.920.6) co-migrate with the native Sl products of early RNA hybridized to Sma I-A (which includes DNA complementary to all identified co-transcripts of the region 2 mRNAs) (Figure 4c, tracks 2 and 3). We estimate that the co-transcript mapping at 74.920.6 is ~50 nucleotides in length. It must be >--I5 nucleotides in length to form a stable hybrid under the conditions of the exonuclease VII digestion. A more accurate estimate of the length of the co-transcript mapping at 68.640.3 is obtained by analyzing the denatured Sl products of early RNA hybridized to Eco RI-B on 8 M urea gradient acrylamide gels (Figure 4b). The Si-generated complement of this co-transcript migrates as expected for a single-stranded DNA

Cell 702

b. 1234

567

8

9

C.

IO

4550 4250 3950 3650 2450

1900

1800

1700

1200 1100

1000

710 650 - 350

350

55

60

L

65

I

I

H

4 Figure RNAs

I

22

20

75

70

I

I

I

I

24

80

I

I Map

26

A

, I A

,

c

I

B

I

D

4. Autoradiograms

I

J

of Gels of Sl- and

F

I

-

I

I

D

I

,

C

I

I

Exo VII-Digested

RNA-DNA

Hybrids

from

Early

Units

28 kb

Region

F

b

Sma I

*

Eco RI

)

Bgl II

2, and the Deduced

Structure

of the

(a) Alkaline agarose gel of: Sl products of early RNA hybridized to Sma I-A (track 1). Eco RI-B (track 2), Bgl II-C (track 3) and Bgl II J (track 4); Exo VII products of early RNA hybridized to Sma I-A (track 5), Eco RI-B (track 6), Bgl II-C (track 7) and Bgl II-J (track 6). Marker Sma I and Hind Ill digests of Ad2 DNA are in tracks 9 and 10, respectively. (b) 6 M urea gradient polyacrylamide gel electrophoresis of Sl products of early RNA hybridized to Eco RI-B (track 1) and marker Hae III digest of SV40 DNA (track 2). (c) Neutral agarose gel of early RNA hybridized to Sma I-A (track 2), Eco RI-B (track 3), Bgl II-C (track 4) and Bgl II-J (track 5). Marker Sma I digest of Ad2 (track 1). (d) Diagram of the deduced structures of the early region 2 mRNAs. The genome is represented by a line marked off in map units and kb from the left end. Heavy lines above the map represent the corresponding genome sequences of the strand which are present in the mRNAs. Thin-lined caret symbols represent the joining of genome sequences at splice points. The arrowhead represents the 3’ end of the RNA, and the lengths (in nucleotides) of co-transcripts are indicated above the heavy lines representing the co-transcripts. Restriction-cut sites and fragments relevant to the analysis are diagrammed below the genome map.

molecule 170&10 nucleotides in length. On the basis of these results, we deduce the structures of early region 2 mFlNAs which are shown in Figure 4d.

A prominent band migrating at 650 nucleotides is observed on alkaline gels of exonuclease VIIdigested hybrid to restriction fragments Sma I-A and Bgl II-C (Figure 4a, tracks 5 and 7). The same

Structure 703

of Adz

Early

mRNAs

band is observed following exonuclease VII digestion of these denatured restriction fragments when they are not hybridized to RNA. It results from the presence of a short inverted repeated sequence in this region of the genome. The repeated sequence is separated by 650 nucleotides. This feature of the Ad2 sequence was first noted in electron microscopic studies of single-stranded Ad2 DNA by Wu, Roberts and Davidson (1977). These investigators mapped the “structural feature” at approximately 73 map units. By electron microscopy, we have mapped these inverted repeated sequences more precisely relative to the ends of the Bgl II-C restriction fragment. The repeated sequences (cl00 nucleotides long) map at positions 72.740.4 and 74.3t0.3. Bands are observed on the alkaline agarose gels of exonuclease VII digested hybrid to Sma I-A and Bgl II-C, which migrate at 4250 and 3650 nucleotides, respectively (Figure 4a, tracks 5 and 7). We believe that these bands result from hybridization of degraded region 2 mRNAs to these restriction fragments. Exonuclease VII digestion of the hybridized DNA strands in these hybrids is blocked at the left end by RNA-DNA hybrid to the 3’ end of the region 2 mRNAs, and at the right end at position 74.3 by the duplex stem of the loop formed by the inverted repeat. Exonuclease VII digestion of hybrids formed with fragmented DNA strands would yield DNA segments with right ends at positions 68.6. This process would explain the occurrence of bands migrating at 2450 and 1750 nucleotides following exonuclease VII digestion of RNA hybridized to Sma I-A and Bgl II-C, respectively (Figure 4a, tracks 5 and 7). In addition, minor bands migrating between 2000 and 3000 nucleotides are observed in this experiment following hybridization to Bgl II-C and exonuclease VII digestion (Figure 4a, track 7). These may be due to incomplete digestion by exonuclease VII, since corresponding exonuclease VII-resistant DNAs were not observed in the experiment with Sma I-A. In control experiments, 32P-labeled Bgl II-C single strands (63.6 to 77.9 map units) were hybridized to an excess of Eco RI-B single strands (58.5 to 70.7 map units). Digestion of these hybrids with exonuclease VII and analysis of the products by alkaline agarose gel electrophoresis indicate that digestion is indeed blocked at the right end by the inverted repeat at 74.3, and at the left end by the DNA-DNA hybrid region at 63.6. This exonuclease VII-resistant product co-migrates on alkaline agarose gels with the minor band migrating at 3650 nucleotides, resulting from exonuclease VII digestion of Bgl II-C-region 2 mRNA hybrids (data not shown). In addition, in the control exonuclease VII digestion, a band was observed migrating at the mobility expected for a fragment mapping from

63.6 (the left end Bgl II-C cleavage site) to 70.7 (the right end Eco RI-B cleavage site). This could result from exonuclease VII digestion of hybrid between an Eco RI-B strand and a fragment of the Bgl II-C strand in which the inverted repeat was lost. These results are consistent with the proposed explanations for the production of the minor bands observed in the region 2 exonuclease VII experiments. The relative intensities of the 1600 and 1700 nucleotide bands observed on alkaline gels, and of the 1800 and 1900 nucleotide bands observed on neutral gels following Sl treatment of region 2 hybrids, varied between different early RNA preparations, indicating that the relative concentrations of the mRNAs from region 2 varied. Figure 4a shows an alkaline agarose gel of Sl-treated hybrids using an RNA preparation which had very low concentrations of the 1600 nucleotide co-transcript. It is noteworthy that only a 1700 nucleotide co-transcript is detected in this region of Ad5infected HeLa cells (T. Harrison, unpublished results).

Early Region 3 Early region 3 is transcribed in the rightward direction (Sharp et al., 1974; Pettersson et al., 1976), and maps between roughly 76 and 86 map units (Berk and Sharp, 1977a; Chow et al., 1977a). Large portions of the rightward part of this region are not required for lytic growth of adenoviruses in tissue culture, since they are substituted by SV40 DNA in nondefective Ad2-SV40 hybrid viruses (Kelly and Lewis, 1973) and deleted in recently selected viable Ad5 deletion mutants (Jones and Shenk, 1978). Viable deletion mutants extending to the left of -79 map units, however, have not been obtained (Jones and Shenk, 1978). We have previously determined the map positions of five co-transcripts in region 3 (Berk and Sharp, 1977a). An -350 nucleotide co-transcript was found to have its 5’ end between 76 and 77 map units. 1500 and 2400 nucleotide co-transcripts have their 3’ ends at 83.550.3 and 86.OkO.3 map units, respectively. In addition, two co-transcripts were observed to have a 5’ end at 76.8-cO.2, possibly the same position as the 5’ end of the -350 nucleotide transcript, and had 3’ ends coincident with the 3’ ends of the 1500 and 2400 nucleotide co-transcripts. These co-transcripts are observed following electrophoresis of denatured Sl-treated hybrids to Sma I-C (Figure 5a, track 1; Figure 5b, track 1). Acrylamide gel electrophoresis provides a more precise length measurement of the short cotranscript from this region, 340+20 nucleotides [Figure 5b, track 1; also results of hybridization to Bgl II-C (data not shown)]. The results of exonuclease VII analysis of cytoplasmic RNAs from this region demonstrate that

Cdl 704

the 340 nucleotide transcript is spliced to the 5’ end of both the 1500 and 2400 nucleotide cotranscripts. Only two bands are observed following exonuclease VII treatment, and these co-migrate with the 2200 and 3100 nucleotide bands generated by Sl (Figure 5a, track 3). This indicates that the low abundance 2200 and 3100 nucleotide transcripts do not have a spliced structure but exist as unspliced transcripts. The structures of the spliced cytoplasmic RNAs from this region are confirmed by determination of total RNA lengths by neutral agarose gel electrophoresis of Sl-treated Sma I-C hybrids. Bands are seen at approximately 1850 and 2750 nucleotides, as predicted by the splicing pattern described above (Figure 5, track 1). Again, as expected, bands equal in length to the unspliced transcripts of lengths 2200 and 3100 nucleotides are observed (Figure 5c, track 1). The 5’ ends of the 1500 and 2400 nucleotide cotranscripts are mapped at 79.1 kO.006, 275?20 nucleotides to the left of the Hind III-H/L junction (Figure 5b, track 1). The 5’ end of the 340 nucleotide co-transcript is mapped at 76.820.2, 1100?60 nucleotides to the left of the Hind III-H/L junction (Figure 5a, tracks 2 and 4). The deduced structures of the region 3 cytoplasmic RNAs are depicted in Figure 5d.

Early Region 4 Region 4 is believed to encode at least two early viral proteins which have been identified in tissue culture cells during the early phase of viral infection: an 11,000 dalton protein and a less prominent 19,000 dalton protein (Lewis et al., 1976). Transcription of region 4 is in the leftward direction (Sharp et al., 1974; Pettersson et al., 1976). We have previously mapped four co-transcripts in region 4. All have their 3’ end at 91 .l kO.4 map units, 350+125 nucleotides to the left of the Sma l-C/G junction (Berk and Sharp, 1977a). The most abundant of these is 1900 nucleotides long. Three less abundant transcripts have lengths of 1500, 1600 and 2300 nucleotides. The 1500 nucleotide co-transcript is not reproduceably observed in the experiments reported here, but the other three co-transcripts are observed following Sl treatment and alkaline gel electrophoresis of hybrids to Eco RI-C (Figure 6a, tracks 1 and 2). Following exonuclease VII digestion of Eco RI-C hybrids, only a single band is observed migrating at 2800 nucleotides (Figure 6a, track 4). This observation indicates that the three co-transcripts observed in these studies all have sequences spliced to their 5’ ends which map 2800+150 nucleotides to the right of their 3’ ends-that is, at 99.120.6 map units. When early RNA is hybridized to restriction fragments Hpa I-D

(85.0-98.5) and Sma I-G (91.9-98.2), and the hybrids are digested with exonuclease VII, the products co-migrate with the products of Sl digestion on alkaline agarose gels (Figure 6a, tracks 5 and 6). This indicates that there are no splices in the early RNA sequences mapping between 85.0 and 98.5. Thus we conclude that the three region 4 early RNAs observed here are composed of two cotranscripts each: the 1900, 2300 and 1600 nucleotide co-transcripts having 3’ ends mapping at 91 .1+0.3 are each spliced to a co-transcript mapping at 99.1 r0.6. No short co-transcripts were detected on a 525% gradient acrylamide gel following Sl digestion of Eco RI-C hybrids. We conclude that the cotranscript(s) at the 5’ end of the region 4 RNAs is (are) cl00 nucleotides in length. These co-transcripts mapping at 99.120.6 are likely to be at least 15 nucleotides in length to form stable hybrid under the conditions of the exonuclease VII digestion. The short length of the 5’ co-transcripts on these RNAs is confirmed by the finding that the products of Sl digestion of hybrids to Eco RI-C (which includes the 5’ co-transcript sequence) and Hpa I-D (which does not) co-migrate on a neutral agarose gel (Figure 6b, tracks 1 and 2). We estimate that we could have detected a difference in molecular weight of 50 nucleotides. We therefore conclude that the 5’ co-transcript(s) spliced at the 5’ ends of the region 4 RNAs is ~50 nucleotides in length and maps at 99.9-r-0.6 map units (Figure 6~).

Relative Abundance

of the Early RNAs

To determine the relative abundance of the early cytoplasmic RNAs, total Ad2 DNA was hybridized to early cytoplasmic RNA. The products of Sl digestion were resolved by electrophoresis on a neutral agarose gel, along with the native Sl products of separate hybridizations to restriction fragments encompassing each of the four early regions (Figure 7). Each hybridization was performed with the same molar concentration of Ad2 sequences, and Ad2 DNA was present in an approximately 10 fold molar excess over the most abundant viral RNA. Because the DNA is present in excess during this hybridization, the relative intensities of the bands in Figure 7 reflect the relative abundance of the early Ad2 cytoplasmic RNAs. Approximately the same relative intensities of bands on a gel of Sltreated hybrids were observed in a second experiment, in which the hybridization was performed with a 10 fold greater Ad2 DNA concentration than used here (data not shown). In this second experiment, hybridization proceeded for CotllP x 10, or to >99% completion. Thus the relative intensities of the bands in Figure 7 reflect the relative concentrations of the RNAs which are represented by

Structure

of Adz Early mRNAs

705

b.

a. 1234

5

d.

76

340

n

340

n

I

2400

t

1500

78

27

I

80

b I

28

84

I

I

29

A , I , L I

,

E

I

of Gels of Sl- and Exo VII-Digested

Early

RNA-DNA

Hybrids

86

I I

Map Units

30 kb

C H

Figure 5. Autoradiograms the RNAs

82

I

I

from

Early

Region

)

SmaI

)

Hind III

3, and the Deduced

Structures

of

(a) Alkaline agarose gel of: the Sl products of early RNA hybridized to Sma I-C (track 1) and Hind III-H (track 2); the Exo VII products of early RNA hybridized to Sma I-C (track 3) and Hind III-H (track 4). Track 5 is a marker Sma I digest of Ad2. There is a faint band in track 1 migrating at -1200 nucleotides which we have not interpreted because a corresponding band has not been observed in hybridizations to other restriction fragments. (b) 8 M urea gradient polyacrylamide gel of the Sl products of early RNA hybridized to Hind Ill-H (track 1) and a marker Hae Ill digest of SV40 DNA (track 2). (c) Neutral agaross gel of the Sl products of early RNA hybridized to Sma I-C (track 1) and Hind Ill-H (track 2), and of a marker Sma I-A digest of Ad2 DNA. The faint band migrating at 2.21 kb in track 2 is due to a small fraction of renatured Hind Ill-H DNA. (d) Diagram of the deduced structure of the region 3 cytoplasmic RNAs. The relevant restriction maps are shown below the genome map. Co-transcripts comprising the cytoplasmic RNAs are represented by lines above genome sequences included in the co-transcripts. Bold lines represent abundant cytoplasmic RNAs, and thin lines represent less abundant cytoplasmic RNAs. Caret symbols connecting two cotranscripts indicate that they are joined at a splice point into one covalently continuous RNA chain, and arrowheads indicate the 3’ ends of the RNAs. The lengths in nucleotides of the co-transcripts are indicated by numbers above the bold and narrow lines.

these bands, and not differences in the hybridization kinetics of various regions of the genome under the conditions of the analysis.

Examination of Figure 7 reveals that the region 2 and 3 cytoplasmic RNAs are the most abundant early cytoplasmic RNAs 8 hr post-infection in the

Cell 706

a.

123

456

b.

78

123

2800

4

2300 1950 1650 1350

2300 1900 1600 1300

C.

I

1900

I 31

92 I

-20 94 I

I

I 33

I

I

98 I

100

I

{ Map Units

34

35kb

I Eco RI

D

4

Figure 6. Autoradiograms Structures of the RNAs

96 I

I

C

t

4

AZ0

2300

4 90 1

-50

1600

4 4

C

I of Gels

I

G of Sl

and

Exo

VII Products

G

1I

K

I

of Region

4 Early

RNA-DNA

Hybrids,

and

Hpa I Sma I

Diagrams

of the

Deduced

(a) Alkaline agarose gel of Sl and Exo VII products of RNA-DNA hybrids. Sl products of early RNA hybridized to Eco RI-C (track l), Hpa I-D (track 2) and Sma I-G (track 3). Exo VII products of hybridizations to Eco RI C (track 4), Hpa I-D (track 5) and Sma I-G (track 6). Tracks 7 and 6 are markers of Ad2 DNA digested with Sma I and Hind Ill, respectively. (b) Neutral agarose gel of Sl products of early RNA hybridized Eco RI-C (track I), Hpa I-D (track 2) and Sma I-G (track 3). Marker Sma I digest of Ad2 DNA (track 4). The material migrating between 1650 and 1950 nucleotides in track 2 was not reproducibly observed. (c) Deduced structures of the early region 4 cytoplasmic RNAs presented as in Figure 2.

presence of cytosine arabinoside. The next most abundant mRNAs are transcribed from early region 1, and the least abundant are transcribed from early region 4. These results are in good agreement with earlier studies, which employed analysis of aqueous hybridizations to separated DNA strands to enumerate the number of early cytoplasmic

RNAs transcribed from each of the four regions accumulating in HeLa cells 8 hr post-infection (Flint and Sharp, 1976). The approximate fraction of total early cytoplasmic RNA represented by each of the detected RNA species was determined by densitometry of the autoradiogram in Figure 7, and the results are presented in Table 2.

Structure 707

of Adz Early

mRNAs

Discussion Structure of the Early Ad2 Cytoplasmic

RNAs

We have defined the general structure of the early Ad2 cytoplasmic RNAs by biochemical methods outlined earlier (Berk and Sharp, 1978). The structures of these RNAs encoded in early regions 1, 2, 3 and 4 (from left to right along the genome) are summarized in Figures 2d, 4d, 5d and 6c, respectively. In these figures, the genome sequences incorporated into an RNA are represented by a line above the genome map. Continuously transcribed sequences, called “co-transcripts,” which are joined into one covalently continuous RNA chain, are connected by a caret symbol. Abundant RNAs are represented by bold lines, and RNAs present at lower cellular concentrations are represented by thin lines. All the abundant stable cytoplasmic RNAs, which we assume to be mRNAs, have a spliced structure.

Biosynthesis of the Early mRNAs

Figure

7. Relative

Abundance

of the Early Cytoplasmic

RNAS

Early cytoplasmic RNA was hybridized to an excess of 32P-labeled Ad2 DNA or restriction fragments encompassing each of the early regions. The resulting hybrids weredigested with Sl and resolved by electrophoresis on a neutral agarose gel. The bands observed on the autoradiogram result from Sl-resistant RNA-DNA hybrids which migrate as expected for the full-length cytoplasmic RNAs. Hybridization of early RNA was to: full-length Ad2 (track 1); Barn I-B (early region 1, track 2); Sma I-A (early region 2, track 3); Sma I-C (early region 3, track 4); or Eco RI-C (early region 4, track 5). Each band present in track 1 co-migrates with a band in one of

Studies of the transcription and structure of viral nuclear RNA during the late phase of infection (Bachenheimer and Darnell, 1975; Berget et al., 1977b; Goldberg, Weber and Darnell, 1977) led us to suggest that mature mRNAs are produced by post-transcriptional splicing (Berget et al., 1977b)that is, a long initial transcript is processed into message by removal of specific internal sequences and ligation at splice points of the resulting RNA chains. Available data concerning the biosynthesis of the early Ad2 mRNAs are also consistent with the post-transcriptional splicing model. The model predicts that the size of the initial transcript from which an mRNA is processed will be at least as long as the length of genome DNA spanned by the sequences encoding the co-transcripts of that mRNA. The model also predicts that the order in which co-transcripts are spliced will be the same as the order in which they map on the genome. Direct sizing of pulse-labeled nuclear RNA molecules transcribed from the four early regions (Craig and Raskas, 1976), and studies of the kinetics of inactivation of early mRNA production by ultraviolet irradiation of purified virions (Berk and Sharp, 1977a), indicate that the sizes of the initial transcripts from each of the early regions are equal to the lengths predicted by the post-transcriptional splicing model (Figures 2d, 4d, 5d and 6~). In addition, the order in which co-transcripts are spliced together is the same as the order in which these early cytoplasmic RNAs. and is equal to the relative intensity of the corresponding band in track 1. The relative concentrations of the cytoplasmic RNAs from any one early region are equal to the relative intensities of the bands in the track from that early region (tracks l-5).

Cell 700

Table

2. Relative

Region

RNA

1

2366 1035 860

2

3

% Total RNA

of the Early Early

Ad2 Cytoplasmic

RNAs

mRNAs produced from a transcriptional be further controlled by the regulation functions.

unit may of splicing

Relative Molarity

% Region

8.8

0.23

56

Comparison

5.6

0.33

35

1.4

0.10

The results reported here are in general agreement with the results of the electron microscopic study of early Ad2 cytoplasmic RNA-genome DNA hybrids published while this work was in progress (Kitchingman, Lai and Westphal, 1977). In particular, the electron microscopic structure of mRNA-DNA hybrids indicated that region 2 mRNAs are composed of three co-transcripts fused at two splice points, and that region 3 mRNAs are composed of two cotranscripts fused at one splice point. The resolution of the electron microscopic techniques used, however, did not allow definition of the two distinct cytoplasmic RNAs encoded in region 2, nor of the four cytoplasmic RNAs encoded in region 3. In addition, some splices in the region 1 mRNAs were not detected, because the intervening sequences between splice points in region 1 mRNAs form single-stranded DNA loops in mRNA-genome DNA hybrids which are too small to be readily visualized in the electron microscope. The structures of the early Ad2 mRNAs presented here are in excellent agreement with earlier estimates of the approximate lengths and mapping coordinates of early Ad2 mRNAs (Craig, Zimmer and Raskas, 1975; Buttner, Veres-Molnar and Green, 1976; Flint, 1977; Wilson et al., 1978). Differences [particularly the failure to detect a 900 nucleotide mRNA composed of two 450 nucleotide co-transcripts from the right half of early region 1 detected by Kitchingman et al. (1977) and in late RNA by Chow et al. (1977b)] are probably due to differences in viral RNAs expressed in cytosine arabinoside as opposed to cycloheximide-treated cells (Wilson et al., 1978). In our studies, early mRNA was prepared from cells treated with cytosine arabinoside to prevent entry into the late phase of infection. Using our methodology, we find (T. Harrison et al., unpublished results) that the early RNAs produced by wild-type Ad5 in the presence of cytosine arabinoside are identical to those produced in cells infected at the nonpermissive temperature with the early temperature-sensitive Ad5 mutant, ts 125 (Ensinger and Ginsberg, 1972). In addition, the region 1 early Ad2 mRNAs detected here are identical to the region 1 early mRNAs detected in Ad2 transformed cells (A. Berk, unpublished results). The deduced structures of the early Ad2 mRNAs (Figures 2d, 4d, 5d and 6c) may be underestimates of the complexity of the splicing patterns present in these mRNAs. It is possible that additional cotranscripts are present in the early mRNAs which

8.9

1900

26

0.63

46

1700

26

1 .oo

52

3100

0.4

0.0076

2750

1.7

0.036

2200

3.6

0.10

16

0.59

75

1650 4

Abundance

18

2300

0.4

0.011

1900

5.6

0.16

1600

0.5

0.019

1.4 7.2

6.6 85 0.3

RNAs are listed by their lengths in nucleotides. Values are derived from the densitometry of the gel shown in Figure 7. Relative masses of the RNA species are listed under “%.” The relative molarities of each of the species are also shown.

they map on the viral genome. Furthermore, the map positions of the 5’ co-transcripts of the early mRNAs are approximately coincident with the sites of initiation of transcription (Berk and Sharp, 1977a; Evans et al., 1977). Thus splicing of the early Ad2 mRNAs may frequently conserve the 5’ sequences of the initial transcript. The early Ad2 mRNAs are synthesized by host cell functions. Since this is the case, the enzymatic activities required to produce the variety of splicing patterns found in the early Ad2 mRNAs must preexist in the uninfected cell. It is not surprising, therefore, that many eucaryotic cellular mRNAs have been found to have a spliced structure (Mandel and Chambon, 1977; Brack and Tonegawa, 1977; Jeffreys and Flavell, 1977; Tilghman et al., 1978). Some possible benefits of a spliced gene structure have been presented (Gilbert, 1978). In the case of the early Ad2 genes, splicing may have a role in the regulation of gene expression. As discussed above, initial transcripts may be processed into one of several possible mRNAs as the result of splicing. Factors such as the primary structure of an initial transcript and the activity of enzymes which produce a particular splice would be expected to determine the relative numbers of the possible mRNAs produced from a transcriptional unit. The activities of such enzymes might be regulated. Thus regulation of gene expression in Ad2 may occur at two levels. First, initiation of transcription may be regulated as in the switch from the early to the late phase of infection. Second, once transcription occurs, the number of each of the possible

with Previous Work

Structure 709

of Adz

Early

mRNAs

are too short to form stable hybrids with genome DNA under the conditions of the biochemical analysis. Such very short co-transcripts (~15 nucleotides in length) would not be detected by the methods we used. In general, we detect more cytoplasmic RNAs from a particular region of Ad2 than necessary to account for the identified early polypeptides encoded in these regions (Lewis et al., 1976). In early region 1 we detect three mRNAs, while only two polypeptides of 15,000 and 55,000 daltons have been shown to map in this region. We detect two largely homologous mRNAs in region 2, which has been identified to encode only a single polypeptide of 72,000 daltons. In region 3 we detect four RNAs where one 15,500 dalton polypeptide has been mapped, and in region 4 we detect three RNAs where two polypeptides of 11,000 and 19,000 daltons have been mapped. While we do not know that all of the observed cytoplasmic RNAs are mRNAs, the early RNA data suggest that there are more early viral encoded polypeptides than have been detected thus far.

Similarity of mRNAs from the Oncogenic Regions of Ad2 and SV40 The structures of the mRNAs detected from early region 1 are shown in Figure 2d. It is interesting to note that there is a similarity between the splicing pattern of these mRNAs and the splicing pattern of the early mRNAs of SV40 (Berk and Sharp, 1978). This similarity is especially noteworthy in view of the impact of the early functions encoded in these regions on the physiology of mammalian cells. Considerable evidence has accumulated to indicate that early SV40 functions are responsible for transformation by this virus. Similarly, expression of early functions encoded in region 1 are required for transformation by Ad2 (Graham et al., 1974; Williams et al., 1974; Gallimore et al., 1975; Harrison et al., 1977; Graham et al., 1978). Two viral gene products, large and small T antigens, are produced during the early phase of SV40 infection and in SV4Otransformed cells. These polypeptides have molecular weights of 94,000 and 17,000 daltons, respectively, as determined by SDS-polyacrylamide gel electrophoresis (Prives et al., 1977; Rundell et al., 1977). Similarly, two spliced mRNAs of total length 2500 and 2200 nucleotides are produced in these cells (Figure 8; Berk and Sharp, 1978). These two mRNAs have the same 5’ and 3’ termini but differ in their pattern of splicing. The 2500 nucleotide mRNA has a 630 nucleotide co-transcript spliced to a 1900 nucleotide co-transcript, while the 2200 nucleotide mRNA has a 330 nucleotide co-transcript joined to a similar or identical 1900 nucleotide co-transcript

(Figure 8). Studies of the gene products produced by SV40 deletion mutants (Crawford et al., 1978), analysis of the large and small T antigen polypeptide sequences (Paucha et al., 1978), and analysis of the sequence of SV40 DNA (Subramanian, Reddy and Weissman, 1977; Thimmappaya and Weissman, 1977) demonstrate the following scheme for the translation of these mRNAs. The 17,000 dalton polypeptide is translated from the 2500 nucleotide mRNA by initiation near the 5’ end of the 630 nucleotide co-transcript and termination at known termination codons near the 3’ end of this co-transcript. The 94,000 dalton polypeptide is synthesized from the 2200 nucleotide mRNA by initiation at the same AUG as the small T antigen mRNA, but in this case, translation can proceed through to the end of the 1900 nucleotide cotranscript because the termination codons near the splice point in the 2500 nucleotide mRNA are not present in the 2200 nucleotide mRNA. The two early SV40 gene products therefore share the same N terminal amino acid sequence, but have different C terminal sequences. The 860 and 1035 nucleotide mRNAs encoded in Ad2 region 1 have a splicing pattern which is similar to the SV40 early mRNAs in that these two Ad2 mRNAs share the same 5’ sequence. The longer of the two is composed of a 5’ colinear transcript 660 nucleotides long, which is spliced to a co-transcript which maps immediately adjacent to it on the Ad2 genome. The 5’ colinear transcript of the shorter of these mRNAs is again spliced to this same sequence. The similarity in the structure

2:

Ad

405 A -CT_1 660

(T) 1 375

--

sv

40:

330 /q

1900

T b

630

TA

1900

Figure 8. Similarity in the Splicing Pattern and Relative Position of Translation Initiation and Termination Codons between the Ad2 Early Region 1 Cytoplasmic RNAs and the Early mRNAs of Simian Virus 40 The approximate positions of translation termination codons in the early SV40 mRNAs are represented by the symbol T. Presumed termination codons in the early Ad2 mRNAs are represented by the symbol (T). The location of these presumed termination codons follows from the DNA sequence studies of H. van Ormondt and J. Maat (personal communication).

Cell 710

of these Ad2 mRNAs and the early SV40 mRNAs extends to the level of DNA sequence as follows. In the SV40 sequence, translation termination codons exist in all three reading frames just to the 5’ side of the splice in the longer mRNA (Thimmappaya and Weissman, 1977). Similarly, in the Ad2 sequence, translation termination codons occur in all three reading frames just to the 5’side of the splice point in the 1035 nucleotide mRNA (J. Maat and l-i. van Ormondt, personal communication), and these termination codons are removed from the 860 nucleotide mRNA due to the splicing pattern. The similarity in structure between the region 1 Ad2 mRNAs and the early SV40 mRNAs suggests that these Ad2 mRNAs may be translated in a manner analogous to the SV40 mRNAs. This argument therefore postulates the existence of two early Ad2 proteins encoded in region 1 which have an identical N terminal sequence. Furthermore, the similar splicing pattern suggests that these two unrelated viruses have evolved a similar regulatory mechanism which operates at the level of splicing to control the expression of early functions involved in the process of transformation by these viruses. It remains to be determined whether this similarity in the organization of genetic information in the oncogenic regions of SV40 and Ad2 is reflected in a similarity in the activities of the proteins encoded by these regions. Experimental

Procedures

Preparation of High Specific Activation 32P-Labeled Restriction Fragments of Ad2 DNA JzP-labeled Ad2 DNA with a specific activity of l-3 x Iv cpm/pg was prepared and digested with restriction enzymes as described (Berk and Sharp, 1977b). Restriction fragments were resolved by electrophoresis as described (Berk and Sharp, 1977b), and recovered from gel slices by electroelution. Enzymes Eco RI, Barn I, Sma I, Hind III and Bgl I were purified in our laboratory. Bgl II was purchased from New England BioLabs. Endonuclease Sl was purified by the method of Vogt (1973). Later in the course of this work, Sl purchased from Boehringer-Mannheim Biochemicals was used. Exonuclease VII of E. coli was a gift from Stephen Goff.

ization were rapidly diluted into 10 vol of 0°C 0.25 M NaCl, 1 mM ZnSOI, 30 mM NaOAc, 5% glycerol, 20 pg/ml denatured salmon sperm DNA (Vogt, 1973), and digested with 5 units (as defined by Vogt. 1973) per ml of endonuclease Sl for 30 min at 45°C. Digestion was terminated by adding IO pg of yeast RNA and 2 vol of ethanol. Ethanol precipitates were collected by centrifugation and subjected to gel electrophoresis. Exonuclease VII digestion of RNA-DNA hybrids was as described (Berk and Sharp, 1978). The products of the 80% formamide hybridization were diluted into 10 vol of 0.03 M KCI. 0.01 M Tris (pH 7.4), 0.01 M Nar EDTA, chilled to 0°C. An amount of E. coli exonuclease VII, purified by the method of Chase and Richardson (1974) and sufficient to digest 2 pg of thermally denatured linear SV40 DNA to 95% acid solubility in 60 min at 45°C in this buffer, was added per ml of exonuclease VII reaction mix; the solution was then incubated at 45°C for 60 min. The reaction was terminated by the addition of NaCl to 0.1 M and the addition of 2 vol of ethanol. The ethanol precipitate was collected by centrifugation and subjected to alkaline agarose gel electrophoresis (McDonell, Simon and Studier, 1977) or electrophoresis on 8 M urea gradient acrylamide gels. Gel Electrophoresis Alkaline agarose gels were run as described (McDonell et al., 1977). Neutral agarose gels and 8 M urea gradient acrylamide gels were as described (Berk and Sharp, 1978). Prior to electrophoresis on acrylamide gels, ethanol precipitates were dissolved in 0.1 M NaOH. 0.01 M EDTA, and incubated at 68°C for30 min. Samples were chilled and diluted with 1 vol of water and tracking dyes before layering. Electron Microscopy The isolated Bgl II-C fragment of Ad2 was denatured in 95% formamide at 37°C for 10 min, diluted to 38% formamide and spread on 10% formamide at 23°C (Wu et al., 1976). Grids were stained with uranyl acetate and rotary-shadowed with Pt-Pd. Molecules were photographed on 35 mm film, printed and measured with a Neumonics digitizer. Acknowledgments A. J. B. thanks the Helen Hay Whitney Foundation for a postdoctoral fellowship. P. A. S. is the recipient of an American Cancer Society career development award. This work was partly supported by a grant from the American Cancer Society and a core grant. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 USC. Section 1734 solely to indicate this fact. Received

April 3, 1978;.revised

May 3,1978

References Isolation of Cytoplasmic RNA RNA was isolated from cells arabinoside 8 hr post-infection RNA was isolated as described

treated with 20 pg/ml cytosine with 20 pfu per cell. Cytoplasmic (Berk and Sharp, 1977b).

80% Formamide Hybridization Hybridization according to the method of Casey and Davidson (1977) was performed as described (Berk and Sharp, 1977b). In general, hybridizations were in 100 ~1 with 5-10 mg/ml of cytoplasmic RNA and 3 @g/ml of Ad2 DNA, or the equivalent molarity of a restriction fragment. 66 ~1 were digested with Sl endonuclease. 33 PI were digested with exonuclease VII. Endonuclease Sl and Exonuclease Hybrids Si digestion of RNA-DNA hybrids Sharp, 1977b). Briefly, the products

VII Digestion

of RNA-DNA

was as described (Berk and of the 80% formamide hybrid-

Bachenheimer. S. and Darnell, USA 72, 4445-4449.

J. E. (1975).

Berget, S. M., Moore, C. and Acad. Sci. USA74, 3171-3175.

Sharp,

Proc.

Nat. Acad.

P. A. (1977a).

Proc.

Sci. Nat.

Berget, S. M., Berk, A. J., Harrison, T. and Sharp, P. A. (1977b). Cold Spring Harbor Symp. Quant. Biol. 42, in press. Berk.

A. J. and Sharp,

P. A. (1977a).

Berk,

A. J. and Sharp,

P. A. (1977b).

Berk, A. J. and Sharp, 1274-l 278. Breathnach, 270, 314-319.

P. A. (1978).

R.. Mandel,

Brock. C. and Tonegawa. 5652-5656.

Cell 72, 45-55. Cell 12, 721-732. Proc.

Nat. Acad.

J. L. and

Chambon,

S. (1977).

Proc.

Sci. USA, 75,

P. (1977).

Nat. Acad.

Nature

Sci. USA 74,

Structure 711

of Adz Early

mRNAs

Buttner, W., Verez-Molnar, 107, 93-l 14. Casey,

Z. and Green,

J. and Davidson,

N. (1977).

Chase, J. W. and Richardson, 4553-4561. Chow, L. T.. Roberts, Cell 17, 619-836.

M. (1976).

Nucl.

Acids

Biol.

Res. 4, 1539-1552.

C. C. (1974).

J. M., Lewis,

J. Mol.

J. Biol.

J. B. and Broker,

Chem.

249,

R. E.. Broker,

Craig, 1202-l

E. A., Zimmer. 213.

S. and

Craig,

E. A. and Raskas,

Raskas,

Roberts,

H. J. (1975).

H. J. (1976).

R. J.

J. Virol.

75,

Cell 8, 205-213.

Crawford, L. V., Cole, C. N., Smith, A. E., Pauch, E., Tegtmeyer, P.. Rundell, K. and Berg, P. (1978). Proc. Nat. Acad. Sci. USA, 75, 117-121. Ensinger,

M. J. and Ginsberg,

H. S. (1972).

J. Virol.

R. M., Fraser, N., Ziff, E.. Weber, J. E. (1977). Cell 72, 733-739.

Flint,

S. J. (1977).

Flint,

S. J. and Sharp,

J. Virol.

Gilbert,

W. (1978).

J. Mol.

J., Williams,

Gallimore, P. H., Sharp, Biol. 89, 49-72.

P. A. and

Nature271,

Biol. 106, 749-771.

J. and Sharp, Sambrook,

S., Weber,

P. A. (1976).

J. (1975).

J. Mol.

501.

Ginsberg, H. S., Ensinger, M. J.. Kauffman, and Londholm, U. (1974). Cold Spring Harbor 39, 419-426. Goldberg, 621.

M. and

23, 44-52.

P. A. (1976).

Flint, S. J., Sambrook, Virology 72, 456-470.

IO, 328-339.

J.. Wilson,

J. and Darnell,

R. S., Mayer, Symp. Quant.

J. E.. Jr. (1977).

A. J. Biol.

Cell 10,617-

Graham, F. L., Abrahams, P. J., Mulder, C., Heijnecker, H. L.. Warnaar, S. O., de Vries, F. A. J.. Fiers, W. and van der Eb, A. J. (1974). Cold Spring Harbor Symp. Ouant. Biol. 39, 637-650. Graham, 10-21.

F. L., Harrison,

Grodzicker,T., Cold Spring Harrison, 329.

J. (1978).

F. and Williams,

A. J. and Flavell,

Jones,

N. and Shenk,

Kelly,

T. J. and Lewis,

J. (1977).

R. A. (1977).

T. (1978).

D. F. (1977).

86,

J. (1974).

Virology

77, 319-

Cell 72, 1097-1108.

Cell 13, 181-188.

A. M. (1973).

J. Virol.

72, 643-652.

Kitchingman, G. R., Lai, S.-Pu and Westphal, Nat. Acad. Sci. USA 74, 4392-4395. Klessig,

Virology,

Williams, J., Sharp, P.A. and Sambrook, Harbor Symp. Quant. Biol. 39, 439-446.

T., Graham,

Jeffreys,

T. and Williams,

H. (1977).

Proc.

Cell 12, 9-21.

Lewis, J. B., Atkins, J. F., Baum, P. R., Solem, F. and Anderson, C. W. (1976). Cell 7, 141-151. McDonell. M. W., Simon, Biol. 170, 119-146.

M. N. and Studier,

F. W. (1977).

Paucha, E., Mellor, A., Harvey, R. and Smith, Nat. Acad. Sci. USA 75, in press. Pettersson, U., Tibbetts, 107, 479-501.

C. and Philipson,

R., Gesteland,

L. (1976).

R.

J. Mol.

A. E. (1978).

Proc.

J. Mol. Biol.

Prives. C., Gilboa, E., Revel, M. and Winocour, Nat. Acad. Sci. USA 74, 457-461.

E. (1977).

Rundell, K., Collins, J. K., Tegtmeyer, P., Ozer, and Nathans, D. (1977). J. Virol. 27, 636-646.

H. L., Lai. C. J.

Sharp, Harbor

P. A., Gallimore, P. H. and Flint, Symp. Quant. Biol. 39, 457-474.

Subramanian. K. N., Reddy, Cell 70, 497-507. Sugawara, 346.

K., Gilead.

S. J. (1974).

V. 8. and Weissman.

Z. and Green,

M. (1977).

Proc.

Cold Spring S. M. (1977).

J. Virol.

S. M. (1977).

van der Vliet, P. C.. Levine, A. J., Ensinger, H. S. (1975). J. Virol. 75, 348-354.

T. R. (1977a).

T. R. and

B. and Weissman.

27, 338-

Cell 77, 837-843.

Tilghman, S. M.. Tiemeier, D. C., Seidman, J. G., Peterlin, B. M , Sullivan, M., Maizel. J. V. and Leder, P. (1978). Proc. Nat. Acad. Sci. USA 75, 725-729.

Vogt,

Chow, L. T., Gelinas, (1977b). Cell 12, l-8.

Evans, Darnell,

Thimmappaya,

V. M. (1973).

Eur. J. Biochem.

M. J. and Ginsberg,

33, 192-200.

Williams, J. F.. Young, H. and Austin, Harbor Symp. Quant. Biol. 39, 427-437.

P. (1974).

Wilson, M. C., Sawicki, S. G., Salditt-Georgieff, E. (1978). J. Virol. 25, 97-103. Wu, M., Roberts, 777.

R. J. and Davidson,

N. (1977).

Cold

Spring

M. and Darnell. J. Virol.

J.

27, 766-

Structure of the adenovirus 2 early mRNAs.

Cell, Vol. 14,695711, Structure July 1976, Copyright 0 1976 by MIT of the Adenovirus Arnold J. Berk and Phillip A. Sharp Center for Cancer Rese...
7MB Sizes 0 Downloads 0 Views