VIROLOGY

186, 280-285

(1992)

The Sequence

of the Genome of Adenovirus Type 5 and Its Comparison with the Genome of Adenovirus Type 2

JADWIGA CNRS,

URA

CHROBOCZEK,

1333, lnstitut Max Grenoble Received

FRANK BIEBER,

Von Laue-Paul Langevin, Outstation, 156X, 38042 June

AND

BERNARD

and European Molecular Grenoble Cedex, France

19, 199 1; accepted

September

JACROT Biology

Laboratory,

25, 199 1

We report the sequence of 7558 nucleotides of the adenovirus type 5 genome. With this sequence and previously published data, the complete sequence of this genome is now available and can be compared with the already known sequence of the adenovirus type 2 genome. These two serotypes belong to the same subgroup and sequence comparison shows 94.7% homology between the two genomes. The differences are not at all randomly distributed. Transitions between C and T and between A and G account in total for 58.3% of the differences and even for 68.6% for the genome devoid of the fiber and the hexon genes (instead of 33O/0 expected for an equal probability of changes). In the fiber gene the transitions account for 47% of the differences. The detailed analysis of the nucleotide substitution between the two genomes suggests that the Ad2 genome could derive from that of Ad5 one, with the exception of the fiber gene which is likely to be present in Ad2 genome as a result of genetic recombination. The homology between the amino acids o 1992 Academic sequences of the structural proteins varies from 1 OOYo (proteins pVll and IX) to only 69.2% for the fiber. Press, Inc.

MATERIAL

INTRODUCTION Nucleotide

The human adenovirus constitute a large family with 42 serotypes identified so far. These serotypes are usually classified in seven subgroups essentially on the basis of DNA homology (Green et a/., 1979) determined by methods like restriction mapping or hybridization. The homology is above 50% within a subgroup but less than 25% between subgroups. The complete sequence of the genome of the serotype 2 has been determined (Roberts et al., 1984) and partial sequences of biologically important parts of the genome of several other serotypes can be found in the data banks. Adenovirus type 5 (Ad5) is a serotype belonging to the same group as adenovirus type 2 (Ad2). The parts of the Ad5 genome that were previously sequenced include the left 32% (see Van Ormondt and Galibert, 1984) the region of the hexon gene (Kinloch et al., 1984) the early E3 transcription unit (Cladaras and Wold, 1985) the regions of the fiber gene (Chroboczek and Jacrot, 1987) and of the penton base gene (Neumann et a/., 1988) the region between coordinates 60 and 72% (Kruijer et a/., 1980, 1981) and finally part of the right-hand extremity (see Van Ormondt and Galibert, 1984). We have completed the sequence of the 35,935 bp which constitute the genome of Ad5 and a comparison can be made with the 35,937 bp of Ad2. 0042-6822192

$3.00

CopyrIght Q 1992 by Academtc Press, Inc. All rights of reproducton I” any form resewed.

AND METHODS

sequencing

The DNA used for sequencing was isolated from bacteria carrying either the plasmid Bluescribe Ml 3+, into which we inserted the appropriate Ad5 DNA fragments (map coordinates 27.3-46.6 and 45.1-52.6, respectively) or the plasmid pFG23 (map coordinates 60-100) a kind gift of Frank Graham. Nucleotide sequencing was carried out according to the method of Sanger (1977) by direct plasmid sequencing; sometimes subcloning into M13tg130 and M13tg131 (Amersham) was used. The enzyme used for sequencing was Sequenase from USB. The appropriate oligonucleotide primers were synthesized in EMBL (Heidelberg) and in CENG (Grenoble). The complete sequence was determined on both strands.

Sequence

comparison

The DNA and proteins sequences were analyzed and compared with the help of the program library of the University of Wisconsin Genetics Computer Group.

RESULTS Nucleotide

AND DISCUSSION

sequence

We have sequenced Ad5 DNA between 32and38%(2158bp),45and52%(2470bp),72and 280

coordinates

SEQUENCE

81 61

OF

ADENOVIRUS

TYPE

5 GENOME

281

~SVEKKDSLTAPSEFATTAS~~PTTFPVEAPPLEEEEVIIEQDPGWSEDDEDRS E F VPTEDKKQDQDNAEANKEQVGRGDEpJ,WYLDVWDVLLKHLQMCAIICDALQERSDVp D Q

121

LAIADVSLAYERHLFSPRVPPKRQENGTCEPNPRLNFYPVP

181

LSCPANRSRADKQLALRQGAVIPDIASLNEVpKIFEGLGRDEKRAANALQQENSENESHS D

241

GVLVELEGDNARLAVLKRSIEVTHFAYPALNLPPKVMSTVMSELIVPRAQPLERDANLQE

C

b

1

HnQDATDPAVRMLQSQPSGLNSTDDWRQVMDRI"SLTARNpDAFR‘QPQANRLSAILKA

61

VVPARANPTHEKVLAIVNALAENRAI~DEAGLVYDALLQRVARYNSGN"QTNLDRLVGD

R 301

QTEEGLPAVGDEQIJ.RWLQTREPADLEEp.RKL"NAAVLVTVELECMQRFFADPS"QRKLE E

361

ETLHYTFRQGYVRQACKISNVELCNLVSYLGILHENRLGQNVLHSTLKGWLRRDWRDCV

121

VP.EAVAQRERAQQQGNLGSMVALNAFLSTQPANVPRGQEDYTNFVSALRLMVTETPQSEV

421

YLFLCYTWQTAHGVWQQCLEECNLKELal(LLKQNL~LWTAFNERSV~H~IIFPERL R

181

YQSGPDYFFQTSRQGLQTVNLSQAFKNLQGLWGVMPTGDRATVSSLLTPNSRLLLLLIA

481

LKTLQQGLPDFTSQSMLQNFRNFILERSGILPATCCALPSDNPIKYRECPPPLWGHCYL

241

PFTDSGSVSPDTYLGHLLTLYPXAIGQAHVDEHTFQEITSVSPAIGQEDTGSLEATLNYL

541

LQLANYLAYHSDIMUIVSGDGLLECKCRCNLCTPHRSLV~SQL~ESQIIGTFEL~PS S

301

LTNRRQKIPSLHSLNSEEERILRYVPQSVSLNL"RDG"TPSVALDMTARN"EPGmASNR

601

PDEKSMPGLKLTPGLWTSAYLRlWPEDYHAHEIWYEDY~EI~YEDQSRPPNAELTACVITPGHIL

361

PFINRL"DYLHRAAJ+VNPEYFTNAILNPHWLPPPGFYTGGFEVpEGNDGFLWDDIDDSVF

661

GQLQAINKARQEFLLRKGRGWLDPQSGEELNPIPPPPQ~YQQQP~~SQDGTQKS.M t

421

SPQPQTLLEL(MREQAEMLRRESFRRPSSLSDLWVVLPRGSLTSTRTT F

A

720

AAMTRGRGGILGQSGRGGFGRGGGGSDGRLGSPRRGSFRGRRGVPJU~TVTLGRIPLAGA D Q *

481

RPRLLGEEEYLNNSLLQPQREKNLPPAFPNNGIESLVDKMSRWKTYAQEHRDVpGPp.pPT

780

PEIGNRFQHGYNLRSSGAAGTARSPTQP S R C

541

RRQRRDRQRGLVWEDDDSADDSSVLDLGGSGNPFARLRPRLGRMF

PROTEIN

801

1OOK

PROTEIN

585

IIIa

FIG. 1. Sequences of adenovirus type 5 structural proteins. The sequence of the unpublished Ad5 structural proteins is given using one letter code. Amino acids which are different in Ad2 are given below the corresponding Ad5 amino acid. A star indicates a missing amino acid In one or the other serotype. Protein pVll which is identical in the two serotype is not shown.

76% (1512 bp), and 92 and 96% (1264 bp). We have also sequenced 154 nucleotides which were missing in the sequence on the right-hand side. Altogether we have sequenced some 7564 nucleotides to obtain the complete sequence of the Ad5 gene. More precisely we have sequenced the following fragments: 11,565 to 13,722 between two HindIll sites. 16,286 to 18,919 between a Sfil site and a Smal site. The sequence determination was stopped at 18,765 as previously published sequence started at 18,618. 25,819 to 27,331 between a site defined by a primer synthetized using the published sequence and the EcoRl site. 33,096 to the right-hand DNA end, starting from a Smal site. The sequence was determined between 33,096 and 34,359 and between 34,700 and 34,859 to cover the missing gaps. The complete sequence is available from GenBank (Accession Number M73260). Ad5 DNA comprises 35,935 nucleotides, two less than Ad2 DNA. The two sequences can be easily aligned without ambiguity except in the E3 region. In

this region an alignment slightly different from the one proposed by Cladaras and Wold (1985) can be used. In all, not including the gaps, there are 1,688 mismatches between the two genomes. The best alignment is obtained with 38 gaps (with lengths ranging between 1 and 36 nucleotides and containing in total 2 13 nucleotides) in the Ad5 sequence and 31 gaps (with lengths between 1 and 69 nucleotides and a total length of 211) in the Ad2 sequence. So adding the mismatches and the gaps there are altogether 2112 differences between the genomes of these two serotypes. This corresponds to 5.34% of the genome or to an homology of 94.7% which is lower than that (99%) estimated from DNA hybridization (Green e2 al., 1979). This discrepancy is not surprising as hybridization followed by digestion with the Sl endonuclease is not sensitive to point mutations or one nucleotide gaps which constitute a large part of the differences between the two genomes. It may be worth noting that a previous estimate, using the available restriction maps (Chroboczek and Jacrot, 1987) namely, 96% homology, was not too far from truth. This indicates that, in the absence of

282

52

CHROBOCZEK,

1 60

JACROT

PRRRVQWKGRRVKRVLRPGTTWFTPGERSTRTYKRVYDEWERLGEFAY R GKRHI(DMLALPLDEGNPTPSLKPVTLQQVLPALAPSEEKRGLKRESGDLRPT~JQLMVPKR T

180

QRLEDVLEKMTVEPGLEPEVRVR~~KQVAPGLGVQTVDV~TQT

240

SPVASAVADMVQAVAAAASKTSTEVQTDPWMFRVSAPRRpRGSRKYGAASAI,LPEYALH A R T

300

PSIAPTPGYRGYTYRPRRRATTRRRTTTGTRRRRRRRQPVLApISVRRVAREGGRTLVLP

360

TARYHPSIV

d

36s

PROTEIN

____-----_______________________________--------------------------

61

AND

H~K~IK~~~~~IAP~I~GPPKKEEQ~~KP~~KRVKKKKK~DDDDELDDEVE~~~TA D

120

1

BIEBER,

V

1 60

MEDINFASLAPRHGSRPFMGNWQDIGTSNMSGGRFSWGSLWSG~KNFGSTVKNYGSKAWN I

HA~KK~LQL.~PP~TD~EE~SQ~~VLD~~EEDWES~DEEASEVE~VSDETPSP l P D *

A

SVAFP~~A~Q~~ATG~~~TT~A~QAPPAL~~RR~NRRWDTTGTRAAHTAP~TA L VP I P

120

AATQKQRR~D~KTLTK~KK~T~~LRLAPNEPVFPQSR V

180

GQEQELKIKNRSLRSLTRSCLYHKSEDQLRRTLEDAEALFSKYCALTLKD

229

SSTGQMLRDKLKEQNFQQKG~SGISGWDYWQAVQNKINSKLDPRPPVEEPPPAV

121

ETVSPEGRGEKRPRPDREETLVTQIDEPPSYEEALKQGLPTTRPIAPMATGVLGQHTPVT

181

LDLPPPADTQQKPVLPGPTAVWTRPSRASLRRAASGPRSLRPVASGNWQSTLNSIVGLG M S

241

VQSLKRRRCF

PROTEIN 33K ---___--____-_____-_____________________--------~-----------------1

250

61

PROTEIN

pV1

1

MSKEIPTPY~SYQPQMGLG~QDYSTRINYMSAGPHHISRYNGIRRHRNRILLEQAA

61

ITTTPRNNLNPRSWBAALWQESPAPTTWLPRDAQAEVaMTRSPG

121

QGITHLTIRGRGIQLNDESVSSSLGL~DGTFQIGGAG~SFTPRQAILTLQTSSSEPRS K S

181

GGIGTLQFIEEFVPSVYFNPFSGPPGHYPDQFIPNFDAVKDSADGYD

PROTEIN

MHPVLRQMRePPQQRQEQEQRQTCRAPSPPPTASGGATSAY~ S

A

DLEEGEGLARLGAPSPERYPRVQLKRDTREAWPR(INLFR H

121

LRHGLNRERLLREEDFEPDARTGISPA~HVARADLVTAYRVRT

181

LVAREEVAIGLMHLWDFVSALEQNPNSKP~QLFLIVQHS~NEAF~AL~I~PEGR

241

WLLDLINILQSIWQERSLSLD~MI~SMLSLGKFYARsGF

301

YMRMUXVLTLSDDLGVYRNERIHKAVSVSRRREI,SDRELM"SLQRALAGTGSGDREAES

361

YFDAGADLRWAPSRRALEAAGAGPGLAVAPRRAGNVGGVEEYDEDDEYEPEDGEY

415

227

pVII1

PROTEIN

52/55K

FIG. 1 -Continued

sequence information, this second method should be prefered to the first one to estimate DNA homology. Comparison of the protein

sequences

The newly determined sequences provide the complete amino acid sequence of several late gene products, so far unknown or partially known. This applies to the following proteins: Illa, V, pVI, pVII, 33K, 1OOK,and the 52/55 K protein. These sequences are given in Fig. 1 together with the corresponding Ad2 products. The sequences of the structural proteins and of the proteins involved in the morphogenesis of the virion of both serotypes can be compared (Table la). It was shown previously (Kinloch et a/., 1984; Chroboczek and Jacrot, 1987) that there are quite large differences

for the two structural proteins exposed at the surface of the virion, namely, the hexon and the fiber. For all the other ones there are altogether 52 amino acids different out of 3,853 (1.35%) between the two serotypes. It is difficult to make a similar analysis with the early nonstructural proteins since not all early products are well identified. Table 1b shows the comparison of a few well-characterized early proteins. The main feature is that for the gene products of the families El, E2, and E4 the differences between the two serotypes are of the order of 1% as for the majority of late proteins. The differences are much larger for the E3 products which have already been analyzed by Cladaras and Wold (1985). Although the E3 family may play a role in modulating the host response to the infection, as this is well

SEQUENCE TABLE

OF

ADENOVIRUS

la

DIFFERENCES BETWEEN THE GENES OFTHE MAIN PROTEINS INVOLVED IN THE ARCHITECTURE OR IN THE MORPHOGENESIS OF ADENOVIRUS TYPE 2 AND 5 Differences in DNA sequences (%) Hexon Fiber Penton llla IX PVl pVlll pVll V 1OOK 23K 33K 52/55K

base

17 27 1.3 0.9 0.24 1.2 3.5 0.67 2.2 3.13 0.65 3.22 1.4

amino

Differences in acid sequences

(%)

13.7 30.8 1.4 0.17 0 1.2 0.88 0 1.63 2.35 0.49 3.95 0.72

Note. The frrst nine proteins are structural proteins and the last four are involved in the morphogenesis of the virion. The proteins pVI, pVII, and pVlll are present in the virion in its mature form after a proteolytic cleavage made by the virus coded 23K protease.

established for the 19K protein (Wold era/., 1985) the E3 family is nonessential, as pointed out by Cladaras and Wold (1985) since the virus can be grown even when the corresponding DNA is deleted. In general, the proteins produced by Ad2 and Ad5, with the exception of the fiber, the hexon, and the E3 products, differ by about 1.5%. If this percentage is the one which exists between the two serotypes in the absence of selective immunological and functional pressure, the probability of differences should follow a Poisson distribution. This does not seem to be the case; the existence of two proteins (pVII and IX) with identical amino acids sequences in the two serotypes is highly improbable with such a distribution. Nor does the differences at the DNA level follow a Poisson distribution. The differences in 1 OOK protein amino acid sequence are somewhat above average (see Table 1). This protein attaches to the hexon and takes part in its folding and transport. Hexon protein shows large differences between these two serotypes (Kinloch et al., 1984). One may speculate that the differences in 1 OOK protein from Ad2 and Ad5 are necessary to accomodate the variations in the hexons. More generally it is probable that due to their function some protein can tolerate more amino acid changes. Comparison

of the genomes

There are no significant differences in the frequency of pairs or triplets of nucleotides. The nucleotide com-

TYPE

5 GENOME

283

position of the two genomes is remarkably similar (Table 2); the number of G and C differs by only four nucleotides out of a total of nearly 20,000. This is a striking fact when one considers that 80% of the differences between the two genomes involves a G or a C. A rough calculation shows that, on the basis of random mutations, one expects a difference larger than that observed by at least one order of magnitude. That mutations between the two genomes are not random is confirmed by the analysis of the 1688 changes of nucleotides as shown in Table 3. The most frequent differences are transitions (mutations between purines or between pyrimidines) between C and T and between A and G which account for 58.39/o of the differences between the two genomes instead of 33% expected if mutations between bases were at random. Indeed, this high proportion of transitions corresponds to what is observed (59.2%) for the mutations in pseudogenes (Li et al., 1984) which are supposed to evolve without selective pressure. In this case the observed proportion of transitional mutations is 59% and the most frequent transitions are C to T and G to A. If one does not include the genes for the hexon and the fiber, transitions represent 68.6% of the differences. A chemical mechanism which favors transitions at the expense of transversions in point mutations has been described by Topal and Fresco (1976). This mechanism is based on intermediate states with non canonical base pairing. Another mechanism increases the frequency of transitions from C to T, namely, the conversion of methylated cytosine to thymine upon deamination (Coulondre et a/., 1978; Razin and Riggs,

TABLE DIFFERENCES

lb

IN SOME OF THE GENE PRODUCTS OF EARLY GENES

amino El 6 products 21K 55K E2B products 105K (polymerase) 87K (terminal protein) IVa2 E2A products 72K (DNA binding protein) E3A products w 19K gp 10.5K E4 products 11K

Differences in acid sequences

1.7 2.42 0.29 1.38 1.11 1.7 17.6 3.5 0

(%)

284

CHROBOCZEK,

BIEBER, TABLE

BASE COMPOSITION Total

A C G T G + C/total Note.

In the last two columns

AND

JACROT

2

OF ADS AND ADS GENOMES Fiber

genome

Ad2

Ad5

Ad2

8342 (23.2%) 10045 (28.0%) 9793 (27.3%) 7757 (21.6%) 0.55205

8367 (23.3%) 10073 (28.0%) 9761 (27.2%) 7734 (2 1.5%) 0.55197

568 (32.5%) 433 (24.8%) 31 1 (17.8%) 437 (25.0%) 0.425

the base composition

is given

for the r-strand

1980). This conversion also increases the frequency of transitions from G to A in one strand as a result of a C to T transition in the other strand. In the absence of selective pressure, these mechanisms should account for the ratio of transitions to transversions. If there is a selective pressure, this pressure will favor the silent or conservative mutations. Silent mutations are largely associated with transitions on the third base of codons and they do not permit escaping from the immunological pressure. Most of the conservative mutations are due to transversions. For instance the conservative substitution between aspartic and glutamic acid or between leucine and isoleucine can result only from transversions. So one expects, and one observes (Nei, 1987), that for genes subject to selective pressure the proportion of transitions is not as large as on genes which mutate without this pressure. However, for globin genes this proportion is still higher TABLE

3

DIFFERENT TYPES OF BASE “MUTATIONS”

BETWEEN ADS AND ADS

Ad2

Ad5

I

II

III

T T T C C C A A A G G G

C A G T A G T C G T C A

284 114 68 257 117 55 111 120 212 61 58 231

65 40 20 70 36 11 45 22 40 18 25 49

54 45 24 59 32 11 44 56 54 14 13 52

Note. This table should be read as follows: the first line means that 284 T of Ad2 are replaced by C in Ad5; among them 65 are in the hexon gene and 54 are in the fiber gene. I, Mutations in the complete genome; II, mutations in the hexon gene; Ill, mutations in the fiber gene.

gene Ad5 542 455 317 432

(3 1 .O%) (26.1%) (18.2%) (24.7%) 0.442

of the fiber gene.

than 33% expected for random mutations. The fiber and the hexon carry most, if not all, the antigenic determinants of the virion. There must exist a strong selective pressure on the genes of these two proteins. Indeed the nucleotide substitutions in these genes are rather different from the ones observed in the rest of the adenovirus genome; the transitions account for only 50.8% of the substitutions in the hexon gene and 47.8% in the fiber gene. This last figure is close to the 45.5% observed for globin genes (Gojobori et a/., 1982). One may wonder if one of the two serotypes resulted from mutations on the other one or if both derive from a common ancestor. To try to answer this question, a possible approach is to consider the relative rate of transitions from C or G toward T or A compared to the opposite transitions (T or A towards C or G). Li et a/. (1984) have shown that, in the absence of selective pressure, the first type is more frequent than that of the second one. To do such a comparison one must adjust the percentages of mutations to take into account the

TABLE

4

PERCENTAGE OF THE 12 TYPES OF NUCLEOTIDE SUBSTITUTIONS BETWEEN ADS AND Ao5 GENOMES Ad5

Ad2

A T C G

A

T

C

G

7.67 6.08 12.31

6.94 -

7.51 19.11

13.35 3.25

3.09

13.26 4.57 2.86 -

Note. These percentages have been corrected to take into account the base composition of the genome (Gojobori et al., 1982; see text). This table indicates for instance that 19.1 1% of the mutations going from the Ad2 to the Ad5 genome are a replacement of a TbyaC.

SEQUENCE TABLE

OF ADENOVIRUS

5

PERCENTAGE OF THE 12 TYPES OF NUCLEOTIDES SUBSTITUTIONS BETWEEN ADZ AND ADS GENOMES EXCLUDING THE HEXON AND FIBER GENES Ad5 A

Ad2

A T C G

4.25 5.47 14.11

T

C

G

3.04

5.80 24.20 -

16.3 3.52 3.69 -

14.35 3.15

2.17

fact that the four bases are not in equal number in the genome. This was done according to Gojobori et al., (1982). In Table 4, these normalized percentages are given for the total genome. The most frequent mutation is the one in which a C in Ad5 is replaced by a Tin Ad2. Considering what was said before, this suggests that the evolution has been from Ad5 to Ad2. The point is even stronger if one consider the mutations in the genome, while excluding the hexon and fiber genes (Table 5). Then, not only are the C to T transitions (going from Ad5 to Ad2) more numerous, but also there are also more G to A transitions going from Ad5 to Ad2, as expected if the evolution has been in that direction. The percentages of the different types of nucleotide substitutions are very similar to those found by Wu and Maeda (1987) for a fragment of primate DNA which has no known function and is mildly constrained by selection. However one must be careful before drawing firm conclusions. The mechanism which favors C to T transition is a consequence of DNA methylation. Wienhues and Doerfler (1985) found no evidence for DNA methylation during productive infections in cell culture. However, the situation might be different in the normal life cycle of the virus. A similar comparison made for the fiber gene of the two serotypes shows very different features (Table 6). As already mentioned, the transitions are not so fre-

TABLE

6

PERCENTAGE OF THE 12 TYPES OF NUCLEOTIDE SUBSTITUTIONS BETWEEN Ao2 AND ADS GENOMES IN THE FIBER GENE Ad5 A

Ad2

A T C G

9.99 6.75 15.74

T

C

G

7.79

9.91 12.01 -

9.56 5.33 2.32 -

12.44 4.23

3.93

TYPE

5 GENOME

285

quent as in the rest of the genome. In particular the C (from Ad5) to T (from Ad2) transition is not the more frequent substitution. This certainly reflects the effects of selective pressure. It also suggests that this gene has evolved independently from the rest of the genome and that the final genome of Ad2 results from an evolution of the Ad5 genome with a recombination with a fiber and possibly an hexon gene from a yet unknown origin. REFERENCES CHROBOCZEK, J., and IACROT, B. (1987). The sequence of adenovirus fiber: similarities and difference between serotypes 2 and 5. Virology 161, 549-554. CLADARAS, C.. and WOLD, W. S. (1985). DNA sequence of the early E3 transcription unit of adenovirus 5. Virology 140, 28-43. COULONDRE, C., MILLER, J. H., FARAEJAUGH, P. I. and GILBERT, W. (1978). Molecular basis of base substitution hotspots in fscherichia coli. Nature 274, 775-780. GOJOBORI, T., LI, W.-H., and GAUR, D. (1982). Patterns of nucleotide substitutions in pseudogenes and functional genes. 1. Mol. Evol. 18,360-369. GREEN, M., MACKEY, J. K., WOLD, W. S. M., and RIDGEN, P. (1979). Thirty-one adenovirus serotypes (Ad1 -Ad31) form five groups (A-E) based upon DNA genome homologies. Virology 93, 481-492. KINLOCH, R., MACKAY, N., and MAUTNER, V. (1984). Adenovirus hexon. Sequence comparison of subgroup C serotypes 2 and 5.1. Biol. Chem. 259, 6431-6436. KRUIJER, W., VAN SCHAIK, F. M. A., and SUSSENBACH, J. S. (1980). Nucleotide sequence of a region of adenovirus 5 DNA encoding a hitherto unidentified gene. Nucleic Acids Res. 9, 4439-4457. KRUIJER, W., VAN SCHAIK, F. M. A., and SUSSENBACH, J. S. (1981). Structure and organisation of the gene coding forthe DNA binding protein of adenovirus type 5. Nucleic Acids Res. 10, 4493-4500. LI, W. H., Wu, C. I., and Luo, C. C. (1984). Nonrandomness of point mutation as reflected in nucleotide substitutions in pseudogenes and its evolutionary implications. J. Mol. Evol. 21, 58-71. NEI, M. (1987). “Molecular evolutionary genetics.” Columbia Univ. Press, New York. NEUMANN, R., CHROBOCZEK, J., and JACROT B. (1988). Determination of the nucleotide sequence for the penton base gene of human adenovirus type 5. Gene 69, 153-l 57. RAZIN, A., and RIGGS, A. D. (1980). DNA methylation and gene function. Science 210, 604-610. ROBERTS, R. J.. O’NEILL, K. E., and YEN, C. T. (1984). /. Biol. Chem. 259, 13,968-l 3,985. TOPAL, M. D., and FRESCO, J. R. (1976). Complementary base pairing and the origin of substitution mutations. Nature 263, 285-289. VAN ORMONDT, H., and GALIBERT, F. (1984). Nucleotide sequences of adenovirus DNA. In “The molecular biology of adenoviruses” (W. Doerfler, Ed.), Springer-Verlag, Berlin/New York. WIENHUES. U., and DOERFLER, W. (1985). Lack of evidence for methylation of parental and newly synthesized adenovirus type 2 DNA in productive infections. /. Viral. 56, 320-324. WOLD, W. S. M., CLADARAS, C.. DEUTSCHER, S. L., and KAPOOR, Q. S. (1985). The 19.kDa glycoprotein coded by region E3 of adenovirus. J. Biol. Chem. 260, 2424-2431. Wu. C-l., and MAEDA, N. (1987). Inequality in mutation rates of the two strands of DNA. nature 327, 169-l 70.

The sequence of the genome of adenovirus type 5 and its comparison with the genome of adenovirus type 2.

We report the sequence of 7558 nucleotides of the adenovirus type 5 genome. With this sequence and previously published data, the complete sequence of...
583KB Sizes 0 Downloads 0 Views