Int. J. Exp. Path. (I990) 71, 905-9I8

Current Status Review Molecular biology of cytomegalovirus V.C. Emery and P.D. Griffiths Department of Virology, Royal Free Hospital School of Medicine, London, UK

Cytomegalovirus (CMV) is ubiquitous. By adulthood 75% of people in developed countries and I00% in developing countries have become infected. Infection is spread readily within families, probably through salivary contact, and can also be acquired sexually. The virus is well adapted to its host resulting in most infected individuals remaining entirely asymptomatic. However, ifthe host cell mediated immune response is suboptimal, due either to immaturity or disease, then CMV can cause serious morbidity and mortality. Thus, CMV is a major pathogen for the foetus, for allograft recipients, and for those with AIDS. The aim of this review is to summarize current knowledge of the molecular biology of CMV. It should be remembered that CMV has the largest genome of all the viruses known to infect man and so the review will aim to give a summary for the general reader rather than provide minute details for the specialist. Structure of the CMV genome The genome of human CMV consists of a linear molecule of double-stranded DNA containing 229 354 base pairs (Chee et al. I 990). The genome has a high G + C content and is divided into two regions (see Fig. I); the unique short region (Us) which is 3 5 4 I 8 base pairs in length and bounded by two 2524 base pair repeat sequences designated IRS and TRs (Weston & Barrell I 986) and the

unique long region (UL) which is I69972 base pairs in length and is flanked by two repeats of II 247 base pairs (known as IRL and TRL). Both the Us and UL can be orientated in either direction thus giving rise to four isomeric forms of the virion DNA. In CMV-infected cells these isomers are present in equimolar amounts. Restriction maps of the cell culture adapted strains of CMV viz ADI69, Towne and Davis have been available for some time (Greenaway et al. I982). The Hind III and EcoRI restriction map of strain ADi69 is shown in Fig. i. In this review the identity of genes and proteins will be referred to by their location within specific Hind III or EcoRi restriction fragments. The use of restriction fragment length polymorphism studies has demonstrated that each clinical isolate of CMV possesses restriction patterns that are distinct from each other and from the laboratory adapted strains (Chandler & McDougall I986; Grillner & Blomberg I984; Colimon et al. I985). Most restriction site heterogeneity has been located in the UL and Us but variation between the Us and UL junction has also been noted (Weststrate et al. I983). These observations have been exploited to determine the clinical consequences of reactivation of latent virus and reinfection with new CMV strains in bone marrow and renal transplant patients (Winston et al. I985; Grundy et al. I988). In renal transplant recipients the major source of CMV viraemia

Correspondence: P.D. Griffiths, Department of Virology, Royal Free Hospital School of Medicine, London NW3 2QG, UK.

905

V.C. Emery & P.D. Griffiths

906

LI~~~~~~~~~L

0~~~~~~0 ---

--

--

-

-

-

-

0

oowu L.-z" .0

-U N

co

E

...

o

IL~~~~~~~~~~~~~~~~~~~~~~c

>

CO

4W.~a E co

Co

W

o ~~~~~~~~~~~~ 0~~~~~~~~~~~~~~~~0c o U

0 ILco

U)~~~~~~~~~~~~~~~W

C

2o~~~~~~~~~~~~ >u

~~~w~inC

CO

CMV molecular biology and disseminated infection is via the donor kidney. In addition to primary CMV infection, reinfection from this source leads to more overt disease than does reactivation of latent virus present in the recipient (Grundy et al. I988). In contrast, CMV disease largely results from reactivation of latent virus in patients undergoing bone marrow transplant (Winston et al. I985). It has been reported that a seropositive donor of T-cell depleted bone marrow can provide significant protection against CMV disease in the recipient (Grob et al. I987). Gene expression After infection the genes of CMV are expressed in three distinct phases, designated immediate early (IE), early (E) and late (L), or a, fi and y respectively (McDonough & Spector I983; Wathen & Stinski I982).

Immediate early gene expression The CMV LE transcripts originate from distinct regions of the genome with the most active region being located within the Hind III E fragment of the ADI69 genome. Such transcriptional activity is distinct from that observed in HSV infected cells, where transcripts originate in the repeat sequences (Clements et al. 1977; Easton & Clements I980). The LE region contains four major transcriptional units IE i-IE 4 (Jahn et al. I984; Stinski et al. I983). Most information regarding the immediate early events in viral infection has been obtained by analysis ofthe EB i and IE 2 regions. The IE i encodes the most abundant species of LE RNA and is a spliced transcript of 1.95 kb in length (Stinski et al. I983; Wilkinson et al. I984; Jahn et al. I984). The mRNA encodes a phosphorylated nuclear protein of Mr 75000 (Gibson I98I; Akrigg et al. I985; Michelson-Fiske et al. 1977) that represents the major IE protein found in infected cells i h post-infection (Stenberg et al. I984). The upstream region of the LE i area is complex. It contains an enhancer region between nucleotides -II8 and

907

- 524 that can activate transcription in a broad range of cells (Boshart et al. I985; Foecking & Hofstetter I986) and can stimulate expression from the EL promoter. This region also contains four types of repeat sequences, comprising i6, i8, I9 and 21base pair repeats which modulate transcription (Boshart et al. I985; Ghazal et al. I987; Stinski & Roehr I985; Thomsen et al. I984). The I8 and i9-base pair repeats are the most active in modulating IE gene transcription (Stinski & Roehr I985) and the i9-base pair repeats have been shown to be identical to sequences important in responses to cyclic AMP (reviewed by Roesler et al. I988). Activation of lymphocytes may thus stimulate expression of the LE genes via transacting factors (Hunninghacke et al. I989) such as ATF (Activating Transcription Factor) that mediate cAMP effects (Hai et al. I988; Horikoshi et al. I988; Montminy et al. I986; Montminy & Bilezikjian I987). The upstream region has multiple binding sites for other nuclear transcription factors such as nuclear factor i and NFKB (Ghazal et al. i987, i988a, b; Henninghausen & Fleckenstein I986). The end result of this complex interplay between transcription factors is to stimulate transcription and hence expression of the major immediate early genes. The intact LE product down regulates its own promoter, i.e. exerts negative feedback inhibition, somewhat analogous to the situation with HSV IE genes (Deluca et al. I984; Preston I979, I98I). This mechanism accounts for the decreased transcription of the LE unit at late times of infection (Stenberg & Stinski I985). The transcription pattern from the IE 2 region is complex and a variety of multiply spliced mRNA species are produced (Akrigg et al. I985; Stenberg et al. I985). Using invitro translation systems the predominant protein encoded by the IE 2 region has a Mr of 5 6 ooo, but other products with Mrs from i6 500 to 42 ooo have also been detected (Stinski et al. I983). The IE 2 region has been further subdivided into IE 2a and b to allow for differential splicing patterns. Transcrip-

V.C. Emery & P.D. Griffiths tional activity in the IE 2 region changes ATF (Lee et al. I 98 7; Lee & Green I 98 7). Coduring infection; in the IE phase of infection transfection of cells with the early gene spliced mRNAs are evident, whereas late in construct and a CMV DNA fragment coninfection unspliced transcripts are more taining the IE genes led to a dramatic abundant (Stenberg et al. I985). As infection increase in the early promoter activity (i I8proceeds, transcription from the IE i pro- fold increase over basal activity). Whether moter is down regulated and transcription at the transactivation of this early promoter the IE 2 promoter initiates (Herminston et al. occurs through the IE 2a gene product as has I987). The IE 2 region has been shown to been shown for the activation of the adenovirus E2 promoter remains to be elucidated. activate a range of heterologous promoters, Post-transcriptional regulation of the for example the adenovirus E2 promoter (Hermiston et al. I987) and the HIV-i LTR early genes has also been demonstrated. (Davies et al. I987). The transactivation Elegant experiments by Mocarski and colresides in proteins encoded by the E2a leagues (Geballe et al. I986b) have shown that a I42 nucleotide cis-acting element in region, although a contribution from the IE i region cannot be excluded in order for the 5' leader of a non-spliced early gene located in the TRL region of the genome maximal activation. (McDonough et al. I985) regulates gene expression post-transcriptionally. When this Early gene expression element was translocated to an IE gene the The early phase of CMV gene expression is latter was regulated as if it were an early dependent on IE gene products. Early tran- gene. The cis-dominant signal appeared to scripts have been mapped to all regions of the act by blocking expression post-transcriptCMV genome (DeMarchi I 98 I; Wathan et al. ionally until a viral function subsequently I 98 I; Wathen & Stinski, I 982, McDonough activated the gene to yield full expression at & Spector i983; Chang et al. I989). the appropriate time during infection. The The IE gene products transactivate the protein encoded by the early gene has been early gene promoters. The mechanism of this studied (Geballe & Mocarski I 988) and it has activation and the cis-acting elements of the been shown that the presence of short open CMV early genes responsible for their tem- reading frames (ORF) upstream of the authporal regulation have not been extensively entic translational start sites affects transstudied. One careful report by Spector and lation of the mRNA. In the presence of these colleagues (Staprans et al. I988) describes upstream AUG codons expression of the the requirements for the activation of an major ORFs is delayed for several hours. early gene promoter located in close proxi- Such short ORFs have been identified in the mity to the IE genes. This early gene unit 5' leader region of other CMV transcripts encodes four phosphoproteins (Wright et al. (Davis & Huang I985; Jahn et al. I987b; I988) translated from a 2.2 kb transcript Kouzarides et al. I98 7a, b). A recent study (Gretch & Stinski I990) has (Staprans & Spector I986). By coupling the early gene promoter to chloramphenicol analysed the transcriptional patterns of the acetyl transferase (CAT) and performing glycoprotein gene family encoded by the DNA deletion analysis, the area responsive to Hind III X region (HXLF genes and part of the gcII complex) located in the Us region of the IE activation was mapped to a region 323 base pairs upstream of transcription initia- the CMV genome. These studies have shown tion. This region contains a 9 bp direct repeat that despite the presence of an enhancer element that was homologous with element downstream of the HLXF genes, sequences present in the adenovirus E2 immediate early transcripts could not be

go8

promoter and which has been shown to be important in binding cellular factors such as

detected. In contrast, all genes were transcriptionally active at both early and late

CMV molecular biology 909 times. It is of interest that the first four gene regulation of the CMV pp6 5 gene. It is products (HXLF I, 2, 3 and 4) are transcribed interesting that other late CMV promoters do as bicistronic mRNAs of i.6 kb (HXLF i and not contain this octomeric sequence 2) and I.'7 kb (HXLF 3 and 4) whereas the although other repeats and inverted repeats, HXLF 5 and 6 genes are monocistronic in including other octomers, have been identinature. fied (Cranage et al. i988; Davis & Huang The relative complexity of gene expression i985; Greenaway & Wilkinson I987; Jahn during CMV infection is highlighted by the et al. i987a; Kouzarides et al. i987a, b; Staprans & Spector i986). fact that the studies discussed above involve two different types of early gene product. One Transcriptional regulation of the major follows the classical description of an early late DNA binding protein ps 2 encoded by the gene, i.e. transcription occurs following IE ICP36 transcription unit occurs via the gene synthesis but is significantly reduced differential utilization of three distinct tranlate in infection (Staprans et al. I988). In scription initiation sites. At early times (8 h contrast, the studies by Mocarski and col- post-infection) only the proximal and distal leagues (Geballe et al. i986b) and Gretch sites are active, whereas at later times (36 h and Stinski (I990) have been performed post-infection) the middle start site is actiwith early genes that are also transcriptiovated. The fact that transcription from the nally active late in infection. middle start site is dependent on DNA replication implies that at late times in infection the gamma-specific regulatory elements in Late gene synthesis the ICP36 promoter are interspersed with Once the early genes have been synthesized, and dominant over the early regulatory virus replication can occur and late gene elements. synthesis ensues. The majority of late genes Post-transcriptional and translational studied so far encode virus structural pro- events similar to those described for the early teins, e.g. the matrix and tegument proteins genes have also been shown to affect late and the envelope glycoproteins (Gibson gene expression (Geballe et al. i986a). I983; Stinski 1977). Dissection of the cis-acting sequences present upstream CMV Proteins of regions of the late genes has not been performed but it is clear that certain late In general, virus specific proteins can be genes must require the presence of imme- classified into: structural, i.e. found in the diate early and/or early gene products for virion, or non-structural, i.e. enzymes found within the infected cell responsible for importheir expression. Recent data (Depto & Stenberg I989) tant anabolic steps. It should be noted that concerning the transcriptional activation of since some enzymes may be contained the pp6 5 gene has shown that both the IE within the virion itself the term 'structural' is regions i and 2 were necessary for activation inappropriate, so these proteins are described of the pp6 s promoter as assessed using a as being virion-associated. This classification CAT assay system. The minimum promoter is somewhat of an over-simplification sequence responsive to the IE genes was 6 I because each protein may exist as a precurnucleotides upstream of the cap site and an 8 sor and require extensive post-translational bp sequence, ATTTCGGG, was necessary for modification by cleavage, glycosylation, the activation of the pp6 5 promoter. Deletion phosphorylation or myristoylation to reach ofthis octomer sequence prevented promoter its mature form. Furthermore, mature proactivation. These data suggest that interac- teins may function not in isolation but in tions between the IE proteins and this association with other proteins, forming octomer sequence may be important for the large complexes. Also, it is interesting to note

V.C. Emery & P.D. Griffiths

9IO

K

D

TRS Pau c L I EI III

E

lI

1

I

t

E 4

PP

28

PP

F

I

MZ J

N YO

1

IQXVW

I I

H

I

E

1±AppIoi It

MCP PP 6&71

ON

DNA.R9 p BP

N

I

I

I

25

50

75

L

0

11

-

BP

128

-

100

125

230 kb

.

Molecular biology of cytomegalovirus.

Int. J. Exp. Path. (I990) 71, 905-9I8 Current Status Review Molecular biology of cytomegalovirus V.C. Emery and P.D. Griffiths Department of Virology...
2MB Sizes 0 Downloads 0 Views