Molecular biology of retroelements.

VIRUS GENES 4:1,93-99, 1990 @ Kluwer Academic Publishers, Manufactured in The Netherlands

Conference Report

Molecular Biology of Retroelements HANS WILL’ AND ROGER HULL* ‘Max-Planck-lnstitut fur Biochemie, Minehen, Plant Science Research, Norwich, UK

FRG; 2John Irtnes Institute, @RC Institute for

Received November 23, 1989 Accepted November 26, 1989

Introduction Retroelements are a family of sequences whose propagation or creation involves reverse transcription and that are found in various kingdoms of the living world ranging from bacteria to humans. The study of retroelements is assuming increasing importance, not only because this group contains several important pathogens, but also because of the recognition of reverse transcription being a strong driving force in evolution. A recent EMBO workshop* focused on two major aspects of this rapidly moving subject. A description of the variety and evolution of these elements has been published by Hull and Will (1). Here we summarize some of the reports on the molecular biology of both viral and nonviral retroelements.

Transcription

Control in Retrwlement

Gene Expression

In the terminal redundancy of the RNA pregenomes of retroelements that contain long terminal repeats (LTRs), consensus sequences for processing/poIyadenylation occur at both ends. Why these signals are recognized only at the 3’ end is not clear. H. Sanfacon (Friedrich Miescher Institute, Basel, Switzerland) searched in cauliflower mosaic virus (CaMV) for sequence elements playing a role in the bypass of the 5’ and the specific use of the 3’ signal. The data of Sanfacon and colleagues suggest that to bypass the 5’ signal requires a minimal distance to the 5’ end of the RNA. Deleting the AAUAAA motif (signal sequence for poly*Molecular Biology of Retroid Viruses and Elements. Organizers: T. Hohn, J. Fiitterer, Schaller. Flumersberg, Switzerland, 3-7 April, 1989.

and H.

94

WILL AND HULL

X

HBV

Fig. 1. Genome organizations of selected retroelements. The genomes are represented as being on the full-length RNA. For CaMV the seven open reading frames (ORFs) are shown as I-VII; the X-protein of HBV is indicated; for HIV 1 the small ORFs, some of which are spliced, are listed as a-c for Ty 1 the gag and pal analogs xe shown as A and B.

adenylation) did not interfere much with the formation of correct 3’ ends, which indicates the existence of an independent signal for 3’ end processing or termination. In contrast to the situation in animals, sequences downstream of the RNA 3’ end are not necessary for correct 3’ end formation of CaMV RNA. A putative stem/loop structure upstream of the AAUAAA sequence appears to be of major importance for correct processing/polyadenylation. Y. Shaul (Weizmann Institute of Science, Rehovot, Israel) described enhancerand transactivator-mediated mechanisms of transcription regulation in hepatitis B virus (HBV). Three sequence elements, E, EP, and NF-la have been identified in the HBV enhancer region that are important, in particular for liver-cell-specific nucleocapsid promoter activity. The E element has strong intrinsic enhancer activity, whereas EP (although its presence is crucial for enhancer function) has very little. Therefore, EP appears to act by coordinating specific interactions between enhancer binding elements. A novel EP binding phosphoprotein that may mediate this function has been isolated and characterized. The HBV enhancer function is also mediated by the HBV-encoded X-protein (for genome position, see Fig. l), which exerts its effect through the E element. A short acidic region of

MOLECULAR

BIOLOGY

95

OF RETROELEMENTS

the X protein has been shown to be a crucial element of its @ans-activating function. The HBV X protein is a potential candidate involved in hepato-carcinogenesis. M. Hohne (University of Gottingen, FRG) addressed this question by using immortalized mouse cells, which, after transfection with cloned hepadnavirus DNA, produced virus particles and became malignantly transformed. After passage through nude mice and establishment of stable tumor cell lines, amplification and rearrangement of the integrated viral DNA and a strongly enhanced X-gene expression was noted, both probably related to the acquisition of the malignant phenotype. Increasing X-mRNA transcription paralleled induction of c&s but not c-rus transcription. In addition, increased transcription of intracisternal type A particle transcripts and endogenous Moloney murine leukaemia virus (MoMLV) transcripts were observed, whereas VL30 transcription was curtailed. In a similar vein, C. Pourcel (Institute Pasteur, Paris, France), studying the potential cooperation of carcinogens and HBV in hepatocarcinogenesis, found elevated expression of XmRNA in tumor cells established from transgenic mice previously transfected with cloned viral DNA and treated with carcinogens. Taken together, these data indicate that X-gene expression can effectively interfere with the transcription of cellular genes. However, similar observations have so far not been reported for human hepatocellular carcinoma tissue, and therefore the relevance for the invivo situation remains to be investigated.

Translation

Control in Retroelement

Gene Expression

S. Kingsman (University of Oxford, Oxford, England) reported on ribosomal frameshift mechanisms involved in &-gene expression and identified key differences between human immunodeficiency virus (HIV) and the yeast retrotransposon Ty 1 (for genome organization, see Fig. 1). For HIV, ribosomal frameshifting moves expression from the gag gene into the - 1 phase of the pal gene, whereas the analogous pal gene of Ty 1 (TyB-protein) is in the + 1 phase. Both retroelements require only a short sequence to direct the frameshift, whereas a stem/loop structure is crucial for efficient frameshifting leading to the gug-po/ precursor protein of Rous sarcoma virus (RSV). The nature of the minimal sequence required for frameshifting is also different for HIV and Ty. Ribosomal frameshifting in HIV occurs by using a cellular signal that also directs a shift in normal cellular genes (shown for antithrombin III). The sequence required appears to consist of a short run of Ts only. Such runs of Ts are under-represented in cellular mRNAs, which may be of biological significance. For HIV there is obviously no specificity of frameshifting, suggesting that whenever the ribosomes meet such runs of Ts, they may become sloppy in translation-provided there are no structural sequence elements preventing it. For the Ty 1 elements the sequence directing the frameshift has been narrowed down to 11 nucleotides and, in contrast to HIV, there is specificity for + 1 frameshifting. Other Ty elements, how-

96

WILL

AND

HULL

ever, do not show any primary sequence identity with the frameshift sequence of Ty 1 and Ty 2, suggesting that more than one frameshift mechanism may be used by yeast retrotransposons. In contrast to most retroviruses, the pal gene of all known pararetroviruses (retroviruses that encapsidate DNA, e.g. hepadna- and caulimoviruses) starts with an ATG codon and is located in the + 1 frame. By mutation and complementation analysis, it was shown for CaMV (M. Schulze, Friedrich Miescher Institut, Base& Switzerland) and hepadnaviruses (H.J. Schlicht, ZMBH, Heidelberg, FRG) that the pal gene can be expressed both in vivo and in vitro independently from the nucleocapsid protein gene. Thus, a nucleocapsid-polymerase precursor protein may not be made. One function of the gag polypeptide in the retroviral reverse transcriptase (RTase) precursor is tacitly assumed to guarantee encapsidation of the RTase into the nucleocapsid and thus to prevent reverse transcription of cellular RNAs. This raises the question of how the pararetroviral RTase/DNA polymerase is directed specifically towards the viral template. Synthesis of hepadna- and caulimo-virus pol proteins might be initiated at internal polgene AUGs of the RNA pregenome or on so far unidentified minor pol mRNAs, which could be produced from a cryptic poi-gene promoter. In addition, preliminary data have been presented on the possible existence in CaMV of viral gene products that can lrans-activate posttranscriptionally the expression of downstream genes on bicistronic viral mRNAs (J.M. Bonneville, Friedrich Miescher Institut, Basel, Switzerland). Further experiments are needed to completely rule out the in-vivo expression of capsid-pol fusion proteins of wild-type pararetroviruses by ribosomal frameshifting, which may have a role not easily testable in vitro. These data also raise the fundamental question of the role of retroviral gagpol precursor proteins and whether they are essential for the retroviral life cycle. A complex picture of translational control mechanisms mediated by untranslated leader regions (UTRs) emerged from studies on the 35s RNA of CaMV (J. Ftitterer, Friedrich Miescher Institut, Basel, Switzerland) and the RNA genome of HIV (J. Ciao, GBF, Braunschweig, FRG). In the CaMV UTR both translationstimulating and -inhibiting sequences have been identified. A model was presented that could explain how the general inhibition by certain leader sequences could be alleviated. Ribosome shunting past an inhibiting stretch of sequences could occur by the interaction of two leader sequences with each other and with cellular factors. CaMV gene VI and host, but not nonhost, cellular factors can enhance ribosome shunting. For HIV, a stem/loop structure present at the 5’ end of all viral RNA is supposed to be a crucial element in translation regulation during infection. Ciao and colleagues showed that the translational inhibitory effects of the 5’ UTR of HIV observed in fibroblasts, in frog oocytes, and in in-vitro translation systems is not observed in T lymphocytes. Conceivably, in T cells the inhibition is not induced or is compensated by cellular mechanisms. A cellular kinase induced by HIV-UTR-containing RNAs in vitro, which curtails the function of eIF2 in translation by phosphorylation, may play a role in such translational regulation mechanisms. Stem loop structures in the UTR of RSV do not influence translation

MOLECULAR

BIOLOGY

OF RETROELEMENTS

initiation, as shown by deletion mutagenesis and gene expression Darlix, Centre de Recherche de Biochimie, Toulouse, France).

97 studies (J.-L.

Function and Processing of Proteins of Retroelements K. Strebel (NIH, Washington, USA) investigated the function of HIV viral proteins VIF and VPU (see Fig. 1 for genome organization). Elimination of VIF has previously been shown to markedly reduce the infectivity of virions. It could now be demonstrated that the VIF protein does not effect CD4/env interaction and has no effect on the release or packaging of virions, nor on replication or synthesis of RTase. According to preliminary data obtained by in vitro translation, Strebel speculated on a possible role of VIF in postranslational modification (phosphorylation and/or glycosylation) of other viral proteins. VPU is an only recently described functional HIV gene product. As shown by molecular, biochemical, and immunological techniques, the VPU gene product is a nonglycosylated, membrane-associated phosphoprotein expressed in large amounts in infected cells. Although not virion associated, the absence of functional VPU leads to the reduced release of viral particles and intracellular accumulation of viral proteins. This suggests that VPU is a matrix protein involved in the regulation of particle release from infected cells. The function and processing of p&-gene-encoded protein was a major theme discussed at the meeting. For duck hepatitis B virus (DHBV), R. Bartenschlager (ZMBH, Heidelberg, FRG) presented evidence for the function of the aminoterminal domain of the pal-gene product as a primer in the initiation of DNA minus-strand synthesis. It is not clear whether initially a full-length pol protein serves as primer that later is cleaved off to release functional RTase, or whether processing does not occur at all, which could implicate a dual function as a primer and as an RTase. So far, there is no evidence for a virus-encoded protease. Mutation analysis suggests that a putative aspartic-like protease, as previously identmed by comparative sequence analysis in the nucleocapsid gene, is of no functional relevance (M. Nassal, ZMBH, Heidelberg, FRG). In contrast to hepadnaviruses, an aspartic protease with sequence similarities to RSV protease has been identified in gene V @o&gene analog) of CaMV (M. Torruella, Friedrich Miescher Institut, Basel, Switzerland). As in retroviruses, this protease can be inhibited by pepstatin A, consistent with it being an aspartic protease, and it cleaves autocatalytically the 80-kD precursor into 58-kD and 22/20-kD polypeptides. Thus, the two types of pararetroviruses not only have major differences in gene organization (CaMV: 7 ORFs, pal-gene domain arrangement is protease, RTase, RNase H; hepadnaviruses: 3-4 ORFs, arrangement of PO/-gene domains is DNA minus-strand primer, tether, RTase, RNaseH; see Fig. l), but CaMV clearly uses extensively processed pol proteins, whereas hepadnavirus po&gene products appear not to encode an aspartic protease and may not need to process the pal protein.

98

WILL

AND

HULL

Data on the processing and functional analysis of HIV 1, HIV 2, and simian immunodeficiency virus (SIV) &gene products have been presented by several participants. All five enzymatic activities encoded in the pal gene: aspartic protease, RTase, DNA-dependent DNA polymerase, RNase H, and the endonuclease have been expressed in E. coli (S.F.J. LeGrice, F. Hoffman-La Roche, Basel, Switzerland; J. Mills, Hoffmann-La Roche, Welwyn Garden City, UK; and K. Molling, Max-Planck-Institut fur Genetik, Berlin, FRG). The major conclusions from these studies were: the aspartic protease mediates all processing events of the gag-pol polyprotein; the RTase is a heterodimer of a 66/51-kD protein, the latter polypeptide arising from C-terminal processing of the 66-kD protein and giving rise to a 15kD protein (RNase H); RNase H activity resides both in the p66 as well as in the ~15 polypeptide; the endonuclease activity resides on a 31-kD polypeptide derived from the C terminus of the pal gene and cleaves specifically in HIV LTR in vitro. By activity gel analysis, RTase activity isolated from virions of HIV 1, HIV 2, and SIV can be detected both with unprocessed gag-pol precursor molecules as well as with partially processed and completely processed pol peptides (U. Bertazzoni, CNR, Pavia, Italy), suggesting that processing and dimerization is not a sine qua non requirement for displacement of RTase activity. How this relates to the in-vivo situation is not clear.

Replication and Retrotransposition

of Retroelements

J.L. Darlix (CRBGC, Toulouse, France) in a collaborative study investigated the mechanisms and type of proteins involved in the correct positioning of retroviral tRNA primer for DNA minus-strand synthesis. Previous studies had demonstrated that in RSV and MoMLV the tRNA primer is positioned by nucleocapsid proteins, pl2 and ~10, respectively. Studies with HIV now showed specific binding of both the nucleocapsid protein ~15 and of the RTase heterodimer p66/p51 to the tRNAlys primer. According to Darlix’s model, the first function of the nucleocapsid protein is to interact with genomic RNA, which will induce the formation of the dimeric RNA genome complex. For this step only two to three molecules of nucleocapsid protein are probably required. Later on during precapsid formation at the membrane, the tRNA primer is positioned at the primer binding site. The RTase will bind to a small domain of the tRNA and induce its annealing to the primer binding site. The nucleocapsid is, however, actively involved in unfolding the tRNA and the dimeric RNA genome, and this is necessary for the annealing process. A detailed study of the mechanisms of hepadnaviral replication and sequence elements involved in the initiation of minus- and plus-strand DNA synthesis was presented by C. Seeger (Cornell University, Ithaca, NY, USA). Unlike conventional LTR-containing retroelements, hepadnaviruses use a genome-linked protein for the initiation of DNA minus-strand synthesis. A small RNA derived from the 5’ end of the RNA pregenome is transferred to a specific site with sequence

MOLECULAR

BIOLOGY

OF RETROELEMENTS

!B

complementarity of the DNA minus strand and serves as a primer for DNA plusstrand synthesis. The results shown demonstrate that a sequence motif only five nucleotides in length will correctly initiate DNA minus-strand synthesis. In contrast to retrovirnses, hepadnavirus DNA minus-strand synthesis initiates at the 3’ end of the RNA pregenome. Transfer of the RNA primer for DNA plus-strand initiation requires no specific sequence motif, but only a short sequence complementarity of the RNA primer sequence with the DNA plus-strand primer site. The 3’ end of the RNA primer, presumably created by RNase H cleavage, is determined prior to the transfer and appears not to have stringent sequence requirements. A minor site for the synthesis of DNA plus strands has been identified at a polypurine tract located close to the 3’ end of the RNA. This site is, however, not essential for virus viability, as shown by mutation analysis. Mechanisms of retrotransposition have been studied by J. Boeke (Johns Hopkins School of Medicine, Baltimore, MD, USA) using Tyl of yeast. Looking at integration into two loci (URA3 and LYS2 genes), he showed that transposition can occur at a very large number of sites, but hot spots for transposition that contain multiple independent transpositions into the same site (to the nucleotide) exist, although a real consensus sequence for transposition does not. Transposition is accompanied by a high rate of recombination in the vicinity of a newly transposed intact Ty element. An in-vitro system has been established that allows the mimicking of transposition events indistinguishable from those occurring in vivo. A similar system has recently been established for MoMLV integration. These data suggest that mechanisms of retroviral integration and retrotransposition are very similar.

Conclusions The observations presented on the molecular biology primarily of viral retroelements illustrate the complexity of the replication and expression of these molecules. They are revealing the subtle and intricate mechanisms by which these molecules bypass the constraints of the normal cellular machinery, Whether the majority of nonviral retroelements use similar mechanisms remains to be seen.

Acknowledgment We thank M. Harvey for preparing the figure.

Reference 1. Hull,

R. and Will,

H. Trends

Genet,

5, 357-359,

1989.

Molecular biology of cytomegalovirus.

Molecular biology of papovaviruses.

Funding of molecular biology.

Molecular biology of selenoproteins.

Molecular biology of methanogens.

Computational systems biology methods in molecular biology, chemistry biology, molecular biomedicine, and biopharmacy.

LINE-1 Retroelements Get ZAPped!

Molecular biology: Marked progress.

Molecular biology in nutrition.

Molecular biology. Chlamydomonas surrenders.

Molecular biology of neurological diseases.

The Molecular Biology of Pestiviruses.

Molecular biology of gynecological cancer.

Molecular biology of lung cancer.

Molecular biology of adrenergic receptors.

The molecular biology of poliovaccines.

Molecular biology of bacterial bioluminescence.

Molecular biology of the Bunyaviridae.

Molecular biology of cellulose degradation.

Molecular biology of dopamine receptors.

Molecular biology of serotonin receptors.

Molecular biology in medicine.

Molecular biology: an overview.

Molecular biology. Inflammatory activities.