Nuclear RNA-binding Proteins JACK AND

D. KEENE CHARLES C. QUERY

Department of Microbiology and Immunology Duke Unioersity Medical Center Durham, North Carolina 27710 I. RNA-binding Proteins . . 11. RNA-Protein Interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Detection of RNA Binding in Vitro B. Sequence Similarities among RNA-a ............ 111. RRM Family of Proteins . . A. Members . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. RNA Recognition Motif C. Origins of the RRM Family ................................... D . Evidence for Direct Interaction of the RRM with RNA E. Specificity of RNA Recognition ................................ F. Do RRMs Constitute “RNA-binding Domains”? ........... IV. Structural Features of RN V. Regulatory Potentials of the RRM Family of Proteins . . . . . . . . . . . . . . . . . VI. Conclusions and Perspectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Note Added in Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

180 181 181 186 187 187 187 188 188 192 194 195 196 198 199 202

The control of gene expression involves several steps at which specific sequences in pre-mRNA transcripts, as well as those in small RNA molecules, are recognized by proteins. RNA-binding proteins can be expected to mediate interactions in a variety of cellular processes, including those occurring in the transcription complex, the spliceosome and the ribosome. Members of one family of nuclear proteins that bind to RNA contain a specific RNA recognition motif (RRM). The RRM family of proteins functions at several levels in RNA processing and some family members are involved in tissue-specific as well as developmentally regulated gene expression. This review describes the proteins that contain this RRM and discusses the potential involvement of these proteins in the control of gene expression at the level of RNA processing. These proteins are modular in structure and often contain at least two types of interactive surfaces, one or more that interacts specifically with RNA and another that interacts with other molecules. Studies to date indicate that, despite the strong homology among these proteins, 179 Progress in Nucleic Acid Research and Molecular Biology, Vol. 41

Copyright Q 1991 by Academic Press. Inc. All rights of r e p d u c t i o n in any form reserved.

180

JACK D. KEENE AND CHARLES C. QUERY

they have unique properties of recognition that allow them to distinguish RNAs of diverse structure.

1. RNA-binding Proteins RNA in living cells is rarely naked. In most cases, RNA is bound to proteins recognizing regions that include RNA termini, homopolymeric sequences, or specific stem-loop structures. The attachment of protein to RNA may be covalent, as in the case of the poliovirus VPg protein, or, more commonly, may involve hydrophobic interactions, hydrogen bonding or charge pairing. The less specific proteins may recognize just the phosphate backbone of RNA, and the more specific proteins may interact with RNA bases. The first three-dimensional determination of RNA structure was obtained by X-ray crystallography of phenylalanine tRNA ( 1 , 2 ) ;more recently, co-crystals of glutamine tRNA synthetase and glutamine tRNA have been reported (3).These latter studies demonstrated contacts between the protein and both the bases and the phosphate backbone of the RNA. In some cases the amino-acid-RNA contacts involved unexpected residues. An overview of RNA-associated proteins is presented in Table I. Among the many known RNA-associated proteins, the bacteriophage r17 coat protein is the best understood (4, 5). This protein contacts a stem-loop structure of the phage RNA and depends on specific sequences in the loop for recognition. The sequence of the RNA stem is less critical; however, a specific “bulged” nucleotide in the stem is vital for high-affinity binding of the protein. In other recent examples, bulged nucleotides have been found to be dispensible (0.Uhlenbeck, personal communication). It has long been known that transcription cofactor TFIIIA can bind to the gene encoding 5-S RNA and mediate the function of RNA polymerase I11 (6). TFIIIA can also bind to 5-S RNA, and an autoregulatory mechanism for the control of 5-S transcription was proposed (6). Studies of the structure of TFIIIA have implicated the zinc-binding fingers in nucleic acid recognition (7). In recent years, viral proteins involved in interactions with the RNAs of the human immunodeficiency virus (HIV) have been identified. These include the tat, rev, and rex proteins implicated in RNA processing and transport of viral mRNA to the cytoplasm (8).These proteins interact with specific regions of the lentivirus RNAs such as the tar and the rev-response element, as recently demonstrated by direct RNA binding studies (8-11). Proteins that are uniquely associated with heterogeneous nuclear RNA (hnRNA) or with small nuclear RNAs (snRNAs) are known, but few studies have demonstrated their direct interaction with RNA. Several of these proteins may attach to ribonucleoprotein (RNP) complexes through proteinprotein interactions and others may bind directly to RNA. Methods to exam-

NUCLEAR HNA-BINDING PROTEINS

181

TABLE I RNA-ASSOCIATED PHOTEINS'

A.

Hctcrogeneoirs nuclear RNP proteins (hnRNP proteins Al, A2/B1, CllC2, E, L) Small nuclear RNP proteins (snRNP proteins 70K, A, B', PRP-24b) mHNP proteins (poly(A)-binding protein, eIF-46) Prc-rRNP protein (nuclcolin) Splicing factors (ASF/SFPc; pl'TW; Drosophila tra-2, S x l ) Transcription factors (La,E. coli rhod) Helix-destabilizing proteins (UPl, HDP, SSBI) Others: Ho-GOK; Neuronal protein (eho);yeast NSHle; Malaria CARP; Phage proteins (T4 gp32d. ~$29gpl0); Maize AAIP; Chloroplast proteins (28, 31, and 33kD); Drosophilo

6.

Ribosomal proteins Signal recognition particle proteins tRNA synthetascs RNA vinis core and nucleocapsid proteins RNA-dependent DNA or RNA polymerases Hibonucleases RNA-RNA helicascs Others (TFIIIA, rev, rex, tat)

bicoidd

aHNA-associated proteins that possess (A) or lack (6)the RNA recognition motif (HRM) depicted in Fig. 2. These proteins arc known to he associated with HNA; some are known to contact HNA directly and others arc present in RNP complexes hut may not be in direct contact with RNA. In most categories within part (A), examples also exist that do not possess the RRM. Abbreviations: PRP, precursor RNA processing; eIF-4B. eukaryotic initiation factor-48; ASF, alternative splicing factor; SF2, splicing factor 2; tra-2, tranrformer-2; Sxl, sex-lethal; SSB1, yeast single-stranded binding protein; eloc, embryonic-lethal abnormal-oisual; M I P , abscisic acid-induciblr protein; NSRI, nuclear signal recognition protein; CARP, clustered-asparaginc-rich protein. pPTB, polypyrimidine tract-binding protein. UP1 is proteolytically derived from huRNPA1. *K. Shannon and C. Guthrie, personal communication. cH. Ge, P. Zuo and J. L. Manley; and A. Krainer, personal communication. dAtypical RRM. These proteins share partial similarity with the conserved RNP-1 and RNP-2 sequences, but do not match at some other positions expected to be critical for the structure of the domain, such as conserved hydrophobic residues (29). =Ten MBIesc, personal communication. fM.Garcia-Blanrw, personal rwmmunication.

ine direct interactions have involved cross-linking with UV light1 (12, 13) and in uitro reconstitution of the RNP complex (see Section 11). In some cases, these approaches have allowed discrimination between direct and indirect attachment of proteins to RNA.

II.

RNA-Protein Interactions

A. Detection of RNA Binding in Vitro Methods of studying RNA-protein complexes isolated from cells have been available for several years (reviewed in 14), but have not allowed determination of which proteins in the complex were in contact with the RNA. 1

See the article on this point by Budowsky and Abdurashidova in Vol. 37 of this series.

[Eds.]

182

JACK D. KEENE AND CHARLES C. QUERY

Several methods have been used to examine directly complexes formed in uitro (e.g., fluorescence quenching), but have had limited applicability because of the requirement for purified components. More recently, other methods that do not require homogeneous materials have been employed. Some useful methods are discussed below, and some examples are shown in Fig. 1 using the U1-snRNP-A protein.

FIG. 1. Five different methods of detecting RNA binding. (A) Immunoprecipitation of an in uitro translated (TI)protein containing an epitope tag using a tag-specific antibody (16, 27). Similar immunoprecipitation methods for RNA binding have used autoantibodies from patients or antibodies produced in animals (26, 32). (B) The WestNorthern blot procedure uses radiolabeled protein and unlabeled RNA transferred to a solid surface (C. C. Query and J. D. Keene, unpublished). (C) A mobility-shift assay with radiolabel in the protein, (D) the Northwestern blot procedure, and (E) a mobility-shift assay with the radiolabel in the RNA are according to published procedures described in the text. In each example shown here, the U1-snRNP-A protein was used to bind to U1 RNA, stem-loop I1 of U 1 RNA (SL2). a deletion of SL2 (ASL2). or stem-loop I of U 1 RNA (SLl). The epitope tag used in (A) is the 12-aminoacid gene-10 peptide of phage T7 (16, 27, 29). Comp, nonspecific competitor RNA.

183

NUCLEAR RNA-BINDING PROTEINS

D. NorthWesltm

C. Mobility Shift

Blol

A 8'.B C D

E. RNA Mobility Shift

UlSLl RNA

UlSL2 RNA

184

JACK D. KEENE AND CHARLES C. QUERY

1. FILTERBINDING A standard approach to the study of both DNA and RNA binding involves attachment to nitrocellulose filters (15). Protein binds directly to various solid substrates, but nucleic acids require special treatments, such as chemical denaturation or UV light exposure, to bind to these materials. However, if the nucleic acid binds to the protein that, in turn, binds to the nitrocellulose, one can identify specific protein-nucleic acid interactions by labeling the nucleic acid. Such methods have been used to study protein-RNA binding quantitatively, and dissociation constants have been approximated in this manner (4). The “Northwestern blot” method is an modification of the filter-binding assay, in which the protein is separated on acrylamide gels prior to transfer to nitrocellulose and probing with labeled RNA (Fig. 1D). When denaturing polyacrylamide gels are used, renaturation of the binding domain of the protein is required. Thus, some RNA-binding proteins are not amenable to analysis by this method. Furthermore, proteins that require accessory factors for RNA binding will not be detected by Northwestern blotting (16). On the other hand, this method has the advantage of not requiring previous purification of protein, and it may serve as an alternative method when antibody precipitation and other methods are not feasible. For example, Northwestern blotting has been used to identify new proteins that bind to an RNA sequence of interest (17, 18),and to determine the RNA sequence specificity of a known RNA-binding protein (19). Recently, this method was used to determine a site of recognition of the Drosophila sex-lethul (Sxl) protein on the alternatively spliced pre-mRNA of the transformer (tra)gene (20). 2. FLUORESCENCE QUENCHING

Under circumstances in which aromatic amino acids such as tryptophan, tyrosine, and phenylalanine are involved in RNA binding, it has been possible to detect binding by measuring the reduction of fluorescence (21).This method is very sensitive if the correct amino acids are involved in RNA contact, but requires the use of highly purified components. The method is most sensitive when tryptophan emission is altered upon binding, and less sensitive for detecting phenylalanines, because their emission can be obscured by nucleic acids. Fluorescence quench techniques have been used to quantitate interactions of the yeast poly(A)-binding protein with poly(A)RNA (22).

3. UV-CROSS-LINKING The ability to cross-link RNA covalently to protein using UV light’ has become increasingly popular in recent years as an indicator of RNA binding. The method is dependent on an appropriate juxtaposition of photoreactive

NUCLEAR RNA-BINDING PROTEINS

185

RNA bases and amino acids (23). For example, pyrimidine bases, especially uracil, can be cross-linked to hydroxylated amino acids (e.g., tyrosine) that are in close proximity. Thus, this method is limited to coincidental proximities, but has proved useful in many cases.l It has also been criticized as not being a measure of specific RNA-protein binding, but only one indication of an association. Under some conditions, one may be able to cross-link proteins that are not directly bound to RNAs. Thus, precautions should be taken in the interpretation of interactions involving UV-cross-linking. More recently, in combination with in uitro RNA competition assays, UV-crosslinking has been used as a method to detect sequence-specific RNA interactions (12, 24). 4. ANTIBODY PRECIPITATION

The presence of autoantibodies reactive with snRNA-binding proteins allows a convenient method of selection of RNPs containing these proteins (25). These antibodies can immunoprecipitate either RNPs synthesized in uiuo or complexes bound in uitro. However, the use of auto-antisera often is compromised by the presence of more than one antibody specificity or by interactions with the RNP that differ from those occurring with the protein alone. Sera from animals immunized with a specific RNA-binding protein or the use of an antibody reactive with an antigenic “tag” that has been attached to a recombinant protein can circumvent these difficulties (26,27). An example of the latter is a phage-T7 gene-10 peptide of 12 amino acids fused to the amino terminus of proteins expressed from certain vectors (28). Antibodies to this tag (16, 27, 29) allowed efficient immunoprecipitation of various in uitro bound complexes (Fig. 1A). 5. MOBILITYSHIFT Nondenaturing gel electrophoresis has been widely used to assay DNAprotein complexes (30). RNA-protein complexes may similarly be examined. For example, a shift in the mobility of a labeled RNA in the presence of cellular extracts and excess competitor RNAs may indicate the presence of a specific binding protein (Fig. 1E). In some cases, resistance of nucleotides in the complex to ribonuclease has been used to indicate specificity (9, 31). Alternatively, RNA binding can be assessed by a change in the mobility of labeled protein, produced by in uitro translation, in the presence of specific RNAs (27, 32) (Fig. 1C). In all cases, competition experiments using nonspecific RNAs are required to verify the specificity of the complex formed. Methods using nondenaturing gels for analysis of large multicomponent complexes such as spliceosomes and polyadenylation complexes have been reviewed recently (33).

186

JACK D. KEENE AND CHARLES C. QUERY

6. BIOTINYLATIONOF RNA Methods have been developed to bind proteins to RNAs containing biotinylated nucleotides so that the RNP complex can be isolated on immobilized avidin. The use of biotinylated RNA as a handle for studying RNAprotein interactions has centered on the isolation of components of the spliceosome (33-35). Recently, such methods also have allowed study of the binding of U 1 RNA to the U1-snRNP-A protein (36). This method has the advantage of allowing detection of labeled proteins from whole-cell extracts, as well as from in uitro translation systems. On the other hand, background binding is often present, and in oitro transcribed RNA must be used. Thus, if the RNA folds incorrectly, if modified bases are required, or if biotinylation interferes with binding, this method would be less useful.

7. WESTNORTHERN BLOTTING A counterpart to the Northwestern blot is the WestNorthern blot (Fig. 1B), which is a Northern blot probed with labeled protein (37).This method of RNA binding has many of the same advantages and disadvantages as the Northwestern blot, but has other applications. For example, conditions of binding need not be compatible with antibodies or gel electrophoresis. However, only a limited number of proteins have been amenable to this technique. One useful application of Northwestern and WestNorthern blots in studying RNA-protein interactions is screening expression libraries to isolate the cognate ligand. We have recently used this method to bind sequences in stem-loop I1 of U 1 RNA that interact with the U1-snRNP-A protein (37). In summary, a variety of RNA-binding methods have been developed that allow the study of RNA-protein interactions, and they have various advantages. No single method is ideal for all such interactions, and different methods may have to be applied in each individual case. Quantitative binding studies using different methods indicate that most RNA-binding proteins studied to date have dissociation constants in the range of lo7 to lo9 M-' (4 16, 22, 27, 37-39).

B. Sequence Similarities among RNA-associated Proteins Sequence elements characteristic of a group of RNA-associated proteins began to emerge with the observation (40) of four copies of an 80-aminoacid repeat in the poly(A)-binding protein. The most conserved region of eight amino acids common among these repeats and in the hnRNP-A1 protein led (41) to the term RNP consensus sequence to describe the octamer. Other groups also noted the presence of conserved sequences in other RNA-associated proteins and speculated that such sequences might be involved in direct RNA binding (42-56). More recently, conservation of the octamer and a separate hexamer sequence in a collection of RNA-associated proteins was noted

NUCLEAR RNA-BINDING PROTEINS

187

(49,50). The amino-acid consensus sequences were referred to as RNP 1and RNP 2, respectively. Several workers independently observed extensively conserved sequences in a diverse collection of approximately 20 proteins ranging in source from Escherichia coli to humans, and the region of sequence similarity was found to span approximately 80 residues (32,42,49,55). The presence of the 80-aminoacid motif in a broad family of proteins and direct evidence for interactions of this motif with RNA led to its designation as an “RNA recognition motif,” or RRM (32).It has also been referred to as an “RNP consensus sequence type-RNA binding domain” (44)and an “RNP-80 motif‘ (36).However, it is certain that not all RNA-binding proteins contain this RRM; other motifs may yet be discovered. The variety of RNAassociated proteins identified to date is outlined in Table I. Those found to contain at least one RRM are listed in Section A, and those apparently lacking this motif are listed in Section B.

111. RRM Family of Proteins A. Members Several members of this family of RNA-associated proteins are involved in aspects of RNA metabolism, including pre-mRNA transcription, splicing, and, possibly, stability and transport. Proteins containing the RRM are associated with hnRNA (Al, Bl/A2, Cl/C2, E, and L proteins), small RNAs (A, B”, 70K,2 La, and Ro-60K2 proteins), mature mRNAs [poly(A)-binding protein (PAB protein) and eIF-4B1, and pre-rRNA (nucleolin) (Table I). Some RRM-containing proteins are helix-destabilizing proteins (UP1, HDP, and SSBl), and one is a translational repressor (T4 gp32). Other members, such as Drosophila embyonic-lethal-abnormul-visual-system(elm), p9, tra-2, and S x l , were implicated in RNA-protein interactions because they contain the RRM. Genes tru-2 and Sxl are linked in a regulatory cascade pathway of alternative pre-mRNA splicing. The product of Sxl interacts with the tra premRNA (20); however, tra-2has not yet been shown to contact any specific RNA molecule. It is logical to predict that it may associate with pre-mRNAs or with the snRNAs that interact with pre-mRNA (reviewed in 45).

B. RNA Recognition Motif The RRM of 80 amino acids contains the RNP consensus octamer near the center. Conserved residues are present on both sides of the RNP octamer, but are more abundant in the amino-terminal half of the motif. About 30 residues toward the amino terminus from the octamer is the hexamer of 2 K here represents kDa, kilodaltons (or kilo-atomic-mass-units, kamu). However, 70K is the name given (53) a 52-kDa protein (32). [Eds.]

188

JACK D. KEENE AND CHARLES C. QUERY

conserved amino acids corresponding to RNP 2 (49). In addition to these regions, there are many other conserved positions in the RRM that contain predominantly phenylalanine, glycine, or alanine residues (32). An aminoacid consensus for the RRM has been derived that assigns residues as shown in Fig. 2. Nine residues within the RRM consensus are conserved as particular amino acids and therefore may be essential for the RNA binding function or for the structure of the RNA-binding domain. In determining structure/function relationships within the RRM, a central question is: Which residues are essential for sequence-specific RNA binding? Furthermore, are sequences inside or outside of the RRM essential for sequence specificity? A short amino-acid sequence flanking the aminoterminal side of the RNP octamer is critical for the specificity of recognition of U l and U2 RNAs by the A and Brrproteins, respectively (16). No information is presently available regarding specific amino-acid contacts with bases and phosphates in the RNA. Thus, it is not known which regions within the RRM influence affinity for RNA, but it has been suggested that the RNP octamer has this potential. These questions can be addressed, in part, by site-directed mutagenesis and subdomain switching among RRMs. Ultimately, three-dimensional structure data will be required to resolve many of these issues.

C. Origins of the RRM Family The similarities manifest among the proteins containing the RRM make it likely that many of these proteins are truly homologous. It has been proposed that, in some cases, exons represent protein domains that rearrange as units in evolution (57). The question of whether the RRM is coded within a single exon has been examined for three cases: La, nucleolin, and U 1 snRNP-70K (the gene for yeast PAB protein has been sequenced, but has no introns). The human La (47)and mouse nucleolin (58) genes contain introns within each of the conserved octamer sequences, although the Xenopus 70K protein does not (59). All three proteins contain introns at positions near the flanking edges of the RRMs. Thus, it appears that the RRMs of these proteins may contain distinct structural elements encoded on separate exons. Exon shuffling as well as gene duplication events may have contributed to the evolution of these RNA-binding proteins.

D. Evidence for Direct Interaction of the RRM with RNA Evidence that the RRMs of RNA-associated proteins interact with RNA comes from several approaches. The 320-aminoacid hnRNP-A1 protein and a 195-aminoacid proteolysis fragment of hnRNP-A1 (UP1) can bind singlestranded (ss) DNA and ssRNA (60). Two 92-aminoacid fragments isolated from the UP1 portion of hnRNP-A1 were UV-cross-linked from their phe-

B P G A O

40

B

A A

A P

A G R G F L O F

*

A P A A A N P C n A H A A A B H P A P C C O C C C P O H A O - A X O A G X X L B G A X - O O L O X A X X H X 50 60 70 80

FWP 1 or RNP octamer

FIG. 2. Amino-acid consensus for the RNA recognition motif family of proteins (32). The residues are designated PO (polar), AL (aliphatic), AR (aromatic), AC (acidic), HO (hydrophobic), AB (amides, acids, or bases), BA (basic), AA (amides or acids), NP (nonpolar), CH (charged), X (unassigned), A (alanine), M (methionine), D (aspartic acid), T (threonine), G (glycine), F (phenylalanine), and R (arginine). The most conserved regions (designated RNP 1 for the RNP consensus sequence and RNP 2) are bracketed (44) and the most highly conserved positions are shaded. Dashes indicate positions of variability between family members, where insertions or deletions occur. Asterisks indicate the two aromatic residues in the hnRNP-A1 protein that were cross-linked to oligo-deoxythymidine (52).

190

JACK D. KEENE AND CHARLES C. QUERY

nylalaine residues to oligo-deoxythymidine [positions 3 (aromatic) and 46 (phenylalanine) in Fig. 21 (52). The yeast PAB protein, which contains four RRMs (Fig. 3) and is essential for growth, can be truncated to a 66-aminoacid fragment and still support some degree of growth (22). This peptide contains the amino half of one RRM, but no RNP octamer, which suggests that the RNP octamer may not be required for the function of the PAB protein. This is so far the only genetic system in which the dispensibility of RRM sequences has been examined. A 33-kDa fragment of nucleolin can interact with processed 184 and 284 RNA in a Northwestern assay, although neither the RNA sequence specificity of this association nor interactions with pre-rRNA was examined (46). This 33-kDa binding fragment encompassed nearly three 80- to 90aminoacid RRM repeats, suggesting that at least part of the RNA-binding activity is associated with the RRMs.

FIG.3. Structural characteristics and modular features of the RNA recognition motif (RRM) family of proteins. Two classes are shown that correspond to the presence of single (classI) or multiple (class-11) repeats of the RHM. The highly charged arginine-rich regions of the U1-snRNP-70K and trunsfonner-2 proteins are designated RD/RE/RS. Potential nucleotide (ATP)-bindingand zinc-binding finger (Zn2+) motifs are also noted. Abbreviated proteins are defined in the text with appropriate references and as described (32). The diagrams are not necessarily to scale, and the size of the regions flanking the RRMs vary. Note that UP2 represents a partial clone.

NUCLEAR RNA-BINDING PROTEINS

191

A prokaryotic example of an RNA-binding protein within the family is the E. coli transcription termination factor rho (61). A proteolytic fragment of 155 amino acids from the amino terminus specifically bound RNAs that contained rho-dependent termination sequences. This domain has subsequently been observed to contain an 80-aminoacid RRM (32), although it possesses less sequence similarity to the consensus than most other members of the RRM family. A minimal region constituting an RNA-binding domain of a protein within the RRM family has been defined for the U1-snRNP-70K protein (32).The protein was progressively truncated until a 125-aminoacid fragment retained the specificity and affinity of the full length portion for U 1 RNA in a direct binding assay, whereas smaller fragments did not retain activity. For the 70KU 1 RNA interaction, 35 residues carboxy-terminal to the RRM were required for the full function of the domain; the RRM alone did not bind to U 1 RNA. Thus, the RNA-binding domain of this protein includes many residues external to the RRM. Determinants of specificity were proposed to reside at nonconserved positions within the RRM or at residues external to the RRM. Recent studies of the U1-snRNP-A protein indicate that a single-unit RNA-binding domain encompasses only one of two RRMs (27, 36). In this case, only six residues in addition to the 80-residue RRM constituted an 86aminoacid RNA-binding domain for U 1 RNA. Thus, the flanking sequences required for the A protein to bind U 1 RNA are fewer than those required for the 70K protein to bind. Studies of the RNA-binding features of the La and Ro-6OK proteins demonstrate that large regions of the protein external to the RRM are essential. The La protein can be deleted at its carboxy terminus and still retain binding activity, but removal of even a few amino acids at the amino terminus completely abolishes binding to precursor transcripts of RNA polymerase I11 (29; S. Clarkson, personal communication). The Ro-6OK protein is more fragile in that removal of even a few amino acids at the amino or carboxy terminus abolishes binding to the hY (human cytoplasmic Y) RNAs (62). Thus, the La (47) and Ro-6OK (48) proteins may be atypical in comparison to other RRM family members, but they demonstrate the diverse nature of the RNA-binding domains in that long-range intramolecular interactions may be involved in dictating the specificity of RNA recognition. Another recent variation on the theme of diversity in the recognition of RNA by members of the RRM family involves the U2-snRNP-B” protein that binds to both U1 and U2 RNAs with low affinity in uitro. Upon addition of the U2-snRNP-A’ protein, however, the affinity of B’ for U2 RNA increases at least 100-fold (16).The A’ protein appears to act on B through proteinprotein interactions involving a region of leucine periodicity in A‘ (81). These findings open the possibility that long-range interactions from within a protein (intramolecular) or interactions involving separate accessory proteins

192

JACK D. KEENE AND CHARLES C. QUERY

(intermolecular) may influence the specificity and affinity of RNA recognition and binding by members of this family of proteins. Analysis of the RNA-binding domain defined for the Ul-snRNP-A protein (27,36)using two-dimensional nuclear magnetic resonance has revealed a highly ordered structure for this module (D. W. Hoffman, C. C. Query, B. L. Golden, S. W. White, and J. D. Keene, unpublished). This RNA-binding domain consists largely of p-structure, with four antiparallel strands and two a-helices. Critical aromatic residues in RNPl and RNP2 believed to contact RNA are adjacent to one another in a P-sheet, and both project to the surface of the molecule. These adjacent aromatic residues are highly constrained and intolerant to significant variation (37, 62). Evidence to date demonstrates that an RRM can be a component of an RNA-binding domain. However, the requirement of particular residues within this motif and of sequences outside of it, as well as the contributions of accessory proteins, vary among members of the family. These findings indicate that the structural features of the RRM proteins that allow recognition and binding to RNA are complex and are likely to involve a variety of molecular interactions that are unique to each protein.

E. Specificity of RNA Recognition The diverse functions of various members of the RRM family of proteins are reflected in the different RNAs they recognize (Table 11). However, the RNA-binding site has been defined in only a few cases. For example, PAB protein recognizes the homopolymer poly(A) (22, 40, 41). Although the sequence is simple, poly(A) can assume complex higher-order structures (63), and the role of base-stacking in this protein-nucleic acid interaction has not been studied. An affinity of 2 X lo7 M - for one of the binding domains of PAB protein has been determined (22). It is estimated that 12 adenylate residues constitute the binding site on poly(A). It was also suggested (22)that multiple poly(A)-binding domains on PAB protein could allow transfer of the proteins between strands of poly(A). The helix-destabilizing proteins (UPl, UP2, HDP, and SSB1) (32)appear to recognize ssDNA and ssRNA with relatively little sequence specificity and thereby destabilize native base-paired structures (60, 63-65). UP1 has an affinity for ssDNA of approximately lo7 M - (60, 66). The binding to DNA by both UP1 and T4 gp32 proteins has been extensively investigated and reviewed (67). T4 gp32 also binds to an RNA stem-loop structure in the 5'untranslated region of its mRNA in order to autoregulate its translation (68). Binding sites in RNA for the hnRNP proteins A1 and C1/C2 have been studied by in uitro binding and UV-cross-linking. A subset of these proteins has been bound to intron sequences near splice acceptor sites that included the pyrimidine-rich tract in pre-mRNA (69). Others have shown that these proteins can be UV-cross-linked to RNA sequences near the AAUAAA poly-

193

NUCLEAR RNA-BINDING PROTEINS

TABLE I1 RNA SPECIFICITIESOF THE RRM FAMILY“ Protein

RNA

Specific sequence

UP1I u P2 hnRNP proteins E . coli rho PAB protein La Ro-GOK U1 snRNP-70K and -A U2 snRNP-B” Sex-lethal ( S x l ) Nucleolin Others: tra-2,elav, bicoid, AAIP, CARP, NSRl, etc.

Single strand Pre-mRNA RNA transcripts Poly(A) Polymerase-111 transcripts hY RNAs U1 RNA stem-loop U2 RNA stem-loop transformer (tra) pre-mRNA Pre-rRNA Unassigned

Nonspecific Nonspecific& Cytosine-rich Homopolymer 3’-terminal oligo(U) Specific Specific Specific Specific Unknown Unknown ~

RNA sequence specificity involved in recognition by the RRM family of cellular proteins. Proteins that contain the RRM are listed in approximate order of complexity of the specific RNAs with which they associate. bPreferences for homopolymers or pyrimidine-rich sequences have been demonstrated in some cases. 0

adenylation signal (12, 13). These apparently conflicting results may be reconciled by the proposed existence of high- versus low-affinity binding sites along the pre-mRNA for these proteins (69). The technique of UV-crosslinking does not allow measurements of affinity and is also limited by the possibility that cross-linked regions of RNA and protein may represent coincidental proximation of reactive bases and amino acids rather than representing true binding sites (23).These issues will be resolved by the development of direct binding assays and binding affinity determinations. However, the use of UV-cross-linking does have the significant advantage of allowing examination of protein-RNA association in vivo. The specificity of RNA recognition by the La (70, 71) and Ro-6OK (72) proteins has been demonstrated. The main recognition target of the La protein is the sequence of 3‘-terminal uridylates in precursor RNA-polymerase-I11 transcripts (reviewed in 73). The Ro proteins were shown by nuclease protection studies to bind the stem structure of 3’ and 5’ basepaired termini of the hY RNAs, which are also RNA-polymerase-I11 transcripts (72).In uitro binding assays using recombinant La protein (47) and recombinant Ro-6OK protein (48)were recently developed, but the specificity, stoichiometry, and affinity of RNA binding have not been reported. Rho-dependent transcription termination involves the binding of the protein to untranslated regions of mRNA (61). Rho has affinity for both ssRNA and ssDNA, and high affinity for poly(C) (reviewed in 74 and 75).The ability of rho to recognize both RNA and DNA may be related to its function

194

JACK D. KEENE AND CHARLES C. QUERY

in unwinding RNA-DNA duplexes during the process of transcription release. The RNA recognition properties of the snRNP proteins 70K, A, and B” have been studied using both cell extracts and in uitro binding assays. The 70K protein and perhaps the A protein were suggested to contact stem-loop I and probably other regions of U 1 RNA (76, 77). Recombinant 70K protein requires 31 nucleotides of U1 RNA stem-loop I for binding (26). Contacts with other regions of the RNA were not detected by these assays (26), but could not be ruled out. In addition, direct binding assays show that 36 nucleotides in stem-loop I1 of U 1 RNA are sufficient to bind to the A protein (19). It has been suggested (78) that B” binds to stem-loop 111 or IV of U2 RNA. The extensive sequence similarity of its RRMs to those of the U1snRNP-A protein (79) suggests that it probably recognizes a discrete sequence in a manner similar to the U1-snRNP-A protein (19).Recent studies indicate that stem-loop IV of U2 RNA is involved in the major contact with the B” protein (16). Studies of the sequence specificity of RNA binding by the RRM family of proteins have suffered, in part, for lack of precise probing methods. RNase protection, chemical probing, and fragment binding have been used, but each method has distinct limitations. Application of affinity measurements to a broader group of RRM-containing proteins would improve the understanding of RNA binding specificities. Footprinting methods, like those used for DNA-binding proteins, are hindered by the conformations assumed by RNA, and its altered conformation when bound to protein. Thus, improved methods of RNA-protein structural probing will be needed to define further the base and phosphate contacts between members of the RRM family of proteins and their cognate RNAs.

F. Do RRMs Constitute “RNA-binding Domains“? As knowledge concerning this group of proteins has emerged, many investigators have assumed that any region of a protein that contains the RNA recognition motif or the RNP octamer is an active RNA-binding domain. Such assignment must be viewed with caution, because domains are defined as single units with structural integrity or functional activity (80).In contrast, a motif is defined as a pattern of related sequences. As noted above, the RNA-binding domain of the Ul-snRNP-70K protein includes and requires amino-acid sequences carboxy-terminal to the boundaries of the RRM (32). However, some proteins with identified RRMs contain no sequences carboxy-terminal to the motif. It is possible that dissimilar regions flanking the RRM as well as the less conserved positions within the RRM are involved in the specificity of RNA recognition, as noted above (Section

NUCLEAR RNA-BINDING PROTEINS

195

111,B). It is also possible that some of these RRMs are degenerate and do not associate with RNA. Considering the differences between the apparently functional RNA-binding domains of PAB protein (66 amino acids) and U 1 snRNP-70K (125 amino acids), it seems that the amino-acid sequence requirements for RNA binding by different RRM proteins may vary widely. Therefore, it is important to define each RNA-binding domain as a singular structural or functional unit with the same binding efficiency and specificity as the complete protein. In the example described above (Section III,D), it was found that the B' protein component of U2 snRNPs binds with modest affinity to both U 1 and U2 snRNAs (16). Upon addition of the A' component of U2 snRNPs, the binding of B is specific for U2 snRNA, and the affinity increases dramatically. Furthermore, the U2-snRNP-A' protein does not bind U2 RNA alone, but appears to interact with the U2 snRNP through protein-protein contacts involving a region of leucine periodicity in A' (81).The in uiuo significance of these findings are yet to be demonstrated. However, it is possible that some RRMs cannot function as specific RNA-binding domains except in the presence of accessory proteins. Thus, the RNA-binding domain in such cases may require trans-acting accessory factors in addition to the RRM to be functional. Such accessory proteins may also participate in regulating functions involved in RNA processing that are mediated by RNA binding.

IV. Structural Features of RNA-binding Proteins Containing RRMs A. Classes of Proteins within the RRM Family From the collection of proteins identified that contain the RRM, at least two distinct classes within the family are evident. Most members contain a single RRM (class I), but some contain multiple copies (class 11) (Fig. 3). For example, the snRNA-associated proteins A and B" contain two copies of the RRM, the elau protein contains three copies, while PAB protein and nucleolin each contain four copies. The questions of whether class II members contain multiple RNA-binding domains, whether they contact more than one RNA, and whether multiple RRMs act cooperatively to bind a single RNA have not been addressed experimentally. It is possible that the binding domain of a class I1 protein could require the combined interactions of more than one RRM. For the Ul-snRNP-A (27, 36) and the U2-snRNP-B" (16) proteins, only the amino-terminal RRM has been shown to constitute an RNA-binding domain for U 1 and U2 RNAs, respectively. (See Section III,D for a discussion of the tertiary structure of the U1-snRNP-A protein.)

196

JACK D. KEENE AND CHARLES C. QUERY

B. Modular Structure The members of this family show structural analogy to DNA-binding proteins such as Gal 4, A repressor, and homeo-box (POU) proteins that contain two identifiable surfaces (82, 83). As depicted in Fig. 3, the La, rho, hnRNP-CVC2, and Ro-6OK proteins contain an RRM in the amino-terminal half and an ATP-binding motif (La, rho, and hnRNP-CUC2) or a zinc-binding finger motif (Ro-6OK) in the carboxy-terminal half of the molecule (47, 48, 50, 84). Likewise, the U1-snRNP-70K protein and the Drosophila tra-2 product contain an RRM in the amino-terminal portions and highly charged “RD/RE/RS” sequences in their carboxy-terminal portions (32, 43, 45, 53, 85).Thus, members of the RRM family class I appear to contain two types of interactive surfaces, one that has the potential to bind RNA and another that may interact with other molecules, including some that have regulatory functions in transcription or splicing. For members of class I1 of the RRM family, it is possible that the repeated RRMs are independent binding domains that interact with other RNA sequences. Therefore, these multiple RRM-containing proteins can be viewed also as containing structural modules for multiple molecular interactions.

V. Regulatory Potentials of the RRM Family of Proteins A. Transcription Two biological processes in which these proteins have been implicated are transcription and pre-mRNA processing. For example, the mammalian La protein and the E. coli rho proteins are involved in the termination of RNA transcription (61, 86). These proteins are analogous in terms of both structural organization (Fig. 3) and function. The La protein binds directly to unprocessed transcripts of RNA polymerase 111 (70) and interacts with other components of the transcription complex, such as TFIIIC (86). Rho is involved in factor-dependent termination of transcription by E. coli RNA polymerase (reviewed in 75). The mammalian Ro-6OK protein contains one RRM and a potential zinc-binding finger (48). The function of the Ro RNP is not known, but it has been suggested that it also plays a role in transcription. Ro-6OK and La proteins can co-exist on the same RNP complexes (72). It should be noted, however, that these proteins may also play a role in the processing of RNA transcripts. Thus, all members of the RRM family may be part of a network that is interrelated through RNA processing.

B. RNA Processing Members of the RRM family of proteins that are associated with mRNA or pre-mRNA include PAB protein and the hnRNA-bound proteins, A1 and

NUCLEAR RNA-BINDING PROTEINS

197

C l l C 2 (69). PAB protein is associated with the 3’ poly(A) tails of processed mRNA (41), but its function has not been defined. The hnRNP-A1 and CUC2 proteins have been suggested to have some limited sequence specificity that allows them to select intron sequences (69) or polyadenylation signal sequences (12, 13). Therefore, it is generally accepted that these proteins have a role in mRNA maturation and transport (reviewed in 49). The snRNP proteins contain amino-acid sequence motifs (RRM, RD/RE/RS, leucine periodicity) that participate in RNA-protein and protein-protein interactions to assemble RNP complexes, including the spliceosome. Although the structural versus functional roles of snRNP proteins are not completely understood, the standard U1, U2, U4/U6, and U5 snRNPs and their associated proteins appear to participate in a constitutive (“housekeeping”) pathway that is common to the tissues of most metazoans. Thus, tissue-specific or developmentally specific proteins containing similar amino-acid sequence motifs may participate in a regulatory (alternative) splicing pathway. Strong sequence similarity between a group of Drosophila proteins and the U1-snRNP-70K protein suggests a regulatory function for these members of the RRM family in pre-mRNA splicing. The 70K protein contains two distinct regions that include the U1 RNA-binding domain and two argininerich (RD/RE/RS) regions that consist of repeating arginine-aspartic acid, arginine-glutamic acid, and arginine-serine residues (32, 53). Two Drosophila proteins that are members of the RRM family are the tra-2 (43, 45) and Sx2 products (42). tra-2 and Sxl mRNAs have been shown to be alternatively spliced and to participate in the regulation of the alternative splicing pathway of sex determination, that also involves tra, double-sex (dsx) (87), and several other proteins (reviewed in 45 and 88). tra-2 contains both the RRM and an RD/RE/RS region and, thus, is strikingly similar in its motifs to the 70K protein (Fig. 1). Likewise, the Drosophila bicoid protein appears (89)to be a member of the RRM family, but also contains a homeodomain for DNA binding. Bicoid plays a developmental role in pattern development and may mediate mRNA gradients across the embryo. Two other Drosophila proteins that are quite similar to the RD/RE/RS regions of 70K, tra (90)and the protein product of the suppressor-of-whiteapricot locus, s u ( f l )(91),have been implicated in the autoregulation of their pattern of pre-mRNA splicing (90, 92, 93). Whether tra or s u ( f l ) directly contacts R N A has not been investigated, but both lack the RRM. The gag proteins of type-C retroviruses also contain arginine-rich sequences similar to those in 70K (94), but also lack an apparent RRM. Whether such viral proteins play a role in splicing is not clear. Models may be envisioned in which tra-2 mimics the 70K protein to

198

JACK D. KEENE AND CHARLES C. QUERY

specify splice-site selection, or in which tra-2 or other RD/RE/RS-containing proteins compete with the 70K protein by recognition of pre-mRNA or other splicing factors at a specific pre-mRNA sequence and thereby modulate a putative “trans” function of the RD/RE/RS regions of the 70K protein. By analogy with transcription factors, the RD/RE/RS sequences may be interchangeable among trans-acting splicing factors (95).

VI. Conclusions and Perspectives Pre-mRNA splicing, along with transcription and translation, is an important step in the control of eukaryotic gene expression. A pathway of splicing involving a standard set of constitutive proteins and snRNAs is well studied, but mechanisms that govern specific recognition remain to be elucidated. An amino-acid sequence motif in a family of proteins involved in mRNA splicing is part of an RNA-binding domain that interacts directly with RNA. The resultant RNP complexes involved in splicing may possess different components at points in development where regulatory functions are required. Thus, patterns of pre-mRNA splicing may be dictated by specific sets of trans-acting RNA-binding proteins that modlfy the interactions of the constitutive proteins.

Possible Control Signals Involved in Splice-site Selection It is possible that U l snRNPs associate initially with pre-mRNA through low-specificity electrostatic interactions, perhaps involving the highly charged RD/RE/RS regions of the U1-snRNP-70K protein or other proteins of the U1 snRNP. Preliminary evidence suggests that the RD/RE/RS region enables the 70K protein to interact with a variety of ssRNAs with relatively little RNA sequence specificity (16,26).The U 1 snRNP may be capable ofonedimensional diffusion along the pre-mRNA until an appropriate exon-intron junction is recognized, resulting from RNA-RNA base-pairing of the 5’ end of U 1 RNA with the donor-site consensus sequence (14, 96-99). Electrostatic interactions between the RD/RE/RS regions of 70K and the pre-mRNA, as well as other proteins, may further stabilize the complex, allowing engagement of the U 1 snRNP at the donor site. Donor splice-site selection may be influenced, in part, by competition within the spliceosome by trans-acting regulatory proteins such as those discussed above. For example, electrostatic interactions between the 70K protein and the pre-mRNA might be unfavorable in regions where a trans-acting RNA-binding protein is positioned. Thus, the U 1 snRNP would be displaced from that particular donor splice site if the RNA-RNA interactions were not strong enough to maintain a stable complex. Simultaneous

199

NUCLEAR RNA-BINDING PROTEINS

interactions of U 1 snRNPs with sequences or factors near the branch point and splice acceptor sites could result in formation of a commitment complex similar to that involved in yeast pre-mRNA splicing (100). In spliceosome assembly, electrostatic interactions between the 70K RD/RE/RS sequence on U l snRNPs at the donor splice site and components at the splice acceptor site may be disrupted by the competing RD/RE/RS sequences of trans-acting regulatory proteins. Alternatively, site-specific RRM proteins could attract the 70K protein and U 1 snRNP toward a specific acceptor site. Therefore, members of this family of proteins might serve as negative or positive trans-activators in splice-site selection. Thus, trans-active splicing proteins may have specific RNA-binding domains that recognize the pre-mRNA and also possess an RD/RE/RS sequence. For example, genetic evidence suggests that tra-2 may recognize six repeats of a specific 18nucleotide stretch in an exon of dsx (45, 87). In some regulatory pathways, the specificity of splice-site selection may be controlled by the expression of tissue-specific members of the RRM family. These may be dominant over factors controlling the constitutive splicing pathway. The possibility that splicing patterns in different tissues are determined by a variety of snRNAs appears unlikely because the snRNAs are relatively generic among tissues. It seems reasonable to consider that the RRM family of proteins and those containing RD/RE/RS regions possess trans-acting functions that are tissue-specific and act within the spliceosome or during spliceosome assembly to modulate pre-mRNA processing and/or transport. The identification and analysis of additional members of the RRM family of proteins should help to resolve these models of pre-mRNA splicesite selection by trans-acting regulatory proteins.

REFERENCES 1 . S. H. Kim, G. J. Quigley, F. L. Suddath, A. McPherson, D. Sneden, J. J. Kim, J. Weinzierl and A. Rich, Science 179, 285 (1973). 2. J. D. Robertus, J. E. Ladner, J. T. Finch, D. Rhodes, R. S. Brown, B. F. C. Clarkand A. Klug, Nature 250, 546 (1974). 3. M . A. Rould, J. J. Perona, D. Sol1 and T.A. Steitz, Science 246, 1135 (1989). 4 . J. Carey, V. Cameron, P. L. de Haseth and 0. C. Uhlenbeck, Bchem 22, 2601 (1983). 5. P. J. Romaniuk, P. Lowary, H.-N. Wu, G. Stormo and 0. C. Uhlenbeck, Bchem 26, 1563 (1987). 6. D. R. Engelke, S. Y. Ng, B. S. Shastry and R. G. Roeder, Cell 19, 717 (1980). 7. J. Miller, A. D. McLachlan and A. Klug, E M B O ] . 4, 1609 (1985). 8. M. H. Malim, J. Hauber, S.-Y. Le, J. V. Maize1 and 8. R. Cullen, Nature 338,254 (1989). 9. M. L. Zapp and M. R. Green, Nature 342, 714 (1989). 10. T. J. Daly, K. S. Cook, G. S. Gray, T.E. Maione and J. R. Rusche, Nature 342,816 (1989). 1 1 . A. W. Cochrane, C. H. Chen and C. A. Rosen, PNAS 87, 1198 (1990).

200

JACK D. KEENE AND CHARLES C. QUERY

12. J. Wilusz, D. I. Feig and T. Shenk, MCBiol 8, 4477 (1988). 13. C. L. Moore, J. Chen and J. Whoriskey, EMBO J . 7, 3159 (1988). 14. J. A. Steitz, D. L. Black, V. Gerke, K. A. Parker, A. Kramer, D. Frendeweyand W. Keller, in “Structure and Function of Major and Minor Small Nuclear Ribonucleoprotein Particles” (M. L. Birnstiel, ed.), p. 115. Springer-Verlag, Berlin, 1988. 15. S.-Y. Lin and A. D. Riggs, J M B 72, 671 (1972). 16. R. C. Bentley and J. D. Keene, MCBiol 11, 1829 (1991). 17. V. Gerke and J. A. Steitz, Cell 47, 973 (1986). 18. 1. Tagi, C. Alibert, J. Temsamani, I. Reveillaud, G. Cathala, C. Brunel and P. Jeanteur, Cell 47, 755 (1986). 19. C. Lutz-Freyermuth and J. D. Keene, MCBiol9, 2975 (1989). 20. K. Inoue, K. Hoshijima, H. Sakamoto and Y. Shimura, Nature 344, 461 (1990). 21. R. C. Kelly, D. E. Jensen and P. H. von Hippel, JBC 251, 7240 (1976). 22. A. B. Sachs, R. W. Davis and R. D. Kornberg, MCBiol7, 3268 (1986). 23. K. C. Smith, in “Photochemistry and Photobiology of Nucleic Acids” (S. Y. Wans, ed.), Vol. 2, p. 187. Academic Press, New York, 1976. 24. M. A. Garcia-Blanco, S. F. Jamison and P. A. Sharp, Genes Deu. 4, 1874 (1990). 25. M. R. Lerner and J. A. Steitz, PNAS 76, 5495 (1979). 26. C. C. Query, R. C. Bentley and J. D. Keene, MCBiol9, 4872 (1989). 27. C. Lutz-Freyermuth, C. C. Query and J. D. Keene, PNAS 87, 6393 (1990). 28. A. H. Rosenberg, B. N. Lade, D . 4 . Chui, S.-W. Lin, J. J. Dunn and F. W. Studier, Gene 56, 125 (1987). 29. D. Kenan and J. D. Keene, unpublished. 30. I. A. Hope and K. Struhl, Cell 43, 177 (1985). 31. E. A. Leibold and H. N. Munro, PNAS 85, 21712 (1988). 32. C. C. Query, R. C. Bentley and J. D. Keene, Cell 57, 89 (1989). 33. M. M. Konarska, in ”Methods in Enzymology” (J. E. Dahlberg and J. N. Abelson, eds.), Vol. 180, p. 442. Academic Press, San Diego, California, 1989. 34. P. J. Grabowski and P. A. Sharp, Science 233, 1294 (1986). 35. C. H. Agris, M. E. Nemeroff and R. M. Krug, MCBiol9, 259 (1989). 36. D. Scherly, W. Boelens, W. J. van Venrooij, N. A. Dathan, J. Hamm and I. W. Mattaj, EMBO J . 8, 4263 (1989). 37. C. C. Query and J. D. Keene, unpublished. 38. F. Cobianchi, R. L. Karpel, K. R. Williams, V. Notario and S. H. Wilson, JBC 263, 1063 (1988). 39. P. C. Ryan and D. E. Draper, Bchem 28, 9949 (1989). 40. A. B. Sachs, M. W. Bond and R. D. Kornberg, Cell 45, 827 (1986). 41. S. A. Adam, T. Nakagawa, M. S. Swanson, T. K. Woodruff and G. Dreyfuss, MCBiol 6, 2932 (1986). 42. L. R. Bell, E. M. Maine, P. Schedl and T. W. Cline, Cell 55, 1037 (1988). 43. H. Amrein, M. Gorman and R. Nothiger, Cell 55, 1025 (1988). 44. R. J. Bandziulis, M. S. Swanson and G. Dreyfuss, Genes Deo. 3, 431 (1989). 45. B. S. Baker, Nature 240, 1037 (1989). 46. B. Bugler, H. Bourbon, B. Lapeyre. M. 0. Wallace, J.-H. Chang, F. Amalric and M. 0. J. Olson, JBC 262, 10922 (1987). 47. J. C. Chambers, D. Kenan, B. J. Martin and J. D. Keene, JBC 263, 18043 (1988). 48. S. L. Deutscher, J. B. Harley and J. D. Keene, PNAS 85, 9479 (1988). 49. G . Dreyfuss, M. S. Swanson and S. Piriol-Roma, TZBS 13, 86 (1988). 50. M. S. Swanson, T. Y. Nakagawa, K. LeVan and G. Dreyfuss, MCBiol 7, 1731 (1987). 51. B. Lapeyre, H. Bourbon and F. Amalric, PNAS 84, 1472 (1987).

NUCLEAR RNA-BINDING PROTEINS

20 1

52. B. M. Merrill, K. L. Stone, F. Cobianchi, S. H. Wilson and K. R. Williams, j B C 263, 3307 (1988). 53. H. Theissen, M . Etzerodt, R . Reuter, C. Schneider, F. Lottspeich, P. Argos, R. Liihrmann and L. Philipson, EMBO /. 5, 3209 (1986). 54. K. R. Williams, K. L. Stone, M. B. LoPresti, B. M. Merrill and S. R. Planck, PNAS 82, 5666 (1985). 55. B. M . Merrill and K. H. Williams, in “The Eukaryotic Nucleus: Molecular Biochemistry and Macromolecular Assemblies” (P. Strauss and S. Wilson, eds.), p. 579. Telford, Caldwell, N . J., 1990. 56. S. R. Haynes, M. L. Rebbert, B. A. Moxer, R. Forquignon and I. B. Dawid, PNAS 84, 1819 (1987). 57. J. E. Darnell, Science 202, 1257 (1978). 58. H.-M. Bourbon, B. Lapeyre and R. A m a h , ] M B 200, 627 (1988). 59. M . Etzerodt, R. Vignali, G. Ciliberto, D. Scherly, I. W. Mattaj and L. Philipson, EMBO ]. 7, 4311 (1988). 60. G. Herrick and B. Alberts, JBC 251, 2133 (1976). 61. A. J. Dombroski and T.Platt, PNAS 85, 2538 (1988). 62. S. L. Deutscher and J. D. Keene, unpublished. 63. S. L. Broitman, D. D. Im and J. R. Fresco, PNAS 84, 51209 (1987). 64. S. R. Planck and S. H. Wilson, ]BC 255, 11547 (1980). 65. A. Y.-S. Jong, M. W. Clark, M. Gilbert, A. Oehm and J. L. Campbell, MCBiol7, 2947 (1987). 66. R. L. Karpel and A. C. Burchard, Bchem 19, 4674 (1980). 67. J. W. Chase and K. R. Williams, ARB 55, 103 (1986). 68. H. M. Krisch and B. Allet, PNAS 79, 4937 (1982). 69. M. S. Swanson and G. Dreyfuss, EMBO]. 7, 3519 (1988). 70. J. Rinke and J. A. Steitz, Cell 29, 149 (1982). 71. J. E. Stefano, Cell 29, 149 (1984). 72. S. L. Wolin and J. A. Steitz, PNAS 81, 1996 (1984). 73. J. D. Keene, S. L. Deutscher, D. Kenan and A. Kelekar, Mol. Biol. Rep. 12, 235 (1987). 74. P. H. von Hippel, D. G. Bear, W. D. Morgan and J. A. McSwiggen, ARB 53,389 (1984). 75. T. Platt, ARB 55, 339 (1986). 76. J. Hamm, M. Kazrnaier and I. W. Mattaj, EMBO]. 6, 3479 (1987). 77. J. Patton and T. Pederson, PNAS 85, 747 (1988). 78. I. W. Mattaj, Cell 46, 905 (1986). 79. P. T. G. Sillekens, W. J. Habets, R. P. Beijer and W. J. van Venrooij, EMBO]. 6, 3841 (1987). 80. W. R. Taylor, in ”Nucleic Acid and Protein Sequence Analysis: A Practical Approach” (M.J. Bishop and C. J. Rawlings, eds.), p. 290. IRL Press, Washington, D.C., 1987. 81. L. D. Fresco, D. S. Harper and J. D. Keene, MCBiol 11, 1578 (1991). 82. M. Ptashne, Nature 335, 683 (1988). 83. M. Levine and T. Hoey, Cell 55, 537 (1988). 84. J. L. Pinkham and T. Platt, NARes 11, 3531 (1983). 85. R. A. Spritz, K. Strunk, C. S. Surowy, S. 0. Hoch, D. E. Barton and U. Francke, NARes 15, 10373 (1987). 86. E. Gottlieb and J. A. Steitz, E M B O J . 8, 851 (1989). 87. K. C. Burtis and B. S. Baker, Cell 56, 997 (1989). 88. J. Hodgkin, Cell 56, 905 (1989). 89. M. Rebagliati, Cell 58, 231 (1989). 90. R . T. Boggs, P. Gregor, S. Idriss, J. M. Belote and M. McKeown, Cell 50, 739 (1987).

202

JACK D. KEENE AND CHARLES C. QUERY

91. Z. Zachar, T,-B. Chou and P. M. Bingham, EMBO J . 6, 4105 (1987). 92. P. M. Bingham, T.-B. Chou, I. Mims and Z. Zachar, Trends Genet. 4, 134 (1988). 93. T.-B. Chou, Z. Zachar and P. M. Bingham, EMBO J . 6, 4095 (1987). 94. C. C. Query and J. D. Keene, Cell 51, 211 (1987). 95. P. Bingham, personal communication. 96. S. M. Mount, I Pettersson, M. Hinterberger, A. Karmas and J. A. Steitz, Cell 33, 509 (1983). 97. P. A. Sharp, Science 235, 766 (1987). 98. T. Maniatis and R. Reed, Nature 325, 673 (1987). 99. C. Guthrie and B. Patterson, ARGen 22, 387 (1988). 100. B. Seraphin and M. Rosbash, Cell 59, 349 (1989).

NOTE ADDED IN PROOF:Since this manuscript was submitted (April 1990). relevant papers by Scherly et al. [Nature345,502 (1990)and E M B O J . 9,3675 (1990)] concerning the specificity of RNA recognition have appeared. In addition, papers by Nagai et a/. [Nature 348, 515 (1990)l and by Hoffman et al. [PNAS 88, 2495 (1991)]have described the tertiary structure of the U1 RNA-binding domain of the U1-snRNP-A protein. Additional RRM family members have recently been reported. These include: eukaryotic initiation factor-4B (eIF-4B) [ Milburn et a/., E M B O J . 9, 2783 (1990)l;Bj6, a chromosomal puff-specific protein product of the Drosophila no-on transient A gene that is required for correct visual system development [von Besser et al., Chromosoma 100,37 (1990);Jones and Rubin, Neuron 4,711 (1990)l;X16, that also contains an RDIREIRS-like region and is expressed differentially in tissues [Ayane et al., NARes. 19, 1273 (1991)l;CARP, a malarial clustered asparagine-rich protein [Kuma et al., FEBS Lett. 260, 67 (1990)l;429gp10, an RNA-associatedviral shell prohead connector [Grimes and Anderson, J M B 215, 559 (1990)l;and several chloroplastid proteins [Li and Sugiura, E M B O ] 9, 3059 (1990)]. We thank the laboratories of Mariano Garcia-Blanco, Christine Guthrie, Adrian Krainer, Jim Manley, and Teri MBIbse for communicating results prior to publication.

Nuclear RNA-binding proteins.

Nuclear RNA-binding Proteins JACK AND D. KEENE CHARLES C. QUERY Department of Microbiology and Immunology Duke Unioersity Medical Center Durham, Nor...
1MB Sizes 0 Downloads 0 Views