TRANSCRIPTION 2016, VOL. 7, NO. 4, 141–151 http://dx.doi.org/10.1080/21541264.2016.1183071

RESEARCH PAPER

RGG boxes within the TET/FET family of RNA-binding proteins are functionally distinct Bess Ling Chau, King Pan Ng, Kim K. C. Li, and Kevin A. W. Lee Division of Life Science, The Hong Kong University of Science & Technology, Clear Water Bay, Kowloon, Hong Kong S.A.R., China

ABSTRACT

ARTICLE HISTORY

The multi-functional TET (TAF15/EWS/TLS) or FET (FUS/EWS/TLS) protein family of higher organisms harbor a transcriptional-activation domain (EAD) and an RNA-binding domain (RBD). The transcriptional activation function is, however, only revealed in oncogenic TET-fusion proteins because in native TET proteins it is auto-repressed by RGG-boxes within the TET RBD. Autorepression is suggested to involve direct cation–pi interactions between multiple Arg residues within RGG boxes and EAD aromatics. Via analysis of TET transcriptional activity in different organisms, we report herein that repression is not autonomous but instead requires additional trans-acting factors. This finding is not supportive of a proposed model whereby repression occurs via a simple intramolecular EAD/RGG-box interaction. We also show that RGG-boxes present within reiterated YGGDRGG repeats that are unique to TAF15, are defective for repression due to the conserved Asp residue. Thus, RGG boxes within TET proteins can be functionally distinguished. While our results show that YGGDRGG repeats are not involved in TAF15 auto-repression, their remarkable number and conservation strongly suggest that they may confer specialized properties to TAF15 and thus contribute to functional differentiation within the TET/FET protein family.

Received 28 March 2016 Revised 20 April 2016 Accepted 21 April 2016

Introduction TET (TLS, EWS, TAF15, Fig. 1) or FET (FUS/EWS/ TAF15) are members of the large RNP family of RNA-binding proteins1 with a characteristic RNARecognition-Motif (RRM) and form a subfamily due to unique features of the RRM.2,3 TETs are widely expressed in different tissues, present in many subcellular compartments and are involved in several aspects of RNA biogenesis and gene regulation.4-8 Furthermore, TETs connect with numerous other proteins in protein networks9 and scaffolds10 and accordingly have diverse normal 11-17 and abnormal biological roles.7,18,19 TET proteins can be divided broadly into an n-terminal transcriptional-activation domain (EAD) and a c-terminal RNA-binding domain (RBD). EAD has largely been characterized in Ewing’s family oncoproteins (EFPs) that arise due to chromosomal translocations involving EWS/TET genes and various transcription factors.7,8,20 In contrast, the role of EAD in native TET proteins remains obscure. TET RBD

KEYWORDS

oncoproteins; RNA-binding proteins; RGG box; TET/FET protein family; transcriptional repression

function is better understood and, intriguingly, the Drosophila TET-related protein Cabeza/SARFH contains only the RBD and can partially substitute for TETs in mammalian cells.2,21 The latter observation is consistent with the central role of TET proteins in different aspects of RNA biogenesis.5,7,8 The Arginine/ Glycine rich regions of TETs (referred to herein as RGG boxes) directly interact with RNA but are also involved in many other TET functions via protein– protein interactions or as targets for methylation by Protein Arginine Methyl Transferases (PRMTs).22 TET proteins are able to regulate gene expression (and in some cases transcription) both positively and negatively.6,15-17,23-27 In some cases, the TET RBD23 or RGG boxes in other TET relatives 28,29 even participate in transcriptional activation. Significantly, however, in experimental DNA-bound fusions, the TET EAD is strongly repressed by TET RGG-boxes.30,31 Given the broad expression of TET proteins, the presence of TETs in a high proportion of transcription complexes.32 and the potency of transcriptional

CONTACT Kevin A.W. Lee [email protected] 3 Newports, High Wych Road, Sawbridgeworth, Herts CM21 0HP, United Kingdom Color versions of one or more of the figures in the article can be found online at www.tandfonline.com/ktrn. © 2016 Taylor & Francis

142

B. L. CHAU ET AL.

Figure 1. TET/FET protein family. The TET/FET protein family contains three members (TAF15 (TBP-Associated-Factor15), EWS (Ewing’s Sarcoma oncoprotein) and TLS/FUS (Translocated in Liposarcoma)). TETs are a subfamily of RNA-binding proteins containing an n-terminal region referred to here as EWS-activation domain (EAD, purple boxes) and a c-terminal RNA-binding domain (RBD). EAD is also present in the Ewing’s family of fusion oncoproteins (EFPs) that arise due to aberrant chromosomal translocations involving a TET protein and a transcription factor partner. EFPs are potent EAD-dependent transcriptional activators. The TET RBD contains two elements (an RRM and Arg/Gly rich RGG boxes) commonly found in other RNA-binding proteins and a C2–C2 zinc finger (Z). The TET RRM has unique features that define the TET subgroup within the larger RNP family of RNA-binding proteins. TET proteins are restricted to chordates although a protein containing a highly homologous RBD (Cabeza/SARFH) but lacking EAD is found in Drosophila. TET gene organization and overall sequences are remarkably similar but TAF15 exhibits distinct sequence variations in both EAD and RGG regions. TAF15 EAD has a significantly more negative unit charge than both EWS and TLS (as indicated by the D and E residues in the figure) and the TAF15 RGG3 region consists almost exclusively of highly conserved YGGDRG/SG repeats (75% of RGG triplets) that are not present in EWS or TLS. In mammalian cells, transcriptional activation by EAD can be assayed experimentally following fusion to the DNA-binding domain (DBD) of the yeast activator protein Gal4 and the presence of cis-linked TET RGG boxes strongly repress EAD. Experiments to detect RGG boxmediated repression were performed using Gal4-fusion proteins containing either EAD or a substitute transcriptional activation domain (VP16) from the Herpes Simplex virus.

activation by EAD, RGG-mediated repression of EAD is most likely crucial for biological control of native TET proteins. The molecular basis for the TET autorepression has been suggested to result from physical masking of EAD by direct intramolecular interaction with RGG boxes32-34 but this question has not been directly examined. Although TET family members have a high degree of homology, remarkably similar organization (Fig. 1) and overlapping expression patterns,35,36 individual TET proteins perform quite specific and non-redundant functions. The molecular basis for TET specialization remains largely unknown but must be accounted for, in

part, by sequence variations that differentiate TET proteins. First, for example, the TAF15 EAD has significant negative charge compared with EWS and TLS (Fig. 1). Second, a large proportion (75%) of RGG boxes in TAF15 are present as part of a conserved repeat YGGDR (G/S)G that is totally absent from EWS and TLS. The functional consequences of the above sequence variations in TET proteins are, however, unknown. Here, we show that RGG box-mediated TET autorepression is not operative in vitro or in yeast cells. These findings are incompatible with repression resulting from a simple intramolecular masking of EAD by the RBD 3234 but instead indicate that repression requires additional trans-acting factors. We also show that RGG boxes present within prevalent YGGDR(G/S)G repeats of TAF15 are unable to repress transcription. Thus, RGG boxes within the TET family can be functionally distinguished and may contribute to TET protein specialization.

Results and discussion Transcriptional repression by RGG boxes requires additional trans-acting factors

RGG-boxes within the EWS RBD repress EAD-mediated trans-activation and also repress several other activation domains (including that of the well-characterized Herpes Virus VP16 protein) in mammalian cells.31 To probe whether this repression phenomenon is an intrinsic property of RGG boxes or, alternatively, whether additional trans-acting factors are required, we tested for repression in different cellular and biochemical contexts. EAD37 and VP1638 both activate transcription in mammalian nuclear extracts, and we first tested for repression activity in this in vitro system (Fig. 2A). Bacterially-produced histidine-tagged Gal4-VP16 derivatives and a promoter containing Gal4 binding sites were added to HeLa cell nuclear extracts and transcriptional activity detected by primer extension of correctly initiated RNA transcripts.37 Within the linear and saturated range for the assay (as shown) Gal4-VP16 (G4VP16) had the same activity as the corresponding protein (G4VPSR4) that contains a reiterated EWS peptide (SR4) harboring a number RGG boxes sufficient for repression in mammalian cells.31 This in vitro result indicates that RGG-boxes do not repress transcription autonomously and therefore suggests that additional trans-acting factors (that are inactive or deficient in nuclear extracts) are required for repression.

TRANSCRIPTION

We sought to substantiate the hypothesis that repression requires additional factors via in vivo experiments. EAD39 and VP1638 both activate transcription

143

in yeast and given the large evolutionary gap between mammals and yeast, we asked whether repression can operate in yeast (Fig. 2B). To this end, a yeast reporter strain (Y190) harboring a chromosomal Gal4-dependent b-galactosidase reporter was transformed with expression vectors for test proteins fused to the DNAbinding domain of Gal4. Transcriptional activation was scored by staining of yeast filters for b-galactosidase activity (Fig. 2B). Gal4-VP16 fusions containing either the intact EWS RBD (VPRBD), or the EWS RBD lacking only the RRM (VPRM4) or containing the EWS peptide SR4 (VPSR4), in each case had activity comparable with Gal4-VP16. Similarly, the activity of a Gal4-EAD fusion containing the SR4 peptide (EADSR4) was comparable with Gal4-EAD (EAD). These in vivo experiments indicate that RGG boxes do not mediate repression autonomously and that a trans-

Figure 2. RGG-box mediated repression in different contexts. (A) Repression in mammalian cell extracts. Transcription assays were performed using nuclear extracts and 500 ng of a reporter plasmid (pG5E4Cat) containing five gal4-binding sites. Exogenous Gal4-fusion proteins were added and transcriptional activity monitored by detection of correctly initiated transcripts using primer extension followed by autoradiography. The open triangle (top) indicates increasing amounts (5, 15 and 50 ng) of Gal4-VP16 (G4VP16) or the corresponding protein (G4VPSR4) containing a reiterated peptide (SR4) with EWS RGG-boxes sufficient for repression in vivo 31. (B) Repression in yeast S. cerevisiae. Gal4EAD or Gal4-VP16 fusion proteins were tested for activity in S. cerevisiae strain Y190 containing a chromosomal Gal4-b-galactosidase reporter. The upper panel shows exogenous protein expression levels (Western blot) and the lower panel staining of yeast filters to score activation of the Gal4-b-galactosidase reporter. G4EAD is a Gal4-EAD fusion protein containing the intact EWS EAD. EADSR4 corresponds to G4EAD but contains the EWS RGG-boxes present in the SR4 peptide. Gal4-VP16 fusion proteins (VP) are as follows: VPSR4 contains the SR4 peptide; VPRBD contains the intact EWS RBD; VPRM4 contains the EWS RBD but lacking the consensus RRM. (C) Activity of EAD mutants in yeast S. cerevisiae. A series of Gal4-EAD mutant proteins, previously characterized in mammalian cells 40 were tested for transcriptional activity in yeast strain Y190. Yeast were transformed with the plasmids pG4DA, pG4DI, pG4DF, pG4REV, pG4D78 and pG4SCR (see the Materials and methods section) expressing the corresponding Gal4-EAD fusion proteins indicated. Upper panel (protein expression levels) and lower panel (transcriptional activity) are as described in part B above. (D) Repression in insect cells. Sf21 insect cells were untransfected (–) or transfected with pZ7Luc together with pDest-57Z or pDest-57ZR plasmids. Luciferase activity (left panel) was assayed 40 hours post-transfection and a representative experiment is shown (bkg D background 0.2 kRLU/sec; 57Z, 4386.6 kRLU/sec; 57ZR, 16.3 kRLU/sec). 57Z and 57ZR protein expression was confirmed by detection of KT3epitope-tagged proteins present in cell extracts (right panel).

144

B. L. CHAU ET AL.

acting mammalian factor (with no functional equivalent in yeast) is required. The inability of yeast (S. cerevisiae) to support RGG box-mediated repression prompted us to ask whether this might be accounted for by EAD functioning differently in yeast (Fig. 2C). In mammalian cells, the properties of a panel of EAD mutants (DA, DI, DF, D78, REV and SCR, see the Materials and methods section) has established that EAD functions as a polyaromatic, sequence-independent, disordered polypeptide.34,40 Using Gal4-EAD fusions, we tested the above panel of EAD mutants in yeast and found that mutants that retain full activity (DF, REV and SCR) or have dramatically reduced activity (DA, DI and D78) in mammalian cells, exhibit similar properties in yeast (Fig. 2C). Thus, the lack of RGG-mediated repression in yeast is not due to differing modes of EAD action in yeast versus mammalian cells. TET auto-repression has been suggested to result from direct physical masking of EAD by intramolecular interaction with the RBD 30,32,33 and specifically via RGG boxes.31 A recent study further proposed a specific mechanism for intramolecular EAD/RGG interactions based on cation–pi interactions between multiple EAD aromatics and Arg residues.34 Computational analysis showed that intramolecular cation–pi interactions within EWS are highly probable and, furthermore, that they could account for robust blocking of EAD transcriptional activity.34 However, the findings here appear to rule out a simple intramolecular masking mechanism but instead point to the requirement for additional trans-acting factors. Future studies using the yeast assay offers a potential avenue for identification and characterization of the putative mammalian protein(s) required for RGG-mediated TET auto-repression. Lack of TET auto-repression in yeast may not be surprising given that there are no EWS homologues in yeast and that EWS (and other TET family relatives) arose only later during evolution of higher organisms.41 Thus, yeast have never been exposed to any evolved functions driven by native EWS. Intriguingly, however, the Cabeza/SARFH protein of Drosophila2,21 is highly related to EWS2,21 but lacks EAD and contains only the RBD (Fig. 1). Insect cells may thus provide an evolutionary probe into trans-acting factors required for TET auto-repression, and we therefore tested insect cells for the ability to support EAD-mediated transactivation and repression (Fig. 2D).

Cultured Sf21 cells were co-transfected with an expression plasmid for a minimal activator (57Z) containing EAD residues 1–57 fused to the DNA-binding domain of Zta42 and a luciferase reporter Z7Luc.43 57Z protein activates transcription to very high levels in Sf21 cells (comparable with mammalian cell activity) consistent with the substantial conservation of the general transcription machinery in insect and mammalian cells.44,45 In striking contrast to yeast, the EWS RBD strongly represses the EAD in Sf21 cells (Fig. 2D, compare 57Z with 57ZR). This result indicates that the trans-acting factor(s) required for repression are present in Sf21 cells and therefore arose independently of EAD during evolution. Repression therefore most likely reflects a primordial function of the EWS/TET RBD (and in particular RGG boxes) that existed prior to evolution of native EWS. Uncovering the functions of RGG-boxes in insect cells and the proteins involved, should provide insights into the mechanism of RGG box-mediated repression. Identifying protein factors required for TET autorepression will be very challenging due to the vast array of candidates. TETs participate in protein networks,9 scaffolds,10 hnRNP complexes,46 snRNP complexes,47,48 and otherwise interact with a wide range of proteins including RNA helicases, SMN protein47-49 and even structural proteins.46 Similarly, various interactions of TET proteins with RNA6,26,50-52 or single-stranded DNA 27 might be involved in repression. It is significant, however, that several aspects of the repression phenomenon rule out particular mechanisms and thus, perhaps, the involvement of many of TET-interacting proteins/ RNA alluded to above. Repression occurs directly at promoters30 and, significantly, is not dominant but instead can be overcome by additional activators.31 Thus, we have suggested that the major role of repression is simply to prevent promiscuous activation by promoter-bound TET proteins while being neutral for other activators within the same transcription complexes.31 This mechanistic framework argues against the involvement of the many TET-interacting proteins and nucleic acid factors that either allow TETs to repress transcription26,27,53 or that may divert TET proteins to other cellular compartments and away from transcription complexes.46-48 Similarly, PRMTs directly interact with and methylate RGG boxes within TET proteins6,22 but these interactions stimulate positive gene regulation by TAF156 or increase EWS accumulation in the cytosol.54 Neither of the above effects is indicative of a role

TRANSCRIPTION

145

for PRMTs in repression of TETs within otherwise active transcription complexes. RGG boxes with TET family members have differential repressor activity

The RGG regions in TET proteins (Fig. 1) are lowcomplexity-disordered sequences, consisting almost entirely of RGG tri-peptides and less abundant Asp and Phe residues, the latter of which are not required for RGG box-mediated repression.31 In striking contrast to all other RGG boxes (including those in TET proteins and other RNA-binding proteins) a large proportion (42%) of RGG boxes in TAF15 are present as part of an extended highly conserved repeat sequence YGGDRGG. The prevalence of the YGGDRGG heptapeptide in TAF15 and the total absence of this sequence in EWS and TLS, strongly suggests that it may enable specialization of TAF15. In relation to the current study it should be noted that Tyr-enriched disordered peptides (such as those containing multiple YGGDRGG repeats) are likely to activate transcription and hence counteract repression.40 Secondly, since the basic nature of the Arg side chain is important for repression31 the vicinal Asp residues present in YGGDRGG repeats might also be expected to compromise repression. In view of this, we asked whether the YGGDRGG sequence could, like other RGG boxes, mediate transcriptional repression in mammalian cells (Fig. 3). As previously shown,31 four copies of the SR4 peptide from the RGG3 region of EWS (Fig. 1) strongly repress transcription when fused to the either VP16 or EAD (Fig. 3). The SR4 peptide represses transcription by 94% and this is an underestimate given the higher expression level of GVPSR4 versus GVP16 proteins (see Fig. 4). However, the related peptide sequence T15R (see Fig. 4) containing the YGGDRGG repeat from TAF15 (and harboring the same number of Arg residues as SR4 peptide) is unable to repress Gal4-VP16 and in fact has slightly elevated activity (1.4-fold higher). Thus, for the TET protein family, some RGG boxes (those within the SR4 peptide from EWS) can repress transcription while others (those within the TAF15 YGGDRGG repeat) cannot do so. To gain insight into the molecular basis for the differential repressor activity of particular RGG boxes, we tested several other R-rich peptides (Fig. 4). In addition to the EWS SR4 peptide, two other peptides (RGG1 and

Figure 3. Testing the TAF15 YGGDRGG repeat (T15R) for repression. Transcription assays. Mammalian JEG3 cells were transfected with 5 mg GAL4 dependent reporter plasmid (pG1E4Cat for VP16 or pG5E4Cat for EAD) and 2.5 mg of plasmid expressing the test protein. Transcriptional activity was monitored by CT assay and a representative TLC result is shown (c, chloramphenicol; ac, acetylated chloramphenicol). Fusion proteins contained the GAL4 DNA binding domain (G4), an activator sequence (VP16 or EAD residues 1–127) and the RGG-rich SR4 peptide from EWS 31 or a reiterated TAF15 peptide (T15R) containing multiple copies of the TAF15 YGGDRGG repeat sequence.

RGG3C) from the RGG1 and RGG3 regions of EWS, respectively (Fig. 1), strongly repressed transcription (86% and 96% repression, respectively). In addition an SR4 derivative (SR4S) with serine substitution adjacent to R, as found naturally in several RGG boxes in TAF15, was also repression competent repression (92% repression). Finally, we tested three naturally occurring R-rich peptides from non-TET proteins (HIV-1 Tat49/62, SmDIC and hnRNPUC) and all of these peptides exhibited repressor activity similar to the EWS SR4 peptide (96%, 92% 98% repression, respectively). The ability of various distinct R-rich peptides to repress transcription indicates that there is little spatial or sequence requirement and that an RG-rich sequence suffices for repression. Asp residues in the YGGDRGG peptide eliminate transcriptional repression

To characterize the TAF15 YGGDRGG repeat, we altered the sequence and tested for repression (Fig. 5). Simultaneously changing both Y and D to A (mutant AGGA, 95% repression) fully restored repression and changing D to A alone (YGGA, 91% repression) restored

146

B. L. CHAU ET AL.

Figure 4. Testing other RG-rich peptides for repressor activity. Transcription assays in JEG three cells were as described in Fig. 3 using pG1E4Cat as reporter and a Gal4VP16 fusion (VP16) as reference activator. A representative CAT assay is shown at the top (c, chloramphenicol; ac, acetylated chloramphenicol) with epitope-tagged activator levels shown by Western blot (using antibody KT3) below the CT assay. Test peptide sequences are as follows. Peptides SR4 (from EWS) and T15R (from TAF15) serve as references for strong repression and lack of repression respectively (Fig. 3). Peptides RGG1 and RGG3C are natural peptides from the RGG1 and RGG3 regions of EWS (see Fig. 1) and contain the same number or RGG-boxes. Peptide SR4S corresponds to SR4 except with Ser replacing Gly adjacent to each Arg. Tat49/62 contains residues 49–62 of the HIV-1 Tat protein and the natural sequences from SmDIC and hnRNPUC proteins are shown. The degree of repression for each peptide is indicated in the righthand column.

most of the repression activity. The higher repression activity for AGGA versus YGGA is most likely accounted for by a small transcriptional activation effect contributed by the Ys in the YGGA peptide (Ng et al., 2007 Song et al., 2007). Next, we examined the effect of altering the EWS SR4 peptide sequence to make it resemble the TAF15 YGGDRGG repeat (Fig. 5). Increasing the distance between R residues in SR4 by insertion of GG (mutant EGG, 96% repression) or YG (mutant EYG, 91% repression) had little effect on repression, although again the presence of Ys in EYG probably accounts for the lower repression activity of EYG versus EGG. Lack of a spatial constraint on R residues, as demonstrated, is consistent with strong repression by several unrelated R-rich sequences (see Fig. 4). In contrast to insertion of GG or YG, insertion of YD (EYD) or GD (EGD) between the R residues in SR4, in both cases eliminated repression. Thus, for all of the above mutants, repression is only maintained if R residues are in large excess over D residues (Fig. 5). Together the above results show that inability of the TAF15 YGGDRGG peptide to repress

Figure 5. Mutational analysis of T15R. Transcription assays and data presentation are as described in Fig. 4 with SR4 and T15R as positive and negative references for repression respectively. The location of TAF15 peptides T15R and C1–C8 within the TAF15 RBD is shown at the top. TAF15 peptide sequences are as follows. C5 (TAF15 residues 159–219); C6 (316–349); C7 (377–413); C1 (398–413); C2 (414–451); C3 (459–490); T15R (491–518); C4 (527–562); C8 (570–589). AGGA and YGGA peptides are altered versions of T15R as shown. EGG, EYG, EYD and EGD are altered versions of SR4 as shown. The number in parentheses following each peptide sequence indicates the excess number of Arg residues over Asp. The degree of repression for each peptide is indicated in the right-hand column.

transcription is fully accounted for by the presence of Asp residues in the TAF15 sequence. Because Lys can substitute for Arg in repression, 31 we have previously suggested that the positive charge of the Arg side chain is a key determinant of repression. Together with lack of sequence and spatial constraints described herein (Figs. 4 and 5) and the highly disordered nature of RGG-enriched peptides,34 it is apparent that repression is mediated by a very simple, polybasic disordered peptide. Accordingly it is likely that elimination of repression caused by the presence of negatively charged Asp residues adjacent to Arg is explained (at least partly) by charge neutralization. A subset of RGG boxes account for repression by the TAF15 RBD

All or most RGG boxes in EWS (see Fig. 4) and all three RG-rich regions in EWS (RGG1-3, Fig. 1) contribute to repression.30 In contrast, 75% of the RGG boxes in TAF15 (those present in YGGDR(G/S)G repeats) do not

TRANSCRIPTION

repress transcription. It was therefore of significance to establish whether the sum total of RGG boxes in TAF15 (and hence the TAF15 RBD itself) can repress transcription. Given that the number of RGG boxes in TAF15 is well in excess of the number required for repression, 30,31 we tested a protein (C5-8) containing only those RGG boxes present in TAF15 peptide sequences C5, C6, C7 and C8 (Fig. 5). C5-8 is relatively deficient in D content and thus is predicted to be repression competent. This prediction is verified (C5-8 represses transcription by 98%, Fig. 5) demonstrating that sequences present naturally in the TAF15 RBD (those present in C5-8) suffice for full repression. This finding is consistent with a previous report that the n-terminal region of the TAF15 RBD can repress the TAF15 EAD.55 Together with the fact that the RBD of TLS is also competent for repression (data not shown) it is apparent that the EADs of all three members of the TET protein family can be autorepressed by their corresponding RBDs. To complete the current study, we tested a series of individual TAF15 RGrich peptides (C1-C8 and T15R) that span the entire TAF15 RBD (Fig. 5). This analysis revealed that peptides with a large excess of R residues over D are repression competent (C5 yields 94% repression, C6 98%, C7 87%, C1 84%, C8 79%) while those with only a minor excess or no excess of R over D are not repression competent (C3 13% repression, C4 15% and T15R). These results for natural TAF15 sequences substantiate the conclusion from direct mutagenesis that the lack of repressor activity for the TAF15 YGGDRGG peptide is due to the presence of Asp residues. Expansion, mutation and transposition of repetitive sequences has played a prominent role in protein evolution 56,57 and is particularly significant for the generation of novel functions in intrinsically disordered protein regions58 that predominate in TET proteins. While several functions of RGG boxes are maintained in all TET proteins (for example as shown here, all TET proteins contain sufficient RGG boxes to repress EAD) our findings show that the subset of RGG boxes within the highly reiterated TAF15 YGGDRGG repeat can be functionally distinguished. In addition, we have observed that the Asp residues in the above repeat peptide confer an inability to bind PRMT1 (KL and KAWL, unpublished observations). In light of the extraordinary number and conservation of YGGDRGG repeats present in TAF15, the fact that these repeats are unique to TAF15 and that these repeats play no role in TET auto-repression, our results strongly suggest that YGGDRGG repeats are likely to

147

have other unique properties that contribute to a biological specialization of TAF15.

Conclusion The TET subfamily of RNA-binding proteins is intriguing because all three members have remarkably similar overall sequence and structure but carry out at least some distinct biological functions. The biochemical basis for such differentiation may lie in the small variations in sequence composition residing in the intrinsically disordered EAD and RGG-rich regions of TET proteins (see Fig. 1). To our knowledge, we have presented the first evidence for biochemical distinction within different TET RGG-boxes, namely the effect of conserved D residues in a subset of YGGDRGG TAF15 repeats. Considering the conservation of multiple YGGDRGG repeats unique to TAF15, our findings suggest that these repeats expand the functionality of TET RGG-boxes and thus will contribute to biological specialization of TAF15. Future insights into this proposition should arise from identification of proteins that specifically bind to YGGDRGG repeats. We have previously proposed that RGG-mediated TET repression is likely to be crucial for preventing large scale, promiscuous gene transcription.30,31 Further understanding of the molecular mechanism of repression should therefore provide insights into biological control of native TET proteins. The finding that RGG-mediated TET repression is not autonomous (as previously proposed 34) but instead requires additional trans-acting factors, paves the way for identification of such proteins. Yeast (S. cerevisiae) does not support repression and may provide a vehicle for identification of the putative mammalian protein(s) required for repression. Finally, repression is operative in insect cells indicating that the trans-acting factors required co-evolved with the TET RNA-binding domain (RBD). Further characterization of the Cabeza/SARFH RBD/RGG-boxes in insect cells may provide insights into the mechanism and additional proteins required for RGG box-mediated TET repression.

Materials and methods Plasmids

For in vitro transcription studies, pETG4VP16 was derived from pET15b (Promega) and expresses a histidine-tagged Gal4VP16 fusion protein corresponding to

148

B. L. CHAU ET AL.

that used for previous in vivo studies.31 pETG4VPSR4 expresses Gal4VP16 fusion containing four copies of the RGG rich SR4 peptide (Fig. 4) from EWS.31 The transcriptional reporter pG5E4TCAT was previously described.59 Yeast expression plasmids for Gal4 fusion proteins were derived from pGBT9 as previously described.39 All proteins expressed contain a c-terminal 7-residue peptide from SV40 T-antigen for detection by western blotting using KT3 antibody.60 For trans-activation experiments in yeast, the panel of EAD mutants have been functionally characterized in mammalian cells in the context of the natural EFP, EWS/ATF1 40 pG4DA, pG4DI, pG4DF, pG4REV, pG4D78 and pG4SCR were constructed by replacing the native EAD sequence in pG4/E28539 with the corresponding EAD sequences from pDA, pDI, pDF, pREV, pD78m and pSCR, respectively.40 Plasmids expressing Gal4VP16 derivatives for repression experiments in yeast were as follows. pVPRBD expresses a protein containing Gal4VP16 and the intact EWS RBD. pVPRM4 corresponds to pVPRBD but lacks the EWS RRM.30 pVPSR4 expresses a protein-containing Gal4VP16 and four copies of the RGG-rich SR4 peptide from EWS.31 pEADSR4 expresses a protein containing the intact EAD and four copies of the RGGrich SR4 peptide from EWS.31 Plasmids for expression in Sf21 insect cells were derived from pXINsect-DEST39 Gateway vector (Life Technologies). pDEST-57Z 61 expresses a well-characterized EWS/ATF1 derivative (57Z) containing residues 157 of the EWS-Activation-Domain (EAD) linked to the ATF1 portion of EWS/ATF1 except that the ATF1 bZip domain is substituted by the Zta bZip domain.42 pDEST57RZ expresses a protein (57RZ) corresponding to 57Z but containing the intact EWS RBD. The transcriptional activity of 57Z and 57RZ was detected using a reporter plasmid pZ7Luc containing a minimal promoter and seven Zta binding sites linked to luciferase.43 For experiments in mammalian cells, the transcriptional reporter pG1E4TCAT was previously described.59 Mammalian protein expression vectors were derived from pSG424 62 and proteins tagged with the epitope for monoclonal antibody KT3 60 pG4VP16 and pVPSR4 are as described previously.31 All other mammalian expression plasmids used express proteins that correspond to VPSR4 and containing the RGG-related test peptides as indicated in the figure legends and main text.

Mammalian cell reporter assays

JEG3 cells were grown in Dulbecco’s modification of Eagles medium containing 10% FCS. Freshly passaged cells were transfected by calcium phosphate co-precipitation method as previously described.63 CAT assays and Western Blotting of epitope-tagged proteins were carried out as previously described.30 Sf21 insect cell reporter assays

Sf21 cells were maintained at 27 C in Grace’s supplemented medium containing 10% fetal bovine serum. For transfections using Cellfectin II Reagent (Life Technologies), cells were seeded at 30% confluence in 24-well tissue dishes and transfected using 200 ng of pZ7Luc reporter and 200 ng of test activator (p57Z or p57RZ). Cell extracts were prepared 40 hours after transfection and luciferase assays performed using Steady-Glo luciferase substrate (Promega). Yeast (S. cerevisiae) reporter assays

Plasmids expressing Gal4-VP16 or Gal4-EAD fusion proteins were used to transform yeast strain Y190 and trans-activation of a chromosomal lacZ reporter was scored by staining of filters for b-galactosidase. Representative filters are shown. Expression of activator proteins was confirmed by Western blotting of KT3epitope-tagged proteins. In vitro transcription

Nuclear extract from Hela S3 cells was prepared as previously described 64 with some modifications.65 Transcription reactions dependent on exogenous test proteins were performed as previously described 37 with the indicated amounts of DNA template and test protein samples (see Fig. 2A). Absolute protein levels for Histidine-tagged G4VP16 and G4VPSR4 were determined by Coomassie blue staining compared with BSA standards. For transcript detection, primer extensions were performed as previously described.65 Primer extension products were resolved on 10% denaturing polyacrylamide/urea gels that were directly exposed against Kodak Biomax XAR film and exposed at minus 80 C for 2 hours. Histidinetagged G4VP16 and G4VPSR4 were prepared as described previously for Gst-fusion proteins 37 from bacterial (BL21) cell lysates obtained from 300 mL of culture. Lysates were mixed with 200 mL Ni-affinity resin (NTA, Qiagen) for 1 hour the resin washed with 5 mL of LB,

TRANSCRIPTION

washed three times with 1mL of 50 mM imidazole (in 10 mM tris pH 8.0) and finally eluted with 200 mL elution buffer (100 mM imidazole, 10 mM tris pH 8.0 and 20% glycerol), aliquoted and quick-frozen in liquid nitrogen.

[12]

Disclosure of potential conflicts of interest No potential conflicts of interest were disclosed. [13]

Acknowledgments We greatly appreciate the efficient and economical gene synthesis service of TOP Gene Technologies, Montreal, QC, Canada. Contributions were as follows: Ling Chau performed experiments and analyzed data; King Pan Ng performed experiments; Kim Li performed experiments; Kevin Lee analyzed data and wrote the paper.

[14]

[15]

References [1] Haynes SR. The RNP motif protein family. N Biologist 1992; 4:421-429. [2] Immanuel D, Zinszner H, Ron D. Association of SARFH (Sarcoma-Associated RNA-Binding Fly Homolog) with regions of chromatin transcribed by RNA polymerase II. Mol Cell Biol 1995; 15:4562-4571. [3] Bertolotti A, Lutz Y, Heard DJ, Chambon P, Tora L. hTAFII68, a novel RNA/ssDNA-binding protein with homology to the pro-oncoproteins TLS/FUS and EWS is associated with both TFIID and RNA polymerase II. EMBO J 1996; 15:5022-5031. [4] Calvio C, Neubauer G, Mann M, Lamond AI. Identification of hnRNP P2 as TLS/FUS using electrospray mass spectrometry. RNA 1995; 1:724-733. [5] Law WJ, Cann KL, Hicks GG. TLS, EWS and TAF15: a model for transcriptional integration of gene expression. Briefings Functional Genomics Proteomics 2006; 5:8-14. [6] Jobert L, Pinzon N, Van Herreweghe E, Jady BE, Guialis A, Kiss T, Tora L. Human U1 snRNA forms a new chromatinassociated snRNP with TAF15. EMBO Rep 2009; 10:494500. [7] Tan AY, Manley JL. The TET family of proteins: functions and roles in disease. J Mol Cell Biol 2009; 1:82-92. [8] Kovar H. Dr. Jekyll and Mr. Hyde: The Two Faces of the FUS/EWS/TAF15 Protein Family. Sarcoma 2011; 2011:837474. [9] Rual JF, Venkatesan K, Hao T, Hirozane-Kishikawa T, Dricot A, Li N, Berriz GF, Gibbons FD, Dreze M, AyiviGuedehoussou N, et al. Towards a proteome-scale map of the human protein-protein interaction network. Nature 2005; 437:1173-1178. [10] Cortese MS, Uversky VN, Dunker AK. Intrinsic disorder in scaffold proteins: Getting more from less. Prog Biophys Mol Biol 2008; 98:85-106. [11] Hicks GG, Singh N, Nashabi A, Mai S, Bozek G, Klewes L, Arapovic D, White EK, Koury MJ, Oltz EM, et al. Fus deficiency in mice results in defective B-lymphocyte

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

149

development and activation, high levels of chromosomal instability and perinatal death. Nature Genetics 2000; 24:175-179. Fujii R, Okabe S, Urushido T, Inoue K, Yoshimura A, Tachibana T, Nishikawa T, Hicks GG, Takumi T. The RNA binding protein TLS is translocated to dendritic spines by mGluR5 activation and regulates spine morphology. Curr Biol 2005; 15:587-593. Li H, Watford W, Li C, Parmelee A, Bryant MA, Deng C, O’Shea J, Lee SB. Ewing sarcoma gene EWS is essential for meiosis and B lymphocyte development. J Clin Invest 2007; 117:1314-1323.  Andersson MK, Stahlberg A, Arvidsson Y, Olofsson A, Semb  H, Stenman G, Nilsson O, Aman P. The multifunctional FUS, EWS and TAF15 proto-oncoproteins show cell typespecific expression patterns and involvement in cell spreading and stress response. BMC Cell Biol 2008; 9:37. Paronetto MP, Minana B, Valcarcel J. The Ewing sarcoma protein regulates DNA damage-induced alternative splicing. Mol Cell 2011; 43:353-368. Ballarino M, Jobert L, Dembele D, de la Grange P, Auboeuf D, Tora L. TAF15 is important for cellular proliferation and regulates the expression of a subset of cell cycle genes through miRNAs. Oncogene 2013; 32:46464655. Ibrahim F, Maragkakis M, Alexiou P, Maronski MA, Dichter MA, Mourelatos Z. Identification of in vivo, conserved, TAF15 RNA binding sites reveals the impact of TAF15 on the neuronal transcriptome. Cell Reports 2013; 3:301-308. Kaneb HM, Dion PA, Rouleau GA. The FUS about arginine methylation in ALS and FTLD. EMBO J 2012; 31:4249-4251. Dormann D, Haass C. Fused in sarcoma (FUS): An oncogene goes awry in neurodegeneration. Mol Cell Neurosci 2013; 56:475-486. Lee KAW. Molecular recognition by the EWS transcriptional activation domain. In Fuzziness: structural disorder in protein complexes, Tompa P, Fuxreiter M, eds.; Adv Exp Med Biol, Vol. 725, New York: Landes Bioscience; 2011; Chapter 7:106-125. Stolow DT, Haynes SR. Cabeza, a Drosophila gene encoding a novel RNA binding protein, shares homology with EWS and TLS, two genes involved in human sarcoma formation. Nucl Acids Res 1995; 23:835-843. Belyanskaya LL, Gehrig PM, Gehring H. Exposure on cell surface and extensive arginine methylation of Ewing sarcoma (EWS) protein. J Biol Chem 2001; 276:1868118687. Uranishi H, Tetsuka T, Yamashita M, Asamitsu K, Shimizu M, Itoh M, Okamoto T. Inolvement of the prooncoprotein TLS (Translocated in Liposarcoma) in nuclear factor-kB p65-mediated transcription as a coactivator. J Biol Chem 2001; 276:13395-13401. Rossow KI, Janknecht R. The Ewing’s sarcoma gene product functions as a transcriptional activator. Cancer Res 2001; 61:2690-2695.149

150

B. L. CHAU ET AL.

[25] Araya N, Hirota K, Shimamoto Y, Miyagishi M, Yoshida E, Ishida J, Kaneko S, Kaneko M, Nakajima T, Fukamizu A. Cooperative interaction of EWS with CREB-binding protein selectively activates hepatocyte nuclear factor 4mediated transcription. J Biol Chem 2003; 278:54275432. [26] Wang X, Arai S, Song X, Reichart D, Du K, Pascual G, Tempst P, Rosenfeld MG, Glass CK, Kurokawa R. Induced ncRNAs allosterically modify RNA-binding proteins in cis to inhibit transcription. Nature 2008; 454:126-130. [27] Tan AY, Riley TR, Coady T, Bussemaker HJ, Manley JL. TLS/FUS (translocated in liposarcoma/fused in sarcoma) regulates target gene transcription via single-stranded DNA response elements. Proc. Natl Acad Sci USA 2012; 109:6030-6035. [28] Lee MH, Mori S, Raychaudhuri P. Trans-activation by the hnRNP K protein involves an increase in RNA synthesis from the reporter genes. J Biol Chem 1996; 271:3420-3427. [29] Dempsey LA, Hanakahi LA, Maizels N. A specifc isoform of hnRNPD interacts with DNA in the LR1 heterodimer: canonical RNA binding motifs in a sequence-specific duplex DNA binding protein. J Biol Chem 1998; 273:29224-29229. [30] Li KKC, Lee KAW. Transcriptional activation by the EWS oncogene can be cis-repressed by the EWS RNAbinding-domain. J Biol Chem 2000; 275:23053-23058. [31] Alex D, Lee KAW. RGG-boxes of the EWS oncoprotein repress a range of transcriptional activation domains. Nucleic Acids Res 2005; 33:1323-1331. [32] Bertolotti A, Melot T, Acker J, Vigneron M, Delattre O, Tora L. EWS, but not EWS-FLI-1, is associated with both TFIID and RNA polymerase II: interactions between two members of the TET Family, EWS and hTAFII68, and subunits of TFIID and RNA polymerase II complexes. Mol Cell Biol 1998; 18:1489-1497. [33] Petermann R, Mossier BM, Aryee DN, Khazak V, Golemis EA, Kovar H. Oncogenic EWS-Fli1 interacts with hsRPB7, a subunit of human RNA polymerase II. Oncogene 1998; 17:603-610. [34] Song J, Ng SC, Tompa P, Lee KAW, Chan HS. Polycation-pi interactions are a driving force for molecular recognition by an intrinsically disordered oncoprotein family. PLoS Comput Biol 2013; 9(9):e1003239. [35] Aman P, Panagopoulos I, Lassen C, Fioretos T, Mencinger M, Toresson H, Hoglund M, Forster A, Rabbitts TH, Ron D, et al. Expression patterns of the human sarcoma-associated genes FUS and EWS and the genomic structure of FUS. Genomics 1996; 37:1-8. [36] Melot T, Dauphinot L, Sevenet N, Radvanyi F, Delattre O. Characterisation of a new brain-specific isoform of the EWS protein. Eur J Biochem 2001; 268:3483-3489. [37] Ng KP, Li KKC, Lee KAW. In vitro activity of the EWS oncogene transcriptional activation domain. Biochem 2009; 48:2849-2857. [38] Chasman DI, Leatherwood J, Carey M, Ptashne M, Kornberg RD. Activation of yeast polymerase II

[39]

[40]

[41]

[42]

[43]

[44]

[45]

[46]

[47]

[48]

[49]

[50]

[51]

[52]

transcription by HerpesvirusVP16 and GAL4 derivatives in vitro. Mol Cell Biol 1989; 9:4746-4749. Pan S, Ming KY, Dunn TA, Li KKC, Lee KAW. The EWS/ATF1 fusion protein contains a dispersed activation domain that functions directly. Oncogene 1998; 16:16251631. Ng KP, Potikyan G, Savene RO, Denny CT, Uversky VN, Lee KAW. Multiple aromatic side chains within a disordered structure are critical for transcription and transforming activity of EWS family oncoproteins. Proc Natl Acad Sci USA 2007; 104:479-484. Azuma M, Embree LJ, Sabaawy H, Hickstein DD. Ewing sarcoma protein Ewsr1 maintains mitotic integrity and proneural cell survival in the zebrafish embryo. PLoS One 2007; 10:e979. Feng L, Lee KAW. A repetitive element containing a critical tyrosine residue is required for transcriptional activation by the EWS/ATF1 oncogene. Oncogene 2001; 20:4161-4168. Ng KP, Cheung F, Lee KAW. A transcription assay for EWS oncoproteins in xenopus oocytes. Protein Cell 2010; 1:927-934. Thomas MC, Chiang CM. The general transcription machinery and general cofactors. Critical Rev Biochem Mol Biol 2006; 41:105-178. Casamassimi A, Napoli C. Mediator complexes and eukaryotic transcription regulation: An overview. Biochimie 2007; 89:1439-1446. Pahlich S, Quero L, Roschitzki B, Leemann-Zakaryan RP, Gehring H. Analysis of Ewing sarcoma (EWS)-binding proteins: interaction with hnRNP M, U, and RNA-Helicases p68/72 within protein-RNA complexes. J Proteome Res 2009; 8:4455-4465. Paushkin S, Gubitz AK, Massenet S, Dreyfuss G. The SMN complex, an assemblyosome of ribo nucleo proteins. Curr Op Cell Biol 2002; 14:305-312. Shaw DJ, Morse R, Todd AG, Eggleton P, Lorson CL, Young PJ. Identification of a self-association domain in the Ewing’s sarcoma protein: a novel function for arginine-glycine-glycine rich motifs? J Bio Chem 2010; 147:885-893. Young PJ, Francis JW, Lince D, Coon K, Androphy EJ, Lorson CL. The Ewing’s sarcoma protein interacts with the Tudor domain of the survival motor neuron protein. Mol Brain Res 2003; 119:37-49. Ohno T, Ouchida M, Lee L, Gatalica Z, Rao VN, Reddy ESP. The EWS gene, involved in Ewing family of tumors, malignant melanoma of soft parts and desmoplastic small round cell tumors, codes for an RNA binding protein with novel regulatory domains. Oncogene 1994; 9:30873097. Lerga A, Hallier M, Delva L, Orvain C, Gallais I, Marie J, Moreau-Gachelin F. Identification of an RNA binding specificity for the potential splicing factor TLS. J Biol Chem 2001; 276:6807-6816. Hoell JI, Larsson E, Runge S, Nusbaum JD, Duggimpudi S, Farazi TA, Hafner M, Borkhardt A, Sander C, Tuschl

TRANSCRIPTION

[53]

[54]

[55]

[56] [57] [58] [59]

T. RNA targets of wild-type and mutant FET family proteins. Nat Struct Biol 2011; 18:1428-1431. Wilson BJ, Bates GJ, Nicol SM, Gregory DJ, Perkins ND, Fuller-Pace FV. The p68 and p72 DEAD box RNA helicases interact with HDAC1 and repress transcription in a promoter-specific manner. BMC Mol Biol 2004; 5:11. Araya N, Hiraga H, Kako K, Arao Y, Kato S, Fukamizu A. Transcriptional down-regulation through nuclear exclusion of EWS methylated by PRMT1. Biochem Biophys Res Comm 2005; 329:653-660. Bertolotti A, Bell B, Tora L. The N-terminal domain of human TAFII68 displays transactivation and oncogenic properties. Oncogene 2000; 18:8000-8010. McClintock B. The significance of responses of the genome to challenge. Science 1984; 226:792-801. Biemont C, Vieira C. Genetics: Junk DNA as an evolutionary force. Nature 2006; 443:521-524. Tompa P. Intrinsically unstructured proteins evolve by repeat expansion. Bio Essays 2003; 25:847-845. Emami KH, Carey M. A synergistic increase in potency of a multimerised VP16 transcriptional activation domain. EMBO J 1992; 11:5005-5012.

151

[60] MacArthur H, Walter G. Monoclonal antibodies specific for the carboxy terminus of Simian virus 40 large T antigen. J Virol 1984; 52:483-491. [61] Todorova R. In vitro interaction between the N-terminus of the Ewing’s sarcoma protein and the subunit of RNA polymerase II hsRPB7. Mol Biol Rep 2009; 36:1269-1274. [62] Sadowski I, Ptashne M. A vector for expressing GAL4(1147) fusions in mammalian cells. Nucl Acids Res 1989; 17:7539. [63] Brown AD, Lopez-Terrada D, Denny CT, Lee KAW. Promoters containing ATF-binding sites are de-regulated in tumour-derived cell lines that express the EWS/ATF1 oncogene. Oncogene 1995; 10:1749-1756. [64] Dignam JD, Lebovitz RM, Roeder RG. Accurate transcription initiation by RNA polymerase II in a soluble extract from isolated mammalian nuclei. Nucl Acids Res 1983; 11:1475-1489. [65] Lee KAW, Green MR. A cellular transcription factor E4F1 interacts with an E1a-inducible enhancer and mediates constitutive enhancer function in-vitro. EMBO J 1987; 6:1345-1353.

FET family of RNA-binding proteins are functionally distinct.

The multi-functional TET (TAF15/EWS/TLS) or FET (FUS/EWS/TLS) protein family of higher organisms harbor a transcriptional-activation domain (EAD) and ...
831KB Sizes 0 Downloads 7 Views