Biochimie ( ! 991 ) 73, i 7-28 © Soci6t6 fran~zaise de biochimie et biologie moi6culaire / Elsevier, Paris

17

A phylogenetic study of U4 snRNA reveals the existence of an evolutionarily conserved secondary structure corresponding to 'free' U4 snRNA E Myslinski*, C Branlant** Laboratoire d'Enzymologie et de G~nie G~n~tique, Universit~ de Nancy !, URA CNRS 457, Bd des Aiguillettes, BP 239 54506 Vand~euvre-lOs-Nancy, France

(Received 13 December 1990; accepted 24 December 1990)

Summary - - The nucleotide sequence of Physarum polycephalum U4 snRNA*** was determined and compared to published U4 snRNA sequences. The primary structure of P polycephalum U4 snRNA is closer to that of plants and animals than to that of fungi. But, both fungi and P polycephalum U4 snRNAs are missing the 3' terminal hairpin and this may be a common feature of lower eucaryote U4 snRNAs. We found that the ":econdary structure model we previously proposed for 'free' U4 snRNA is compatible with the various U4 snRNA sequences published. The possibility to form this tetrahelix structure is preserved by several compensatory base substitutions and by compensatory nucleotide insertions and deletions. According to this finding, association between U4 and U6 snRNAs implies the disruption of 2 internal helical structures of U4 snRNA. One has a very low free energy, but the other, which represents one-half of the helical region of the 5' hairpin, requires 4 to 5 kcal to be open. The remaining part of the 5' hairpin is maintained in the U4/U6 complex and we observed the conservation, in all U4 snRNAs studied, of a U bulge residue at the limit between the helical region which has to be melted and that which is maintained. The 3' domain of U4 snRNA is less coaserved in bath size and primary structure than the 5' domain; its structure is also more compact in the RNA in solution. In this domain, only the Sm binding site and the presence of a bulge nucleotide in the hairpin on the 5' side of the Sm site are conserved throughout evolution.

snRNA U4 / secondary structure / Physarum polycephalum / U4/U6 complex / RNA evolution Introduction

m R N A to generate the active spliceosome. U 1 and U2

R e m o v a l of intervening sequences from pre-messenger R N A is essential for gene expression in eucaryotes. It proceeds in a 2-step reaction. First, cleavage occurs at the 5' splice site generating the 5' exon and a reaction intermediate containing the 3' exon and a lariat form o f the intervening sequence. Second, the 5' and the 3' exons are ligated, releasing the matured m R N A and the intron in a lariat form [1]. This 2-step reaction requires several factors including 5 small nuclear R N A s (U1, U2, U4, U5 and U6 s n R N A s ) [2, 31 and occurs in a highly ordered structure termed a spliceosome [4, 5]. U I , U2 and U5 s n R N A s are present as individual ribonuc;eoprotein c o m p l e x e s ( s n R N P s ) [6-8], in contrast to U4 and U6 s n R N A s which are present together in a single s n R N P particle [9, 10]. The snRNPs assemble onto pre-

branched site of p r e - m R N A , respectively [1 1-15], U5 s n R N P probably interacts with the 3' splice site [16, 17]. No specific site o f action has been identified for the U4AJ6 snRNP. The splicing reaction appears to depend upon a series of conformational rearrangements of the spliceosomal structure. In particular, the U4AJ6 association seems to be destabilized just before cleavage at the 5' splice site [18-20]. This raises the question of a d y n a m i c U4AJ6 interaction and of the existence of ' f r e e ' U4 s n R N A in vivo. There is one peculiar situation in which U4 s n R N A m a y also be free of association with U6 s n R N A in vivo: its maturation step in the cytoplasm. Like U l, U2 and U5 p r e - s n R N A s [21], U4 p r e - s n R N A is directed to the c y t o p l a s m following transcription, which is not the case for U6 snRNA[22]. In the cytoplasmic c o m p a r t m e n t U4 s n R N A undergoes a 3' terminal maturation process and binds s n R N P proteins [21-23]. As found for other U s n R N A s , 'free' U4 s n R N A is expected to have a defined secondary structure conserved throughout evolution [24]. The

~nl~lMPe

*Present address: lnstitut de Biologie Mol6culaire et Cellulaire du CNRS - 67000 Strasbourg, France **Correspondence and reprints ***Accessible No of P polycephalum U4 snRNA in the EMBL data bank: X 13840

interact

rllraetlv

u:ith

th~

~'

~nlle~

clt~

~nri

!S

E Myslinski, C Branlant

secondar,, structure of U4 s n R N A has long been a matter oi" debate. Several models have been proposed to explain the U4/U6 association 110, 251. O n l y that recently proposed by Brow and Guthrie is strongly supported by phyiogenetic and genetic evidence [26]. Based on an experimental study of U4 s n R N A secondary structure [27] and on the primary structure determination of Drosophila U4 s n R N A [281, we previously proposed a secondary structure model for "free" U4 snRNA. The U4 s n R N A segments which are not base-paired with U6 s n R N A in the Brow and Guthrie model are folded in the same w a y in our model and in the Brow and Guthrie model. The question was to know whether the additional U4 s n R N A internal base-pairings we proposed for "free' U4 s n R N A had biological significance. O n e w a y to answer this question was to check if the possibility to form the corresponding base pairs was conserved throughout evolution by compensatory mutations. The number of determined U4 snRNA nucleotide sequences was not high: man, rat, chicken [271, Drosophila [28], broad bean [29], Caednorabditis elegans [301, Saccharomw'es cerevisiae [311, Schizosaccharomyces pombe [32i, Kiuyveromyces lactis [33], Trypanosoma brucei [341. We decided to determine an additional U4 s n R N A sequence, that of Physarum polycephalum, a unicellular slime mold at the frontier between plants and animals [35]. Then we c o m p a r e d all U4 s n R N A sequences with the aim of defining a c o m m o n secondary structure corresponding to 'free' U4 s n R N A and of identifying highly evolutionarily conserved structural motifs of U4 snRNA. The results are discussed taking into account recent data on spiiceosome assembly and function.

Materials and Methods

Isolation of P polycephalum U4 snRNA Nuclei were isolated from P polycephalum strain M3 CVIII microplasmodia [361, according to Mohberg and Rusch [37]. Nuclear RNAs isolated by 2 successive phenol extractions of the nuclei at 4°C in 100 mM NaCI, 10 mM Tris-HCl buffer pH 7.6, as previously reported 1381, were fractionated on a 15--30% sucrose gradient in the same buffer. The U snRNAs were immunopurified from the 4S-8S RNA fraction using antibodi,s specific for 2,2,7-trimethylguanosine and protein A-Sepharose as described by Krol et a11391. The m3G specific antibodies were a generous gift of Dr Liihrman (Berlin). The immunopurified U snRNAs were then 3' end-labelled using 15'-3-~P1 pCp and T4 RNA ligase [40] and fractionated by electrophoresis on a 15% polyacrylamide gel in the presence of 8 M urea.

RNA sequencing Chemical method The chemical degradation method of Peattie [41] was applied to the 3' end-labelled molecules.

b

a

-~,,.~;.

~

U2

.~

U1

~

------ U 4



1 0 ~ ~ "

1.J5

12--

~:;ii!! ..?...

Fig 1. Electrophoretic fractionation of P polycephalum and rat U snRNAs on a 15% polyacrylamide gel, in the presence of 8 M urea. P polycephalum U snRNAs (lane a) and rat U snRNAs 0ane b) were obtained by the previously described 2-step procedure [43]. They were 3' end-labelled using [5'-32p] pCp and T4 RNA ligase. The RNA in each band numbered in the figure was subjected to sequence analysis. Band 7 was found to contain U4 snRNA.

Reverse transcriptase method [42] A 30-mer DNA oligonucleotide complementary to the region of U4 snRNA sequenced by the chemical method was chemically synthesized (Pharmacia DNA synthesizer). It was 5' endlabelled with h,32P-ATP] and T4 polynucleotide kinase and used as a primer for RNA sequencing with reverse transcriptase directly on to the 4S-8S RNA mixture recovered from the sucrose gradient, 20 I.tg of RNA mixture was annealed with 40 pmol of primer.

Results

Isolation and sequence analysis of P p o l y c e p h a l u m U4 snRNA Nuclear R N A s were prepared from P polycephalum microplasmodia. The U s n R N A s were isolated

A phylogenetic study of U4 snRNA from this mixture by the previously described 2-step procedure (sucrose gradient plus immunoselection) [43]. Figure 1 shows the fractionation pattern of 3' end labelled P polycephalum U snRNAs after electrophoresis on a 15% polyacrylamide gel. Band 7 was found to contain U4 snRNA after sequence analysis of the eluted RNAs. In order to determine the complete nucleotide sequence of P polycephalum U4 snRNA, the 3' half of the molecule was subjected to the chemical method for R/klA sequencing, whereas the 5' half was subjected to the reverse transcriptase method. This second method was directly applied to the mixture of small nuclear RNAs recovered from the 4S-8S peak of the sucrose gradient. A 30-mer deoxyoligonucleotide complementary to the P polycephalum U4 snRNA sequence located between nucleotides 66 and 98 (numbered from the 5' end) (fig 2) was used as a primer. Although U4 snRNA only represents - 1% of the total 4S-8S RNA mixture, satisfactory sequencing results were obtained using this approach. Only the 2 nucleotides proximal to the triphosphate of the cap structure could not be determined. Reverse transcriptase reaction stopped before them. This stop is probably explained by the presence of base modifications. The sequence determined for the RNA of band 7 is given in figure 2.

Comparison of U4 snRNAs primary structures The nucleotide sequence determined for P polycephalum U4 snRNA has been aligned with published U4 snRNA sequences (fig 2). The high degree of homology observed confirmed that the isolated RNA really corresponds to U4 snRNA. This is in contrast with the RNA previously sequenced by Adams et a! [44]: these authors sequenced a P polycephalum snRNA that they considered to be U4 snRNA. However, this RNA displays no significant homology with U4 snRNA sequences from other species. The present results suggest that it should correspond to an unidentified additional snRNA species present in P polycephalum. An important point to notice is that P polycephalum U4 snRNA, like S cerevisiae [31], S pombe [32] and K lactis [33] U4 snRNAs is truncated at its 3' end. P polycephalum U4 snRNA is the shortest U4 snRNA sequenced up to now, with the exception of Trypanosoma brucei U4 snRNA [34], which is a special case as it is involved in o'ans splicing. P polycephalum U4 snRNA is 124 nucleotides long, whereas S pombe, vertebrate, broad bean and S cerevisiae U4 snRNAs are 128, 145, 152 and 160 nucleotides long, respectively. The short length observed could have resulted from the isolation of a degradation product. As an argument against this hypothesis, we would like to mention that the fractio-

19

nation pattern of snRNA:; illustrated in figure ! was obtained in a reproducible manner, that the RNA of each band detected on the autoradiogram was subjected several times to sequence analysis a~,~d that band 7 was reproducibly found to be the only one corresponding to U4 snRNA. In addition, RNA degradation products generally have 3' phosphate at their 3' end and this was not the case for the RNA we sequenced. As previously observed by Guthrie and Patterson [33], the 5' and 3' halves of U4 snRNA have different evolutionary behaviour (fig 2). The 5' half displays a higher degree of nucleotide sequence conservation, with large blocks which are highly conserved from yeast to man. Only punctual insertions or deletions have occurred in this 5' half, whereas large blocks have been inserted or deleted in the 3' half. Determination of a percentage of homology in the case of large insertions or deletions is not really meaningful. For this reason, we calculated the percentage of homology between the 5' domains (84 nucleotides at the 5' end) of the various molecules sequenced. T brucei RNA has not been introduced in the comparison since as revealed by the alignment in figure 2, only its extreme 5' terminal region retains significant homology with other U4 snRNAs. As can be seen in table I, the degree of homology between P polycephalure and vertebrate U4 snRNAs on one hand, and broad bean and vertebrate U4 snRNAs on the other hand, are very similar. We noted that the homology between vertebrate and C elegans U4 snRNAs is higher than that between vertebrate and P polycephaium U4 snRNAs, which is in accord with the expected phylogenetic distances between these organisms [45]. The U4 snRNA sequences giving the lowest score of homology with any U4 snRNA compared are those ¢,-,-.,~,-, t - k = , EEUill

L lllb , , ,

,,~,.-~,t~, .~**-'(~L*-~LL~).

IAnl l

~,,~,~';r.,.l,~r ~J~llLl~,~Ulllt,~lq

P



n,'*l,~,r.onl'~,'Jl.~

}/ttltt~l]lg~cg:tlg

D I~]A z'~,t 'ql •

displays a very low level of homology with yeast RNAs, which is in accord with the previous observations of Gonzales-y-Merchand and Cox [35]. These authors concluded from a comparison of tubulin sequences that physarales are closer to kingdom animalia than to kingdom fungi. Another important point to notice is the high degree of divergences between S cerevisiae and S pombe RNA sequences. It is greater than that between C elegans and man or broad bean and man RNAs.

The secondary structure of ~'ee' U4 snRNA We previously proposed a secondary structure model for 'free' U4 snRNA which was based on the results of an experimental study of rat U4 snRNA in solution [27] and on a comparison between vertebrate and Drosophila U4 snRNAs [28]. We checked whether this model was compatible with the various primary structures of U4 snRNA published since that time.

E Myslinxki. C Bramdant

2(,)

Figt,~',, 3A and 3B .,,ho~v the results obtained when U4 RNA.,, fi'om higher eucaryotes (broad bean and C" ¢'h'g,m,~')and from lower eucaryotes (P po/y~¢7~h~thun. S ¢'erevisiae and S pomhe) are respectively folded according to this model.

As can be seen in these figures, the proposed model is compatible with all published U4 s n R N A primary structures. As shown in figure 3A where broad bean R N A is c o m p a r e d with chicken R N A , a huge n u m b e r o f c o m p e n s a t o r y base mutations, nucleotide deletions

I

III

Human

U4A

III ~" I Sm 1oo iio 12o GCCGUGACGRCUUGCRAUAUAGUCGGCAUUGGCRRUUUUUGR

U4C

,,,R,..,,,,..,R,.,,..,..,..,,,..,.,...,.,,

U4R

, , , . , . , , , , o , , , , . , , , , o , , , , , . , , , o , . , , , , , , , , .

t

1o

20

3o

40

Human

U4A

I~;CUUUGCGCRGUGGCAGUAUCGUAGCCAAUGAGGUUUAUCCGR

Chicken

U4A

.....................................

A ......

Drosophila

..... A . . . . . . . . . . . A . . C .... A ....... A . C C . C C . U . .

Chicken

C. e l e g a n s

.......... U.G...GA..A..

Drosophila

Broad

bean

a

P. p o l y c e p h a l u m



90

°GA . . . . . . . . . C . . U G ....

.U ........ IJ~.G.... A . G A ...... U . G ....... .V.A ....

C.

...R..0A..,0,,R

elegans

,

........ uV.G .... A... UA .... G C G . , , C C A G R . RGG.. Broad

bean

....

CC..,Cn.URC

............

cu. u. ,oocuuoo .VuuoGG.uce,. c. u . . , ,o . . . .

a

0

.oR.. u. co..,..,..c..o.ou ..... Rc ...........

c . . ov

K. l a c t i s S. cereuis'.'ae

.U.C..AU...CG..A.R..CGCAUAU..G .U.C..RU...CG..R.A..CGCAURU..G

.....

V.CG .....

P.polgcephalum

S. p o m b e T. brucei

.U.C..V.u..ACG..U.U..CGC.UAU..G

.....

'~..CCUR..

K. / a c t l s S. c e r e v i s i a e

U . . UCRUARCG. RUUOVRC.. GUGUURUGCR.. G . . . . . . . .

G

U, . UCRUG . U. RRUUI~[JUCRI:IGA . URUG. R . . G . . . . . . . .

G

S. pom'be

U, . R R C U . U C G U

.....

V.CG .....

ft..C ........ G,RGG .R.. G agu u '' A-CYUuRYGC - - - G g n R r U R n n n . n R Y c a R U G A g g Y u n n n c n G R U ucg o

concensus

|

I

n

~URAGR.UU

c N Y -g

concensus

IV

U4R

GGCGCGAUURUUGCUARUUGRAARCUUUUCCCAR~RCCCC

Chicken

U4R

........................................

Drosophila

..U...G

.........

C. elegans

..U...U

........

......

G ...........

.....

UC . . . . . . .

GG . . . . . . . . . G .........

S. c e r e v i s l a e

.AUUGUG..U .....

S. poabe

.RU.GCG . . . . . . . . .

CQnC~US

~ V ~ U I

........

Human,rat

C&..R.

UG...

U..RUfl .......... eRMUGMG

AR . . . .

GG . . . . . . . . . . . . . . . . . .

P.polgcephalum K . lact is

a

u ~.,~--. ............

]0

.,j.LI

D

Human

bean

.o ..... cc.. c. u . @ - ~ o . . . v . ,

....

G...R .... G c gR~RRUuUYUGR

I

z

Broad

.

R..UGCC.fl

Chicken

Broad

GG . . . . . .

UU.R..UUU..C~..U.

GG . . . . . .

UU.~UURUA..CY..GU

G .........

I . I I I I C P I I D D I II I C a n a - - - - i

AUUR..CUG...G. I UUU

.......

c~_

t~

.......................---,,,,,,-~---~,s,,..,,,, _~,,r RRR g I

U4R

CRGUCUCUACGGRGRCUGGoH

U4C

................

U4R

...................

Drosophila C. elegans

HR..UA...GG.UA

bean

IV

OH

R..C.CU .... flC.C . . . . . a

P.polycephalum K. l a c t i s

OH

RG.G..ARox

A ....

G.fl.flRoH

RG.fl...CC~J..G.URflfl.

UCUfloH

A.RRGCUoH R.CCUA.CUoH

S. c e r e u l s i a e

A.UR.CUUUoH

S. pombe

,. MU, n , ~ o H

I

A

B

Fig 2. U4 snRNA sequence aligment. The RNA has been divided in 2 domains based on its secondary structure (fig 3): A: the B : the 3' domain. The sequences of man, rat, chicken [27], Drosophila [28], Caenorabditis elegans [301, broad bean 1291. Phvsarum polycephalum (this paper), Kh~w'eromw'es lactis [33], Sact'haromw'es cerevisiae [31] and Schi-osaccharomyces pomhe 1321 U4 snRNAs have been aligne(l. Since base modifications were only determined for a few species, they were r-.~t taken into account in the comparison. Thus. as human and rat U4A snRNA sequences only differ by a base modification at l,osition 2 in the human RNA, a single sequence has been given for the 2 species. In the case of the 3' domain, we nevertheless Jigned a second sequence, that of the U4C snRNA variant which differs from the U4A snRNA in the 3' domain. For reasons of clarity, only U4 snRNAa from broad bean was used and finally, only the limited areas of Trvpanosoma brucei RNA [34] having a significant homology with other U4 snRNAs have been aligned. Only the differences with respect to rat and human RNAs are indicated, conserved nucleotides are figured by dots A means deletion of a single nucleotide, V insertion of a sin le A ~ . . ' ~ g nucleoude. --~ >- deletton of a segment V insertion of an 8-nucleotide fragment. The consensus line is derived by the following rules: a) invariant residues are indicated as a capital A, G, C, U; b) Y and R indicate an invariant pyrimidine or purine, respectively: c) a residue which is conserved in at least 7 sequences among the 9 aligned is represented by a lower case later; d) a means that the residue found at this position is A or G residue; e) n means no significant conservation. Inverted arrows above the sequences indicate the proposed intramolecular helices numbered I to IV, as in figure 3. Sm refers to the Sm protein binding site. The 2 segments underlined in the consensus sequence indicate the 2 segments base-paired with U6 snRNA m the Brow and Guthrie model [261. 5' domain;

A phylogenetic study of U 1.snRNA

21

Table I. Estimation of the percentage of homology between the U4 snRl~lA 5' domains of various eucaryotic species. The reference length for this domain has been taken as 84 nucleotides (length of the human U4 snRNA 5' domain). The number.,, in this table represent the percentage of conserved positions in a pair of species (taking into account substitutions, deletions and insertions) with the 84 nucleotide length taken as the reference. The nucleotide sequences used for the comparison are from: human [27], Drosophila 128], Caenorabditis elegans [301, broad bean 129], Physarum poly~'ephalum (this paper), Sa~'~haromyces cerevisiae [31 i, St'tlizosaucharomw'es pombe [32]. As human, rat and chicken RNAs are very similar, only one sequence has been introduced in the comparison.For the same reason, the K iactis sequence which is very close to that of S cerevisiae has not been used.

Human Drosophila C elegans

Human

Drosophila

C elegans

Broad bean

Physarum

S cerevisiae

S pombe

100

77.4

79.8

75

63. I

48.8

53.6

100

77.4

64.3

54.8

52.4

58.3

100

75

60.7

47.6

39.3

100

61.9

48.8

39.3

i 00

41.7

42.8

100

66.7

Broad bean Physarum S cerevisiae S pombe

and insertions preserves the possibility to form the 3 hairpin structures denoted I-III and hairpin IV when the corresponding sequence is present. We noticed that not only the possibility to form the hairpin structures is conserved, but also the thermodynamic stability of the corresponding helices. For instance, an internal A-U pair and an internal G.U pair in vertebrate helix I are replaced by a G-C and an A-U pair, respectively, in broad bean helix I, which increases the stability. But on the other hand, the first A-U pair of helix 1 in vertebrate RNA is disrupted by A to G and U to G replacements in the broad bean RNA, so that the overall stability of hairpin I calculated according to Tinoco et al [46] or taking into account new thermodynamic parameters proposed for R N A [47-49] is similar in vertebrate and broad bean RNAs. Depending on the parameters used, it varies between - 1 0 and - 8 kcal. This conservation of the stability is also observed for hairpin II. It either contains a perfect helix of 4 base pairs or a helix with 5 base pairs when a G°U pair or a looped-out nucleotide is present. The free energy of this hairpin is very low: -0.5 kcal according to Tinoco et al [46]; using new parameters [47-49] it is even positive. But we would like to mention that such hairpin structures have been shown to exist in ribosomal R N A [50]. From the present data, we concluded that the secondary structure model we previously proposed for 'free' U4 snRNA is supported by phylogenetic evidence and should correspond to a structure of biological importance.

1O0

Evolutionarily conserved motifs o f ~'ee' U4 snRNA structure As mentioned above, the 5' domain of U4 snRNA has a higher degree of conservation than the 3' domain in both size and primary structure. The 5' domain It involves a 12-13-nucleotide long single-stranded region at the extreme 5' end and 2 hairpin structures, I and II. Hairpin I has a large stem loop (15--17 nucleotides) and a long stem (13-15 base pairs). Two nucleotides are always looped out of the stem at invariable positions: one, which is a purine residue (except in the yeasts where it is a U residue), is located on the left strand and is separated by 3 base pairs from the top loop. The second one, which is a uridine residue in all RNAs, is located on the right strand and is separated by 8 base pairs from the top loop. Such conservation of bulge nucleotide probably has a biological significance. They may be involved in RNA/protein or R N A / R N A interactions. Hairpin !I has a short stem (4-5 base pairs) and a top loop with variable size in RNAs from lower eucaryotes (7-1 l nucleotides). If we now look to the primary structure conservation within these 3 secondary structure elements (fig 3C), one striking observation is the high degree of sequence conservation in some base-paired regions, which is not usual in RNA evolution [50]. A large part

-~ ~

E Myslinski, C Branlant GGC UU U

GUUc

A

G

uA A C C AG

A G U G

b

AU G C-G G-C A U G G-C C-G A-U G-C U U-A A-@ A-U C-G "n" GAInA A A C G-C u m3GIN~I~mUCU~mUGCGCUmUG~ U GGU U A IIII CCAA C A t¢ C U 3" A U G AAGGUCUUUAAG _ C No C - G A-U U-A G-C G-C U U AA-U U U

2Z'

"%G-C

U-A

G-C~

C-G

G-U"~' G

c-~

U

A G G C U C A A c _ G A rl C-G A-U U.G G-C C-G

I

A_-Uu

U A-U G.U C-G G-C

~"

m-a pppAnrUUUfi C, _. ('ficU._ _ _ _ . GGG. U

GGUUG • -,t

~

G-C G-C A-U

t

A

AA A C U

GU UAA CC U CC Cu U C

3"AAGA C- GCAAGUUUUU AA C-G G-C

~-c

A -G

G-A U

A-U C-G G'U UGG-C -C

-A

G-C A-U A-U

U

A-u

TIT

G-C a-( U C

~u

A

C

A

G

~

G.u ®-u-,-~,"

,~

Gp~A.,GCVUUaCaC4,G AUUG

c,~'l '..~.(t trar~ / t

c

Y G C-G

C-G

U-A R-U

A

"c u

c u u u.,~

GG.u-CR. .AAAA C U m~pppA • CUU "GC-C • "G " uuu I I i • .YAA. • U .C" " ',J 3" R RGUYUUUAA C H• . . . .

.. G_ -_~ ~ti'u uuuu,~,c - G. . ~ .,L...t,- L . . ~ afg-G - C- ~

(cJ U--A ~'~J

t~C-G-~ LzLA-u I'~1 G-C

'I:~,-~,-u--(D

uG-r -

C-G

~:

~tl)

u.C-A~ (]~)--A-U-~ -

~ ) - a A-@

"C

AG

;~R - Y U.G G-C

AG -C U G -C G C-G - U -A

®

A

RA

~',.~-A c - a A G

U-A ® - a .u A-U C-G G-C

A A

G U

U'~)

UA

U

~i 13

• U



U

U

C

Fig 3. The proposed secondary structure for 'free' U4 snRNA. The various RNAs have been folded according to the model previously proposed on the basis of a comparison between vertebrate and Drosophila U4 snRNAs [28]. A: U4 snRNA from higher eucaryotes: a) broad bean [29], the full sequence is that of U4 snRNA b, the mutations and insertions in U4 snRNA are indicated by circled letters; b) Ceadnorabditis elegans [30]; c) chicken RNA (full sequence) compared with broad bean RNA (circled letters, • = deletion); d) conserved nucleotides in U4 snRNAs from the 6 higher eucaryote species studied (man, rat, chicken, Drosophi!a, broad bean, C elegans). B: U4 snRNAs from lower eucaryotes: a) Physarum polycephalum (this paper); b) Saccharomyces cerevisiae [31 ]; c)Schizosaccharomyces pombe [32]. C: Conserved nucleotides in all U4 snRNAs studied.

~w

mo



I~



Q

I

e

/

I

i

D

i

D

I

i

~

I

Q

!

O

I

~

I





f"l

;O G1 C ..

A phylogenetic study of U4 snRNA reveals the existence of an evolutionarily conserved secondary structure corresponding to 'free' U4 snRNA.

The nucleotide sequence of Physarum polycephalum U4 snRNA*** was determined and compared to published U4 snRNA sequences. The primary structure of P p...
1MB Sizes 0 Downloads 0 Views