Gene, 102 (1991) 189-196 @ 1991 Elsevier Science Publishers

GENE

189

B.V. 037%1119/91/$03.50

04095

Structure, chromosomal localization and evolutionary conservation of the gene encoding human Ul snRNP-specific A protein (Autoantigen;

exonlintron

distribution;

gene duplication;

Rob L.H. Nelissen”, Peter T-G. Sillekenss*,

chromosomal

mapping;

genomic library; promoter;

U2-B ’ protein)

Ria P. Beijer**, Ad H&I. Geurts van Kessel b and Walther J. van Venrooij a

” Department of Biochemistry and ’ Department of Human Genetics, University of Nijmegen, Nijmegen (The Netherlands) Received by H. van Ormondt: Revised: 7 February 1991 Accepted: 18 February 1991

20 November

1990

SUMMARY

Three specific proteins, called A, 70K and C, are present in the U 1 small nuclear ribonucleoprotein (snRNP) particle, in addition to the common proteins. The human I31 snRNP-specific A protein is, apart from a proline-rich region, highly similar to the U2 snRNP-speci~c protein B” . To examine the homologous regions at the genomic level, we isolated and characterized the human UI-A gene. The human U I-A protein appears to be encoded by a single-copy gene and its locus has been mapped to the q arm of chromosome 19. The gene, about 14-16 kb in length, consists of six exons. The regions homologous to the U2-B” gene are not limited to single exons and are mostly not confined by exon-exon junctions in the corresponding Ul-A mRNA. However, the proline-rich region of U l-A, absent in U2-B’ , is encoded by a single exon, suggesting a specific function for this domain of U 1-A. The region of the cap site and upstream sequences contain interesting similarities to the promoter region of other snRNP protein-encoding genes and several housekeeping genes, in particular the vertebrate ribosomal protein-encoding genes. Hybridization experiments with various vertebrate genomic DNAs revealed that UI-A sequences are evolutionarily conserved in all tested vertebrate genomes, except for chicken, duck and pigeon. The divergence of these avian genomes is probably typical for the class of birds.

The U 1 snRNP particle is the most abundant of the U snRNPs which participate in pre-mRNA splicing as components of the spliceosome (reviews Maniatis and Reed, 1987; Steitz et al., 1988). It was the first U snRNP shown to be required for splicing (Kramer et al., 1984) and mutational complementation experiments revealed that direct

Correspondence to: Dr. R.L.H. IJniversity

of Nijmegen,

Nelissen,

Department

of Biochemistry,

P.O. Box 9101, NL 6500 HB Nijmegen

(The

Netherlands) Tel. (31)-80-614254; * Present address: Ne~heriands) Tel. (31)-4116-54468

Fax (31)~80-540525. Organon

Teknika,

P.O. Box 84,528O AB Boxtel (The

base-pairing interactions between the Ul snRNA and the 5’ splice sites are involved in the recognition of these exonintron junctions (Zhuang and Weiner, 1986; Zhuang et al., 1987). The protein components of the Ul snRNP particle seem to increase the affinity of the snRNP particle for the 5’ splice site (Mount et al., 1983; Heinrichs et al., 1990). The Ul proteins 70K, A and C are Ul snRNP-specific, whereas the proteins B, B’, D, D’, E, F and G are afso

Abbreviations:

aa, amino

plementary

to RNA;

ribosomal;

SDS,

acid(s);

bp, base pair(s);

kb, kilobase

sodium

dodecyl

cDNA,

DNA com-

or 1000 bp; nt, nucleotide(s); sulfate;

snRNP,

small nuclear

r, ribo-

nucleoprotein; fsp,transcription start point(s); Ul-A, Ul snRNP-specific protein A; UI-A, gene encoding Ul-A; U2-B”, U2 snRNP-specific protein B n ; UZ-E” , gene encoding

U2-B” .

190 present in other snRNPs (reviewed by Ltihrmann, 1988; Van Venrooij and Sillekens, 1990). Only very little is known about their exact functions. The Ul snRNP-specific proteins contain sequence elements also observed in other nucleic acid-binding proteins. In proteins U 1-A and U 1-C a proline-rich region is present, whereas U l-70K and U 1-A contain the RNP-80 motif that is involved in RNA-binding

A

B

M (kb)

(I%

23.1 -

23.1-

2

9.4 -

(Swanson et al., 1987; Dreyfuss et al., 1988; Scherly et al., 1989; 1990). We have previously reported the molecular cloning and

6.7 6 7-

j

sequence analysis of the cDNAs for the human U 1 snRNPspecific A protein (Sillekens et al., 1987) and the human U2 4.3 -

snRNP-specific B” protein (Habets et al., 1987). Both proteins were shown to have similar internal sequences. Furmutual sequence comparison revealed two thermore, extremely similar regions located in the C- and N-terminal parts of either protein. These structural relationships indicate that the genes encoding Ul-A and U2-B” have emerged from a common ancestral gene and that probably, prior to gene duplication giving rise to separate genes for the Ul-A and U2-B” proteins, an internal sequence duplication in the progenitor gene has taken place (Sillekens et al., 1987). This report describes the isolation and structural analysis of the human Ul snRNP-specific A protein gene, whose 5’ flanking region shows interesting similarities with the promoter regions of vertebrate ribosomal proteinencoding genes and of three other cloned snRNP-encoding genes, the Ul snRNP-specific 70K protein of Xenopus (Etzerodt et al., 1988) and Drosophila (Mancebo et al., 1990) and the human U snRNP E protein (Stanford et al., 1988).

1

4.3-

2.32.32.0-

C 3’ EcoRl

5’

AGEN-HAi

J&EN-HAI

RESULTS

AND

DISCUSSION I

I 5

0

(a) Isolation of the gene encoding human Ul snRNPspecific A protein A human (liver) genomic library in the AEMBL3 replacement vector was screened with the EcoRI insert of pHA-4, a full-length cDNA clone encoding human Ul snRNPspecific A protein (Sillekens et al., 1987). Initial screening of about 5 x lo5 plaques revealed eight recombinant phages bearing sequences homologous to the Ul -A cDNA. Restriction-enzyme digestion followed by gel electrophoresis showed identical restriction patterns for all these phages, indicating that their genomic inserts originated from a single chromosomal locus. Further restriction-enzyme and blotting analyses showed that the overlapping inserts of two genomic clones, termed LGEN-HA1 and I.GEN-HA2, contained all hybridizing sequences in the human genome (Fig. 1). The EcoRI digest of AGEN-HAI, carrying 17 kb of genomic DNA, revealed two hybridizing fragments of 4.9 and 7.8 kb (Fig. lB, lane 1). When the transcriptional

Fig.

1. Comparison

with human genomic

I

I

15

20

of positive

genomic

library

I

10

gene clones IGEN-HA1

DNA. Plaques

were

Schuell) and screened

lifted onto

et al., 1987) under hybridization DNA (20-24 with 0.5 M Na 0.25 M Na

filters

nick-translated

conditions

as described

pH 7.0/l y0 SDS/l

pH 7.0/l “/, SDS/l

human

(Schleicher

&

pHA-4 (Sillekens by Church

herring-sperm

After hybridization,

and

single-stranded

blots were washed

twice

mM EDTA and once with

mM EDTA (30 min each time

plaques were purified and DNA inserts isolated. with total human DNA, this total DNA was digested

with EcoRI, electrophoresed (Sillekens

nitrocellulose

of 100 &ml

and IGEN-HA2

10s) from a IEMBL3

phosphate

phosphate

on a 0.7% agarose

nick-translated

EcoRI

insert

et al., 1987). (B) IGEN-HAZ

were digested

x

h at 65°C).

at 65°C). Positive (A) For comparison “P-1abelled

(5

with “P-1abelled

Gilbert (1984) in the presence

’ kb 25

with

clone

pHA-4

and IGEN-HA2

DNA

inserts

with EcoRI and hybridized

above. HindIII-digested I-DNA the human Ul snRNP-specific

gel, and hybridized of cDNA

on Southern

blot as described

was used as marker (M). (C) Cloning of A protein gene. The physical EcoRI

restriction map of the human Ul-A gene and its flanking sequences is given in the upper line. The thick bars represent those regions of IGENHA1 and IGEN-HA2 found.

in which hybridization

to the A protein cDNA was

191 orientation

of the gene was determined,

the smallest of these

fragments turned out to contain the 5’ moiety of the UI-A gene (data not shown). The other clone, bearing 12.5 kb of genomic DNA, contained EcoRI-fragments of 7.1 kb and 2.8 kb, that hybridized with the cDNA probe (Fig. lB, lane 2). The 7.1-kb fragment overlaps with the 7.8 kb fragment of I,GEN-HAI, whereas the 2.8-kb fragment hybridized to a 3’ probe of the Ul -A cDNA (data not shown). Human genomic DNA digested with EcoRI yielded three hybridizing fragments of 8.0, 4.9, and 2.8 kb (Fig. 1A). Taken together, these results demonstrate that all genomic UZ-A sequences are located within the two overlapping inserts of IGEN-HA2 and AGEN-HA2 (Fig. 1C). Detailed analysis of both inserts, recloned with Sa[I in pSP65, revealed (see section b) that the overlapping fragments contain only one gene. Therefore, it was concluded that the human Ul-A protein is encoded by a single gene per haploid genome. (b) Structure of the human UI-A gene To elucidate the detailed exon-intron structure of the whole gene, the nt sequence of all exons and parts of their flanking introns was established via the shotgun strategy of Deininger (1983). Clones containing exon sequences were selected by hybridization with the EcoRI insert of the cDNA plasmid pHA-4 and subsequently subjected to dideoxy sequencing reactions. The clones contain the complete single-copy U1 -A gene, flanked by at least 4.1 kb of upstream and 3.5 kb of downstream sequences. The coding sequence extends over 14-16 kb and is split into six exons. The sequence of all exons was completely determined, whereas the introns were only partially sequenced (Fig. 2). The boundaries between exons and introns were defined by sequence comparison with the Ul-A cDNA pHA-4 (Sillekens et al., 1987). Furthermore, comparison of the nt sequence of the gene with that of the human cDNA clone did not reveal any discrepancy in their nt sequences, indicating that the gene described here indeed corresponds to the functional UI-A gene. The features of the exon-intron junctional regions are shown in Table I. Exon lengths extend from 89 bp to 380 bp. The boundaries of all the intron sequences are consistent with the deduced 5’ and 3’ splice site consensus sequences (Mount, 1982) and the pyrimidine-rich tract located between the consensus AG dinucleotide at the 3’ splice site and the site of lariat formation is present in all the introns (Table I). The 3’ end of the gene was also mapped by sequence comparison of the gene and the pHA-4 cDNA clone, which contains the complete 3’ untranslated region as well as a stretch of eight adenosine residues (Sillekens et al., 1987). The poly(A) tract is added 27 nt downstream from the unusual polyadenylation signal ATTAAA (Fig. 2). It is remarkable that neither of the two homologous regions, which the Ul-A

protein shares with the U2-B” protein, is encoded by a single exon and that the boundaries of the N-terminal region of homology do not coincide with exon junctions in the UI-A gene. The homologous region in the N-terminal part is encoded by exons 1, 2 and part of exon 3, whereas exons 5 and 6 encode the C-terminal region of homology. It is significant, however, that the stretch of 58 aa from the middle part of Ul-A (aa 143-200) that are not present in U2-B”, is encoded by a single exon, namely exon 4. This exon spans the Pro-rich domain of Ul-A. Given the fact that Pro-rich regions have also been found in other proteins that bind single-stranded nucleic acids (Garoff et al., 1980; Kruijer et al., 1981; Adam et al., 1986), this protein segment might encompass an additional RNA-binding capacity of Ul-A. The already known RNA-binding domain of the Ul-A protein (aa l-101, see Scherly et al., 1989) is contained in exons 1, 2 and a part of exon 3. The previously noticed sequence similarity between the N- and C-terminal regions of U 1-A that is indicative of a sequence duplication within the progenitor gene from which the individual Ul -A and U2-B” genes have descended (Sillekens et al., 1987), is not reflected in the exon distribution of the gene. Therefore, after sequence duplication in the progenitor gene, initially both segments must have developed independently without high sequence conservation, and loss or gain of introns has taken place since the duplicated segments diverged. Only after the gene duplication event, giving rise to separate genes for Ul-A and U2-B”, at least parts of these regions have been evolutionarily conserved. (c) The 5’ flanking region: presence of C/C boxes and pyrimidine-rich segment About 200 bp upstream from the cDNA sequence from the IGEN-HAI insert were sequenced (Fig. 3). Primer extension experiments with a double-stranded primer derived from the 5’ end of pHA-4 (positions + 1 to + 124 of the cDNA) revealed that the tsp is situated at nt -19 relative to the cDNA start point (Fig. 2). This was confirmed with a second primer (pHA-4 positions + 28 to + 185) (data not shown). Assuming a poly(A) tail of 150-200 nt, the location of the cap site is consistent with the size of the human U 1-A mRNA (1.4 kb) deduced from Northern blotting (Sillekens et al.. 1987). Remarkably, the promoter region of the Ul-A gene has several features typical for promoters of housekeeping genes. Upstream from the cap site no canonical TATA box (Breathnach and Chambon, 1981) can be found. A CCAAT box-like sequence, often found between nt -100 and -40 (Efstratiadis et al., 1980), is present 151 nt upstream from the cap site. Three G/C boxes, often found in the promoter region of housekeeping genes (Dynan, 1986), are present upstream from the cap site (Fig. 3). One of these contains the sequence 5’-CCGCCC. This motif (the inverted form

5’ and 3’ sequences

junctions

are deduced

recorded,

programs

(Sillekens

(1979). (GenBank

(exon-containing

accession

fragments

were sequenced Nos. for exon-containing

I-VI)

and hybridized I-VI:

of the polyadenylation a library ofrandom method

site, based

were

et al., 1987).

subfragments et al., 1977). The gel readings respectively).

(Sanger

with the EcoRI insert of pHA-4 (Sillekens M60779-M60784,

by the dideoxy chain-termination gene sequences

The position

with a star. Primer extension with double-stranded by two stars. For sequencing,

and is indicated

aa sequence

VI-67

by a shaded background.

letters. The deduced of U2-B” are indicated

are printed in upper-case signal is underlined. stop codon is marked

experiments (1983). Plaque filters were prepared

et al., 1989). The translational by Deininger

(Sambrook

by primer extension

et al., 1987). The polyadenylation arrow. The fsp was determined

as described

of [%]dATP

in M13mp18

coding sequences

by the Staden

of clones containing

genomic inserts was constructed

edited and compared

The DNA fragments

of two overlapping

DNA primers (see section c) was carried out in the presence

by a downward

letters, whereas exon sequences

5---------------------ccctagtaa*catgaggatggctgcttaatggagaaagaagt*tggaatgagaagagggaaatt*g*

of Ul-A which are highly similar to aa sequences

are in lower-case

triplets. Those aa sequences

and intron sequences

from the 171-A cDNA sequence

above the second nt of the corresponding

with the cDNA sequence, is indicated

of the exon-intron

upon comparison

The positions

notation

Fig. 2. Sequence ofthe human C/Z-A gene. Flanking

is given in single-letter

------------intron

193 TABLE

I

Exon-intron

junction

features

of the human

Ul-A gene”

Exon/intron

Exon size

5’ splice site

3’ splice site

Intron

number

(bp)

sequence

sequence

(kb)

1

198

gtgagt.....gctcaaaggtctttttttccccca

c tgcag

AG

ND

2

173

ATG

gtgagc...gatccccacccgccctgctctctgtt

t ggtag

CGT

ND

3

180

CCG

gtaagc...gtaaccacgcactctcctccctctct

c cacag

GGC

ND

4

174

CCT

gtgagt.......gctcaccgactcccctataccc

c cgcag

CTT

0.5

CA

gtaagt...ctgagtccctgaggtctgtcgttctc

t ttcag

G

1.9

AC

gtiagt..

5

89

6

380

G

consensus

a Exon sequences

are in upper-case

(1982) and the branch

G/C

-180

letters, whereas

point consensus

-160

box

from Green

"CCAAT"

intron sequences

-130

tttttctccaaa~caacCCttcggatgcttgggCCaatttagggtct G/C

-100

box

-80

ccacccaccaaatcacqtagagcatcctggaagtcgtagtaaatctctcgagagttctct G/C box -40 -20 ccacacacaaactggagaagcgggtcctacgcacgctttattatcacactttacctccat

Py

box

motif

+20 +40 +I. cOttCcccct(Lctcccacctt~cCTGA~~TTTCGGAGG~GATCCTTGAGCAGCCG LcDNA start site

Fig. 3. Sequence

features

gene. The sequence lower-case

letters

of the 5’ flanking

encompassing and part

region

of the exon 1 sequence

letters. Three G/C boxes, a putative rich motif are underlined. boxes.

region of the human

the upstream 5’-CCAAT

Two 5’-CTTCC

Ul-A

is depicted

in

is given in capital

box and a pyrimidine-

motifs

are placed

(f)n.rag

G

are in lower case. The 5’ and 3’ splice site consensus

(1986). The size of introns

box

.~n~a~.

in grey

of 5-GGGCGG) represents a potential binding site for transcription factor Spl (Kadonaga et al., 1986). Finally, in the 5’ flanking sequence a pyrimidine-rich region surrounding the cap site can be identified which is also present in promoters of many housekeeping genes, in particular vertebrate r-protein genes (review Mager, 1988). Furthermore, the cap site is situated in a 5’-CTTCC-motif and a second copy of this motif is present at position + 24 (relative to cap site; Fig. 2). The CTTCC-motif is a highly conserved sequence found at or near the tsp of all known vertebrate r-protein genes (Mager, 1988). The presence of an SPl consensus motif and/or a CTTCC cap-site motif was also observed in the 5’ upstream region of a human snRNPassociated E protein gene (Stanford et al., 1988) and genes of Ul snRNP-specific 70K proteins from Xenopus (Etzerodt et al., 1988) and Drosophila (Mancebo et al., 1990). Therefore, genes encoding snRNPs may be regulated in a common fashion, similar to the transcriptional control of r-protein genes and several other housekeeping genes. (d) Chromosome localization of the human UZ-A gene To determine on which human chromosome the Ul-A gene is located, a panel of ten hamster x human somatic cell hybrids was used (Geurts van Kessel et al., 1983). The

1, 2, and 3 has not been determined

size

sequences

are from Mount

(ND, not determined).

restriction enzyme EcoRI gave a good resolution of the hamster and human UI-A gene bands. DNA of the hybrid cells was digested with EcoRI, electrophoresed in an agarose gel and transferred to a nitrocellulose filter. The filter was hybridized to the nick-translated cDNA insert of pHA-4 (Sillekens et al., 1987). The EcoRI fragments of the human Ul-A gene were detected in four cell lines that retained human chromosome 19 and were not detected in EcoRI digests of DNA from hybrid cells that lacked this chromosome (Table II). All other human chromosomes could be excluded by at least three discordant hybrids (Table II). The results are consistent with assignment of the Ul-A gene to human chromosome 19. A more precise subregional mapping of the Ul-A gene was carried out, as described above, with a panel of ten hamster x human somatic cell hybrids, each containing only human DNA of a well determined fragment of chromosome 19. The concordant/discordant segregation (not shown) indicated that the UI-A gene is localized in the chromosome 19q 13.1 region. Interestingly, the gene for the Ul snRNP-specific 70K protein has also been located on chromosome 19q (Spritz et al., 1987; Schonk et al., 1990). However, as the latter gene has been located to the 13.2 region, the probability of these snRNP-encoding genes being organized in a cluster is small. (e) Detection of UZ-A gene sequences in genomic DNAs Southern-blot analysis was performed to test the presence of UI-A gene elements in the genome of other species. EcoRI-digested genomic DNA of different phylogenetic sources was hybridized using the 32P-labelled nicktranslated EcoRI insert of pHA-4 as probe (Fig. 4). All mammalian species showed clear hybridizing fragments under the moderate conditions of stringency used. In the genomic DNA of rabbit, only a single strongly hybridizing fragment was detected (Fig. 4, lane 5), whereas a more complex pattern of bands was observed in the genomes of

194 TABLE

II

Correlation

of human

Hybridization

pHA-4 sequences

I’

Human

1 +,‘+

with human

chromosomes

in hamster

2

3

somatic

cell hybrids

chromosomes 3

4

5

6

2

344223

2

3

7

8

9

10

1012100422

1’

x human

4

2

+I-

3

244322443

-I+

3

4

6

676666465

3

11

12

4

0

2

2

14

15

16

17

18

2

3

3

3

2

3

4

144

5

4

2

2

3

3

2

3

12 4

13

19

20

21

22

x

3

4

30

6

2

3

3

25

4

4

3

3

4

3

0

4

3

3

41

2

1

1

1

2

1

0

3

0

0

1

0

1

3

3

5

5

5

4

4

6

0

4

4

3

55

Y

4

Discordant hybrids ,’ The numbers

of hybrids

showing

concordant

( + / + or -

I - ) and discordant

( + / - or - / + ) segregation

with pHA-4 sequences

are given for each

chromosome. For chromosomal mapping, DNA from ten hamster x human somatic cell lines (Geurts van Kessel et al., 1983) and controls was digested with EcoRI. Each digested DNA sample (10 Fg) was electrophoresed on a 0.7% agarose gel, blotted onto nitrocellulose and hybridized to the “‘P-labelled nick-translated

EcoRI insert of pHA-4 (Sillekens

et al., 1987)

calf, mouse, hamster, and guinea pig (Fig. 4, lanes 2, 4, 7, and 9) as compared to the three EcoRI fragments hybridizing in the monkey (Fig. 4, lane 10) and human genome (Fig. 1A).

M

12345678910

(kb)

23.19.4-

6.7-

4.3-

Fig. 4. Detection

of WI-A gene sequences

in insect and vertebrate

Total genomic DNA was isolated from various described via standard methods (Van der Putten

DNAs.

tissues of all species et al., 1979). Samples

of 10 pg were digested,

electrophoresed

on 0.77, agarose

ferred to nitrocellulose.

The Southern

blot was hybridized

gels and transand washed,

as described in Fig. 1 legend, with ‘*Plabelled nick-translated EcoRI insert of cDNA clone pHA-4 (Sillekens et al., 1987). The DNAs were extracted

from

the following

species:

lanes:

1, trout;

2, calf;

3, rat;

4, mouse; 5, rabbit; 6, chicken; 7, hamster; 8, Drosophila; 9, guinea pig; 10, monkey. As markers, HindIII-digested I-DNA fragments were run in parallel

(lane M).

Genomic fish nt sequences with homology to the cDNA of the human U 1-A protein could be detected. Three significantly hybridizing DNA fragments were observed in trout genomic DNA (Fig. 4, lane 1). Insect DNA also appears to contain sequence elements with homology to the human Ul-A cDNA (Fig. 4, lane 8). For the invertebrate species Drosophila a protein immunologically related to the human Ul-A protein has been reported (Wieben and Pederson, 1982; Wooley et al., 1982). Only chicken genomic DNA does not contain fragments hybridizing to the human Ul-A cDNA (Fig. 4, lane 6). Interestingly, probing with a UZ-B” cDNA did not reveal hybridizing fragments in the chicken genome either, whereas in other vertebrates the evolutionary conservation at the DNA level was similar to that of Ul-A (data not shown). Evolutionary conservation among vertebrates has also been found for the genes of other snRNPs, namely the E (Wieben et al., 1985), Ul-C (Sillekens et al., 1988) and Ul-70K (Mancebo et al., 1990) polypeptides. The divergence of the chicken mRNA sequence corresponding to the Ul-A protein probably holds true for the entire class of birds since the genomic DNAs from duck and pigeon did not hybridize either (data not shown). A precise estimation of the number of Ul-A genes in all the species cannot be made from the experiment described in Fig. 4. However, the presence of only a single prominent band in the genomic DNAs of Drosophila and rabbit and the pattern in monkey which resembles the one observed in human genomic DNA, suggest that in these species, just as in man, the Ul-A gene occurs in only one copy. (f) Conclusions ( 1) The human U 1 snRNP-specific A protein is encoded by a single-copy gene, consisting of six exons, which is

195 localized on chromosome conserved

19q 13.1. The gene sequences

in several vertebrate

classes, but probably

are

not in

birds. (2) Exon sequences coding for N- and C-terminal homologous regions of protein Ul-A evolved differently after a probable duplication within the gene. The homologous regions between proteins Ul-A and U2-B” are not encoded by single exons. This indicates a diverging evolution of the two genes descending from the same progenitor gene. The Pro-rich domain of protein U l-A, not present in protein U2-B”, is encoded by a single exon (exon 4), suggesting a specific function of this domain in the Ul-A protein. (3) In the promoter region of the Ul-A gene the presence of G/C boxes, a pyrimidine-rich segment and a CTTCCmotif around the cap site as well as the lack of a TATA-box sequence are characteristics encountered in housekeeping genes and, in particular, of vertebrate ribosomal protein genes. As the genes encoding the human snRNP-associated protein E and the Ul snRNP-specific protein 70K (Xenopus, Drosophifu) contain similar promoter features this adds up to the assumption that these genes might be controlled in a common way.

Dreyfuss,

G.,

nuclear mations. Dynan,

Swanson,

M.S.

ribonucleoprotein Trends

Biochem.

W.: Promoters

Efstratiadis,

A., Posakony,

J.L., Blechl, A.E., Smithies, Proudfoot,

Breathnach,

T., Swanson,

M., Woodruff,

polyadenylate-binding and

identification

protein:

T.K. and Dreyfuss, gene

of a ribonucleoprotein

isolation

and

consensus

Mol. Cell. Biol. 6 (1986) 2932-2943. R. and

eukaryotic

Chambon,

Etzerodt,

M., Vignali,

split genes coding

L.: Structure

for proteins.

and

expression

Annu. Rev. Biochem.

of 50

W.: Genomic

sequencing.

Proc. Natl. Acad.

Sci. USA 81 (1984) 1991-1995.

an snRNP

protein

Garoff,

(Ul 70K). EMBO

H., Frischauf,

capsid

protein

of Major

and

Minor

Small

Nuclear

Ribonucleoprotein

Particles. Springer-Verlag, Berlin, 1988, pp. 38-70. Deininger, P.L.: Random subcloning of sonicated DNA: shotgun 216-223.

/?-globin

D., Mattaj,

I.W. and

of a Xenopus gene encoding J. 7 (1988) 431 l-4321.

A.M., Simons, K., Lehrach,

of Semliki Forest

H. and Delius, H.: The

virus has clusters

acids and prolines in its amino-terminal

of basic amino

region. Proc. Natl. Acad. Sci.

USA 77 (1980) 6376-6380. Geurts

van Kessel, A.H.M.,

Hagemeijer, associated

surface

antigens

Proc. Natl. Acad. Habets,

B” antigen.

myeloid

myeloid-

cell hybrids.

splicing. Annu. Rev. Genet. 20 (1986) 671-708. Hoet, M.H., Schalken,

J.A.M.,

clone expressing

sequence

Proc. Natl. Acad.

autoimmune

Sci. USA 84 (1987) 2421-2425. G. and Ltihrmann,

for efficient complex

J.T., Jones,

a human

ofthe U2 small nuclear RNA-associated

V., Bach, M., Winkelmann, C needed

J.A., Roebroek,

Van de Ven, W.J.M. and Van Venrooij,

of a cDNA

antigen: full-length Heinrichs,

of human

in human-mouse

P.T.G.,

Leunissen,

W.J.: Analysis

P.A.T., Von dem Borne, A.E.G.K., D.: Expression

Sci. USA 80 (1983) 3748-3752.

W.J., Sillekens,

A.J.M.,

Tettero,

A. and Bootsma,

Green, M.R.: Pre-mRNA

formation

R.: Ul-specific

I snRNP with

of U

247 (1990) 69-72.

K.A. and Tjian, R.: Promoter-specific transcription

activation

by SPl. Trends Biochem.

Sci. 11

(1986) 20-23. Kramer, A., Keller, W., Appel, B. and Liihrmann, R.: The 5’ terminus the RNA moiety of Ul small nuclear ribonucleoprotein particles required

for the splicing ofmessenger

RNA precursors.

of is

Cell 38 (1984)

299-307. Kruijer,

W., Van Schaik,

F.M.A.

and Sussenbach,

J.S.: Structure

and

organization of the gene coding for the DNA binding protein adenovirus type 5. Nucleic Acids Res. 9 (198 1) 4439-4457. Ltihrmann,

R.: snRNP

Function

proteins,

In: Birnstiel,

W.H.:

Control

Berlin,

of

M.L. (Ed.), Structure

of Major and Minor Small Nuclear

ticles. Springer-Verlag. Mager,

and

Ribonucleoprotein

Par-

1988, pp. 71-99.

of ribosomal

protein

gene expression.

Biochim.

Acta 949 (1988) 1-15.

R., Lo, P.C.H. and Mount,

S.M.: Structure

and expression

of

DNA

protein

particle

70K protein.

S.M.: A catalogue

of splice junction

Mount,

sequence

analysis.

Anal.

Biochem.

129 (1983)

Nucleic

Acids

S.M., Pettersson,

I., Hinterberger,

M., Karmas,

RNA-protein

complex

A. and Steitz,

selectively

binds a

5’ splice site in vitro. Cell 33 (1983) 509-518. J., Fritsch,

E.F. and Maniatis,

T.: Molecular

Laboratory Manual. Second ed. Cold Spring Harbor Press, Cold Spring Harbor, NY, 1989, pp. 7.79-7.87. Sanger,

to

sequences.

Res. 10 (1982) 459-472.

F., Nicklen,

terminating application

Mol. Cell. Biol. 10 (1990) 2492-2502.

Maniatis, T. and Reed, R.: The role of small nuclear ribonucleoprotein particles in pre mRNA splicing. Nature 325 (1987) 673-678.

Sambrook,

J.E. and Lund, E.: The genes and transcription of the major nuclear RNAs. In: Birnstiel, M.L. (Ed.), Structure and

Function

C.C. and

of the human

G., Scherly,

and expression

J.A.: The Ul small nuclear

G.M. and Gilbert,

Dahlberg, small

C.,

SM., Slightom,

F.E., Shoulders,

and evolution

R., Ciliberto,

Philipson,

Mount,

P.: Organization

(1981) 349-383. Church,

T., Lawn, R.M., O’Connel,

the Drosophila melunogaster gene for the U 1 small nuclear ribonucleo-

mRNA

sequence

2 (1986)

gene family. Cell 21 (1980) 653-668.

Mancebo,

S.A., Nakagawa,

sequencing

Genet.

B.G., Weissman,

O., Baralle,

N.J.: The structure

Biophys.

G.:

genes. Trends

J.W., Maniatis,

of RNA polymerase-II

Adam,

of mRNA infor-

Sci. 13 (1988) 86-91.

Spritz, R.A., De Riel, J.K., Forget,

Kadonaga,

REFERENCES

S.: Heterogeneous

and the pathway

for housekeeping

a 5’ splice site. Science

We would like to thank Be Wieringa (University of Nijmegen) for kindly providing genomic DNA of several species, and Gerard Grosveld (Erasmus University of Rotterdam) and the Department of Human Genetics (University of Nijmegen) for the use of the genomic library. The present investigations have been carried out in part under the auspices of the Netherlands Foundation for Chemical Research (SON) and with the financial aid of the Netherlands Organization for Scientific Research (NWO) and the Dutch Rheumatism Foundation.

Pinol-Roma,

196-197.

protein ACKNOWLEDGEMENTS

and

particles

5463-5467. Scherly, D., Boelens, and

Mattaj,

S. and Coulson,

inhibitors.

Proc.

A.R.: DNA sequencing Natl.

W., Van Venrooij,

I.W.: Identification

Acad.

Sci.

W.J., Dathan,

of the RNA

USA

Cloning.

with chain74 (1977)

N.A., Hamm,

binding

A

Laboratory

segment

J. of

196 human U1 A protein and definition of its binding site on U 1snRNA. EMBG J. 8 (1989) 4163-4170. Scherly, D., Boelens, W., Dathan, N.A., Van Venrooij, W.J. and Mattaj, I.W.: Major determinants of the specificity of interaction between small nuclear ribonucleoproteins UlA and U2B” and their cognate RNAs (1990). Nature 345(1990) 502-506. Schonk, D., Van Dijk, P., Riegman, P., Trapman, J., Helm, C., Craig, I., Sillekens, P., Van Venrooij, W., Wimmer, E., Geurts van Kessel, A., Ropers, H.-H. and Wieringa, B.: Assignment of seven genes to distinct intervals on the midportion of human chromosome 19q around the myotonic dystrophy gene region. Cytogen. Ceil Genet. 54 ($990) 15-19. Sillekens, P.T.G., Habets, W.J., Beijer, R.P, and Van Yenrooij, W.J.: cDNA cloning ofthe human Ul snRNA-associated A protein: extensive homology between Ul and U2 snRNA-specific proteins. EMBO J. 6 (1987) 3841-3848. Sillekens, P.T.G., Beijer. R.P., Habets, W.J. and Van Venrooij, W.J.: Human Ul snRNP-specific C protein: complete cDNA and protein sequence and identification of a multigene family in mammals. Nudeic Acids Res. 16 (1988) 8307-8321. Spritz, R.A., Strunk, K., Surowy, C.S., Ho&, S.U., Barton, DE. and Francke, U,: The human UI-70K snRNP protein: cDNA cloning, chromosomal localization, expression, alternative splicing and RNAbinding. Nucleic Acids Res. 15 (1987) 10373-10391. Staden, R.: A strategy of DNA sequencing employing computer programs. Nucleic Acids Res. 6 (1979) 2601-2610. Stanford, D.R., Perry, CA., Holicky, EL., Rohleder, A.M. and Wieben, ED.: The small nuclear ribonu~~eoprote~n E protein gene contains four introns and has upstream simiiarities to genes for ribosomal proteins. J. Bioi. Chem. 263 (1988) 17772-17779.

Steitz, J.A., Black, D.L., Gerke, Y., Parker, K.A., Kramer, A., Frendewey, D. and Keller, W.: Functions of the abundant U-snRNPs, In: Bimstiel, M.L. (Ed.), Structure and Function of Major and Minor Small Nuclear Ribonucleoprotein Particles. Spri~8er-Verlag, Berlin, 1988, pp, tl5-L54. Swanson, MS., Nakagawa, T.Y., LeVan, K. and Dreyfuss, G.: Primary structure of human nuclear ribonucleopratein particle C proteins: conservation of sequence and domain structures in heterogeneous nuclear RNA, mRNA and pre-rRNA-binding proteins. Mol. Cell. Bioi. 7 (1987) 1731-1739. Van der Punen, H., Terwindt, E., Berns, A. and Jaenisch, Ii.: The integration sites ofendogenous and exogenous Maloney murine leukemia virus. Cell 18 f1979) 109-116. Van Venrooij, W.J. and Sillekens, P.T.G.: Smali nuclear RNA associated proteins: autoantigens in connective tissue diseases. Clin. Exp. Rheumatol. 7 (1989) 635-645. Wieben, E.D. and Pederson, T.: Smail nuclear ribo~ucleoproteins of Drosophila: identification of Ul RNA-associated proteins and their behaviour during heat shock. Mol. Cell. Biol. 2 (1982) 914-920. Wieben, E.D., Rohleder,A.M., Nenninger, J.M. and Pederson, T,: cDNA ctoning of a human aut~immune nuclear ~bonucleopr~)tein antigen. Proc. Nat]. Acad. Sci. USA 82 (1985) 7914-7918. Wooley, J.C., Cone, R.D., Tartof, D. and Chung, S.-Y.: Smait nuclear ribonucleoprotein complexes of Drosophila melunogaster. Proc. Nat]. Acad. Sci. USA 79 (1982) 6762-6766. Zhuang, Y. and Weiner, A.M.: A compensatory base change in Ul snRNA suppresses a 5’ splice site mutation. Cell 46 (1986) 827-835. Zhuang, Y., Leung, H. and Weiner, A.M.: The natural 5” splice site of simian virus 40 large T antigen can be improved by increasing the base complemental to U IRNA, Mol. Cell. Biol. 7 (1987) 3018-3020.

Structure, chromosomal localization and evolutionary conservation of the gene encoding human U1 snRNP-specific A protein.

Three specific proteins, called A, 70K and C, are present in the U1 small nuclear ribonucleoprotein (snRNP) particle, in addition to the common protei...
965KB Sizes 0 Downloads 0 Views