J. Biochem. 107, 73-76 (1990)

Complete Primary Structure of the Human Estrogen-Responsive Gene (pS2) Product1 Kazutoshi Mori,* Ritsuko Fujii,* Noriyoshi Kida,' Haruo Takahashi," Shoichi Ohkubo,"* Masahiko Fujino,'** Mitsuhiro Ohta,'*** and Kyozo Hayashi 4 2 'Department of Pharmaceutics, Gifu Pharmaceutical University, Gifu, Gifu 502; "Sanwa Kagaku Kenkyusho Co., Ltd., Inabe-gun, Mie 511-04; "'Tsukuba Research Laboratories, Takeda Chemical Industries Co., Ltd., Tsukuba, Ibaragi 305; and ""Department of Biochemistry, Clinical Research Center, Utano National Hospital, Ukyo-ku, Kyoto, Kyoto 616 Received for publication, July 10, 1989

pS2 is a human gene whose transcription is directly triggered by estrogen in human breast cancer cells (MCF-7). We described here the complete primary structure of the pS2 gene product. The pS2 protein purified from conditioned medium of MCF-7 cells was Spyridylethylated and digested with TPCK-trypsin. Five major fragments were obtained by reverse-phase HPLC. Amino acid sequence analysis of these tryptic peptides established that the pS2 protein comprises a 60-amino acid polypeptide. The sequence of the pS2 protein was completely identical to that deduced from the nucleotide sequence of the pS2 gene, if the signal polypeptide is excluded. Furthermore, two cDNA clones encoding an 84-amino acid precursor pS2 protein were isolated from a cDNA library which was constructed with RNA from MCF-7 cells cultured in the presence of estrogen. The nucleotide sequence of one clone (pS2Bl) was identical to that of pS2 cDNA previously reported except for one nucleotide in the 3' untranslated region. The other clone (pS2B2) was longer by 73 nucleotides, at the 5' end, than pS2Bl. The additional 73 nucleotides are located just upstream of the sequence of pS2Bl in the structure of the pS2 gene, indicating that the pS2 gene has two start sites for transcription. However, a mRNA molecule corresponding to pS2Bl but not to pS2B2 was detected in the cells on RNA blot hybridization analysis, indicating that one transcriptional start site is mainly used.

pS2 cDNA was originally isolated by Chambon and coworkers (1) from human breast cancer cells (MCF-7) derived from a patient with metastatic breast cancer (2). pS2 mRNA of about 600 nucleotides is rapidly accumulated in response to estrogen treatment of the cells (1, 3), which contain estrogen receptors and respond to estrogen with increased rates of DNA, RNA, and protein syntheses (4, 5). The single-copy pS2 gene contains a single open reading frame encoding an 84-amino acid polypeptide (6*). Transcription of the gene is directly triggered by estrogen (7) and the 5' flanking sequence of the gene possesses the properties of an estrogen-inducible promoter (8). Thus, expression of the pS2 gene in MCF-7 cells is considered to be an excellent model system for understanding the mechanism of regulation of hormone-induced gene expression. However, induction of pS2 mRNA and the pS2 protein by estrogen does not appear to be responsible for the estrogen-induced growth of the cells (9, 10). At present, the biological significance of the pS2 protein remains unclear. We recently identified a protein synthesized and secreted ' This work was supported in part by Grants-in-Aid for the Encouragement of Young Scientists (K. Mori) and for Developmental Scientific Research (K. Hayashi) from the Ministry of Education, Science and Culture of Japan. 1 To whom correspondence should be addressed. Abbreviations: TPCK-trypsin, L-l-tosyl-amide-2-phenylethyl chloromethyl ketone-treated trypsin from bovine pancreas; FCS, fetal calf serum. Vol. 107, No. 1, 1990

by MCF-7 cells as the pS2 gene product by means of N-terminal 36-amino acid sequence analysis (11). This report reports extensive characterization of the pS2 product, including the complete amino acid sequence of the pS2 protein purified from the medium and the isolation of two cDNA clones encoding the pS2 protein from cells cultured in the presence of estrogen. MATERIALS AND METHODS Materials—TPCK-trypsin was obtained from Sigma; insulin from Novo; y3-estradiol from Tokyo Kasei (Tokyo); and restriction enzymes from Toyobo (Osaka) and Takara Shuzo (Kyoto). Steroid hormone-depleted FCS was prepared by mixing FCS (Bocknek) with dextran-coated charcoal as described (12). The pS2 protein was purified from serum-free medium conditioned with MCF-7 cells as described (11). Isolation of Tryptic Peptides of the pS2 Protein—Seventy micrograms of the purified pS2 protein was reduced for 5 min at 100'C with 1 //I of 2-mercaptoethanol in 500 //I of 0.2 M N-ethylmorpholine/acetate buffer (pH8.0) and then S-/S-4-pyridylethylated for 90 min at room temperature, with the addition of 3 ^ 1 of 4-vinyl pyridine, according to the method of Fullmer (13). The reaction mixture was directly applied to a column of//Bondapak Cis (10 //m pore size, 3.9 x 300 mm; Waters) at theflowrate of 1 ml/min at room temperature. The S-pyridylethylated pS2 protein was eluted from the column with a linear 50-min gradient 73

K. Mori et al.

74 of acetonitrile (0-50%) containing 0.1% trifluoroacetic acid. The protein was evaporated to dryness, dissolved in 500 //I of 0.1 M Tris/HCl (pH 8.0) and then digested for 5 h at 37*C with 1 //g of TPCK-trypsin. The resulting tryptic peptides were separated by the same procedure as described above. Amino Acid Sequence Analysis—The tryptic peptides of the pS2 protein were automatically sequenced with an Applied Biosystem Sequencer (model 470A) equipped with an on-line Phenylthiohydantoin Analyzer (model 129A). Polybrene was used as a carrier. cDNA Library Construction—MCF-7 cells were cultured in Dulbecco's modified Eagle's minimum essential medium containing 10% untreated FCS, insulin (2#M), and /3estradiol (10 nM). Total RNA was extracted by the guanidium thiocyanate method (14) and poly(A)+RNA was prepared by oligo(dT)-cellulose (Pharmacia) column chromatography (15). The MCF-7 cell cDNA library was constructed, using 5.2 fig of vector primer DNA (Pharmacia) and 30 ju g of poly (A)+RNA, according to the method of Okayama and Berg (16). Sequences of the Oligodeoxyribonucleotide Probes—Four probes were synthesized by a modification of the triester method (17), labeled at the 5'-terminal end with [y-"P]ATP (Amersham, 6,000 Ci/mmol), and T4 polynucleotide kinase (Toyobo), and then purified by passage through a Sephadex G-50 (Pharmacia) column. Two 25-mer probes (I and II) were used for colony screening. The sequence of probe I is identical to nucleotides 73-97 and that of probe II is complementary to nucleotides 225-249 in the sequence pS2 cDNA (Figs. 3 and 4). Two 30-mer probes (A and B) were used for RNA blot hybridization. The sequences of probes A and B were complementary to nucleotides 194223 and - 1 0 2 to - 73 in the sequence of pS2 cDNA (Figs. 3 and 4), respectively. Colony Screening—Escherichia coli HB101 was transformed with the recombinant DNA by the calcium-shock procedure (18). Colony hybridization was performed by the method of Hanahan and Meselson (19). The filters were prehybridized at 50'C for 3 h in 3xSSC ( l x = 0 . 1 5 M NaCl/0.015M sodium citrate, pH 7.0) containing 10 x Denhardt's solution (IX =0.02% polyvinylpyrrolidone/ 0.02% Ficoll/0.02% bovine serum albumin), 0.1% SDS, and 50 pLg/w\ of denatured salmon sperm DNA. Hybridization was carried out using end-labeled probe I or probe II at 50'C for 18 h in 4xSSC containing 10 X Denhardt's solution, 0.1% SDS, and 25 /*g/ml of denatured salmon sperm DNA. The filters were washed at 50'C with 4 x SSC containing 0.1% SDS. DNA sequencing was performed by the method of Sanger et al. (20). RNA Blot Hybridization Analysis—MCF-7 cells were grown in Dulbecco's modified Eagle's minimum essential medium containing 10% untreated FCS. When the cells reached the subconfluent state, the medium was replaced with the medium containing 10% steroid hormone-depleted FCS. After 48-h incubation, the cells were cultured for 48 h in fresh medium containing 10% steroid hormone-depleted FCS in the presence or absence of /?-estradiol (10 nM). Twenty micrograms of total RNA isolated from the cells was denatured in the presence of 50% formamide containing 2.2 M formaldehyde, 20 mM morpholinopropane sulfonic acid (pH7.0), 5mM sodium acetate, and 1 mM EDTA, electrophoresed in a 1.5% agarose gel, transferred

to a nitrocellulose filter (Schleicher & Shuell), and then baked in a vacuum oven for 5 h at 75'C according to the standard method (21). Prehybridization, hybridization with end-labeled probe A or probe B, and washing were carried out as described above. RESULTS AND DISCUSSION Complete Primary Structure of the pS2 Protein Purified from Conditioned Medium of MCF-7 Cells—Digestion of the S-pyridylethylated pS2 protein with TPCK-trypsin yielded five major fragments (designated as T1-T5), as judged on reverse-phase HPLC, as shown in Fig. 1. The amino acid sequences of these tryptic peptides were determined automatically with a gas-phase protein sequencer. Cys was identified as S-/J-(4-pyridylethyl)cysteine. Tl, T2, T3, T4, and T5 were composed of 12,9,18,16, and 21 amino acids, respectively. The approximate ratio of the yields of N-terminal phenylthiohydantoin-amino acids of the five tryptic peptides was T l : T2 : T3 : T4 : T5 = 9 : 8 : 2 : 6 : 7 . The N-terminal 36-amino acid sequence of the intact pS2 protein, which we have already reported (11), includes the whole sequences of Tl, T3, and T4, and a part of the sequence of T2. From these results, the five tryptic peptides can be aligned as shown in Fig. 2. T4 overlapped T3, however, the presence of two amino acids (Glu13 and Gln1B) on both sides probably caused the incomplete digestion of the S-pyridylethylated pS2 protein at Arg u with TPCK-trypsin (22). Correspondingly, if the yields of the N-terminal phenylthiohydantoin-amino acids of T3 and T4 were added, the sum was comparable to the sum of the yields of those of other tryptic peptides. Isolation and Characterization of Two cDNA Clones Encoding the pS2 Protein—cDNA clones encoding the pS2 protein were isolated from a cDNA library which was constructed with RNA from MCF-7 cells cultured in the presence of /ff-estradiol. Six clones hybridizing with both probes I and II were obtained on screening approximately 1X10 3 ampicillin-resistant colonies. Thus the content of the mRNA encoding the pS2 protein in the cells is considered to be about 0.6%, this value being comparable to that (0.8%) reported by Prud'homme et al. (23). Digestion of these clones with Pstl and PvuU revealed that one clone, No. 51 (designated as pS2Bl), contained a cDNA insert of approximately the same length as the pS2 cDNA isolated by Jakowlew et aL (3), and that another clone, No. 52 (designated as pS2B2), contained a cDNA insert longer by about 70 bp, at the 5' end, than pS2Bl. Then the nucleotide sequences of pS2Bl and pS2B2 were determined, as schematically shown in Fig. 3. As shown in Fig. 4A, pS2Bl contained a cDNA insert of 490 bp excluding the poly(dA)-poly(dT) tract and the poly(dG)-poly(dC) tail, while pS2B2 contained one of 563 bp. Both inserts have a single open reading frame of 252 nucleotides encoding an 84-amino acid polypeptide. The amino acid sequence of the pS2 protein purified from the medium (Fig. 2) was completely identical to the deduced sequence over the region extending from residue 25 to 84. Thus the pS2 gene product is synthesized as a precursor protein composed of 84 amino acids and secreted as a mature protein composed of 60 amino acids after the signal polypeptide cleavage. J. Biochenu

75

Structure of the pS2 Gene Product The nucleotide sequence of pS2Bl was identical to that of pS2 cDNA isolated by Jakowlew et al. (3) except for one nucleotide in the 3' untranslated region (double-underlined in Fig. 4A). Nucleotide 289 (C) was reported to be G in the sequence of pS2 cDNA (3) but to be C in the sequence of the pS2 gene (6). On the other hand, pS2B2 contained 73 nucleotides at the 5' end in addition to the sequence completely identical to that of pS2Bl. As shown in Fig. 4B, the additional 73 nucleotides are located just upstream of the sequence of pS2Bl in the structure of the pS2 gene, indicating that the pS2 gene has two start sites for transcription. The sequences, TATAAAA (nucleotides —69 to - 6 3 ) and GGCAAAT (nucleotides - 1 1 2 to -106), are the TATA box and CAAT-like box for pS2Bl, respectively, as reported by Jeltsch et aL (6). In the case of pS2B2, the sequences, GATAAAA (nucleotides —162 to —156) and GCAAAT (nucleotides - 2 0 6 to -201), are probably the TATA box and the CAAT-like box, respectively. Since a TATA box and a CAAT box are usually located about 25 bp and 70-80 bp upstream from the start site for transcription, respectively (24), dozens of nucleotides may be missing at the 5' end of pS2B2 and the actual cap site for a mRNA corresponding to pS2B2 may be the one of the

adenines located before or after nucleotide —130. Then the levels of mRNAs corresponding to pS2Bl and pS2B2 were examined by RNA blot hybridization analysis using two end-labeled oligodeoxyribonucleotide probes. Probe A is complementary to both pS2Bl and pS2B2, while probe B is only complementary to pS2B2 (Fig. 3). Total RNA was extracted from MCF-7 cells cultured in steroid hormone-depleted medium in the presence or absence of /3-estradiol (10 nM) to examine the estrogen-inducibility of the mRNA. As shown in Fig. 5, a single band of RNA was detected for cells cultured in the absence of /9-estradiol with probe A (lane 1) and its density was dramatically increased by treatment of the cells with /9-estradiol (lane 2). The band of RNA was detected at the same migration position as that hybridizing with the nick-translated pS2 cDNA. In contrast, no bands were detected with probe B, irrespective of the absence (lane 3) or presence (lane 4) of /S-estradiol in the medium, indicating that the content of a mRNA corresponding to pS2B2 was below the level of detection in the assay. Thus, although the pS2 gene has two start sites for transcription, one site is mainly used. Several eukaryotic genes, for instance, the a-amylase gene (25) and the alcohol dehydrogenase gene (26), have been shown to possess two start sites for transcription. Two 5

10

15

Glu - A l a - G i n -Thr- CMu-Thr -Cys -Thr -Val - A l a -Pro - A r g : G l u - A r g - G l n * . T1 »-« i

-Asn-Cys-Gly-Phe-Pro-Gly-Val-Trv-Pro-Ser-Gln-Cys-Ala-Asn-Lys-; T3

*

U

»•'

r,

20 30 Time (mm)

50

Fig. 1. Reverse-phase HPLC of the tryptic peptides of the pS2 protein. The S-pyridylethylated pS2 protein was digested with TPCK-trypsin and then applied on a column of //Bondapak C,§ as described under "MATERIALS AND METHODS." The column eluate was monitored at 210 run (solid line) and the broken line shows the percentage of acetonitrile in the elution medium. T1-T5 denote the five major fragments obtained, in order of elution.

Fig. 3. Restriction map and sequencing procedure for pS2Bl and pS2B2. The open box and the hatched box denote the region encoding the mature pS2 protein and the signal peptide, respectively. The straight lines at the sides of these boxed show the 5' and 3' untranslated regions. The broken line indicates the nucleotide sequence present in pS2B2 but not in pS2Bl. The wavy lines, 5' and 3', are the poly(dG)-poly(dC) tail and the poly(dA) -poly(dT) tract, respectively. The left and right closed boxes are the locations of probes I and II used for colony screening, respectively. The right and left dotted boxes are the locations of probes A and B used for RNA blot hybridization, respectively. The relevant restriction sites are shown together with the nucleotide numbers derived from Fig. 4 in parentheses. The length and direction of the sequenced restriction fragments are shown by the arrows. Vol. 107, No. 1, 1990

u

1.0

• Gly -Cys -Cys -Phe -Asp-Asp-Thr - Val - Arg -Gly - Val - Pro -Trp -Cys -Prw?4

T2 50

^4 55

to

-Tyr -Pro-Asn-Thr-lle -Asp-Val -Pro-Pro-Glu-Glu-Glu-Cys-Glu-Phe T5 ti

Fig. 2. Complete amlno acid sequence of the pS2 protein. The five solid lines between two arrowheads show the sequences of the tryptic peptides (T1-T5) identified by Edman degradation.

IQObp

76

K. Mori et al. pS2B2 -110

Origin

f pS2B1 -40

-30

-20

30 40 50 SO ATG GCC ACC ATC C M AAC AAG CTC ATC TCC OCC CTC GTC CTC CTTC TCC ATC CTC GCC CTC M t AJ« Thr Htt Glu A»n Lvs V«l II. Cr» M « L«u V»l L.Q V«l t«r Mat L«ll All L « I 1 0 2 0

150

UO

18S R N A —

170

110

TGT GCT TTT CCT OOT GTC ACG CCC TCC CAG TCT GCA AAT AAC GGC TOC TOT TTC GAC GAC Cys Cly Ph« Pro Gly Val Thr Pro S«r Gin Cy» Ala Asn Lya Gly Cya Cys Ptas Asp Asp M

'0

ACC GTT CCT GGG GTC CCC TOG TOC TTC TAT CCT AAT ACC ATC GAC OTC CCT CCA GAA GAC Thr V a l Arg G l y v a l P r o Trp C y i Ph« T y r P r o Asn Thr I l « Asp V a l P r o P r o G l a G l o

GAC TCT GAA TTT TAG ACAll ILlGCAGGGATCTUJCTGCATCCTGACGCGOTUUJUTLLLCACCACGGTGATTA Glu C y i G l a Ph« 320

330

340

350

340

370

310

410

420

430

440

450

GACTGCTCrGACTTTaACTACTCAAAATTGGCCTAAAAATTAAAAGAGATCOATATT

-260 5'

-HO

-2S0

-240

-230

-110

-170

-160

-150

-210

-220

GAATCTCAGATCCCTCAG

CCC ACC CTC OCC OAO OCC CAC ACA GAG ACC TOT ACA GTC GCC CCC CCT CAA AGA CAG AAT r V«l Ala Pro Axg Gla Axg GlJi Asa 30 140

1 2

28SRNA —

-10

JATCCCTGACTCGCGCTCO-C 111IGACCAGAGAGGAGGCA

130



A T C OCC ACC ATO GAG AAC AAG CTG ATC

-40 ATCCC

3'

K«t Ala Thr nmt alu Asn Lys Val II*

Fig. 4. Nucleotide sequences of pS2Bl and pS2B2, and the deduced amino acid sequence. (A) Nucleotides are numbered in the 5'- to 3'-direction, beginning with the first adenine of the initiating methionine and ending with the last nucleotide before the poly(dA) tract. Nucleotides in the 5' untranslated region have negative numbers. The sequence of the 73 nucleotides present in pS2B2 but not in pS2Bl is broken-underlined. The deduced amino acid sequence is shown below the nucleotide sequence, with the numbering starting from the initiating methionine. The signal peptide is underlined. One nucleotide, 289, different from in the sequence of pS2 cDNA previously reported (3), is double-underlined. (B) The nucleotide sequence present only in pS2B2 is indicated by the open box in the sequence of the pS2 gene (6). The numbering of nucleotides is the same as in (A). The CAAT-like boxes and TATA boxes are underlined and brokenunderlined, respectively.

mRNAs encoding an identical mouse a-amylase but containing different 5' terminal sequences are tissue specific. Thus, the

Complete primary structure of the human estrogen-responsive gene (pS2) product.

pS2 is a human gene whose transcription is directly triggered by estrogen in human breast cancer cells (MCF-7). We described here the complete primary...
2MB Sizes 0 Downloads 0 Views