DNA AND CELL BIOLOGY Volume 10, Number 10, 1991 Mary Ann Liebert, Inc., Publishers Pp. 757-769

Identification of a Second Human Subtilisin-Like Protease Gene in the fes/fps Region of Chromosome 15 MICHAEL C.

KIEFER, JEFFREY E. TUCKER, RICHARD JOH, KATHERINE E. LANDSBERG, DAVID SALTMAN,* and PHILIP J. BARR

ABSTRACT A cDNA

encoding a novel human subtilisin-like protease was identified by a polymerase chain reaction (PCR) methodology. PCR primers were designed to be specific for the subfamily of eukaryotic subtilisin-like proteases with specificity for paried basic amino acid residue processing motifs. The gene encoding this protease, designated PACE4, also encoded a smaller subtilisin-related polypeptide derived by alternate mRNA splicing. The deduced PACE4 protein sequence contained a number of interesting features not present in other family members, including an extended signal peptide region, and a relatively large carboxyl-terminal cysteine-rich region with no obvious membrane anchor sequence. As with the fur gene product, the tissue

distribution of PACE4 was widespread, with comparatively higher levels in the liver. An additional relationship to the fur gene product was shown by chromosomal localization studies. The close proximity of the fur and PACE4 genes on chromosome 15 suggests that these genes probably evolved from a common ancestor by gene duplication.

INTRODUCTION

active in the

processing of neuroendocrine precursors such pro-opiomelanocortin (POMC) and, in contrast to human fur gene encodes a subtilisin-like protease furin/PACE, processing occurs via the regulated secretion alternately named furin (Roebroek eí al, 1986) or pathway (Körner eí al, 1991; Seidah eí al, 1990; Thomas PACE (Wise ei al., 1990; Barr eí al., 1991) and was discov- eí al, 1991). Thus, although exact roles in vivo remain to ered by its extremely close proximity to the fes/fps proto- be determined for each of these proteases, the discovery of oncogene. The specificity of the encoded enzyme for end- this class of dibasic processing enzymes has provided a soproteolytic cleavage at paired basic amino acid residue mo- lution to a long-standing enigma of cell biology. tifs (dibasic sites) was predicted by its homology to the The dibasic processing, subtilisin-like proteases appear yeast dibasic processing endoprotease DEX2 (Fuller ei al., to form a discrete branch of the subtilisin family, with dis1989). Subsequent studies showed that furin/PACE indeed tinctive motifs surrounding their catalytic Asp, His, and has the capacity and specificity for such proteolytic pro- Ser residues (Barr, 1991; Barr et al, 1991). This is best ilcessing. It can cleave substrates accurately and efficiently lustrated by comparison with human tripeptidyl peptidase such as pro-von Willebrand factor (van de Ven eí al, II (TPP II), a subtilisin-like exopeptidase with no apparent 1990; Wise eí al, 1990), pro-0-nerve growth factor (Bres- specificity for dibasic sites (Barr, 1991; Tomkinson and nahan eí al, 1990), proalbumin, complement pro-C3 (Mi- Jonsson, 1991). The comparison indicates that PCR prisumi eí al, 1991), and a prorenin mutant (Hosaka eí al, mers based on DNA sequences encoding the catalytic Asp 1991). Concurrent with these studies, two additional mam- and His residue motifs should invariably prime the amplifimalian subtilisin-like proteases also with cleavage specifici- cation of DNA sequences encoding subtilisin-like proteases ties for dibasic sites were also identified by polymerase with cleavage specificity for paired basic amino acid resichain reaction (PCR) methodologies (Seidah ei al, 1990; dues. Accordingly, we have extended the observations of Smeekens and Steiner, 1990; Smeekens ei al, 1991). These Smeekens and Steiner (1990) by using such PCR primers two endoproteases, designated PC2 and PCI or PC3, are for the isolation of a novel subtilisin-like protease cDNA. as

The

Chiron Corporation, Emeryville, CA 94608. *Genelabs Incorporated, Redwood City, CA.

757

KIEFER ET AL.

758

We show that the gene that encodes this protease, designated PACE4, can give rise to an additional, smaller, subtilisin-related protein by alternate mRNA splicing, and is located in the same region of chromosome 15 as the fur and fes/fps genes.

MATERIALS AND METHODS

Cell culture and RNA isolation

Construction and DNA libraries

screening of cDNA and genomic

A 293 cDNA library was constructed in XZAPII as described by Zapf et al. (1990). A total of 6 x 107 independent recombinant clones were obtained. The HepG2 cDNA library (Zapf et al, 1990) and a human osteosarcoma tissue (Ost4) cDNA library (Kiefer et al, 1991b) have been described previously. Approximately 300,000 recombinant phage from the Ost4 library were screened as described (Kiefer et al, 1991b) using the two PACE4 oligonucleotide probes. The HepG2 and 293 cDNA libraries were plated as above and screened with a 1.0 kb Bgl II DNA fragment derived from the 5' end of clone Os-1 from the Ost4 cDNA library. The 293 cDNA library was screened further with a 500 bp Eco RI fragment derived from the 3' end of cDNA clone K-15, from the 293 cell cDNA library, and a synthetic oligonucloetide (PACE2115) derived from a sequence complementary to the 5' end of PACE4 cDNA and including the initiation codon (5'-GGAGGCATAGCGGCGAC-31). The cDNA probes were labeled as described (Feinberg and Vogelstein, 1984) and were hybridized to filters under conditions described above for the oligonucleotide probes except that the hybridization solution contained 40% formamide. The filters were washed twice for 20 min in 0.1 x SSC, 0.1% NaDodSO, at 65°C. To obtain the human PACE4 gene, 600,000 colonies from a human genomic DNA library cloned in the cosmid vector pWE15 (Stratagene, San Diego, CA) were plated and screened with a 1.0-kb Bgl II DNA fragment from the 5' end of cDNA clone Os-1 and subsequently with a 1.1-kb Eco RI DNA fragment from the 3' end of cDNA clone L2-1 (Fig. 1A). Two overlapping clones of -35-40 kb were obtained which hybridized to the 5' probe (PACE4-COS 3) and 3' probe (PACE4-COS 8.1). —

HepG2 human hepatoma, 293 human kidney, and U937 human monocyte-like cells were obtained from the ATCC and maintained in Dulbecco's modified Eagle's medium (DMEM) containing 10% fetal calf serum, 100 units/ ml penicillin, and 100 pg/m\ streptomycin. UC729-6 human lymphoblastoid cells were obtained from the ATCC and maintained in RPMI-1640 with 2 mMglutamine, 10% fetal calf serum, 100 units/ml penicillin, and 100 pg/m\ streptomycin. Hamster insulinoma cells (HIT-15) were a gift from Dr. Y. McHugh (Chiron Corp.). Human umbilical vein endothelial cells (HUVEC) were a gift from Dr. D. Gospodarowicz (University of California, San Francisco). Human osteosarcoma tissue was obtained from Dr. Marshall Urist (University of California, Los Angeles). RNA was isolated by the guanidinium thiocyanate method (Chirgwin ei «/., 1979) with modifications (Freeman et al, 1983). Poly(A)*RNA was purified by fractionation over oligo(dT)-cellulose (Aviv and Leder, 1972). Human liver poly(A)*RNA was obtained from Clontech (Palo Alto, CA).

~

~

-

Oligonucleotide synthesis

Oligonucleotide adaptors, probes, and primers were synas previously described (Kiefer eí al, 1991a,b). Plasmid and cosmid isolation, subcloning, and consensus PCR primers used to identify PACE4 DNA sequencing were a sense primer consisting of a mixture of 16 23-mers thesized The two

(5'-AGATCTGAATTCGAC/tGAC/tGGXAT-3') and an

antisense

primer consisting of a mixture

of 64 27-mers

(5'-

AGATCTAAGCTTACACCG/tXGTXCCA/gTG-3')

where X denotes all four deoxynucleotides. The PACE4 probes used to screen a human osteosarcoma tissue (Ost4) cDNA library were a 24-mer (5'-TAATCATTGCCGTTCACGTCGTAG-3') and a 22-mer (5'-GCTGGCATCATATCGT-GGAGAT-30 that were derived from the sequence complementary to the PACE4 PCR product (see below).

PCR PCR was performed as described (Saiki et al, 1985) with modifications (Kiefer et al, 1991a). The reactions were performed using the PCR primers described above at 8 pM and 0.1-1.0 ng/ml cDNA synthesized from RNA extracted from human osteosarcoma tissue, human liver, or kidney 293 cells. PCR products migrating between 120 and 170 bp on a 7% acrylamide gel were excised, subcloned into Ml3, and sequenced.

Plasmid DNA

D1210,

or

was propagated in E. coli strains HB101, XL-1 Blue (Stratagene). DNA sequencing was by the dideoxy chain-termination method using

performed M13 primers as well as specific internal primers. Ambiguous regions were resolved using 7-deaza-2'-deoxyguanosine-5'-triphosphate (Barr et al, 1986) and Sequenase (U.S. Biochemicals). Northern blot

analysis

Northern blots

containing poly(A)*RNA (2 pg) from huheart, brain, placenta, lung, liver, skeletal muscle, kidney, and pancreas were purchased from Clontech (Palo Alto, CA). Poly(A)*RNA (2 pg) from various other cell man

lines and tissues was fractionated on 1.4% agarose gels in the presence of formaldehyde (Lehrach et al, 1977) transferred directly to nitrocellulose and processed as described (Thomas, 1980). Hybridization and washing conditions were as described above for the screening of cDNA libraries using cDNA probes.

HUMAN SUBTILISIN-LIKE PROTEASE GENE R/G Sm I_U_L

Sp

759

I

G/R

R

G RG l l

Os-1 L2-1

L2-10

*^^S^^

H

0.5 kb

R/G Sm

A Sp

I II I

PACE 4

G/R _l H

K-15 K-l.l

^^^^^^ 0.5 kb

H

PACE 4.1

B

Restriction maps, cDNA clones, and composite diagrams of PACE4 (A) and PACE4.1 (B). Composite diadisplaying coding regions and 5' and 3' untranslated regions. The amino-terminal signal sequences are represented by shaded boxes, and the remainder of the encoded proteins by hatched boxes. The unique carboxyl-terminal exon of PACE4.1 is represented by an open box. Individual clones are shown above their respective composite diagram. Restriction endonucleases are abbreviated as follows: B, Bel I; G, Bgl II; R, Eco RI; Sm, Sma I; Sp, Sph I. Each clone contains synthetic DNA-derived Eco RI and Bgl II sites at their termini. FIG. 1. grams

Chromosomal localization of the PACE4 gene in situ hybridization and detection

by

min

3 at

room

temperature; (iii) 30 min incubation in 15

/tg/ml sheep anti-digoxigenin Fab fragments (BoehringerMannheim) and 5 /tg/ml biotinylated goat anti-avidin

Hybridization was performed as described (Pinkel eí al., 1988) with some modifications. First, 10 p\ of hybridization mixture containing 50% formamide, 10% dextran sulfate, 2x SSC, 1 pg of salmon sperm DNA, 0.5 pg of sonicated human DNA, and 0.1 pg of biotin-labeled probe were denatured by heating to 70CC for 10 min, and then cooled on ice. The hybridization mixture was incubated for 20 min at 37°C before being applied to denatured slides. Hybridization was carried out in a moist chamber at

37°C for 16 hr. Slides were washed three times for 2 min in 50% formamide, 2x SSC at 45°C and once in 2x SSC at room temperature. Detection of biotinylated probes was performed with 5 pg/m\ of fluoresceinated avidin (Vector Laboratories Inc., Burlingame, CA), followed by three washes in 4x SSC, 0.1% Tween. Slides hybridized to probes labeled with biotin and digoxigenin were processed as follows: (i) after the post-hybridization washes, the slides were incubated with 2.5 pg/ ml avidin-Texas Red (Vector) in 4x SSC, 5% nonfat dry milk, and 0.1% Tween for 30 min at room temperature; (ii) the slides were washed in 4x SSC, 0.1% Tween for 5

x

(Vector); (iv) three washes in 4x SSC, 0.1% Tween; (v) 30 min incubation in 1:50 dilution of fluoroscein isothiocyanate (FITC)-conjugated rabbit anti-sheep IgG (Vector) and 2.5 /tg/ml avidin-Texas Red in 4x SSC, 5% milk and 0.1% Tween; (vi) three 5-min washes in 4x SSC, 0.1% Tween. Slides were mounted in anti-fade (p-phenylenediamine dihydrochloride (Sigma) adjusted to pH 8 with bicarbonate) containing 0.01-0.1 /tg/ml of propidium iodide and 0.1 /tg/ml of 4',6-diamidino-2-phenylindole dihydrochloride. RESULTS

Molecular cloning the PACE4 gene

'

of PACE4

and 4.1 cDNAs and

To identify and clone new members of the human subtilisin-like protease gene family, we designed degenerate PCR primers corresponding to stretches of amino acids in the highly conserved catalytic domains of the yeast KEX1 protein, human PACE, human PC2, and mouse PCI/

760

KIEFER ET AL.

PC3. These included the amino acid sequences containing the active-site Asp (DDGI) and the active-site His

(HGTRC). After PCR amplification of cDNA reversetranscribed from human liver tissue and human kidney 293 cells, and human osteosarcoma tissue poly(A)+RNA, the

expected products (-140 bp) were gel-purified, pooled, and subcloned into the Ml3 mpl8. Sequence analysis of 10 PCR products revealed one sequence that was predicted to encode a stretch of 37 amino acids displaying 30-60% sequence identity to the yeast KEX2 protein, human furin/ PACE, human PC2, and mouse PC1/PC3. In an attempt to isolate a full-length cDNA encoding the new human subtilisin-like protease, an osteosarcoma tissue

(Ost4) cDNA library was screened with an oligonucleotide probe derived from the sequence of the PCR product. Among 300,000 recombinant plaques screened, one clone (Os-1) hybridized to the probe and contained a s3.6-kb cDNA insert (Fig. 1A). Comparison of the deduced amino

acid sequence of this clone with the sequence of furin/ PACE revealed that it was missing approximately 200 amino acids from the amino terminus. Subsequently, 300,000 recombinant plaques from a HepG2 cDNA library were probed with a a 1.0-kb Bgl II DNA fragment derived from the 5' end of the Os-1 cDNA clone. This resulted in the isolation of two approximately full-length cDNA clones with inserts of s4.4 kb (clone L2-1) and s4.3 kb

(clone L2-10). Preliminary Northern blot analysis of several tissues and -16 9 -158 -7 9

cell lines using the 1.0-kb Os-2 DNA probe identified a 4.4-kb mRNA transcript in most tissues (described below). In addition, the kidney cell line 293 contained an mRNA of approximately 2.0 kb. The size of this transcript suggested that an alternative form of the PACE4 protein existed since the mRNA size required to encode PACE4 is 2.9 kb (Fig. 1). To isolate this PACE4 variant cDNA, termed PACE4.1, a 293 kidney cell cDNA library was constructed and 300,000 recombinant phage were screened with the 1.0-kb Os-1 cDNA probe. Several positive clones were identified including one that contained a cDNA insert of 1.7 kb (clone K-15). Preliminary DNA sequence analysis revealed that K-15 was identical to PACE4 at the 5' end but lacked 270 kb encoding the amino-terminal 90 amino acids. DNA sequence analysis also showed that PACE4.1 had a unique 3' end that was contained almost entirely on a 0.6-kb Eco RI DNA fragment (Fig. IB). Therefore, this unique PACE4.1 Eco RI DNA fragment and an oligonucleotide probe complementary to the 5' end of PACE4 were used to rescreen the 293 kidney cell cDNA library. One clone (K-l.l) that hybridized to both probes contained a full-length 1.9-kb cDNA insert. =

Structures of PACE4 and 4.1 deduced from cDNA sequences The translation of PACE4 is probably initiated at the ATG start codon shown in Fig. 2A, since this is the only CGGGAACGCGC

CGCGGCCGCCTCCTCCTCCCCGGCTCCCGCCCGCGGCGGTGTTGGCGGCGGCGGTGGCGGCGGCGGCGGCGCTTCCCCG GCGCGGAGCGGCTTTAAAAGGCGGCACTCCACCCCCCGGCGCACTCGCAGCTCGGGCGCCGCGCGAGCCTGTCGCCGCT Met Pro Pro Arg Ala Pro Pro Ala Pro Gly Pro Arg Pro Pro Pro Arg Ala Ala Ala Ala ATG CCT CCG CGC GCG CCG CCT GCG CCC GGG CCC CGG CCG CCG CCC CGG GCC GCC GCC GCC

21

61

Thr Asp Thr Ala Ala Gly Ala Gly Gly Ala Gly Gly Ala Gly Gly Ala Gly Gly Pro Gly ACC GAC ACC GCC GCG GGC GCG GGG GGC GCG GGG GGC GCG GGG GGC GCC GGC GGG CCC GGG

41 121

Phe

61 181

Cys

81 241

Gly

101 301

Gly

121 361

Ser Ser Arg Gly Pro His Thr Phe Leu Arg Met Asp Pro Gin Val Lys Trp Leu Gin Gin AGT AGC AGA GGC CCT CAC ACC TTC CTC.AGA ATG GAC CCC CAG GTG AAA TGG CTC CAG CAA

141 421

Gin Glu Val

161 481

Asn Asp Pro lie Trp Ser Asn Met Trp Tyr Leu His Cys Gly Asp Lys Asn Ser Arg Cys AAC GAC CCC ATT TGG TCC AAC ATG TGG TAC CTG CAT TGT GGC GAC AAG AAC AGT CGC TGC

181

Arg

541

Arg Pro

Leu Ala Pro Arg Pro Trp Arg Trp Leu Leu Leu Leu Ala Leu Pro Ala Ala GCG CCG CGT CCC TGG CGC TGG CTG CTG CTG CTG GCG CTG CCT GCC GCC

TTC CGG CCGl C CTC

Ser Ala'Pro Pro Pro Arg Pro Val Tyr Thr Asn His Trp Ala Val Gin Val Leu Gly TGC TCC GCG CCC CCG CCG CGC CCC GTC TAC ACC AAC CAC TGG GCG GTG CAA GTG CTG GGC

Pro Ala Glu Ala Asp Arg Val Ala Ala Ala His Gly Tyr Leu Asn Leu Gly Gin lie GGC CCG GCC GAG GCG GAC CGC GTG GCG GCG GCG CAC GGC TAC CTC AAC TTG GGC CAG ATT Asn Leu Glu Asp Tyr Tyr His Phe Tyr His Ser Lys Thr Phe Lys Arg Ser Thr Leu GGA AAC CTG GAA GAT TAC TAC CAT TTT TAT CAC AGC AAA ACC TTT AAA AGA TCA ACC TTG

:g~Glr Val Arg Ser Asp Pro Gin Ala Leu Tyr Phe Lys Arg Arg Val Lys Arg'Gln CAG GAA GTG AAA CGA AGG GTG AAG AGA CAG GTG CGA AGT GAC CCG CAG GCC CTT TAC TTC

Ser Glu Met Asn Val Gin Ala Ala Trp Lys Arg Gly Tyr Thr Gly Lys Asn Val Val CGG TCG GAA ATG AAT GTC CAG GCA GCG TGG AAG AGG GGC TAC ACA GGA AAA AAC GTG GTG

201 601

Val Thr lie Leu Asp Asp Gly Ile Glu Arg Asn His Pro Asp Leu Ala Pro Asn Tyr Asp GTC ACC ATC CTT GAT GAT GGC ATA GAG AGA AAT CAC CCT GAC CTG GCC CCA AAT TAT GAT

221

Ser Tyr Ala Ser Tyr Asp Val Asn Gly Asn Asp Tyr Asp Pro Ser Pro Arg Tyr Asp Ala TCC TAC GCC AGC TAC GAC GTG AAC GGC AAT GAT TAT GAC CCA TCT CCA CGA TAT GAT GCC

661

*

241

721

Ser Asn Glu Asn Lys His Gly Thr Arg Cys Ala Gly Glu Val Ala Ala Ser Ala Asn Asn AGC AAT GAA AAT AAA CAC GGC ACT CGT TGT GCG GGA GAA GTT GCT GCT TCA GCA AAC AAT

261 781

Ser Tyr Cys lie Val Gly lie Ala Tyr Asn Ala Lys Ile Gly Gly lie Arg Met Leu Asp TCC TAC TGC ATC GTG GGC ATA GCG TAC AAT GCC AAA ATA GGA GGC ATC CGC ATG CTG GAC

281

Gly Asp Val Thr Asp Val Val Glu Ala Lys Ser

841



Leu Gly Ile Arg Pro Asn Tyr lie Asp GGC GAT GTC ACA GAT GTG GTC GAG GCA AAG TCG CTG GGC ATC AGA CCC AAC TAC ATC GAC

761

HUMAN SUBTILISIN-LIKE PROTEASE GENE 301 901

321 961

Ile Tyr Ser Ala Ser Trp Gly Pro Asp Asp Asp Gly Lys Thr Val Asp Gly Pro Gly Arg ATT TAC AGT GCC AGC TGG GGG CCG GAC GAC GAC GGC AAG ACG GTG GAC GGG CCC GGC CGA Leu Ala Lys Gin Ala Phe Glu Tyr Gly Ile Lys Lys Gly Arg Gin Gly Leu Gly Ser He CTG GCT AAG CAG GCT TTC GAG TAT GGC ATT AAA AAG GGC CGG CAG GGC CTG GGC TCC ATT

* 341 1021

361 1081

_

Ser Gly Asn Gly Gly Arg Glu Gly Asp Tyr Cys Ser Cys Asp Gly Tyr TTC GTC TGG GCA TCT GGG AAT GGC GGG AGA GAG GGG GAC TAC TGC TCG TGC GAT GGC TAC Phe Val

Trp Ala

_

Thr Asn Ser He Tyr Thr He Ser Val Ser Ser Ala Thr Glu Asn Gly Tyr Lys Pro Trp ACC AAC AGC ATC TAC ACC ATC TCC GTC AGC AGC GCC ACC GAG AAT GGC TAC AAG CCC TGG Leu Glu Glu Cys Ala Ser Thr Leu Ala Thr Thr Tyr Ser Ser Gly Ala Phe Tyr Glu TAC CTG GAA GAG TGT GCC TCC ACC CTG GCC ACC ACC TAC AGC AGT GGG GCC TTT TAT GAG

381 1141

Tyr

401 1201

Arg Lys He Val Thr Thr Asp

*

421

1261 441 1321 461

1381 481 1441

501

Leu Arg Gin Arg Cys Thr Asp Gly His Thr Gly Thr Ser CGA AAA ATC GTC ACC ACG GAT CTG CGT CAG CGC TGT ACC GAT GGC CAC ACT GGG ACC TCA _

Val Ser Ala Pro Met Val Ala Gly He He Ala Leu Ala Leu Glu Ala Asn Ser Gln Leu GTC TCT GCC CCC ATG GTG GCG GGC ATC ATC GCC TTG GCT CTA GAA GCA AAC AGC CAG TTA

Trp Arg Asp Val Gin His Leu Leu Val Lys Thr Ser Arg Pro Ala His Leu Lys Ala ACC TGG AGG GAC GTC CAG CAC CTG CTA GTG AAG ACÁ TCC CGG CCG GCC CAC CTG AAA GCG Thr

Ser Asp Trp Lys Val Asn Gly Ala Gly His Lys Val Ser His Phe Tyr Gly Phe Gly Leu AGC GAC TGG AAA GTA AAC GGC GCG GGT CAT AAA GTT AGC CAT TTC TAT GGA TTT GGT TTG

Asp Ala Glu Ala Leu Val Val Glu Ala Lys Lys Trp Thr Ala Val Pro Ser Gin His GTG GAC GCA GAA GCT CTC GTT GTG GAG GCA AAG AAG TGG ACÁ GCA GTG CCA TCG CAG CAC

Val

1501

Met Cys Val Ala Ala Ser Asp Lys Arg Pro Arg Ser He Pro Leu Val Gin Val Leu Arg ATG TGT GTG GCC GCC TCG GAC AAG AGA CCC AGG AGC ATC CCC TTA GTG CAG GTG CTG CGG

521 1561

Thr Thr Ala Leu Thr Ser Ala Cys Ala Glu His Ser Asp Gin Arg Val Val Tyr Leu Glu ACT ACG GCC CTG ACC AGC GCC TGC GCG GAG CAC TCG GAC CAG CGG GTG GTC TAC TTG GAG

541 1621

561 1681

Arg Arg Gly Asp Leu Gin He Tyr Leu CAC GTG GTG GTT CGC ACC TCC ATC TCA CAC CCA CGC CGA GGA GAC CTC CAG ATC TAC CTG His Val Val Val Arg Thr Ser He Ser His Pro

Gly Thr Lys Ser Gin Leu Leu Ala Lys Arg Leu Leu Asp Leu Ser Asn GTT TCT CCC TCG GGA ACC AAG TCT CAA CTT TTG GCA AAG AGG TTG CTG GAT CTT TCC AAT

Val Ser Pro Ser

1741

Glu Gly Phe Thr Asn Trp Glu Phe Met Thr Val His Cys Trp Gly Glu Lys Ala Glu Gly GAA GGG TTT ACA AAC TGG GAA TTC ATG ACT GTC CAC TGC TGG GGA GAA AAG GCT GAA GGG

601 1801

Gin Trp Thr Leu Glu Ile Gin Asp Leu Pro Ser Gln Val Arg Asn Pro Glu Lys Gin Gly CAG TGG ACC TTG GAA ATC CAA GAT CTG CCA TCC CAG GTC CGC AAC CCG GAG AAG CAA GGG

621 1861

Lys Leu Lys Glu Trp Ser Leu He Leu Tyr Gly Thr Ala Glu His Pro Tyr His Thr Phe AAG TTG AAA GAA TGG AGC CTC ATA CTG TAT GGC ACA GCA GAG CAC CCG TAC CAC ACC TTC

641

Ser Ala His Gln Ser Arg Ser Arg Met Leu Glu Leu Ser Ala Pro Glu Leu Glu Pro Pro AGT GCC CAT CAG TCC CGC TCG CGG ATG CTG GAG CTC TCA GCC CCA GAG CTG GAG CCA CCC

581

1921

1981

Lys Ala Ala Leu Ser Pro Ser Gln Val Glu Val Pro Glu Asp Glu Glu Asp Tyr Thr Ala AAG GCT GCC CTG TCA CCC TCC CAG GTG GAA GTT CCT GAA GAT GAG GAA GAT TAC ACA GCT

681 2041

Gln Ser Thr Pro Gly Ser Ala Asn He Leu Gln Thr Ser Val Cys His Pro Glu Cys Gly CAA TCC ACC CCA GGC TCT GCT AAT ATT TTA CAG ACC AGT GTG TGC CAT CCG GAG TGT GGT

701 2101

Asp Lys Gly Cys Asp Gly Pro Asn Ala Asp Gln Cys Leu Asn Cys Val

721 2161

Gly

741 2221

Thr Ala Ala

761

Ala Thr Gln

661

2281

781 2341

His Phe Ser Leu GAC AAA GGC TGT GAT GGC CCC AAT GCA GAC CAG TGC TTG AAC TGC GTC CAC TTC AGC CTG

Ser Val Lys Thr Ser Arg Lys Cys Val Ser Val Cys Pro Leu Gly Tyr Phe Gly Asp GGG AGT GTC AAG ACC AGC AGG AAG TGC GTG AGT GTG TGC CCC TTG GGC TAC TTT GGG GAC

Arg Arg Cys Arg Arg Cys His Lys Gly Cys Glu Thr Cys Ser Ser Arg Ala ACA GCA GCA AGA CGC TGT CGC CGG TGC CAC AAG GGG TGT GAG ACC TGC TCC AGC AGA GCT

Cys Leu Ser Cys Arg Arg Gly Phe Tyr His His Gln Glu Met Asn Thr Cys GCG ACG CAG TGC CTG TCT TGC CGC CGC GGG TTC TAT CAC CAC CAG GAG ATG AAC ACC TGT

Val Thr Leu Cys Pro Ala Gly Phe Tyr Ala Asp Glu Ser Gln Lys Asn Cys Leu Lys Cys

GTG ACC CTC TGT CCT GCA GGA TTT TAT GCT GAT GAA AGT CAG AAA AAT TGC CTT AAA TGC

His Pro Ser Cys

Lys Lys Cys

Val

821 2461

Phe Ser Leu Ala

Arg Gly

Cys

841 2521

Glu Leu He

861 2581

Glu Glu

881

Cys Gly

801 2401

2641 901

2701 921 2761

Asp Glu Pro Glu Lys Cys Thr Val Cys Lys Glu Gly

CAC CCA AGC TGT AAA AAG TGC GTG GAT GAA CCT GAG AAA TGT ACT GTC TGT AAA GAA GGA

TTC AGC CTT GCA CGG GGC

Ser

ÂGC

He Pro Asp Cys Glu Pro Gly Thr Tyr Phe Asp Ser TGC ATT CCT GAC TGT GAG CCA GGC ACC TAC TTT GAC TCA

Arg Cys Gly Glu Cys His His Thr Cys Gly Thr Cys Val Gly Pro Gly Arg GAG CTG ATC AGA TGT GGG GAA TGC CAT CAC ACC TGC GGA ACC TGC GTG GGG CCA GGC AGA Cys He His Cys Ala Lys Asn Phe His Phe His Asp Trp Lys Cys Val Pro Ala GAA GAG TGC ATT CAC TGT GCG AAA AAC TTC CAC TTC CAC GAC TGG AAG TGT GTG CCA GCC Glu Gly Phe Tyr Pro Glu Glu Met Pro Gly Leu Pro His Lys Val Cys Arg Arg TGT GGT GAG GGC TTC TAC CCA GAA GAG ATG CCG GGC TTG CCC CAC AAA GTG TGT CGA AGG

Cys Asp Glu Asn Cys Leu Ser Cys Ala Gly Ser Ser Arg Asn Cys Ser Arg Cys Lys Thr TGT GAC GAG AAC TGC TTG AGC TGT GCA GGC TCC AGC AGG AAC TGT AGC AGG TGT AAG ACG Ser Cys He Thr Asn His Thr Cys Ser Asn Ala Asp Glu GGC TTC ACA CAG CTG GGG ACC TCC TGC ATC ACC AAC CAC ACG TGC AGC AAC GCT GAC GAG

Gly Phe Thr Gln Leu Gly Thr

KIEFER ET AL.

762

941 2821 961 2 881 2 950

302 9 3108

3187 32 6 6 334 5 3424 3503 3582 3661 374 0

3819 3898

3 977 4 056 4135 4214

Thr Phe

Cys Glu

Met Val

Lys

Ser Asn

Arg Leu Cys Glu Arg Lys Leu Phe Ile Gin Phe

ACA TTC TGC GAG ATG GTG AAG TCC AAC CGG CTG TGC GAA CGG AAG CTC TTC ATT CAG TTC

Cys Cys Arg Thr Cys Leu Leu Ala Gly 0C TGC TGC CGC ACG TGC CTC CTG GCC GGG TAA GGGTGCCTAGCTGCCCACAGAGGGCAGGCACTCCCATCC ATCCATCCGTCCACCTTCCTCCAGACTGTCGGCCAGAGTCTGTTTCAGGAGCGGCGCCCTGCACCTGACAGCTTTATCT CCCCAGGAGCAGCATCTCTGAGCACCCAAGCCAGGTGGGTGGTGGCTCTTAAGGAGGTGTTCCTAAAATGGTGATATCC TCTCAAATGCTGCTTGTTGGCTCCAGTCTTCCGACAAACTAACAGGAACAAAATGAATTCTGGGAATCCACAGCTCTGG CTTTGGAGCAGCTTCTGGGACCAT AAGTTTACTGAATCTTC AAG ACCAAAGCAGAAAAGAAAGGCGCTTGGCATC ACAC

ATCACTCTTCTCCCCGTGCTTTTCTGCGGCTGTGTAGTAAATCTCCCCGGCCCAGCTGGCGAACCCTGGGCCATCCTCA

CATGTGACAAAGGGCCAGCAGTCTACCTGCTCGTTGCCTGCCACTGAGCAGTCTGGGGACGGTTTGGTCAGACTATAAA TAAGATAGGTTTGAGGGCATAAAATGTATGACCACTGGGGCCGGAGTATCTATTTCTACATAGTCAGCTACTTCTGAAA CTGCAGCAGTGGCTTAGAAAGTCCAATTCCAAAGCCAGACCAGAAGATTCTATCCCCCGCAGCGCTCTCCTTTGAGCAA GCCGAGCTCTCCTTGTTACCGTGTTCTGTCTGTGTCTTCAGGAGTCTC ATGGCCTGAACGACCACCTCGACCTGATGCA GAGCCTTCTGAGGAGAGGCAACAGGAGGCATTCTGTGGCCAGCCAAAAGGTACCCCGATGGCCAAGCAATTCCTCTGAA CAAAATGTAAAGCCAGCCATGCATTGTTAATCATCCATCACTTCCCATTTTATGGAATTGCTTTTAAAATACATTTGGC CTCTGCCCTTCAGAAGACTCGTTTTTAAGGTGGAAACTCCTGTGTCTGTGTATATTACAAGCCTACATGACACAGTTGG ATTTATTCTGCCAAACCTGTGTAGGCATTTTATAAGCTACATGTTCTAATTTTTACCGATGTTAATTATTTTGACAAAT ATTTCATATATTTTCATTGAAATGCACAGATCTGCTTGATCAATTCCCTTGAATAGGGAAGTAACATTTGCCTTAAATT TTTTCGACCTCGTCTTTCTCCATATTGTCCTGCTCCCCTGTTTGACGACAGTGCATTTGCCTTGTCACCTGTGAGCTGG AGAGAACCCAGATGTTGTTTATTGAATCTACAACTCTGAAAGAGAAATCAATGAAGCAAGTACAATGTTAACCCTAAAT TAATAAAAGAGTTAACATCCC

A

GlyHisLysGlyAlaAlaValAlaPheTrpTrpThrlleGlyTrpProTrpAsnValAm

GGTCATAAAGGTGCGGCAGTGGCGTTCTGGTGGACCATTGGGTGGCCCTGGAATGTGTAG GAAGGGGTGTCATGAATTCCTTAAAAGGACTCTCCAAATAGCATTAGTTGTTATTATTAA

CTTAAAAGGACTCTCCAAATAGCATTAGTTGTTATTATTAATTGTGTGTCACAAGAATTT AAAACGCATGTGCAGCTATTTAAGAAAAGTATCCCGGAAGCTCACAGTGACATTACGGAA

GAACCCTCAGGTCACAAGAGTCTGGGGTCTCCTATACTCTATAACTTTGGCCACACCGAG ACACCACCTATACCAATATTTACTCATAGTTCTCTTTAAGCCAGGAGCAATGACGTGTGC CTATAGTCGCAGCTACTAGGGAAGTTGAGGCAGGAGGATTGCTTGAGCCCAGGAATTTGA GTCTAGCCTGGACAACACAGCAGGACTCCATCTCTTAAAAAAAAAATTACTTCCCCCACT ACTTTTTTTGACATAAAAAAATGTATTTTAAAAGGAAACTGTACTACATCTAGTTAATCA

TAGGTTTGATATGTAGTTACGTATTTTTTCTAATGTGCATTAAAACAAATCCATAATTAT TAAAATAAATGTTGTTTGTGTGCCAAAAAAAAAAAA

B

SerHisPheTyrGlyPheGlyLeuValAspAlaGluAla 5'...ggcctgtcttttcag

TTAGCCATTTCTATGGATTTGGTTTGGTGGACGCAGAAGCT

LeuValValGluAlaLysLysTrpThrAlaValProSer CTCGTTGTGGAGGCAAAGAAGTGGACAGCAGTGCCATCG

GlnHisMetCysValAlaAlaSerAspLysArgPro CAGCACATGTGTGTGGCCGCCTCGGACAAGAGACCCAG

gtaaggctctgctgt...3'

c FIG. 2. cDNA and partial gene sequences of PACE4 and PACE4.1. A. Composite sequence of PACE4 cDNA and its translation product. The proposed signal peptidase cleavage site between Ala-63 and Pro-64 is arrowed, as is a putative propeptide processing site after Arg-149. The subtilisin-like catalytic domain is boxed, and includes the active site Asp, His, Asn, and Ser residues (*). Consensus sites for Asn-linked glycosylation are marked by diamonds, and cysteine residues are marked by bars. B. Composite PACE4.1 cDNA sequence. The sequence is identical to the PACE4 cDNA sequence up to Lys-471. The additional 16-amino-acid sequence, the termination codon, and the 3' untranslated region that form the truncated cDNA and encoded protease sequence of PACE4.1 are shown. The divergence point between PACE4 and PACE4.1 cDNA sequences is arrowed subsequent to Lys-471. C. DNA sequence of the homolog of exon 12 of the furin/PACE gene (Barr er al, 1991). This sequence contains the intron/exon splice junction for commitment to either PACE4 or PACE4.1 mRNAs. This exon sequence corresponds to PACE4 residues Ser-473 to Pro-510.

HUMAN SUBTILISIN-LIKE PROTEASE GENE

in-frame Met codon in the 5'

region of the cDNA.

763 In

con-

trast to the amino-terminal sequence of furin/PACE, this Met is not followed immediately by a classical hydropho-

bic

signal sequence. However, amino acids 43-63 of the large open reading frame closely resemble the proposed signal sequence for furin/PACE (Barr ei al;, 1991), including a predicted signal peptidase cleavage site following the alanine residue at position 63 (Fig. 2A, arrowed) (von Heijne, 1986). The open reading frame encodes a PACE4 precursor protein of 969 amino acids with a calculated molecular weight of 106.4 kD. As with the KF.X2 protein and its mammalian counterparts, PACE4 contains a region of clustered basic residues immediately preceding the catalytic domain (Fig. 2A). By direct comparison with the known cleavage sites for KEX2, PC2, and PC1/PC3 (Christie ei al, 1991; Shennan et al, 1991; R.S. Fuller, personal communication), the RVKR motif at amino acids 146-149 probably represents a propeptide cleavage junction, thereby making Gln-150 the likely amino terminus of mature PACE4. The inferred sequence of PACE4 contains

three consensus sites for A'-linked glycosylation and 56 Cys residues (Figs. 2A and 3), with 44 of these Cys residues clustered in a carboxyl-terminal Cys-rich region analogous to (although somewhat longer than) that of furin/PACE. Again, in contrast to furin/PACE, which has a classical

transmembrane domain, PACE4 does not appear to have such a region of hydrophobic amino acid residues for anchorage in cell membranes. Moreover, PACE4 does not have the carboxyl-terminal amphipathic type anchor sequence proposed for PC2 and PC1/PC3 (Smeekens ei al, 1991) and carboxypeptidase E (Fricker ei al, 1990), thereby suggesting an alternate mode of subcellular localization for PACE4. The PACE4.1 cDNA, isolated from the human kidney cell line, encodes a much smaller subtilisin-like protease with a termination codon immediately downstream from the catalytic domain, and leading to a calculated molecular weight of 53.3 kD (Figs. 2B and 3). Truncation at this point removes an important region of the molecule, referred to as the P-domain. This region was identified in the yeast KEX2 protein and is essential for proteolytic activity (R.S. Fuller, personal communication). The PACE 4.1 cDNA also contains a 3' untranslated region that is distinct from that of PACE4 (Fig. 2B), but is encoded by a cosmid clone containing the PACE4 gene (PACE4-COS 3; data not shown). Alternate RNA splicing can be inferred by comparing the cDNA sequence to the sequences of genomic clones isolated from this cosmid. For example, the sequence of a 620-bp- Pst I fragment from this cosmid encodes an exon (equivalent to exon 12 of furin/PACE) that

S/TRR

furin / PACE

PACE4

Sig

Pro D H

PACE4.1

Sig

N

S

-±-l-L Pro

100

aa

FIG. 3. Organization comparison of PACE4, PACE4.1, and related proteases. The structures of the yeast and mammalian dibasic processing enzymes are shown schematically. Active-site Asp, His, Asn (Asp), and Ser residues are shown. Potential glycosylation sites are indicated by diamonds. Putative signal peptides (Sig) and transmembrane domains (TMD) are indicated by solid boxes, as are the proposed membrane-binding amphipathic helices (AH) of human PC2 and murine PC1/PC3. Cysteine-rich (CRR) and serine/threonine-rich (S/TRR) regions are hatched boxes. Known proregions of KEX2 (R.S. Fuller, personal communication), PC2, and PC1/PC3 (Christie et al, 1991; Shennan eí al, 1991), and similar potential pro-regions for furin/PACE, PACE4, and PACE4.1 are shown as shaded boxes.

764

KIEFER ET AL.

clearly illustrates the intron/exon splice junction leading to this cleavage is required for activity, and a role for autoeither PACE4 or PZCE4.1 (Fig. 1C). catalysis in this process remain to be determined. Amino acid sequence identities between the catalytic domain of PACE4 and those of the previously described diSimilarity of PACE4 to other dibasic processing basic processing enzymes are shown in Fig. 4 and quanti-

endoproteases

tated in Table 1. of the an structure of PACE4 Comparisons sequence and PACE4.1 with the KEX2 protein and its mammalian Tissue distribution of PACE4 mRNA To determine the tissue distribution of PACE4 and homologs are shown in Figs. 3 and 4. The spacing of the catalytic Asp, His, Asn (As), and Ser residues is conserved, PACE4.1, we carried out Northern blot analysis of as is the placement of the proposed processing site for repoly(A)*RNA from a variety of tissues and cell lines, using moval of the propeptide region of each protease (Fig. 3). cDNA probes specific to each transcript (Fig. 5A-D). A The functional utilization of this cleavage site has been single PACE4 4.4-kb transcript was seen at various intensidemonstrated for the KEX2 protein (R.S. Fuller, personal ties in all tissues and cell lines tested (Fig. 5, A and C). communication), and for PC2 and PC1/PC3 (Christie et This is similar to the widespread tissue distribution found al, 1991; Shennan eí al, 1991). However, whether or not with furin/PACE (Schalken eí al, 1987; Barr eí al, 1991)

yKEX2

.NDPLFER--QWHL-VNPSFPGSDINVLDLWYNN 163

hPACE hPC2

.-DPKFP—QQWYI.-SGVTQRDLNVKAAWAQG

141

.NDPI.FTK—QWSftlNTGQADGTPGLDtNVAEAWELG 155 .NDPM--WNQQWTLQDTRMTAALPKLDLHVIPVWEKG 155 .NDP--IWSNMWÏtHCG-DKNSRCRSEMNVQAAWKRG 193

mPCl/PC3 hPACE4

yKEX2 ITGAGVVAAIVDDGLDYENEDLKDNFCAEGSWDFNDNTNLPKPRLSDDY-HGTRCAGEIAAKKGNNFC hPACE XTGHGIWSIiTJDr^IEKNHPDIAGNYDPGASFDVNDQDPDP mPCl/PC3 IÏGKGWITVÏ^r^l^WNHTDIYANYDPEASYDI^ hPACE4 YTGKNVVVTII4>rx;iERNHPDIJ^NYDSYASYDVNr^

230 201 225 193 263

yKEX2 GVr^GYHAKISGXRILSG-DITTEDEAASLIYGLDVNDIYSCSWGPADDGRHLQGPSDLVKKALVKGVTE GVGVAYNARIfJG^WtoDG-EVTDAVKAR^ GVGv^YHSKVAGXRMrJDQPFMTDIIEÄSSISHMPQLIDIYSA^ mPCl/PC3 GVGVÄY1*SKVGGIR1^DG-IVTDAI£&SSIGFNPGH^ hPACE4 IVGIAYllAKIGGXRWrJDG-DWDVVEAK^^

299 280 295 262 332

hPC2 13TCGKGVTIGIMDrxaDYLHPD:r^^

hPACE hPC2

yKEX2 GRDSKGAIYVFASGNGGTRGDNCNYDGYTNSIYSITIGAIDHKDLHPPYSEGCSAVMAVTYSSG-SG 365 hPACE hPC2

GRGGI^SIFVWASGNGGREHDSCMCDGYXNSIYTI£ISSATQFGNVPWYSEACSST1ATTYSSGN--QNE

GRGGKGSIYVWASGDGGSY-DDCNCDGYASSMWTISINSAINDGRTALYDESCSSTIASTFSNGRKRNPE mPCl/PC3 r^QGKGSIFVWASGNGGRQGDNCDCDGYTDSIYTISISSASQQGLSPWYAEKCSSTLATSYSSGD--YTD

348 364 362

hPACE4

GRC^LGSiyVWASGNGGP^ÄDYCSCDGYTNSIx^ISVSSATENGYKPWYLEECASTIATTYSSGAFYERK

402

yKEX2 EYIHSSDINGRCSNSHGGTSAAAPIAAGVYTLLLEANPNLTWRDVQYI,.

KQXVXTOLRQKCTESHTGTSASAPIJ^GIIALTLEANKNLTimDMQHl.. AGVATTDLYGNCTLRHSGTSAAAPEAAGVFAIJa.EAKLGLTWRDMQHL. mPCl/PC3 QRITSA0LHNLX:TETHTGTSASAPIAAGIFAIAXiEANPNI.TWRDMQHI, hPACE hPC2

hP ACE4

—ÍVTtOlíRQRCTDGHTGTSVSAPMVAGI IALALEANSQLTWRDVQHL.

.

Amino acid sequence identities within the catalytic domains of the yeast and mammalian dibasic processing enzymes. Amino acids that are identical in three or more enzymes are shaded. Quantitative amino acid sequence identities are tabulated in Table 1. FIG. 4.

HUMAN SUBTILISIN-LIKE PROTEASE GENE

765

Table 1. Overall Percent Amino Acid Sequence Identities Between the Catalytic Domains [Figs. 2(a), 4] of Saccharomyces cerevisiae KEX2, and the Mammalian Paired Basic Amino Acid Residue Cleaving Enzymes

>-KEX2 furin/PACE

PC2

PC1/PC3

57 53

63

PACE4

yKEX2 furin/PACE PC2 PC1/PC3 PACE4

kb 9.5 7.5

1

2

3

4

48 47 49 45

56 65 69

5 6 7 8

1

2

3 4

5 6 7 8

— —

4.4



2.4



1.3



B 9 10 11 12 13 14 15

kb 9.5 7.5

— —

4.4



2.4



1.3



kb 9.57.5— 4.4

9 10 1 1 12 13 14 15



2.4— 1.3—

FIG. 5. Northern blot analysis of PACE4 and PACE4.1. The blots shown in A and B contain human poly(A)*RNAs (2 ¿ig) from heart (lane 1), brain (lane 2), placenta (lane 3), lung (lane 4), liver (lane 5), skeletal muscle (lane 6), kidney (lane 7), and pancreas (lane 8) and were purchased from Clontech (Palo Alto, CA). The blots shown in C and D contain poly(A)*RNAs (2 pg) from hamster insulinoma HIT-15 (lane 9), human embryonic kidney cell line 293 (lane 10), human hepatoma cell line HEP G2 (lane 11), human umbilical vein endothelial cells (HUVEC; lane 12), human monocyte cell line (U937 (lane 13), human lymphoblastoid cell line UC-729-6 (lane 14), and human osteosarcoma tissue (lane 15). RNA molecular weight markers (BRL) were run on adjacent lanes and identified by méthylène blue staining after gel transfer. The size of the markers, in kilobases (kb), are shown. The blots were hybridized with a probe for PACE4 (A and specific C) or a probe specific for PACE4.1 (B and D).

KIEFER ET AL.

766

and is in contrast to the limited neuroendocrine tissue dis- peptide sequence, and has neither the classical transmemtribution of PC2 and PC3 transcripts (Seidah et al, 1990; brane anchor domain fouind in furin/PACE nor the Smeekens and Steiner, 1990; Smeekens et al, 1991). How- amphiphathic helix anchor proposed for PC2 and ever, the 2.0-kb PACE4.1 transcript was seen only in the PC1/PC3. Moreover, a large cysteine-rich region that 293 kidney cell ine from which it was originally isolated spans over 270 residues is found in the carboxyl-terminal (Fig. 5, B and D). The lack of the 2.0-kb PACE4.1 tran- region of PACE4. These differences in structure may discript in all tissues tested suggests that it may represent an rect alternate subcellular targeting for PACE4 that may aberrantly spliced PACE4 transcript limited to certain cul- correlate with substrate specificity in vivo. Truncation of the PACE4 protease sequence by translatured cell types. tion of an alternately spliced mRNA species gives rise to a shorter inferred protease, designated PACE4.1. Since this mRNA species was detected at high levels only in a human kidney cell line, it is not clear whether PACE4.1 represents The PACE4 gene localizes to the fes/fps region a true protease, or is an artefact of mRNA splicing. Interof chromosome 15 estingly, the structure of the PACE4.1 precursor predicts a secreted polypeptide. The lack, however, of a P-domain, a Chromosomal localization of the fur and PACE4 genes region that is necessary for correct folding and activity in was done by in situ hybridization using isolated cosmid the KEX2 protein, suggests that this polypeptide might not clones (Barr et al, 1991, and present satudy). The results be catalytically active. Thus, secretion, detection in vivo, are shown in Fig. 6. The contiguous fur and fes/fps genes and activity studies become extremely important areas of were localized to 15q25, in agreement with previous mapfuture research on the PACE4.1 translation product. ping studies using tritium-labeled probes (Roebroek et al, PACE4 mRNA was expressed in a broad array of tis1986). PACE4 also localized to 15q, but appeared to lo- sues. This is similar to the widespread tissue distribution of calize more telomerically to fur, and was assigned to furin/PACE mRNA (Schalken et al, 1987; Barr et al, 15q26. To confirm the relative order of fur and PACE4 1991), although PACE4 mRNA is expressed more abungenes on 15q, the cosmid probes were labeled with digoxiin liver than in other tissues. Also, since furin/ genin and biotin, respectively (Fig. 6). In 24 of the 50 chro- dantly PACE is unable to process prorenin (Hatsuzawa et al, mosomes analyzed (48%), fur (green) was proximal to 1990), and coexpression of PC1/PC3 with prorenin leads PACE4 (red). In 18 of the chromosomes (36%), both to partial removal of the propeptide (Hosaka et al, only colors were fused, suggesting that the fur and PACE4 the relatively high levels of PACE4 expression in the 1991), genes are within 5 megabases of each other. kidney suggest PACE4 may be a candidate prorenin processing enzyme along with PC1/PC3 and renal cathepsin B

(Wang

et

al, 1991).

Preliminary studies on the PACE4 gene indicate that un-

DISCUSSION We used a PCR approach to identify a novel human subtilisin-like protease. Our PCR primers were designed to amplify sequences encoding members of the subfamily of mammalian subtilisins with cleavage specificity for paired basic amino acid residues. The encoded protease we cloned is the fourth such enzyme from mammalian tissue sources and, hence, is designated PACE4. The structure of PACE4, inferred from a composite cDNA sequence, was very similar to the previously reported dibasic processing endoproteases, with the highest amino acid sequence identities found in the catalytic domains. In this region, PACE4 was most similar to furin/PACE; however the comparison of PACE4 with PC1/PC3 is a cross-species comparison, since only rodent PC1/PC3 sequences are currently available. Despite this similarity, however, other predicted structural features of PACE4 differ rather extensively from the other three family members. For example, the PACE4 precursor appears to have an extended signal

like the compact intron/exon structure of the furin/PACE exons (van de Ven et al, 1990; Barr et al, 1991), the size of the newly discovered gene is greater by virtue of much larger introns. However, conservation of intron/exon splice junctions appears to be highly conserved (our unpublished observations). This is not surprising based on previous studies (reviewed by Gilbert, 1985), and our finding that the fur and PACE4 genes map to the same region of chromosome 15, thus indicating a common ancestry for these two genes. Interestingly, our data shows that although the/«/- and fes/fps genes are within 1 kb of each other, the fur and PACE4 genes are further apart and, indeed, are distinguishable by in situ labeling of metaphase chromosomes. Since these two genes most likely arose through gene duplication, this region of chromosome 14 may contain additional subtilisin-like proteases genes. The discovery of new dibasic processing enzyme genes and cDNAs, and determination of the exact substrate specificities of their protein products in vivo will continue to add to our understnading of this crucial aspect of eukaryotic cell biology. =

HUMAN SUBTILISIN-LIKE PROTEASE GENE

767

FIG. 6.

Fluorescence in situ hybridization of cosmid DNA to metaphase chromosomes. A. Biotinylated fur gene al, 1991) hybridizing to 15q25. B. Biotinylated PACE4 gene hybridizing to 15q26. C, D, and E. Digoxigenin-labeled/«/- (FITC) and biotinylated PACE4 gene (Texas red). The fur gene maps proximal to PACE4 in C and D. The two cosmids fuse in E, but the green-yellow can be discerned as being proximal to the red fluorescence.

(Barr



ACKNOWLEDGMENTS We thank Peter T. Anderson for preparation of the manuscript, Dr. Steven P. Smeekens for critical review, and Dr. Robert S. Fuller, Stanford University for providing unpublished information. Sequence reported in this paper has been submitted to GenBank under accession nubmer M 80482.

REFERENCES AVIV, H., and LEDER, P. (1972). Purification of biologically active globin messenger RNA by chromatography on oligothymidylic acid-cellulose. Proc. Nati. Acad. Sei. USA 69, 14081412.

BARR, P.J. (1991). Mammalian subtilisins: The long-sought dibasic processing endoproteases. Cell 66, 1-3. BARR, P.J., THAYER, R.M., LAYBOURN, P., NAJARÍAN, R.C., SEELA, F., and TOLAN, D.R. (1986). 7-Deaza-2'-de-

oxyguanosine-5'-triphosphate: Enhanced resolution deoxy sequencing. Biotechniques 4, 428-432.

in M13 di-

BARR, P.J., MASON, O.B., LANDSBERG, K.E., WONG, P.A., KIEFER, M.C., and BRAKE, A.J. (1991). cDNA and gene structure for age specificity for

a

human subtilisin-like protease with cleavDNA Cell

paired basic amino acid residues.

Biol. 10, 319-328. BENJANNET, S., RONDEAU, N., DAY, R„ CHRÉTIEN, M., and SEIDAH, N.G. (1991). PCI and PC2 are proprotein convertases capable of cleaving proopiomelanocortin at distinct pairs of basic residues. Proc. Nati. Acad. Sei. USA 88, 3564-

-

3568.

KIEFER ET AL.

768

BRESNAHAN, P.A., LEDUC, R., THOMAS, L., THORNER, J., GIBSON, H.L., BRAKE, A.J., BARR, P.J., and THOMAS, G. (1990). Human fur gene encodes a yeast KEX-2 like endoprotease that cleaves pro-é-NGF in vivo. J. Cell. Biol. Ill, 2851-2859. CHIRGWIN, J.M., PRZBYLA, A.E., MACDONALD, R.J., and RUTTER, W.J. (1979). Isolation of biologically active ribonucleic acid from

sources

enriched in ribonuclease. Bio-

chemistry 18, 5294-5299. CHRISTIE, D.L., BATCHELOR, D.C., and PALMER, D.J. (1991). Identification of kex2-related proteases in chromaffin granules by partial amino acid sequence analysis. J. Biol. Chem. 266, 15679-15683. FEINBERG, A.P., and VOGELSTEIN, B. (1984). A technique for radiolabeling DNA restriction endonuclease fragments to high specific activity. Anal. Biochem. 137, 266-267. FREEMAN, G.J., CLAYBERGER, C, DEKRUYFF, R., ROSENBLUM, D.S., and CANTOR, H. (1983). Sequential expression of new gene programs in inducer T-cell clones. Proc. Nati. Acad. Sei. USA 80, 4094-4098. FRICKLER, L.D., DAS, B., and ANGELETTI, R.H. (1990). Identication of the pH-dependent membrane anchor of carboxypeptidase E (EC 3.4.17.10). J. Biol. Chem. 265, 24762482.

FULLER, R.S., BRAKE, A.J., and THORNER, J. (1989). Intracellular targeting and structural conservation of a prohormoneprocessing endoprotease. Science 246, 482-486. GILBERT, W. (1985). Genes-in-pieces revisited. Science 228, 823-824.

HATSUZAWA, K., HOSAKA, M., NAKAGAWA, T., NAGASE, M., SHODA, A., MURAKAMI, K., and NAKAYAMA, K. (1990). Structure and expression of mouse furin, a yeast KEX2-re\a.leà protease. J. Biol. Chem. 265, 2207522078.

HOSAKA, M., NAGAHAMA, M., KIM, W.-S., WATANABE, T., HATSUZAWA, K., IKEMIZU, J., MURAKAMI, K., and NAKAYAMA, K. (1991). Arg-X-Lys/Arg-Arg motif as a signal for precursor cleavage catalyzed by furin within the constitutive secretory pathway. J. Biol. Chem. 266, 1217-12130. KIEFER, M.C., JOH, R.S., BAUER, D.M., and ZAPF. J.

(1991a). Molecular cloning of a new human insulin-like growth factor binding protein. Biochem. Biophys. Res. Commun. 176, 219-225.

KIEFER, M.C., MASIARZ, F.R., BAUER, D.M., and ZAPF, J. (1991b). Identification and molecular cloning of two new 30-kDa insulin-like growth factor binding proteins isolated from adult human serum. J. Biol. Chem. 266, 9043-9049. KÖRNER, J., CHUN, J., HARTER, D., and AXEL, R. (1991). Isolation and functional expression of a mammalian prohormone processing enzyme, murine prohormone convertase 1. Proc. Nati. Acad. Sei. USA 88, 6834-6838. LEHRACH, H., DIAMOND, D., WOZNEY, J.M., and BOEDTKER, H. (1977). RNA molecular weight determinations by gel electrophoresis under denaturing conditions, a critical re-examination. Biochemistry 16, 4743-4751. MISUMI, Y., ODA, K., FUJIWARA, T., TAKAMI, N., TASHIRO, K., and IKEHARA, Y. (1991). Functional expression of furin demonstrating its intracellular localization and endoprotease activity for processing of proalbumin and complement pro-C3. J. Biol. Chem. 266, 16954-16959. PINKEL, D., LANDEGENT, J., COLLINS, C, FUSCOE, J., SEGRAVES, R., LUCAS, J., and GRAY, J. (1988). Fluorescence in situ hybridization with human chromosome-specific libraries: Detection of trisomy 21 and translocations of chromosome 4. Proc. Nati. Acad. Sei. USA 85, 9138-9142.

ROEBROEK, A.J.M., SCHALKEN, J.A., BUSSEMAKERS, M.J.G., VAN HEERIKHUIZEN, H., ONNEKINK, C, DEBRUYNE, F.M.J., BLOEMERS, H.P.J., and VAN DE VEN, W.J.M. (1986). Characterization of human c-fes/fps reveals a

transcription unit (fur) in the immediately upstream of the proto-oncogene. Mol. Biol. Rep. 11, 117-125.

new

region

SAIKI, R.K., SCHARF, S., FALOONA, F., MULLÍS, K.B., HORN, G.T., ERLICH, H.A., and ARNHEIM, N. (1985). Enzymatic amplification of 5-globin genomic sequences and restriction site analysis for diagnosis of sickle cell anemia. Science 230, 1350-1354. SCHALKEN, J.A., ROEBROEK, A.J.M., OOMEN, P.P.C.A., WAGENAAR, S.SC, DEBRUYNE, F.M.J., BLOEMERS, H.P.J., and VAN DE VEN, W.J.M. (1987). fur Gene expression as a discriminating marker for small cell and nonsmall cell lung carcinomas. J. Clin. Invest. 80, 1545-1549. SEIDAH, N.G., GASPAR, L., MION, P., MARCINKIEWICZ, M., MBIKAY, M., and CHRÉTIEN, M. (1990). cDNA sequence of two distinct pituitary proteins homologous to Kex2 and furin gene products: Tissue-specific mRNAs encoding candidates for pro-hormone processing proteinases. DNA Cell Biol. 9, 415-424. SHENNAN, K.I.J., SEAL, A.J., SMEEKENS, S.P., STEINER, D.F., and DOCHERTY, K. (1991). Site-directed mutagenesis and expresión of PC2 in microinjected xenopus oocytes. J. Biol. Chem. (in press). SMEEKENS, S.P., and STEINER, D.F. (1990). Identification of a human insulinoma cDNA encoding a novel mammalian protein structurally related to the yeast dibasic processing protease Kex2. J. Biol. Chem. 265, 2997-3000. SMEEKENS, S.P., AVRUCH, A.S., LAMENDOLA, J., CHAN, S.J., and STEINER, D.F. (1991). Identification of a cDNA encoding a second putative prohormone convertase related to PC2 in AtT20 cells and islets of Langerhans. Proc. Nati. Acad. Sei. USA 88, 340-344. THOMAS, P.S. (1980). Hybridization of denatured RNA and small DNA fragments transferred to nitrocellulose. Proc. Nati. Acad. Sei. USA 77, 5201-5205. THOMAS, L., LEDUC, R., THORNE, B.A., SMEEKENS, S.P., STEINER, D.F., and THOMAS, G. (1991). DEX2-iike endoproteases PC2 and PC3 accurately cleave a model prohorin mammalian cells: Evidence for a common core of neuroendocrine processing enzymes. Proc. Nati. Acad. Sei. USA 88, 5297-5301. TOMKINSON, B., and JONSSON, A.-K. (1991). Characterization of cDNA for human tripeptidyl peptidase II: The N-terminal part of the enzyme is similar to subtilisin. Biochemistry 30, 168-174. VAN DE VEN, W.J.M., VOORBERG, J., FONTIJIN, R., PANNEKOEK, H., VAN DEN OUWELAND, A.M.W., VAN DUIJNHOVEN, H.L.P., ROEBROEK, A.J.M., and SIEZEN, R.J. (1990). Furin is a subtilisin-like proprotein processing enzyme in higher eukaryotes. Mol. Biol. Rep. 14, 265-275. VON HEIJNE, G. (1986). A new method for predicting signal sequence cleavage sites. Nucleic Acids Res. 14, 4683-4690. mone

WANG, P.H., DO, Y.S., MACAULAY, L., SHINAGAWA, T., ANDERSON, P.W., BAXTER, J.D., and HSUEH, W.A. (1991). Identification of renal cathepsin B as a human prorenin-processing enzyme. J. Biol. Chem. 266, 12633-12638. WISE, R.J., BARR, P.J., WONG, P.A., KIEFER, M.C., BRAKE, A.J., and KAUFMAN, R.J. (1990). Expression of a human proprotein processing enzyme: Correct cleavage of the von Willebrand precursor at a paired basic amino acid site. Proc. Nati. Acad. Sei. USA 87, 9378-9382. YANISCH-PERRON, C, VIEIRA, L., and MESSING, J.

HUMAN SUBTILISIN-LIKE PROTEASE GENE

(1985). Improved M13 phage Nucleotide sequences of the Gene 33, 103-119.

cloning vectors and host strains. M13mpl8 and pUC19 vectors.

ZAPF, J., KIEFER, M., MERRYWEATHER, J., MASIARZ, F., BAUER, D., BORN, W., FISCHER, J.A., and FROESCH, E.F. (1990). Isolation from adult human serum of four insulin-like growth factor (IGF) binding proteins and molecular cloning of one of them that is increased by IGF1 administration and in extra-pancreatic tumor hypoglycemia. J. Biol.

769 Chem. 265, 14892-14898.

Address

reprint requests to: Dr. Philip J. Barr

4560 Horton Street 94608

Emeryville, CA Received for

publication October 1,

1991.

fps region of chromosome 15.

A cDNA encoding a novel human subtilisin-like protease was identified by a polymerase chain reaction (PCR) methodology. PCR primers were designed to b...
12MB Sizes 0 Downloads 0 Views