ARCHIVES

OF BIOCHEMISTRY

AND BIOPHYSICS

Vol. 296, No. 1, July, pp. 190-197, 1992

cDNA Clone to Chick Cornea1 Chondroitin/Dermatan Sulfate Proteoglycan Reveals Identity to Decorin’ Weishi Li, Jean-Paul Vergnes, Pamela K. Cornuet, and John R. Hassell’ Department of Ophthalmology, The Eye and Ear Institute of Pittsburgh, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania 15213

Received January

2, 1992, and in revised form March 3, 1992

A 1.6-kb cDNA clone was isolated by screening a library prepared from chick cornea1 mRNA with a cDNA clone to bovine decorin. The cDNA contained an open A 19reading frame coding for a M, 39,683 protein. amino-acid match with sequence from the N-terminus of core protein from the corneal chondroitin/dermatan sulfate proteoglycan confirmed the clone as a cornea1 proteoglycan and the homology with human and bovine decorin confirmed its identity as decorin. Structural features of the deduced sequence include a 16-amino-acid signal peptide, a 14-amino-acid propeptide, cysteine residues at the N- and C-terminal regions, and a central leucine-rich region (comprising 63% of the protein) containing nine repeats of the sequence LXXLXLXXNXL/I. Chick decorin contains three variations of this sequence that are tandemly linked to form a unit and three units tandemly linked to form the leucine-rich region. The presence of /3 bend amino acids flanking the units may serve to delineate the units as structural elements of the leucine-rich region. Sequence homology within the repeats and the spacing of the repeats suggest that this region arose by duplication. Chick decorin primarily differs from mammalian decorins in the 19-amino-acid sequence that starts the N-terminus of the core protein. Within this region, the serine that serves as a potential acceptor for the chondroitin/dermatan sulfate side chain is preceded by a glycine instead of being followed by a glycine as it is in the mammalian decorins and all other mam0 1992 Academic Press, 1~. malian proteoglycans.

The cornea1 stroma contains proteoglycans with chondroitinldermatan sulfate side chains and proteoglycans with keratan sulfate side chains. These proteoglycans help maintain the regular arrangement of collagen fibrils 1 The nucleotide sequence reported in this paper has been assigned Accession No. X-63797 by the GenBank/EMBL Data Bank. ’ To whom correspondence should be addressed. 190

within the cornea1 stroma (1, 2), a feature necessary for cornea1 transparency (3). The chondroitin/dermatan sulfate proteoglycans isolated from cornea have been principally characterized using biochemical and immunochemical methodology. Biosynthetic radiolabeling studies with chick corneas show that this proteoglycan contains one chondroitin/dermatan sulfate side chain and one to three N-linked oligosaccharides (4) and that it is synthesized from a it& 40,000 precursor protein (5). Amino acid analysis of the chondroitin/dermatan sulfate proteoglycan isolated from bovine corneas show its core protein is leutine-rich (6). Antibodies to decorin, a proteoglycan present in a variety of different connective tissues, including skin, bone, tendon, and cartilage (6,7), have been shown to react with the core protein of the chondroitin/dermatan sulfate proteoglycan isolated from cornea (7). Antibodies to biglycan, another proteoglycan that shows a wide tissue distribution, also react with the core protein of the chondroitin/dermatan sulfate proteoglycan from corneas but to a lesser extent than antibodies to decorin (7). RNA from bovine corneas, however, hybridizes with a cDNA clone to bovine decorin but not with a cDNA clone to bovine biglycan (7). These observations suggest that the chondroitin/dermatan sulfate proteoglycan of the cornea is decorin. Although decorin is a relatively small proteoglycan, it has multiple biological activities. It has been shown to bind to collagen (8,9) and TGF-/3 (10) and to inhibit the in vitro formation of collagen fibrils (11,12). These properties are not altered by the removal of the glycosaminoglycan side chain, which suggests that the activities reside in the M, 40,000 core protein. A comparison of the amino acid, sequence of decorin from different species can help to identify functionally , important domains. In the present study, we isolated and sequenced the N-terminus of the chick cornea1 chondroitin/dermatan sulfate proteoglycan. The 21-aminoacid sequence obtained was, with the exception of four amino acids, distinct from the N-terminus of human or 0003-9861/92 $5.00 Copyright 0 1992 by Academic Press, Inc. All rights of reproduction in any form reserved.

CHICK

bovine decorin. Consequentially, we used a cDNA clone to bovine decorin (3) to isolate a cDNA clone to chick decorin from a cDNA library prepared from chick cornea1

mRNA and compared the deduced sequence to those of bovine and human decorin. The results showed that although the N-terminal sequence of the chick decorin core protein showed only 26% identity with the mammalian decorins, the rest of the core protein is 84% identical to mammalian decorins. This indicates that the cornea1 chondroitin/dermatan sulfate proteoglycan is decorin. MATERIALS

AND METHODS

Purification and sequencing of decorin. The cornea1 chondroitin/ dermatan sulfate proteoglycan was isolated as previously described (5). Briefly, corneas were excised from adult chicken heads obtained from a local slaughterhouse and extracted in 4 M guanidine-HCl. The extracts were then exchanged into urea and chromatographed on DEAE in 7 M urea to separate glycoproteins from the proteoglycans. The isolated proteoglycans were digested with chondroitinase ABC (Seikagaku Kogyo Co.) which removes the chondroitin/dermatan sulfate side chains, thereby converting those proteoglycans with chondroitin/dermatan sulfate side chains to core proteins. The chondroitinase digestion of the cornea1 proteoglycans produced only one core protein (it4,40,000). This core protein was purified by FPLCs and sequenced by Edman degradation on a Beckman Model 890M or a Porton Model 209E. Phenylthiohydantoin derivatives were identified by HPLC using a sodium acetate (containing 3.5% tetrahydrofuran) acetonitrile gradient system. Construction and screening of a chick cornea-derived cDNA library. Seven-day-old white leghorn chicks were decapitated and their corneas removed by excision. The corneas were frozen immediately in liquid nitrogen and stored at -80°C. RNA was extracted from the corneas using a guanidine isothiocyanate-phenol-chloroform procedure (14). Poly(A)+ mRNA was isolated by affinity chromatography on oligo(dT)cellulose (Pharmacia LKB Biotechnology) and used to construct a UniZAP XR cDNA library (Stratagene Custom Library Section). Approximately 375,000 plaque-forming recombinants from this library were transduced into Escherichia coli PLK-F cells and were screened with a 1.3-kb probe from a bovine cDNA clone (PG 28) for decorin (13) using moderate stringency (washes with 0.1X SCC, 0.1% SDS, 45°C). The fragment was labeled with 32P to high specific activity ([a-32P]dCTP, 3000 Ci/mmol, DuPont/New England Nuclear) using random primer labeling (Pharmacia Oligolabeling Kit). Positive clones were selected and rescreened to purity with the same method. These positive clones were then screened with a degenerate oligonucleotide probe (14-mer) based on sequence present in the N-terminal sequence obtained from sequencing the core protein. The probe was end labeled with Y-~‘P (10 mCi/ml) (ICN). Denatured phage plaques on durulose filters (Stratagene) were washed, incubated in prehybridization solution, and then hybridized with the labeled probe at 42°C for 8 h. Following moderate stringency washes, the filters were dried and exposed to Kodak X-OMAT AR film with an intensifying screen at -80°C overnight. Positive clones were selected for further analysis. The pBluescript plasmid containing the cDNA insert was rescued from the Uni-ZAP XR vector by coinfecting the Uni-ZAP phage with fl Helper Phage R408 into E. Coli XLl-Blue cells. The rescued phagemids from these clones were then grown up in XLl-Blue cells and purified (Stratagene protocol). The inserts on the plasmids were cut with EcoRI and XhoI and their size was measured by agarose gel electrophoresis.

’ Abbreviations used: FPLC, fast protein liquid chromatography; SSC, standard saline citrate; SDS, sodium dodecyl sulfate; PAGE, polyacrylamide gel electrophoresis.

DECORIN

191

DNA sequencing of cDNAs. Sequencing was done via the dideoxy chain termination method (15) using the Sequenase Version 2.0 Kit (United States Biochemical) with [a-??l]-dATP (1350 Ci/mmol, DuPont/New England Nuclear) and synthetic oligonucleotides. The sequencing reactions were loaded onto 6% polyacrylamide-urea gels and electrophoresed for varying times. The dried gels were exposed to XOMAT AR (Kodak) film overnight at room temperature. The sequence from the developed films was read using an IBI Gel Reader with a Macintosh SE/30 computer using MacVector software. Northern blot analysis. Total RNA was isolated from Day 15 embryonic chick brain eyes, whole eyes (less cornea), heart, intestine, liver, muscle, and cornea1 fibroblasts grown in culture for 7 days as previously described (16). Fifteen micrograms of total RNA from each of the above tissue types was electrophoresed through a 1% agarose gel containing formaldehyde and transferred to Genescreen overnight (New England Nuclear protocol). Following prehybridization, the blot was hybridized with the [32P]-labeled probe (prepared as described above) at 42°C for 32 h. After high stringency washes (New England Nuclear protocol), the washed blot was exposed to X-OMAT AR film (Kodak) with an intensifying screen at -80°C.

RESULTS Sequencing the chondroitin/dermatan sulfate proteoglycan core protein isolated from chick corneas from the N-terminus yielded the following sequence: D-E-G-XA-D-V-A-P-X-D-D-P-V-I-S-G-F-X-P-V, where X is an undetermined amino acid. This sequence, with the exception of residues 1,2,12, and 13, is distinct from the N-terminus of bovine or human decorin. Screening 375,000 recombinants of the X Uni-ZAP-XR cDNA library with the cDNA for bovine decorin yielded 22 positive clones that were purified through secondary and tertiary screenings. Further selection was done by using a degenerate 14-mer oligonucleotide probe GA(T/ C)GA(T/C)CC(T/C/A/G)GT(T/C/A/G)AT based on the sequence Asp-Asp-Pro-Val-Ile found in the N-terminal sequence of the chick chondroitin/dermatan sulfate proteoglycan. This oligonucleotide hybridized to 5 of the 22 positive clones. The sizes of cDNA inserts in these five clones were estimated using restriction enzyme digestion and agarose gel electrophoresis. Two clones (9 and 15) containing the largest inserts (1.5 kb) were selected for further characterization. Synthetic oligonucleotides complementary to the plasmid near the 5’ and 3’ ends of the insert were used to initiate sequencing. Sequencing continued from both strands using synthetic oligonucleotide primers to produce a series of overlapping sequences where all regions in both strands had been read at least two times except, because of secondary structure, part of the 3’ untranslated region immediately adjacent to poly(A)+ region. The nucleotide sequence obtained for the insert in clone 15 is shown in Fig. 1. A CACTG sequence (bases 13131317) and a TTCAA sequence (bases 1268-1272), both of which have been proposed to have a role in the formation of the 3’ terminus of mRNA (17-19), are present upstream of the putative polyadenylation signal (bases 13451350). The 161&J-bp cDNA has an open reading frame that encodes for a protein of 357 amino acids (M, 39,683) (Fig.

192

LI ET AL.

1 GAGCCCCTCCTCCTTTCCACACCTACAAAGCCTTTGCC~;CTTTG 76 AGAGCTCCTGTTGCAAATCCCTGGATTAAAAGGTTCTGCCTGGAGTTGATGAACTGAGCCATGAGGCTAGTTCTC 5 MetAruLeuValLeu

226 DEGXADVAPXDDPVISGFXPV* TGCCAGTGTCATCTTCGCGTTGTGCAGTGCTCTGACCTAGGTCT~~GT~~~CCTTCCCCCT~C 80 CysGlnCysHisLeuArgValValGlnCysSerAspLeuGlyLeuGl~~alProLysAspLeuProProASp * * * 376 ACAACTCTGCTGGATTTACAGAACAACAAAATCACTGAAATTAAAGAAGGAGATTTCAAGAATTTGAAGAATCTT 105 ThrThrLeuLeuAspLeuGl~s~s~ysIleThrGluIleLysGluGlyAspPheLysAsnLeuLysAS~eu 451 CATGCATTGATCCTTGTTAACAACAAAATCAGCAAAATAA HisAlaLeuIleLeuValAsnAsnLysIleSerLysIleSerProAlaAl~h~AlaProLeuLysLysLeuGlu

130

526 AGACTGTACCTATCCAAGAATAATTTGAAGGAACTTCCAGAAAAcATGccAAAGTCTCTTcAGGAGATAcGTGcT 155 ArgLeuTyrLeuSerLysAsnAsnLeuLysGluLeuProGl~s~tProLysSerLeuGlnGluIleArgAla 601 CATGAAAATGAGATCTCCAAGTTGAGGAAGGCAGTTPTTAGC HisGluAsnGluIleSerLysLeuArgLysAlaValPheAsnGlyLeuAsnGlnValIleValLeuGluLeuGly

180

676 ACCAATCCACTCAAGAGCTCAGGCATTGAAAATGGAGCTTTT~~T~~GGCTTTCCTATATCCGCATC 205 ThrAsnProLeuZysSerSerGlyIleGluAsnGlyAlaPheGlnGlyMetLysArgLeuSerTyrIleArgIle 751 GCAGACACCAACATTACTAGCATCCCTAAAGGTCTTCCTC~TCCCTTACT~GCTT~CCTT~TGG~~ AlaAspThrAsnIleThrSerIleProLysGlyLeuProProSerLeuThrGl~euHisLeuAspGlyAs~ys * 826 ATTAGCAAAATTGATGCGGTCTGTCTGGACTCACCAT IleSerLysIleAspAlaGluGlyLeuSerGlyLeuThrAsnLeuAlaLysLeuGlyLeuSerPh~snSerIle

230 255

901 280

976 AGAGTACCTAGTGGGTTGGGTGAACACAAATACATCCAGGT ArgValProSerGlyLeuGlyGluHisLysTyrIleGlnValValTyrLeuHisAsnAsnLysIleAlaSerIle

305

1051 GGTATCAACGACTTTTGCCCTCTTGGCTACAACAC CATdAAGGCAACCTATTCTGGTGTGAGTCTCTTcAGcAAC 330 GlyIleAsnAspPheCysProLeuGlyTyrAsnThrLysLysAlaThrTyrSerGlyValSerLe~heSerAsn * 1126 CCCGTGCAGTACTGGGAAATCCAGCCCCTCTTTCCGAT 355 ProValGlnTyrTrpGluI1eGlnProSerAlaPheArgCysIleHisGluArgSerAlaValGlnIleGlyAsn l

1201 TACAAATAGATTTCTAAAGGCGGGGTTTGGTTGTATTTCA TyrLys--1276 1351 1426 1501 1576

357

ATTGTTGCTAACAGTTAATGCCAAATACCAGAAACTT~TAAAGTGGAGATGATCcAATTATGTUA&J2U TTAATCATTACAGATGTACAAGATCAAACATACCATCAAATT CAGCCTTCTTTTCCAGCATGCTTTCTGTGGTAATTTPTGGCTGT ATATGAGATTCGGAGTAAATATTTGGCTGCCACGGGTCCT~G~CGT~T~~GCCTGTGAGATG~GG AGAGACTCATCACGAACCCAGTTTTGGTTCTGATAGCTTCTAG

FIG. 1. The cDNA sequence and deduced amino acid sequence of chick decorin. Nucleotide bases are numbered on the left and amino acids are numbered on the right. The wavy line denotes the signal peptide and the double underlined region matches the amino acid sequence from the N-terminus of the chick cornea1 chondroitin sulfate proteoglycan (shown in single letter code). Cysteine residues are indicated by (*). Possible N-glycosylation sites are noted by (A). The stop codon is designated by a dashed line. A possible polyadenylation signal (AAATAAA) as well as the sequences CACTG and TTCAA which have been proposed to have a role in the formation of the 3’ terminus of mRNA have been underlined.

1). Residues 31-50 match 17 of the 21 amino acid sequences obtained from sequencing the N-terminus of the proteoglycan, thereby confirming the identity of the clone as the chick cornea1 chondroitin/dermatan sulfate proteoglycan. The open reading frame starts with a hydrophobic region (Fig. 1, wavy underlined region) of 16 residues which contains features of a typical signal peptide. The application of the “(-3,-l)-rule” predicts the signal sequence cleavage site to be between Ala (residue 16) and

Thr (residue 17) (20). The next 12 residues immediately following the putative signal peptide but preceding the N-terminus of the mature proteoglycan contain several charged amino acids uncharacteristic of a leader sequence. This may represent a prepeptide immediately N-terminal to the start of the mature core protein. The deduced sequence contains two potential N-linked oligosaccharide attachment sites (AsnXxxThr/Ser, denoted by A), and these are located in the C-terminal half of the protein.

CHICK UNIT NUMBER (Amino acid number)

193

DECORIN

MOTIF 1

WOTIF 2

tPPStTEtXtXXNKlSKl

tXNtXXtXtXXNXl

MOTIF 3

(217-2851 CONSENSUS

FIG. 2. Alignment any amino acid.

of the leucine repeats in chick decorin. Identical

There are a total of six cysteine residues (denoted by *) in the protein, excluding the cysteine in the proposed signal peptide sequence. Four of the cysteines are located near the N-terminus and two are near the C-terminus. Chick decorin possesses an extensive leucine-rich region that contains multiple leucine-rich repeats. Nine of the leucine-rich repeats are located between amino acids 77 and 279, constituting 63% of the protein. Sequence analysis of these repeats indicates that there are three variations, or motifs, containing the basic sequence LXXLXLXXBXLJ (Fig. 2). Motif 1 is LPPSLTELXLXXNKISKI, motif 2 is LXNLXXLXLXXNXI, and motif 3 is LXXLXLXXNNL. The three motifs have different amino acids at positions other than the basic sequence and motifs 2 and 3 are progressively shorter. The motifs are arranged in a distinct pattern. Motifs 1, 2, and 3 are tandemly linked to form a unit and three such units are tandemly linked to form the leucine-rich region. Motif 3 in unit 2 is the least conserved of the nine leucine-rich repeats. There is an isoleucine in three of the four positions for leucine and a threonine (residue 208) in place of the asparagine. This particular leucine-rich repeat also contains one of the two potential N-linked oligosaccharide sites on the protein. The spacing of the motifs and the units supports the proposed alignment (Fig. 2). Using asparagine, the most conserved amino acid in eight of the motifs, and threonine for motif 3 in unit 2 as a reference point within each motif, the units are 21 amino acids apart while the motifs within a unit are 24 amino acids apart with one exception: motifs 2 and 3 in unit 2 are 26 amino acids apart. The similarities in amino acid sequence in between motif 2 and motif 3 in all three units also support the proposed alignment. A composite showing the spacing of the motifs and the units, the position of the cysteine residues, and the location of possible N-linked oligosaccharide sites on the proteoglycan is shown in Fig. 3. The 12 proline residues within the leucine-rich repeat region (residues 78-83) also show a unique distribution (Fig. 2). Only one proline residue is found within the basic sequence of LXXLXLXXNXL/I and this is in motif 2 of unit 2. Three of the proline residues are between motif

LXXLXLXXNNL

residues are boxed. A consensus sequence is shown below each motif. X,

2 and motif 3. Eight of the proline residues, however, are located at the beginning of motif 1 or in the region just following motif 3. Thus, all three units are flanked by proline residues at both their amino and carboxy terminal ends. In addition to proline, the most common amino acids found in fl bend structures include glycine, serine, asparagine, and aspartic acid. These amino acids are also present in the regions flanking the units. For example, preceding unit 1 (starting at amino acid 74) is the sequence PXDXPXD and following unit 1 (starting at amino acid 143) is the sequence PXNXPXS. Thus, there is a sequence of seven amino acids containing a p bend amino acid in every other position (with proline in the first and fifth position within the sequence) both following and preceding each unit except at the carboxy terminus of unit 3, where only half of this sequence is found. This /3 bend amino acid rich sequence likely results in a @bend structure that delineates the units as structural elements. Examination of the amino acid sequence (Fig. 1) reveals one more potential leucine repeat near the carboxy terminus (amino acids 292-302). It has the structure IXXVXLXXPJKJ. Comparison to the consensus sequence (LXXLXLXXr_lJXI/L) shows that two leucines have been replaced, one by isoleucine (I), the other by valine (V). Both isoleucine and valine are chemically quite similar to leucine and they often substitute for leucine in the other motifs. This potential repeat may belong to motif

Distance in Amino Acids ’ k24+24+21+24+26+2+24+244



---Motif:

1 -II-

2 Unit

3 1

1

2 Unit

3 2

1

2 Unit

3 3

FIG. 3. Structural model for decorin. The symbols used in the diagram are as follows: C, cysteine; N, asparagine; T, threonine; inverted Y, potential N-linked glycosylation site; open box, LPPSLTELXLXXNKISKI; closed box, LXNLXXLXLXXNXk hatched box, LXXLXLXXNNL. The dashed line represents the signal peptide.

194

LI ET AL.

1 since it contains a lysine (K) and is followed by XXXXF (residues 303-310) which is identical to what follows motif 1, but the spacing between this potential repeat and motif 3 in unit 3 is 23 amino acids, not the 21 that is usually found between motif 1 and motif 3. This may reflect the degeneracy not only in sequence but also in spacing arrangement. In view of these differences, this potential repeat was not included as a motif. Comparison of the chick amino acid sequence with those of human (21) and bovine (13) decorin reveals striking homology (Fig. 4). Major variations exist only in the signal peptide region and the most N-terminal region, but overall the sequence is 77% identical with those of human and bovine decorin. This confirms the identity of the cDNA clone as chick decorin. The amino acid sequence of chick decorin was compared to those of bovine fibromodulin (22), chick lumican (23), and human biglycan (7) because these proteoglycans also have extensive leucine-rich repeats (Fig. 5). The boxed areas contain the residues which are identical to those of decorin. All share the same basic structures: the alignment of the four cysteine residues at the N-terminus, the align-

ment of the two cysteine residues at the C-terminus, and the presence of the multiple leucine-rich repeats in the middle, with most parts of motifs well conserved in all four proteins. The amino- and carboxy-terminal positions of these sequences show the greatest variation. The percentage identity of decorin with biglycan is 50%, with fibromodulin 29%, and with lumican 26%. The presence of decorin mRNA in a variety of tissues was examined by Northern blot analysis (Fig. 6). It was detected as a 2.5kb band at relatively high levels in RNA from chick cornea1 fibroblasts (lane 6) and whole eyes (less cornea) (lane 5). Lower levels were present in RNA from heart (lane 4) and even lower levels were present in RNA from breast muscle (lane l), intestine (lane 3), and liver (lane 2). No decorin mRNA was detected in the brain (lane 7). DISCUSSION

The deduced amino acid sequence from the cDNA clone to chick decorin contains an open reading frame of 357 amino acids corresponding to a molecular weight of 39,683

CDCN BDCN WCN CSDLGLERVPKDLPPDTTLLDLQNNKITEIKEGD CSDLGLEKVPKDLPPDTaLLDLQNNKITEIKDGD ~JK&vPKDLPPDTTLLDLQNNKITEIKDIJ CDCN BDCN HDCN

;ZD

~\KGLPPSLTELHLDGNKI~I SYIRIADTNITTIPQGLPPSLTELHLDGNKITYV SYIRIADTNITSIPQGLPPSLTELHLDGNKISRV

FIG. 4. Amino acid sequence comparison of chick decorin (CDN),bovine those of chick decorin are boxed.

decorin (BDN), and human decorin (HDN). Residues identical to

CHICK

195

DECORIN

E J”IE[

EGmADMAmT FTLDDGPFMMNDEEASGADTSGVLDPD

Fm=f

g

E GPA&

QFLRNQQ~ZJTYDD~YDPYPYEPYEPYPTGEEGPA

NNKIASIGINDFCPLGYNTKKATYSGVSLFSNPV S~TKV&+M[FGV~R~Y~N~I~ VNKINKFPLSSFCKVVGPLTYSKITHLRLDGNNL GNRINEFSOSSFCTVVDVMNFSKLQVQRLDGNEI

FIG. 5. Amino acid sequence comparison of chick decorin (BFM). Residues identical to those of decorin are boxed.

(DCN),

for the complete protein and of 37,240 for the protein without the signal peptide. This theoretical molecular weight of the mature protein is consistent with that previously determined by SDS-PAGE for the precursor protein made in the presence of tunicamycin (5). The 77% identity with bovine and human decorin confirms the identity of this clone as chick decorin. The match with the amino acid sequence obtained from the N-terminus of the chick cornea1 chondroitin/dermatan sulfate proteoglycan confirms its identity as a cornea1 proteoglycan.

human biglycan

(BGN),

chick lumican

(LUM),

and bovine fibromodulin

Northern blot analysis shows that the mRNA for chick decorin is, like the mammalial decorins, expressed in a variety of different tissues. For mammalian decorin, these tissues are primarily those that are type I and type II collagen-rich (24). The most striking feature of this protein is the presence of multiple leucine-rich sequences (LXXLXLXXNXL/ I) which have been shown to contain fl sheet structure (25). Chick decorin contains three variations or motifs of this leucine sequence. The sequence alignment (Fig. 2)

196

LI ET AL.

-

2.5 kb

FIG. 6. Northern blot analysis of chick decorin RNA in tissues from Day 15 chick embryos and cultured cornea1 fibroblasts from Day 15 corneas. Fifteen micrograms of total RNA was loaded on the gel in each lane. Lane 1, breast muscle; lane 2, liver; lane 3, intestine; lane 4, heart; lane 5, whole eye (less cornea); lane 6, corneal fibroblasts; lane 7, brain.

and motif spacing (Fig. 3) suggest that the original leucinerich repeat was partially duplicated twice to make a unit of three different motifs, and then the unit was duplicated twice to produce a total of nine leucine-rich motifs. A sequence of p bend amino acids present in the regions flanking the units probably serves to create loops between the p sheet structures produced by the leucine-rich sequences. It is interesting to note that, although biglycan, fibromodulin, and lumican deviate from this p bend sequence, they do contain two proline residues flanking both ends of the units. The spacing (in number of amino acids) between the conserved asparagines in the nine leucine-rich motifs in decorin is similar to the spacing of the leucine-rich repeats in other members of the small interstitial proteoglycan gene family (Table I). It is interesting to note that in all these proteoglycans the spacing between motifs within units ranges from 23 to 26 amino acids but between units, motif spacing is always 21 amino acids. The highly conserved nature of this pattern indicates that the units play an important role in the structural integrity and/or the function of the core protein of these proteoglycans. The leucine-rich repeat is found in at least 19 other proteins besides decorin. These proteins are found in a diverse range of species, from the 83-kDa subunit of human carboxypeptidase N to yeast adenylate cyclase and the drosophila proteins, toll and chaoptin (26-31). In most casesthe repeats have been implicated in protein-protein binding. Examination of the functions of these leucine motif-bearing proteins shows that they interact with other components in a protein-protein or protein-cell membrane (extracellularly and intracellularly) fashion. The leucine-rich regions in the small interstitial proteoglycans may be involved in both of these interactions. In decorin, this region may be important for binding to collagen (8, 9), fibronectin, or TGFP (10).

A comparison of the amino acid sequence of chick decorin with that of the mammalian decorins reveals some interesting differences. The Is-residue signal peptides in human and bovine decorin are completely identical, but chick decorin is only 38% identical to the mammalian decorins. In contrast, the prepeptide region (residues 1730) of chick decorin shares 93% identity with the prepeptide region of bovine decorin and 78% identity with the prepetide region of human decorin. Likewise, the prepeptide regions of human and bovine biglycan are 86% identical (32). The prepeptide region of human decorin is, however, only 35% identical to the prepeptide region of human biglycan. Thus, the prepeptide regions of decorin and biglycan may have separate functions, although the prepeptide region of decorin is not found on the proteoglycan extracted from the extracellular matrix (33). The 19-amino-acid sequence starting with the N-terminus (residue 31) of the mature proteoglycan is only 26% conserved between chick and the other two species with only 50% conserved between bovine and human. This region may be species specific or may simply be structural filler to separate the glycosaminoglycan attachment site from the N-terminal cysteine globule and the leucine-rich repeats. The rest of the chick decorin core protein shows 84% identity with the bovine decorin core protein. Thus, overall bovine and human decorin are more closely related to each other (93% identity at the amino acid level) than to chick decorin (77% identity). Chick and mammalian decorin differ in the sequence of amino acids surrounding the glycosaminoglycan attachment site. The serine at position 4 from the N-terminus of the core protein has been identified as the chondroitin/dermatan sulfate attachment site for human and bovine decorin (7,21,34,35). The deduced sequence for chick decorin also shows a serine in this position (residue 34, Fig. 1). N-terminal sequencing of the chick decorin core protein, however, did not identify the amino acid in this position, suggesting that it was modified. This serine (residue 34) in chick decorin is preceded by a glycine but in mammalian decorin the serine is followed by a glycine. All the proposed consensus sequences for chondroitin/ TABLE Motif

Proteoglycan

name

Decorin Biglycan Lumican Fibromodulin

I

Spacing in a Small Interstitial Proteoglycan Gene Family Number of amino acids between the conserved asparagine in adjacent leucine-rich repeats0 24-24-21-24-26-21-24-24 25-24-21-24-25-21-24-24 24-26-21-24-25-21-24-25 24-26-21-24-23-21-24-25

a Except for motif 3 in unit 2 of decorin, where there is a threonine in place of the asparagine.

CHICK

dermatan sulfate attachment contain Ser-Gly (36-39) except for the a2 chain of chick collagen IX, where GlySer was proposed (40). It is interesting to note that both chick decorin and chick collagen IX have the sequence E-G preceding the serine and sequence A-D following the serine. It is possible that the sequence E-G-S-A-D may be a chick-specific recognition site for glycosaminoglycan attachment on decorin and collagen IX.

197

DECORIN 16. Sugrue, S. P. (1991) Invest. Ophthulmol. 17. Berget, S. M. (1984) Nature

18. McLauchlan, J., Gaffney, D., Whitton, J. L., and Clements, (1985) Nucleic Acids Res. 134, 1347-1368.

21. Krusius,

T., and Ruoslahti,

E. (1986) Proc. N&l.

REFERENCES 1. Borcherding, M., Blacik, L. J., Sittig, R. A., Bizzell, J. W., Breen, M., and Weinstein, H. G. (1975) Exp. Eye Res. 21, 59-70.

2. Hassell, J. H., Cintron,

C., Kublin, C., and Newsome, D. A. (1983) Arch. Biochem. Biophys. 222, 362-369.

3. Maurice, D. M. (1984) in The Eye (Davson, H., Ed.), Vol. lB, pp. 1-158, Academic Press, London. 4. Midura, R. J., and Hascall, V. C. (1989) J. Biol. Chem. 264,14231430. 5. Schrecengost, P. K., Blochberger, T. C., and Hassell, J. R. (1992) Arch. Biochem. Biophys. 292,54-61.

6. Heinegard, D. K., Bjorne-Persson, dell, S., Malstrom, A., Paulsson, (1985) Biochem. J. 230, 181-194.

A., Coster, L., Franzen, A., GarM., Sandfolk, R., and Vogel, K.

7. Fisher, L. W., Termine, J. D., and Young, M. F. (1989) J. Biol. Chem. 264,4571-4576. 8. Pringle, G., and Dodd, C. (1990) J. Histochem. Cytochem. 38,14051411.

9. Brown, D. C., and Vogel, K. G. (1989) Matrix 9, 468-978. 10. Yamaguchi, Y., Mam, D. M., and Ruoslahti, E. (1990) Nature 346, 281-284. 11. Vogel, K. G., Paulsson, M., and Heinegard,

D. (1984) Biochem. J.

223,587-597. 12. Rada, J. A., Schrencengost, P. K., and Hassell, J. R. (1991) Inuest. Ophthulmol. Via Sci. 32, 1010. [Abstract] 13. Day, A. A., McQuillan, C. I., Termine, J. D., and Young, M. R. (1987) Biochem. J. 248,801-805. 14. Chomczynski, P., and Sacchi, N. (1987) Anal. Biochem. 162, 156159.

15. Sanger, F., Nicklen, S., and Coulson, A. R. (1977) Proc. Nutl. Acud. Sci. USA 76,5463-5467.

Acad. Sci. USA

83,7683-7687. EMBOJ.

This work was supported by NIH Grants EY08104 and 5 P30 EY08098 and by Research to Prevent Blindness, Inc. John R. Hassell is a Doris and Jules Stein Professor. The authors thank Shukti Chakravarti, Jody Rada, Tom Blochberger, and Nirmala SundarRaj for their helpful critique of the manuscript.

J. B.

19. Urano, Y., Watanobe, K., Sakai, M., and Tamaoki, T. (1986) J. Biol. Chem. 261,3244-3251. 20. Von Heijne, G. (1986) Nucleic Acids Res. 14, 4683-4690.

22. Oldberg, A., Antonsson, P., Lindblom, ACKNOWLEDGMENTS

Wk. Sci. 32, 140-146.

309, 179-182.

K., and Heinegard,

D. (1989)

8,2601-2604.

23. Blochberger,

T. C., Vergnes, J.-P., Hemple, J., and Hassell, J. R. (1992) J. Biol. Chem. 267, 347-352.

24. Bianco, P., Fisher, L. W., Young, M. F., Termine, J. D., and Robey, P. G. (1990) J. Histochem. Cytochem. 38, 1549-1563. 25. Krantz, D. D., Zidovetzki, R., Kagan, B. L., and Zipursky, S. L. (1991) J. Biol. Chem. 266, 16,801-16,807. 26. Ohkura, H., and Yanagida, M. (1991) Cell 65, 149-157.

27. Mikol, D. D., Alexakos, M. J., Bayley, C. A., Lemons, R. S., LeBeau, M. M., and Stefansson, K. (1990) J. Cell Biol. 111, 2673-2679. 28. Fresno, L. D., Harper, D. S., and Keene, J. D. (1981) Mol. Cell. Biol. 11, 1578-1589. 29. Madisen, L., Neubauer, M., Plowman, G., Rosen, D., Segarini, P., Dasch, J., Thompson, A., Ziman, J., Bentz, H., and Purchio, (1990) DNA Cell Biol. 9, 303-309.

A. F.

30. Smiley, B. L., Stadnyk, A. W., Myler, P. J., and Stuark, K. (1990) Mol. Cell. Biol. 10, 6436-6444. 31. Rothberg, J. M., Jacobs, R. J., Goodman, C. S., and Artavanis-Tsakonas, S. L. (1990) Gene Deu. 4, 2169-2187.

32. Marcum, J. A., and Thompson, M. A. (1991) Biochem. Biophys. Res. Comm. 175, 706-712. 33. Sawhney, R. S., Hering, T. M., and Sandell, L. J. (1991) J. Biol. Chem. 266,9231-9240.

34. Chopra, R. K., Pearson, C. H., Pringle, G. A., Fackre, D. S., and Scott, P. G. (1985) Biochem. J. 232,277-279. Y., Bourdon, M. A., and Ruoslahti, E. (1990) J. Biol. Chem. 265, 5317-5323. 36. Neame, P. J., Choi, H. U., and Rosenberg, L. C. (1989) J. Biol. Chem. 264, 2653-2658.

35. Mann, D. M., Yamaguchi,

37. Ruoslahti,

E., Bourdon, M., and Krusius, T. (1986) Functions of the Proteoglycans, pp. 260-271, Ciba Foundation Symposium 124, Wiley, Chichester.

38. Bourdon, M., Krusius, T., Campbell, S., Schwartz, N., and Ruoslahti, E. (1989) Proc. Nutl. Acud. Sci. USA 84, 3194-3198. M., and Ruoslahti, (1985) Proc. Nutl. Acud. Sci. USA 82, 1321-1325.

39. Bourdon, M. A., Oldberg, A., Pierschbacher, 40. McCormick,

E.

D., van der Rest, M., Goodship, J., Lozano, G., Ninomiya, Y., and Olsen, B. R. (1987) Proc. Nutl. Acud. Sci. USA 84,

4044-4048.

dermatan sulfate proteoglycan reveals identity to decorin.

A 1.6-kb cDNA clone was isolated by screening a library prepared from chick corneal mRNA with a cDNA clone to bovine decorin. The cDNA contained an op...
1MB Sizes 0 Downloads 0 Views