Biochimica et Biophysica Acta, 1132 (1992) 225-227
© 1992 Elsevier Science Publishers B.V. All rights reserved 0167-4781/92/$05.00
c D N A sequence for rat dermatan sulfate proteoglycan-II (decorin) Susan R. Abramson and J. Frederick Woessner, Jr. Department of Biochemistry and Molecular Biology, REPSCEND Labs and Department of Medicine, University of Miami School of Medicine, Miami, FL (USA)
(Received 10 June 1992)
Key words: Decorin; Proteoglycan;cDNA sequence; (Rat) A cDNA clone for dermatan sulfate proteoglycan-II, or decorin, has been isolated from a rat uterus library and sequenced. The cDNA and deduced amino acid sequences are 79 and 77% identical to the previously reported human and bovine sequences, respectively. The rat protein contains potential attachment sites for two glycosaminoglycan chains and four N-linked oligosaccharides, six conserved cysteine residues and multiple repeats of a leucine-rich sequence, LXXLXLXXNXL/I. Overlapping the C-end of one of these repeats is an NKISK sequence, which has been implicated in binding to fibronectin.
The physical properties of a connective tissue are determined by the extracellular matrix components of that tissue. These include the major structural proteins such as collagens, as well as various glycoproteins and proteoglycans (PGs). One such PG is decorin, a small dermatan sulfate P G that is present in many types of connective tissue, including tendon, skin, bone, smooth muscle and articular cartilage and has been shown to bind specifically to collagen types I and VI, fibronectin and transforming growth factor-fl (TGF-/3) [1-4]. Our studies have demonstrated the presence of decorin in the rat cervix and have indicated that this proteoglycan may play a role in the reorganization of fibrillar collagen in the cervix during the ripening and dilatation processes [5,6]. We have isolated a c D N A clone for rat decorin so that we can study its gene expression in the cervix during pregnancy. A Sprague-Dawley rat uterus c D N A library in Agtl0 (Clonetech, Palo Alto, CA) was screened by P C R and plaque hybridization. Degenerate P C R primers were designed based on our previously reported amino acid sequence for rat decorin  and the c D N A sequences for human and bovine decorin [7,8] and were used to
Correspondence to: S.R. Abramson, University of Miami School of Medicine, Department of Biochemistry (R-127), P.O. Box 016960, Miami, FL 33101, USA. The sequence data reported in this paper have been submitted to the EMBL/Genbank Data Libraries under the accesion number Z12298. Abbreviations: PG, proteoglycan; GAG, glycosaminoglyean; PCR, polymerase chain reaction; TGF-/3, transforming growth factor-/3; LRR, leucine-rich repeat.
amplify a 512 bp fragment from the library. This PCR product was directly sequenced with the dideoxynucleotide Sanger method (Sequenase Version 2.0 kit) to confirm that it was a fragment of the decorin cDNA. The P C R product was then used as a probe to screen the library by plaque hybridization. Three rounds of screening were performed in order to isolate positive plaques . Two clones of length 1.3 kb and 1.1 kb were chosen to be subcloned into p G E M 3 Z f f +). The p G E M 3 Z f ( + ) clones were transformed into D H 5 a cells, amplified, isolated via the alkaline lysis protocol given by Promega , and sequenced as above. Multiple primers were used in order to sequence the clones completely. The two clones, of length 1311 and 1065 bp, are identical at their 3' ends, the size difference being due to truncation at the 5' end. A single point of sequence difference between the two clones is found in the open reading frame, but this makes no difference in the deduced amino acid sequence (see Fig. 1). The 1.3 kb clone contains a complete open reading frame, as evidenced by a Met initiation codon preceded by three stop codons and followed by 353 sense codons before the next stop codon is found in that reading frame. The 5' and 3' untranslated regions are 179 and 67 bp, respectively. The predicted 354 residue protein contains the identical amino acid sequences reported for the N-terminus and a CNBr fragment of decorin isolated from the rat cervix . The N-terminal sequence previously reported begins 30 residues from the translation initiation site, indicating that there is a signal peptide which is cleaved before secretion. Indeed, the
226 -179 5 ~ cg -1TI gttttttttttttttcaacctagtgacagtcacagagcagcaccaccccctcctccttt -118 -59 1 1
ttgagggctccggtggcaaatacccggattaaaaggtggtgaaaacgcatgagacaacc ATG AAG GCA ACT CTC GTC TTA TTC CTT CTG GCG CAA GTC TCT TGG MET Lys Ala Thr Leu Vat leu Phe Leu Leu Ala Gin Val Ser Trp
46 GCT GGA CCA TTT GAG CAG AGA GGA TTA TTT GAC TTC ATG CTA GAA 1(> Ala Gly Pro Phe Glu Gin Arg Gly Leu Phe Asp Phe Met leu Glu
CAT GAG GCC TCT GGC ATA ATC CCT TAC GAC CCT GAC AAT CCC CTG Asp Gtu Ala $er GIV tie ICe Pro Tyr Asp Pro Asp ASh Pro leu
ATA TCT ATG TGC CCC TAC CGA TGC CAA TGC CAT CTC CGA GTG GTG lle $er Met Cys Pro Tyr Arg Cys Gin Cys His Leu Arg Vat Val
CAG TGT TCT CAT CTG GGT CTG GAC AAA GTA CCC TGG GAG TTT CCA
CCT GAC ACA ACA TTG CTA GAC CTG CAA AAC AAC AAA ATA ACA GAG
ATC AAA GAG GGG GCC TTT AAG AAC CTG AAG GAC TTG CAT ACC TTG lle Lys Gtu Gly Ala Phe lys Asn Leu Lys Asp Leu His Thr Leu
Gin Cys $er Asp leu Gly Leu Asp lys Val Pro Trp Glu Phe Pro
Pro Asp Thr Thr leu Leu Asp Leu Gin Asn Asn Lys ]le Thr Glu
31(> ATC CTT GTC AAC AAC AAG ATC AGC AAG ATC AGC CCA GAG GCA TTT 106 Ire Leu Va[ Asn Ash Lys Ire Ser Lys lle $er Pro Gtu Ala Phe 361 121
CCT CTA GTG AAG TTG GAA AGG CTT TAT CTG TCT AAG AAC CAC lys Pro Leu Vat lys leu Glu Arg Leu Tyr leu Ser Lys Asn His
4O6 CTA AAG GAG CTG CCC GAA AAA TTG CCC AAA ACA CTC CAG GAG CTT leu Lys Gtu Leu Pro Gtu l y s keu Pro Lys Thr Leu Gln Gtu Leu 451 151
CGA CTC CAC GAC AAT GAG ATC ACC ~ CTG AAG AAA TCT GTG TTC Arg Leu His Asp Asn Gtu l i e Thr Lys Leu Lys Lys Ser Vat Phe
AAT GGA CTG AAC CGT ATG ATT GTC ATA GAA CTG GGC GGC AAC CCA Asn Gty Leu Ash Arg Met l l e Vat I r e Gtu Leu Gty Gty Asn Pro
CTG AAA AAC TCT GGG ATT GAA AAT GGA GCC TTG CAG GGA ATG /tAG Leu Lys Asn Ser Gty l i e Glu Asn G[Y Ala leu Gtn Gty Met Lys
GGT CTC GGA TAC ATC CGC ATC TCA GAC ACC AAC ATA ACT GCT ATT Gty Leu Gty Tyr I l e Arg l l e Ser Asp Thr Asn l l e Thr Ala l l e
CCT CAA GGT CTG CCC ACT TCT ATC AGT GAA CTG CAT CTG GAT GGC Pro Gln Gly Leu Pro Thr Set l l e Set Gtu Leu His Leu Asp GIy
A AAC AAG ATC GCC AAA GTT CAT GCA GCC AGC CTG AAA GGA ATG TCT Asn Lys I r e Ata Lys Vat Asp Ata Ala Ser Leu Lys Gly Met Ser
AAT TTG TCT AAG CTG GGT TTG AGC TTC AAT AGC ATC ACC GTT GTG Ash Leu Ser l y s Leu Gty Leu Set Phe Asn $er l i e Thr Vat Val
GAA AAT GGC AGT CTG GCT AAT GTT CCT CAT CTG AGG GAG CTC CAC GEu Asn Gty Set Leu Ala Asn Vat Pro His Leu Arg Gtu Leu His
TTG GAC AAC AAC AAA CTC CTC AGA GTG CCT GCT GGG CTG GCA CAG Leu Asp Ash Asn Lys Leu Leu Arg Vat Pro Ata Gty leu A[a Gin
CAT AAA TAT GTC CAG GTC GTC TAC CTT CAT AAC AAC AAC ATC TCC His ky$ Tyr Vat Gtn Vat Vat Tyr Leu His Asn Asn Asn I r e Ser
GAA GTT GGG CAG CAT GAC TTC TGC CTC CCT TCA TAC CAG ACT AGG GIu Va| GLy Gin His Asp Phe Cys Leu Pro Ser Tyr Gin Thr Arg
AAG ACT TCC TAC ACT GCC GTG AGT CTT TAT AGC AAC CCT GTC CGG kys Thr Set Tyr Thr Ata Vat Ser Leu Tyr Ser Asn Pro Vat Arg
TAT TGG CAA ATT CAC CCA CAC ACC TTC AGA TGT GTC TTC GGG CGC Tyr Trp Gin I r e His Pro His Thr Phe Arg Cys Vat Phe Gty Arg
TCT ACC ATT CAA CTT GGG AAC TAC AAG TAA ctcccaaacagcctcattt $er Thr l i e Gin Leu Gty Asn Tyr Lys .
first 20 residues of the N-terminus comprise a highly hydrophobic region, typical of a signal peptide. The calculated M r for the precursor and secreted forms are 39807 and 36366, respectively, the latter being in agreement with the reported M r of 37000 for the decorin core protein isolated from the rat cervix . The rat decorin cDNA and core protein are 79 and 77% identical to the human and bovine sequences, excluding a gap equivalent to five and six amino acids, respectively (Fig. 2). This gap occurs near the N-terminal, in the hypervariable region, which has been shown to be the major site of difference among decorin protein sequences . The rat protein sequence contains two of the three potential G A G chain attachment sites found in the human and bovine decorins, including the site, four residues from the N-terminal, where the single G A G chain of decorin attaches . The cDNA data confirm the unusual sequence following this site found in rat decorin (SGII instead of SGIG), reported by Kokenyesi and Woessner . The rat protein has four potential N-linked glycosylation sites, three of which are also present in the human and bovine proteins. Furthermore, six cysteine residues are conserved in all three homologs (see Fig. 2). The rat decorin core protein contains six exact leucine-rich repeats (LRR) of the consensus sequence L X X L X L X X N X L / I (10 repeats are found if stringency is relaxed), which has been reported to be found in all the proteins in a subfamily of small proteoglycans, which includes decorin, biglycan, fibromodulin and lumican . Furthermore, the L R R is found in a wide variety of proteins whose function depends on protein-protein interactions, including receptor and adhesion proteins (human glycoprotein Ib and oligodendrocyte-myelin glycoprotein, rat lutropin-choriogonadotropin receptor, and Drosophila slit, toll and chaoptin proteins) and enzymatic or inhibitory proteins (adenylate cyclase, lysine carboxypeptidase and ribonuclease/angiogenin inhibitor) [12-14]. Decorin also contains the NKISK sequence which has been shown to be important for decorin's ability to bind to fibronectin . This sequence overlaps the C-end of one L R R (see Fig. 2) and is also found in similar position in the c~-2-glycoprotein. Synthetic peptides containing LRRs form amphipathic /3-sheets , and it seems plausible that decorin's LRRs and conserved cysteine residues provide a structural framework for at least
Fig. 1. Nucleotide and deduced amino acid sequence for rat decorin. Nucleic acid sequences in capital letters indicate the open reading frame; small letters indicate the untranslated regions. Underlined sequence indicates previously reported amino acid sequence from isolated rat decorin . Numbering starts at initiation codon (positions before this are negative). Position 684 is the point of contention between the two clones, as indicated.
r b h
GAG 20 • MKATLVLFLLAQVSWAGPFEQRGLFDFMLE DEASGIIP IIFL V Q K G II L Q G
80 i00 ~ 120 G L D K V p W E F P P D T T L L D L Q N N K I T E I K E G A F K N L K D L H T L I L % - N [ N K I S K I ISPEAFKPLVKLERLYLSI~IHLK E KDL A D D N I l l G A Q
r b h
140 160 180 v 200 l E L P E K L P K T L Q E L R L H D N E I T K L K K S V F N G L N R M I V I EL G G N P L K N S G I E N G A L Q G M K G L G Y I R I S D T N I T A I P M V E VR Q V T S F KS A T M AE WVT Q T s F KS A s
Q G L P T S I S E L H L D G iN K I A K [V D A A S L K G M S N L _ S K L G L S F N S I T W E N G S L A N V P H L R E L H L D I ~ N K L L R V P A G L P LT I T l LN A "SA D T N AK G V
220 h r b h
hypervariable l [email protected]
@ • 60 • [Y D P D N P L I S M . . . . . . lC P Y R C Q C H L R V V Q C S D L EEHFPEVPEIEPMGPV l F EV--DRDFEPSL-GPV F
Iv G T
3| 00 • 320 [email protected]
354 AQHKYVQVVyLHNNNI SEVGQHDFCLPS YQTRKTSYTAVS LYSNPVRYWQIHPHTFRCVFGRSTIQLGNYK D I AI SN P G N K A SG F Q E Q S YV A A V 77% E I V SS P GHN K A SG F Q E Q S YV A 79%
Fig. 2. Comparison of rat, bovine and human decorin deduced amino acid sequences. Points of interest are shown: v , potential glycosaminoglycan attachment site; I , potential N-linked oligosaccharide attachment site; •, cysteine residue; ~ , leucine-rich repeat; small box, putative fibronectin binding site.
some of its protein-protein interactions with collagens, fibronectin and TGF-/3. We wish to thank Ms. Carolyn Taplin for her technical support and Drs. Gregory Conner and Rudolf Werner for their expert cloning advice and assistance with the manuscript. D N A and protein sequence analyses were performed using IntelliGenetics Suite of Molecular Biology Analysis Programs and FASTA , respectively. This work was supported by NIH grant H D 06673. References 1 Scott, J.E. (1991) Int. J. Biol, Macromol. 13, 157-161. 2 Bidanset, D.J., Guidry, C., Rosenberg, L.C., Choi, H.U., Timpl, R. and Hook, M. (1992) J. Biol. Chem. 267, 5250-5256. 3 Schmidt, G., Hausser, H. and Kresse, H. (1991) Biochem. J. 280, 411-414. 4 Yamaguchi, Y., Mann, D.M. and Ruoslahti, E. (1990) Nature 346, 281-284.
5 Kokenyesi, R. and Woessner, J.F. (1989) Biochem. J. 260, 413419. 6 Kokenyesi, R. and Woessner, J.F. (1990) Biol. Reprod. 42, 87-97. 7 Krusius, T. and Ruoslahti, E. (1986) Proc. Natl. Acad. Sci. USA 83, 7683-7687. 8 Day, A.A., McQuillan, C.I., Termine, J.D. and Young, M.R. (1987) Biochem. J. 248, 801-805. 9 Sambrook, J., Fritsch, E.F. and Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor. 10 Titus, D.E. (1991) Promega Protocols and Applications Guide, 2nd Edn., Madison. 11 Blochberger, T.C., Vergnes, J.-P., Hempel, J. and Hassell, J.R. (1992) J. Biol. Chem. 267, 347-352. 12 Ruoslahti, E. (1988) Annu. Rev. Cell Biol. 4, 229-255. 13 Rothberg, J.M., Jacobs, J.R., Goodman, C.S. and ArtavanisTsakonas, S. (1990) Genes Dev. 4, 2169-2187. 14 Krantz, D.D., Zidovetzki, R., Kagan, B.L. and Zipursky, S.L. (1991) J. Biol. Chem. 266, 16801-16807.