Pancreas Vol. 6 , No. 2, pp. 157-161 0 1991 Raven Press, Ltd., New York

cDNA Sequence and Deduced Amino Acid Sequence of Human Preprocolipase W. Renaud and J. C. Dagorn U315 INSERM de Physiologie et Pathologie Digestives, Marseille, France

Summary: Complementary DNA clones for human pancreatic colipase were identified in human pancreatic cDNA libraries by hybridization with a pool of synthetic oligonucleotides containing all possible coding sequences for amino acids 75 to 80 of the partial human colipase protein sequence (Sternby, et al. Biochim Biophys Acta 1984;784:75). Alignment of overlapping cDNA clones yielded an mRNA sequence of 504 nucleotides [not including the poly(A) tail] encoding a polypeptide of 112 amino acids. The prepeptide comprised 17 amino acids, with an amino-terminal cluster of charged residues followed by a hydrophobic core of 12 residues typical of leader sequences. The deduced human procolipase sequence comprised 95 residues, including a propeptide of 5 residues. It was in complete agreement with the partial sequence previously obtained by protein sequencing. Northern blot analysis revealed that the polyadenylated preprocolipase transcript had a length of approximately 680 nucleotides. Key Words: Human pancreas-Colipase-Pancreatic mRNAMolecular cloning.

Pancreatic lipase (E.C. 1.3.3.1) is responsible for the hydrolysis of dietary triacylglycerides in the duodenal lumen (1). The enzyme by itself, however, is inactive when its substrate is associated with bile salts and phospholipids, which is the case in vivo. In these conditions, addition of colipase restores lipolytic activity (2,3). Colipase is a small protein cofactor (molecular weight of about 10,000)present in the pancreas of all vertebrates tested so far (4). It is secreted in a proform (5) that is hydrolyzed by trypsin to generate active colipase (Fig. 1). The primary sequences of procolipases from several species have been determined (6-8) and a significant degree of homology was found among them (9). Human procolipase has been sequenced, except for its carboxy-terminalend (8,9). This protein is of pecu-

liar clinical interest since lipase is not completely saturated by colipase under physiological conditions; a deficiency in colipase therefore results in impaired lipolytic activity and fat malabsorption (10). Cases of colipase deficiency have been reported, sometimes with normal levels of lipase (1 1,12), suggesting alteration of preprocolipase gene expression. As a first step in the characterization of that genetic defect, we have cloned the human preprocolipase messenger RNA and established its nucleotide sequence. We also deduced from the nucleotide sequence the complete amino-acid sequence of the preprotein. MATERIALS AND METHODS

Human pancreatic cDNA libraries Human pancreas was obtained from cadaver kidney transplant donors. Total RNA was extracted according to Chirgwin et al. (13) and the polyadenylated fraction was purified on an oligo(dT) cellulose column (14). A first cDNA library was con-

Manuscript received December 7, 1989; revised manuscript accepted February 13, 1990. Address correspondence and reprint requests to Dr. J. C. Dagorn, U315 INSERM de Physiologie et Pathologie Digestives, 46 Boulevard de la Gaye, F-13009 Marseille, France.

157

W . RENAUD AND J . C . DAGORN

158

MC 1 w32A

-

I

L

wR2 i

a - w - w . * *

1

-

w5 c

w97

FIG. 1. Organization of human preprocolipase mRNA and sequencing strategy of the corresponding cDNA clones. In the coding region of the mRNA (box), the hatched region corresponds to the prepeptide, the black region to the propeptide, and the open region to the mature protein. The noncoding regions are indicated on the 5 ' and 3' sides. Positions of cDNA clones W32A, W5, and W97 relative to the mRNA sequence are given (grey bars). Arrows indicate the directions of sequencing. Positions of the synthetic oligonucleotides used in cloning (MCI) and ! in sequencing (WR2) are indicated by closed rectangles. Clone W32A is in pBR322. Clones W5 and W97 are in XgtlO.

structed in pBR322 with cDNA synthesized with reverse transcriptase, by oligo(dT) priming the first strand and oligo(dC) priming the second strand after oligo(dG) tailing the single-stranded cDNA (15). The library contained approximately 4.5 x lo3 recombinants. A second library, containing 5 X lo4 independent clones, was constructed in XgtlO as previously described (16). Biohazards associated with the experiments described in this publication have been examined previously by the French National Control Committee. Library screening The cDNA library in pBR322 was screened with a mixture of synthetic 17-mer oligonucleotides. Their sequence was derived from the amino acid sequence YYKCPC (amino acids 75 through 80), selected in the human colipase sequence (9) for minimum codon degeneracy. The oligonucleotide mixture (MC 1) comprised the following sequences: 5 '-CA(A/T/G/C)GG(A/G)CA(T/C)TT(NG)TA(NG) TA-3'. The oligonucleotide pool was end-labeled with [32P]ATPand polynucleotide kinase (17) to a specific activity of 2 x lo9 cpm/pg and used in the colony hybridization procedure of Benton and Davis (18). The XgtlO library was screened with the insert of clone W32A originally selected from the pBR322 library by oligonucleotide hybridization. The cDNA insert was labeled by random priming to a specific activity of 1 x lo9 cpm/pg and used for screening by plaque hybridization (1 8). DNA sequencing Selected recombinant clones were sequenced either according to Maxam and Gilbert (17) or to Sanger (19) after subcloning in the sequencing vecPancreas, Vol. 6 , N o . 2, 1991

tors M13 mp18/mp19 (20), using the universal primer or an appropriate synthetic oligonucleotide. Both strands of the cloned cDNAs were sequenced. Northern blot analysis Total pancreatic RNA was analyzed by electrophoresis on agarose-methylmercury gels (21), transfered to nitrocellulose, and hybridized to random primer-labeled inserts by the technique of Thomas (22). RESULTS Screening the pBR322 and hgtl0 libraries of human pancreatic cDNA for preprocolipase clones A first screening of about 3 x lo3 clones from the pBR322 pancreatic cDNA library was performed using as a probe the synthetic oligonucleotide mix-

nucl.

6)

1

6

FIG. 2. Northern blot analysis of preprocolipase mRNA. Total pancreatic RNA was transfered onto nitrocellulose after electrophoretic separation on an agarose-methymercury gel and hybridized to "P-labeled insert of clone W32A (see Fig. 1). The position of size markers, run in parallel and stained with ethidium bromide, is indicated on the left.

159

HUMAN PREPROCOLIPASE mRNA

ture (MC1) described in the “Methods” section. The longest (W32A) of several hybridizing clones (Fig. 1) contained 305 nucleotides of an mRNA coding for a fragment (amino acids 60 to 108) of colipase previously determined by protein sequencing (9). To acquire the 5’ and 3‘ ends of the mRNA, we used the insert of clone W32A to screen lo5 clones of a AgtlO human cDNA library. Two clones (W5 and W97) were selected and sequenced (Fig. 1). They covered 455 and 375 nucleotides respectively. The overlapping cDNA sequences of pBR322 and AgtlO preprocolipase clones totaled 5 19 nucleotides. Clones with cDNA inserts extending farther towards the 5’ end of the message could not be found. The length of the mature (polyadenylated) preprocolipase mRNA was estimated to be 680 nucleotides by Northern blot analysis (Fig. 2).

initiation codon were present in the longest clone studied. Primary sequence of human preprocolipase The cDNA sequence obtained by alignment of the clones encoded a preprotein of 112 amino acids (Fig. 3). The sequence starting from Ala in position 18 and ending with Gly in position 108 was identical to the sequence reported by Sternby et al. (8,9). Hence, the mRNA does indeed encode colipase and the prepeptide comprises 17 amino acids. On the carboxy-terminal end of the protein, the tetrapeptide Arg-Ser-Lys-Gln followed the Gly residue ending the partial sequence available. Computer analysis of procolipase sequence gave a molecular weight of 10,105 Da. DISCUSSION

Analysis of preprocolipase rnRNA sequence Of the 519 nucleotides obtained by alignment of the cDNAs, 15 corresponded to the beginning of the poly(A) tail at the 3‘ end of the message. A single open reading frame started at nucleotide 2, and ended with a stop codon at nucleotide 338 (Fig. 3). The 164 nucleotide 3’ nontranslated region contained an altered consensus polyadenylation signal (AAUUAAA) 17 nucleotides from the 3‘ end of the message. Only two nucleotides upstream from the CC

Synthetic oligonucleotides derived from a partial amino acid sequence of human pancreatic colipase allowed us to select clones encoding most of the preprocolipase sequence from a human pancreatic cDNA library. A single open reading frame was found, corresponding to a polypeptide of 112 amino acids. The available nucleotide sequence 5’ from the first methionine was too short to exclude the presence of an upstream in-frame methionine. The

ATG GAG AAG ATC CTG ATC CTC CTG LOIT GTC GCC CTC TCT GTG GCC TAT M E K I L I L L L V A L S V A Y

50 16

GCA GCT CCT GGC CCC CGG GGG ATC ATT ATC AAC CTG GAG AAC GGT GAG CTC A A P G P R G I I I N L E N G E L

101

TGC ATG AAT AGT GCC CAG TGT AAG AGC AAT TGC TGC CAG CAT TCA AGT GCG C M N S A Q C K S N C C Q H S S A

152

CTG GGC CTG GCC CGC TGC ACA TCC ATG GCC AGC GAG AAC AGC GAG TGC TCT L G L A R C T S M A S E N S E C S

203 67

GTC AAG ACG CTC TAT GGG ATT TAC TAC AAG TGT CCC TGT GAG CGT GGC CTG V K T L Y G I Y Y K C P C E R G L

254 a4

ACC TGT GAG GGA GAC AAG ACC ATC GTG GGC TCC ATC ACC AAC ACC AAC TTT G D K T I V G S I T N T N F T C E

305 101

GGC ATC TGC CAT GAC GCT GGA CGC TCC AAG CAG TGA gactgcccacccactcccac G I C H D A G R S K Q

361 112

33

50

acctagcccagaatgctgtaggccaCtaggcgcaggggCatctctcccCtgctccagcgcatctcccg429

ggctggccacctccttgaccagcatatctgttttctgattgcgctcttcacaattaaaggcctcctgc 497 aaaccttaaaaa

509

FIG. 3. Nucleotide sequence of the preprocolipase cDNA and deduced sequence of the encoded preprotein. Upper case letters indicate the amino acid coding region of the mRNA. The 5‘ and 3’ noncoding regions are in lower case. The putative polyadenylationsignal (AATTAAA)is underlined. The open reading frame starts at nucleotide 3 and ends with a stop codon at nucleotide 338. The propeptide (residues 18-22) is underscored. Numbering on the right refer to nucleotides (top) and amino acids (bottom). Pancreas, VoI. 6,No. 2, 1991

160

W . RENAUD AND J . C DAGORN

nucleotide sequence surrounding the first methionine (CCATGG), however, was in complete agreement with the consensus sequence of the translation initiation sites in eukaryotes (23). In addition, we have determined the mRNA sequence of rat preprocolipase (W. Renaud and J. C . Dagorn, unpublished results), in which 56 nucleotides were sequenced upstream from the initiation codon. The deduced prepeptide was 17 amino acids in length, as predicted in the human sequence as well. It is therefore very likely that the mRNA sequence reported here encodes the complete human preprocolipase amino acid sequence. Analysis of the prepeptide sequence revealed several features typically found in eukaryotic signal peptides (24): the first three residues confered to the N-terminal end a net charge of + 1 (counting + 1 for Met-NH: , Asp, or Glu and - 1 for Arg or Lys). In addition, the C-terminal residue of the charged region (Lys) was positively charged, as observed in most prepeptides. Then, starting with an Ile in position 4, a stretch of nine hydrophobic residues with large side chains (except Ala in position 1 1 ) provided the prepeptide with a hydrophobic core. Finally, the last residue was Ala, which is one of the most common sites of prepeptide cleavage (25). The deduced sequence of procolipase (95 residues) comprised the partial sequence already reported (9), without modifications. The tetrapeptide Arg-Ser-Lys-Gln in positions 109-1 12 completed the carboxy-terminal portion. The overall similarity of complete human procolipase with pig procolipase remains 75% (9). Similarities with horse A and B procolipases amount to 73 and 76%, respectively (7).

Human preprocolipase mRNA is the first colipase transcript whose sequence has been elucidated. The corresponding cDNAs will now be used in studies on human preprocolipase gene structure. The large homology among colipases suggests that human cDNAs could also be successfully used in heterologous screenings to clone preprocolipase mRNAs from other species, which is presently the easiest way to determine their protein sequence. Acknowledgment: The authors are grateful Dr. R. J. MacDonald and Helen Aronovich for the design and synthesis of the degenerate colipase oligonucleotide probe and for helpful advice. The assistance of Dr. J. L. Berge Lefranc during synthesis of other oligonucleotides is also acknowledged. We would like t o thank G. Michel for excellent technical help. W.R. was supported by a grant from the Ministkre d e la Recherche et de la Technologie. Panerecis, VoI. 6, N o . 2 , 1991

REFERENCES 1. Borgstrom B. Luminal digestion of fat. In: Go VLW, Gard-

ner JD, Brooks FP, Lebenthal E, DiMagno EP, and Scheele GA, eds. The exocrine pancreas: biology, pathobiology and diseases. New York: Raven Press, 1986:361-74. 2. Morgan RGH, Barrowman J, Borgstrom B. The effect of sodium taurodeoxycholate and pH on the gel filtration behaviour of rat pancreatic protein and lipase. Biochim Biophys Acta 1969;175:65-75. 3. MayliC MF, Charles M, Gache C, Desnuelle P. Isolation and partial identification of pancreatic colipase. Biochim Biophys Acta 1971;229:28&9. 4. Sternby B, Larsson A, Borgstrom B. Evolutionary studies on pancreatic colipase. Biochim Biophys Acta 1983;750: 340-5. 5. Borgstrom B, Wieloch T, Erlanson-Albertsson C. Evidence for a pancreatic pro-colipase and its activation by trypsin. FEBS Lett 1979;108:407-10. 6. Erlanson C, Fernlund P, Borgstrom B. Purification and characterization of two proteins with co-lipase activity from porcine pancreas. Biochim Biophys Acta 1973;310:43745. 7. Julien R, Rathelot J, Canioni P, Sarda L, Gregoire J, Rochat H. Horse pancreatic colipase isolation by a detergent method and amino-terminal sequence of the polypeptide chain. Biochimie 1978;60:103-7. 8. Sternby B, Borgstrom B. One-step purification of precolipase from human pancreatic juice by immobilized antibodies against human colipase,,. Biochim Biophys Acta 1984; 786:10%12. 9. Sternby B, Engstrom A, Hellman H, Vihert AM, Sternby NH, Borgstrom B. The primary sequence of human colipase. Biochim Biophys Acta 1984;784:75-80. 10. Gaskin KJ, Dune PR, Lee L, Hill R, Forstner GG. Colipase and lipase secretion in childhood-onset pancreatic insufficiency . Delineation of patients with steatorrhea secondary to relative colipase deficiency. Gastroenterology 1984;86: 1-7. 1 1 . Ghishan FK, Moran JR, Dune PR, Green HN. Isolated congenital lipase-colipase deficiency. Gastroenterology 1984;86:1580-2. 12. Hildebrand H , Borgstrom B, Bekassy A, ErlansonAlbertsson C, Helin A. Isolated colipase deficiency in two brothers. Gut 1982;23:243-6. 13. Chirgwin JM, Przybyla A, MacDonald RJ,Rutter WJ. Isolation of biologically active ribonucleic acid from sources enriched in ribonuclease. Biochemistry 1979;24:5294-9. 14. Aviv H, Leder P. Purification of biologically acive globin messenger RNA by affinity chromatography on oligothymidilic acid cellulose. Proc Nut1 Acad Sci USA 1972;69:140812. 15. Land H , Grez M, Hansen H, Lindermaier W, Schuetz G. 5’-Terminal sequences of eucaryotic mRNA can be cloned with high efficiency. Nucl Acid Res 1981;9:2251-66. 16. Giorgi D, Bernard JP, Rouquier S, Iovanna J, Sarles H, Dagorn JC. Secretory pancreatic stone protein messenger RNA. Nucleotide sequence and expression in chronic calcifying pancreatitis. J Ctin Invest 1989;84:101-6. 17. Maxam AM, Gilbert W. Sequencing end-labeled DNA with base-specific chemical cleavage. Methods Enzymol 1980; 65:499-560. 18. Benton WD, Davis RW. Screening recombinant clones by hybridation to single plaques in situ. Science 1977;196: 180-2.

HUMAN PREPROCOLIPASE mRNA 19. Sanger F, Nicklen S, Coulson AR. DNA sequencing with chain terminating inhibitors. Proc Natl Acad Sci USA 1977 ;74:5463-7. 20. Messing J , Crea R, Seeberg PH. A system for shotgun DNA sequencing. Nucl Acid Res 1981;9:309-32. 21. Bailey JM, Davidson N. Methylmercury as a reversible denaturing agent for agarose gel electrophoresis. Anal Biochem 1976;70:75435. 22. Thomas PS. Hybridization of denatured RNA and small

161

DNA fragments to nitrocellulose. Proc Natl Acad Sci USA 1980;77:5801-5. 23. Kozak M. The scanning model for translation: an update. J Cell Biol 1989;198:229-41. 24. Came T, Scheele G. The role of presecretory proteins in the secretory process. In: Cantin M, ed. The secretory process. Basel: Karger Press, 1983:73-101. 25. von Heijne G. Pattern of amino-acids near signal sequence cleavage sites. Eur J Biochem 1983;133:17-21.

Pancreas, Vol. 6, No. 2, 1991

cDNA sequence and deduced amino acid sequence of human preprocolipase.

Complementary DNA clones for human pancreatic colipase were identified in human pancreatic cDNA libraries by hybridization with a pool of synthetic ol...
399KB Sizes 0 Downloads 0 Views