Insect Biochemistry and Molecular Biology 60 (2015) 47e58

Contents lists available at ScienceDirect

Insect Biochemistry and Molecular Biology journal homepage: www.elsevier.com/locate/ibmb

Cysteine cathepsins as digestive enzymes in the spider Nephilengys cruentata Felipe J. Fuzita a, b, Martijn W.H. Pinkse c, Peter D.E.M. Verhaert c, Adriana R. Lopes a, b, * ~o Paulo, Brazil Laboratory of Biochemistry and Biophysics, Instituto Butantan, Sa ~o Paulo, Sa ~o Paulo, Brazil Biotechnology Program, University of Sa c Laboratory of Analytical Biotechnology & Innovative Peptide Biology, Delft University of Technology, Delft, The Netherlands a

b

a r t i c l e i n f o

a b s t r a c t

Article history: Received 3 November 2014 Received in revised form 16 March 2015 Accepted 17 March 2015 Available online 25 March 2015

Cysteine cathepsins are widely spread on living organisms associated to protein degradation in lysosomes, but some groups of Arthropoda (Heteroptera, Coleoptera, Crustacea and Acari) present these enzymes related to digestion of the meal proteins. Although spiders combine a mechanism of extra-oral with intracellular digestion, the sporadic studies on this subject were mainly concerned with the digestive fluid (DF) analysis. Thus, a more complete scenario of the digestive process in spiders is still lacking in the literature. In this paper we describe the identification and characterization of cysteine cathepsins in the midgut diverticula (MD) and DF of the spider Nephilengys cruentata by using enzymological assays. Furthermore, qualitative and quantitative data from transcriptomic followed by proteomic experiments were used together with biochemical assays for results interpretation. Five cathepsins L, one cathepsin F and one cathepsin B were identified by mass spectrometry, with cathepsins L1 (NcCTSL1) and 2 (NcCTSL2) as the most abundant enzymes. The native cysteine cathepsins presented acidic characteristics such as pH optima of 5.5, pH stability in acidic range and zymogen conversion to the mature form after in vitro acidification. NcCTSL1 seems to be a lysosomal enzyme with its recombinant form displaying acidic characteristics as the native ones and being inhibited by pepstatin. Evolutionarily, arachnid cathepsin L may have acquired different roles but its use for digestion is a common feature to studied taxa. Now a more elucidative picture of the digestive process in spiders can be depicted, with trypsins and astacins acting extra-orally under alkaline conditions whereas cysteine cathepsins will act in an acidic environment, likely in the digestive vacuoles or lysosome-like vesicles. © 2015 Elsevier Ltd. All rights reserved.

Keywords: Spider Acidic digestion Cysteine cathepsin Peptidase Midgut diverticula

1. Introduction Cysteine peptidases from C1 family can be found in virtually all living beings and it is likely that a peptidase from this family was present in the ancestor of all organisms (Rawlings and Salvesen, 2013). Cysteine cathepsins belong to C1A family (clan CA) and are commonly associated with protein degradation in lysosomes. In arthropods, the cathepsins L and B can play important roles in the digestive process. There are examples of digestive cathepsins L and B in all main Arthropoda taxa such as Crustacea (Hu and Leung, 2007; Stephens et al., 2012), Hexapoda (Terra and Ferreira, 2012) and Arachnida (Franta et al., 2010).

* Corresponding author. Laboratory of Biochemistry and Biophysics, Instituto ~o Paulo, Brazil. Butantan, Av. Vital Brazil 1500, 05503-000, Sa E-mail address: [email protected] (A.R. Lopes). http://dx.doi.org/10.1016/j.ibmb.2015.03.005 0965-1748/© 2015 Elsevier Ltd. All rights reserved.

The presence of these enzymes in arachnids by activity assays has been shown in ticks (Franta et al., 2010), mites (Carrillo et al., 2011) and scorpions (Fuzita et al., in press) until now. Cathepsins L and B are important peptidases involved in the hemoglobinolytic pathway in Ixodes ricinus (Franta et al., 2010, 2011) and the former was also located in digestive vesicles from Rhipicephalus (Boophilus) microplus (Renard et al., 2002). In contrast to ticks, the enzymatic studies on digestive peptidases from spiders are still underexplored. The peptidase activities observed in the digestive fluid (DF) from Tegenaria atrica were classified as carboxipeptidase A, aryl aminopeptidase, glycylglycine dipeptidase, chymotrypsin and trypsin (Mommsen, 1978). Kavanagh and Tillinghast (1983) and Foradori (Foradori et al., 2001) provided strong biochemical evidences of the presence of metallo peptidases in the DF from Argiope aurantia. This was confirmed by obtaining the amino-terminal sequence from two astacin-like metallo peptidases in the same kind of sample (Foradori et al., 2006). The studies cited above were

48

F.J. Fuzita et al. / Insect Biochemistry and Molecular Biology 60 (2015) 47e58

mostly concerned in the characterization of the DF. The only study using the midgut as enzyme source identified a collagenase-like activity in thirteen spider species from Australia (Atkinson and Wright, 1992). Recently, a genome followed by a proteome of the spider Stegodyphus mimosarum was published with the identification of cathepsins L and B in the animal “whole body”, with none subsequent interpretation of this result by the authors (Sanggaard et al., 2014). Spiders perform extra-oral digestion (EOD) (Cohen, 1995) in combination with an intracellular phase of digestion (Ludwig and Alberti, 1988). Thus, it is also important the characterization of the digestive process in the midgut. The predigested food is absorbed by the digestive cells in the midgut and midgut diverticula (MD) through pinocytosis and the final digestion takes place inside the digestive vacuoles (Ludwig and Alberti, 1988). None of the above cited studies attempted to do enzymatic assays using the combination of substrates, assay conditions and specific inhibitors to investigate the presence of cysteine cathepsins. Hence, in the present work, enzymatic assays were done using properly conditions to detect the activity of cysteine cathepsins using as enzyme sources, not only the DF, but also the midgut with its digestive diverticula. For the first time it was identified the presence of cathepsins L, F and B as digestive enzymes in the DF and MD from a spider. Moreover, a recombinant cathepsin L was heterologous expressed and characterized. 2. Materials and methods 2.1. Animals and sample obtaining Adult Nephilengys cruentata females were collected at Instituto ~o Paulo and kept in artificial terrarButantan or at University of Sa ium under natural photoregime and room temperature conditions with water spraying 4 times per week. The animals were starved for at least one week and then fed with Gryllus sp. as follows in order to let all spiders at the same physiological condition of the digestive process. After that, the preys were provided with food, firstly, with an interval of 11 days, in which this interval was decreasing to 9, 7, 5, 3 up to 1 day. Subsequently, the spiders were starved for 2 weeks and dissected or fed again for 9 h prior to dissection. The animals were anesthetized in a CO2 chamber and dissected in cold isotonic saline solution (300 mM KCl pH 7) and the opisthosomal midgut with its diverticula (MD) (Fig. 1) were isolated. We would like to remark that the MD is intimately associated to an intermediate tissue which is not separable from the digestive epithelium (Ludwig and Alberti, 1988). Thus, the use of MD is a simplification of the tissue complexity and these aspects will be approached in the discussion section. MDs were stored at 20  C until use. Samples were homogenized in a Potter-Elvehjem homogenizer in cold deionized water to a final volume of 1 ml per MD. DF samples were collected by electrical or mechanical stimulus in two weeks fasting animals or after 1, 3, 9, 25 and 48 h of feeding. Samples prepared for purification attempts contain 1 mM of methylmethane thiosulfonate (MMTS), a reversible cysteine peptidase inhibitor (Smith et al., 1975). 2.2. Protein determination and enzymatic assays The protein concentration was determined according to the method of Smith et al. (Smith et al., 1985) using egg albumin as standard. The 4-methylcoumarin-7-amide (MCA) fluorescent substrates were purchased from Sigma. Stock solutions (1 mM) of Z-FRMCA and Z-RR-MCA were prepared in dimethyl sulfoxide (DMSO) and diluted to a 10 mM final concentration in 0.1 M adequate assay buffer as listed below. All buffers contained 3 mM cysteine and

3 mM EDTA to avoid oxidation of the catalytic cysteine from the enzymes (Beynon and Bond, 2001; Rawlings and Salvesen, 2013). All assays were performed such that the measured activity was proportional to the protein concentration and the incubation time. No-enzyme and no-substrate controls were included. Activation of cysteine peptidases samples (crude and recombinant) were accomplished in 0.1 M citrate-phosphate buffer containing 3 mM cysteine and 3 mM EDTA in a range of pH values from 2.6 to 7.0 for 60 min or 10 min at 30  C, thereafter samples were diluted in deionized water (5 times) and activity was measured using a 1:100 ratio of activated enzymes to 10 mM Z-FR-MCA diluted in 0.1 M citrate-phosphate buffer (pH 5.0), this ratio guaranteed that the pH of the assay was always the same. Two controls were done: a) sample was 50 times diluted in deionized water and incubated at 30  C for 1 h or b) the dilution was done prior to the activity assays. The highest rate hydrolysis pH was chosen for time dependence activation test. After incubation enzyme activity was tested as described previously. For standard activation, the crude homogenate samples were incubated at 30  C in citrate phosphate buffer pH 2.6 for 120 min whereas recombinant samples were incubated for 60 min at pH 3 at the same temperature. Activity measures were also used to evaluate the stability of the cysteine cathepsins under different pH conditions of cysteine peptidase samples (crude and recombinant) in a pH range from 5.0 to 10.0 at 30  C for 3 h. In all stability experiments the buffers concentration and enzyme dilution were adjusted to guarantee no pH changes in the activation, incubation and assay steps. Buffers used for incubation: citrate-phosphate pH 5.0e7.0; TRISeHCl pH 8.0e9.0; Gly-NaOH pH 10. Besides that the effect of substrate concentration on the activity of the partially purified cysteine peptidases or the activated recombinant NcCTSL1 was studied using at least 15 different Z-FRMCA concentrations of ranging from 1 to 150 mM. The Km and Vmax values (mean ± SD) were determined from a weighted linear regression using EnzFitter software (Biosoft). A similar experiment with recombinant NcCTSL1 was performed in the presence of 4 pepstatin concentration (0.5, 1, 5 and 10 mM). The enzyme was incubated for 5 or 25 min at 30  C with pepstatin prior to the substrate addition. 2.3. Partial isolation of cysteine peptidases and inhibition assays MD homogenate samples from Nephilengys cruentata were fractionated in 1.7 M ammonium sulfate for at least 16 h at 4  C. The samples were centrifuged for 20 min at 16,100  g at 4  C. The supernatant was applied to a hydrophobic column (Hitrap Butyl FF€ GE) coupled to an AKTA-FPLC system (GE) that had been equilibrated in 50 mM phosphate buffer (pH 6) containing 1.7 M ammonium sulfate. The elution was performed using a 25 mL gradient of 1.7e0 M ammonium sulfate in 50 mM phosphate buffer (pH 6). Fractions of 1 mL were collected. The fractions were assayed in the presence and absence of the cathepsin B inhibitor CA-074, using the substrates Z-FR-MCA or Z-RR-MCA. The fractions that exhibited activity with Z-FR-MCA as a substrate were pooled, desalted (HiTrap desalting column, GE) and concentrated using a Vivaspin 6 membrane (GE). The samples were then applied to a cation-exchange column (Resource S-GE) that had been equilibrated in 50 mM sodium acetate buffer (pH 5.0). The protein was eluted using a 40 mL gradient of 0e0.6 M NaCl in the equilibrating buffer, and fractions of 0.5 mL were collected and assayed using ZFR-MCA. Alternatively, a simpler partial purification was used to the inhibitory assays. The homogenized samples were diluted in 50 mM citrate-phosphate buffer (pH 5.0). This diluted sample was applied to a cation-exchange column (HiTrap S, GE) that had been

F.J. Fuzita et al. / Insect Biochemistry and Molecular Biology 60 (2015) 47e58

49

Fig. 1. General morphology of the digestive system of a spider and its location. A) Diagrammatic lateral view of the digestive system of a spider showing its components and location (modified from Foelix, 2010. B) N. cruentata opisthosomal MD. On the right, it is possible to see the diverticula ramifications after decompression.

equilibrated in the same buffer. The protein was eluted in a 25 mL gradient of 0e1.0 M NaCl, in the equilibrating buffer. The fractions were assayed using 10 mM Z-FR-MCA in the presence and absence of the following peptidase inhibitors: 10 mM E-64 (cysteine peptidase), 10 mM pepstatin (aspartic peptidase) and 1 mM PMSF (serine peptidase). Since the chelating agent EDTA is used in the assay buffer and did not affect activity over Z-FR-MCA other metallopeptidase inhibitors were not tested. In all inhibitory assays of chromatographic fractions a pre-incubation without the substrate was done for 30 min at 30  C and after this period the substrates were added to a final concentration of 10 mM. 2.4. Sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) Samples of the cell lysates used for NcCTSL1expression or from the purification attempt were diluted in a sample buffer containing

60 mM TriseHCl buffer (pH 6.8), 2.5% SDS, 0.36 mM b-mercaptoethanol, 10% (v/v) glycerol and 0.005% (w/v) bromophenol blue. The samples were heated for 5 min at 95  C in a water bath and then loaded onto a 12% (w/v) polyacrylamide gel slab containing 0.1% SDS (Laemmli, 1970). The gels were run at a constant voltage of 200 V at room temperature and then stained with Coomassie Blue (Colloidal Blue Staining Kit, Invitrogen). 2.5. Mass spectrometry procedures The MD homogenates were submitted to three cycles of freeze and thaw and then centrifuged for 20 min at 1000  g. The supernatant (50 mg) was collected and separated by SDS-PAGE on a 10 well PAGE® Novex 4e12% Bis-Tris Gel (Invitrogen, Bleiswijk, NL) for 30 min at a constant voltage of 200 V using MES-SDS as running buffer. Each gel lane was sliced in 32 equal pieces and separately treated and analyzed by LC-MS/MS in an LTQ-Orbitrap Velos

50

F.J. Fuzita et al. / Insect Biochemistry and Molecular Biology 60 (2015) 47e58

(Thermo Fisher) as previously described (Liebensteiner et al., 2013). The same approach was used for the analyses of chromatographic activity pools but the gels were cut into 12 slices. In the case of DF samples, the total time of separation by SDS-PAGE was 5 min, the gel was cut in nine pieces and the liquid chromatography gradient from 0 to 40% was performed in 220 min. In all cases, the peptides were extracted from the gel with 100% (v/v) acetonitrile, vacuum dried and resuspended in a deionized water solution containing (v/ v) 5% DMSO and 5% acetonitrile. All raw data files were processed into peak lists using the software ReAdW 4.3.1 and then deconvoluted using the program MS-deconv (Liu et al., 2010). The files generated from MS-deconv were analyzed by MASCOT (Matrix Sciences). Subsequently, the MASCOT searches of all the runs were loaded together in the software Scaffold 4 and statistically analyzed with X!Tandem (Muth et al., 2010). The positive protein identification considered the presence of at least 2 sequenced peptides and presented a false discovery rate (FDR) of 0.1%. Label-free quantitative analysis was based in the normalized spectral abundance factor (NSAF) (Zybailov et al., 2006) to each experiment individually analyzed using the software Scaffold 4 (Searle, 2010).

2.7. Construction of the expression vectors After removing the signal peptide (Petersen et al., 2011) the two clones of cathepsin L-like cysteine peptidases obtained (NcCTSL1 and NcCTSL2) as described above were separately inserted in a pAE expression vector (Ramos et al., 2004). New primers were designed and NcCTSL1 forward primer E contained a restriction site to the enzyme KpnI whereas reverse primer F presented a restriction site to HindIII. The forward primer NcCTSL2 G and the reverse primer H contained restriction sites to the enzymes BamHI and BstBI, respectively. After the PCR amplification of the two sequences the products were purified by loading the samples in a 1% agarose gel and extracting the DNA from the single band observed to each sequence. The PCR products (overnight digestion) and pAE plasmid (3 h digestion) were double digested with the respective restriction enzymes at 37  C. The plasmids pAE-NcCTSL1 and pAE-NcCTSL2 were obtained after inserting the digested PCR products in their respective linearized plasmids using the enzyme T4 DNA ligase. All enzymes used were from Fermentas. Primers: E ¼ 50 -GGACAACACAAACTAAAGACC30 ; F ¼ 50 -GGTCGTTTAAACCAGAGGGTA-30 ; G ¼ 50 -GAAGGTCAGCACGCAAAGAAG-30 ; H ¼ 50 GATTTGAACATTGGAAAGAGG-30 .

2.6. Molecular cloning of digestive cathepsin L from MD All nucleic acid manipulations were performed as previously described (Sambrook and Russell, 2001) or according to the manufacturer's protocol, unless otherwise specified. The RNA extraction from the MD was performed using the TRIzol® (Invitrogen) reagent. Complementary DNA (cDNA) was obtained using the Superscript III First-Strand Synthesis System for RT-PCR® (Invitrogen). The RACE technique e rapid amplification of cDNA ends, (Zhang and Frohman, 2000) e was applied as follows: in the first step to obtain the 30 region the cDNA was submitted to a polymerase chain reaction (PCR) using a degenerate primer A (forward) and primer B (reverse). Primer A, with the sequence 50 -GAMAVTGYGGWTCBTGYTGG-30 (where M ¼ A or C; Y ¼ C or T; W ¼ A or T; V ¼ A, C or G and B ¼ C, G or T), was designed based on the GQCGSCW sequence that is present in the reactive site of cathepsin L-like cysteine peptidases from different organisms. Primer B is a hybrid primer (QT) previously described (Zhang and Frohman, 2000). The PCR product was sequenced as described below and one gene-specific reverse primer C for NcCTSL1 and D for NcCTSL2 were designed (primer C ¼ 50 GGTCGTTTAAACCAGAGGGTA-30 ; primer D ¼ 50 -CACATTTTAAACTAATGGGTA-30 ). In a second step, to obtain the 50 region and also the complete sequence from the mRNA, the cDNA was adeninetailed using a terminal deoxynucleotidyl transferase (Fermentas). A new PCR was done but now primer B was used as a forward primer and primers C or D were separately used as reverse primers. For the sequencing of the cloned DNA sequences the PCR products were loaded onto a 1% agarose gel. The band was extracted using a GeneJet Gel Extraction Kit® (Fermentas) and inserted into the pGEM-T Easy vector® (Promega). Thermally competent XL1-Blue cells were transformed and selected by ampicillin resistance. Single colonies were selected and grown in LB Amp (LuriaeBertani broth containing ampicillin) at 37  C overnight, and the plasmids were isolated (GeneJET Plasmid Miniprep Kit®, Fermentas) and sequenced using a Big Dye Terminator Mix® (Applied Biosystems) in an ABI PRISM 3100 Genetic Analyzer (Applied Biosystems). The sequences obtained were submitted to a Basic Local Alignment Search Tool (BLAST) analysis using default parameters (available at www.ncbi.nih.gov/blast). Then, a multiple sequence alignment was constructed using ClustalW software (Thompson et al., 1994). Other sequences used to the mass spectrometry identification were obtained in a transcriptomic study that will be published elsewhere.

2.8. Heterologous expression and purification of NcCTSL1 and NcCTSL2 Escherichia coli (strain BL21 Star (DE3)pLysS) competent cells were separately transformed using the constructions pAE-NcCTSL1, pAE-NcCTSL2 and only pAE as a control. The transfected cells were grown overnight at 37  C in LB medium containing 100 mg/ml of ampicillin and 34 mg/ml of chloramphenicol (LBac). The cultures were then diluted 20 times in LBac to a final volume of 50 ml with 1% glucose and grown until a 0.6 value of optical density at 600 nm be reached. Thereafter, the cells were collected by centrifugation at 5000  g and the medium was changed to an LBac without glucose. For the expression induction, isopropyl-ß-D-thiogalactoside (IPTG) was added to a final concentration of 1 mM and the culture was incubated for 15 h at 20  C. The cells were centrifuged at 5000  g for 30 min at 4  C and the pellet suspended in 1 ml of lysis buffer [10 mM TRISeHCl, pH 8.0; 100 mM NaCl; 20 mM imidazol; 1 mM phenylmethylsulfonyl fluoride (PMSF) and 10% glycerol (v/v)]. Cell lyses were achieved by sonication (100W potency) in the R1 cell disruptor (Unique) for 5 cycles of 1 min each with 3 min chilling in between. The lysate was centrifuged at 16,100  g for 30 min at 4  C and the supernatant (soluble fraction) applied onto a Ni-NTA agarose column (Qiagen) previously equilibrated in the lysis buffer. The procedure was followed according to the manufacturer's instructions except that the buffers contained 10 mM TRIS instead of 50 mM monobasic phosphate. After the washing steps, the elution of the recombinant proteins was accomplished using elution buffer with 150 mM imidazol. 2.9. Phylogenetic analyses Multiple sequence alignments of the nucleotide sequences were conducted by ClustalW (Thompson et al., 1994) algorithm with default parameters in the MEGA v6.0 interface (Tamura et al., 2013), using codons as anchors for the alignment. Maximum likelihood tree was constructed in Mega v6.0. The percentage of trees in which the associated taxa clustered together is shown next to the branches. Initial tree(s) for the heuristic search were obtained by applying the Neighbor-Joining method to a matrix of pairwise distances estimated using a JTT (JoneseTayloreThornton) model. A discrete Gamma distribution was used to model evolutionary rate differences among sites (5 categories (þG, parameter ¼ 1.0346)). Concatenated Bayesian analysis was done in Beast v1.8.0

F.J. Fuzita et al. / Insect Biochemistry and Molecular Biology 60 (2015) 47e58

(Drummond et al., 2012), with data partitioning by codon position, and assuming a lognormal distribution of evolutionary rates across branches in the topology (therefore we did not assume a strict molecular clock, but a relaxed one), by fixing its mean to 1.0 and letting the standard deviation follow an exponential (0.33) prior (program default). For each run, posterior probabilities of clades were obtained after discarding the burnin, which was assessed by graphical analysis in Tracer v1.6 (Drummond et al., 2012). 3. Results 3.1. Identification of the cysteine cathepsins by mass spectrometry and molecular biology techniques The MD post nuclear supernatant and the DF were submitted to proteomic analyses and the isolated MD RNA to transcriptomic experiments, as described in the methodology. Table 1 shows in which physiological condition cysteine cathepsins were identified by both, transcriptomics and quantitative proteomics. Cathepsins L1 and 2 (also respectively named as NcCTSL1 and NcCTSL2) are the two most abundant cysteine cathepsins in fasting and fed animals. They were identified in the MD at both, proteomic and transcriptomic levels, in fasting or fed spiders. Top ten proteins NSAF ranges from 6.4 to 1.4 and NcCTSL1 and 2 are, respectively, the 66th and 96th in a list with 1284 identified proteins used for quantification (data not shown). NcCTSL2 was also identified in the DF under fasting conditions and after 1, 9, 25, 30 and 48 h of feeding. The mRNA of cathepsins L3 and 4 were observed in the MD of fasting and fed animals and the protein could be observed in both conditions (NcCTSL3) or only in fasting animals (NcCTSL4). The presence of cathepsin L8 and cathepsin B as proteins was confirmed by LC-MS/MS experiments and their mRNA could be detected only in fed animals. Cathepsin B was also identified in the DF but only after 25 h of feeding and in fasting spiders. The identified proteins in Table 1 had at least two peptides sequenced by mass spectrometry. By reducing the stringency to one peptide cathepsin L10 is also found in fasting and fed animals whereas cathepsin L6 only in the starved ones (data not shown). Although NcCTSL2 and cathepsin B1 were identified in many different samples from the DF, in all these cases both enzymes

51

presented only small NSAF values (Table 1). For comparison, the range of the ten most abundant proteins in the DF of a fasting spider is 0.98e0.25 e summing 40% of all proteins included for the quantification analysis (data not shown). Moreover it was not possible to detect activity neither over Z-FR-MCA nor Z-RR-MCA in the DF. By these results two main hypotheses could explain our observation: 1) the enzymes are really secreted and needed in very small amounts to compose the DF or 2) their presence in the DF is a consequence of cell contents release together with excreta after complete processing of digestive vacuoles. All cathepsin L sequences identified in Table 1 possess the four residues of the active site (Gln 152, Cys 158, His 292 and Asn 308, papain numbering), the signal peptide and the propeptide. The cathepsin B1 sequence is complete but it lacks the histidine residue of the active site. Since this part of the sequence was truncated it is more likely that this is an artifact of the contig assembly rather than a real loss of the catalytic histidine. Moreover none peptide of this problematic region was identified by mass spectrometry (data not shown), which could confirm if this is a real loss. Cathepsin F sequence contains all catalytic residues but it is incomplete at the N-terminal region lacking the cystatin domain. Six more cathepsins L were identified in the transcriptome plus one sequence (cathepsin L9) that differ only by 8 nucleotides and 4 amino acids in relation to cathepsin L1. We decided to keep this enzyme in the database for proteomic identification and also it was included on the phylogenetic analyses further discussed, but this protein was not considered as a different cathepsin L. Another cathepsin B and one cathepsin O were identified only in the transcriptomic experiments. 3.2. Characterization of the cysteine peptidases present in the MD of the spider N. cruentata It is known, in the literature, that cysteine peptidases are synthesized as inactive zymogens that can be activated when incubated in acidic pHs (Collins et al., 2004; Rozman et al., 1999; Turk et al., 1999). The hydrolysis rates in activated crude MD extracts from fed or fasting animals were higher over Z-FR-MCA than to ZRR-MCA, with a Z-FR-MCA/Z-RR-MCA ratio of 16 and 43 for fed and fasting animals, respectively. Without activation the fed animals'

Table 1 Cysteine cathepsins identified in the MD (transcripts and proteins) and DF (proteins) from the spider Nephilengys cruentata under different physiological conditions. MD (n ¼ 3)

Enzyme

Signal peptidade cleavage site

Protein

Cathepsin Cathepsin Cathepsin Cathepsin Cathepsin Cathepsin Cathepsin

L1 L2 L3 L4 L8 B1 F

DF (NSAF)a Cathepsin L2 Cathepsin B1 Cathepsin F

mRNA

9 h (NSAF)a

Fasting (NSAF)a

9h

Fasting

0.5 ± 0.04 0.5 ± 0.5 0; 0;0.03b N.I 0.04; 0.5; 0.02b 0.2 ± 0.1 0.04 ± 0.02

0.4 ± 0.3 0.3 ± 0.2 0; 0;0.02b 0.006 ± 0.003 0.04 ± 0.03 0.14 ± 0.09 0.04 ± 0.03

Yes Yes Yes Yes Yes Yes Yes

Yes Yes Yes Yes N.I N.I N.I

21e22 15e16 16e17 19e20 14e15 16e17 N.I

Fasting n¼3

1h n¼1

3h n¼3

9h n¼4

25 h n¼3

30 h n¼1

48 h n¼2

0.006 0.002; 0.02; 0.007b 0.03 ± 0.01

0.005 0 0.004

0 0.008 0.006 ± 0.004

0.007 ± 0.02 0 0.005 ± 0.002

0.002; 0.016; 0.003b 0.057 ± 0.003 0.006 ± 0.003

0.005 0.013 0.004

0.008; 0b 0.003, 0.02b 0.007 ± 0.002

The number of hours corresponds to the period after feeding started. N.I: Not identified. n: number of distinct biological replicates. When n > 1 the values are mean ± SD of different biological samples NSAF. Only cysteine cathepsins identified with at least two peptides are shown. a NSAF e normalized spectral abundance factor. b Due to large variation, NSAF is shown for each individual experiment.

52

F.J. Fuzita et al. / Insect Biochemistry and Molecular Biology 60 (2015) 47e58

samples specific activity over Z-FR-MCA was 3 ± 0.5 mU/mg whereas for fasting spiders the activity observed was 0.04 ± 0.004 mU/mg, thus with 75 times increase after 9 h feeding. Z-RR-MCA hydrolysis was not observed in non-activated samples (Table 2). In order to confirm that the Z-FR-MCA cleavage was due to the action of cysteine peptidases, the MD samples were submitted to a cation-exchange chromatography and assayed in the presence of different peptidase inhibitors (Fig. 2). The cysteine peptidase inhibitor (E-64) was the only one to cause 100% inhibition. PMSF, a serine peptidase inhibitor, did not affect the activity and since EDTA is used in the medium assay metallopeptidase activity over Z-FRMCA also can be discarded. Curiously pepstatin, an aspartic peptidase inhibitor, caused 47% inhibition. By the subsite specificity of human cathepsin D Z-FR-MCA is not suitable for aspartic peptidase cleavage (Pimenta et al., 2001) and this inhibition was observed and confirmed for the recombinant NcCTSL1, which lead us to believe that this is a non-specific inhibition. In order to differentiate cathepsin B from cathepsin L activity a hydrophobic interaction chromatography (HIC) was done after an ammonium sulfate fractionation (Fig. 3). Activity is about 100 times higher with the substrate Z-FR-MCA in contrast to Z-RR-MCA (Fig. 3A). The cathepsin B inhibitor CA-074 was tested in the active fractions using both substrates in different assays. In Fig. 3B it is possible to see that Z-RR-MCA hydrolysis is 88% inhibited in the main activity peak and only 52% in the other peaks. When Z-FRMCA is used both peaks are 59% inhibited (Fig. 3C). A further investigation was performed submitting the fractions 17e20 from the first peak (P1) and 26e28 from the last one (P2) to LC-MS/MS experiments. In P1 it was possible to identify cathepsins L1, L2 and B1 with exclusive spectrum count of 11, 9 and 5, respectively (Table S1). In P2, cathepsins L1 and L2 were detected with exclusive spectral count of 3 and 25, respectively (Table S1). The activity peaks P1 and P2 were separated and applied onto a cation-exchange chromatography. However, it was not possible to successfully purify the cysteine cathepsins from N. cruentata (data not shown). As stated before zymogens of cysteine peptidases are activated when incubated in acidic pHs. MD samples from fasting or fed

Table 2 Specific activities of cysteine cathepsins in the MD homogenate of the spider Nephilengys cruentata. Non-activated

Activated

Fed

Specific activity (mU/mg)

Z-FR-MCA Z-RR-MCA Z-FR-MCA/Z-RR-MCA

3 ± 0.5 a a

Activation ratio

57 ± 8.6 3.5 ± 0.9 16.2

19 e e

47 ± 3 1.1 ± 0.1 42.7

1175 e e

Fasting Z-FR-MCA Z-RR-MCA Z-FR-MCA/Z-RR-MCA

Z-FR-MCA

0.04 ± 0.004 a a

Fed/fasting

TMARb

75

1425

Non-activated and activated samples hydrolyze ratios were measured with 10 mM of the above substrates diluted in 0.1 M citrate-phosphate buffer pH 5. Activated samples were diluted 10 times in citrate phosphate buffer pH 2.6 and incubated for 2 h at 30  C. After a 5 times dilution in deionized water the activity was measured. All buffers contain 3 mM cysteine and 3 mM EDTA. The values are mean and SD of at least 3 different biological samples. a Activity was not observed. b TMAR (Theoretical maximum activation ratio) was calculated by the activity ratio between activated samples from fed animals and non-activated samples from fasting spiders.

animals were first activated in a pH range from 2.6 to 7 as described in the methodology. Higher activities were obtained after incubation at pHs 2.6 and 3.0 (Fig. 4A) showing that these pHs are more suitable to a time-course activation experiment resulting in an increase of activity of 19 times (Table 2). After choosing the pH 2.6 for activation, samples were submitted to a kinetic experiment which evidenced that maximum activity in vitro is reached after 2 h incubation at pH 2.6 in samples from fasting and fed animals (Fig. 4B) with a 19 fold-change of activity. Activated samples were also submitted to pH stability experiments in a pH range from 5.0 to 10.0. Activated samples kept approximately 100% activity when incubated for 3 h at pH 5.0. However remaining activity was only 43% at pH 6. Remaining activities in incubation at pH above 7 were equivalent or lower 11% (Fig. 5A). The effect of the pH on activated crude samples from the MD of animals in both conditions showed pretty similar profiles (Fig. 5B). Maximum activity was obtained at pH 5.5 and at pH 5 the activity is higher than 90%. Despite high activities can be observed at pH 7 (between 60 and 70%), the enzymes are stable only up to 30 min assay, completely lacking their activities after that (data not shown). At pH 8 the residual activity between 9 and 16% that can be observed is no longer stable after 15 min assay (not shown). 3.3. Heterologous expression of cathepsins L1 and L2 from N. cruentata MD The expression of the recombinant NcCTSL1and NcCTSL2 was well succeeded (Fig. 6A and B). All expressions were performed at 37, 30, 25 and 20  C (only shown at 20  C). For NcCTSL1 most part of the protein was aggregated in inclusion bodies. However, in the expression made at 20  C activity could be recovered from the soluble fraction. Despite NcCTSL2 could be obtained as a recombinant enzyme, even at 20  C none activity over Z-FR-MCA could be observed in the soluble fraction. Control containing the pAE vector without the insert did not present activity using the same substrate, showing that the observed hydrolysis of Z-FR-MCA by NcCTSL1 is due to the action of the recombinant cathepsin L. The purification of NcCTSL1to apparent homogeneity was accomplished using an affinity column since the recombinant protein contains a His6 tag at the N-terminal region (Fig. 6C). Purified samples from recombinant NcCTSL1 as described in the methodology were used for activation, pH stability and pH effect tests. The expressed NcCTSL1 did not present any activity previous to activation. Incubation at pHs 2.6 and 3 were the most efficient in obtaining activation, however the incubation at pHs 4, 5 and 6 activated the samples 18, 25 and 11% from the maximum activation, respectively (Fig. 4C). The samples were incubated at pH 3 for the time-course activation in a total of 90 min at 30  C. Maximal activation was observed after 60 min of incubation (Fig. 4D), in contrast to the 2 h needed for complete in vitro activation of native cysteine cathepsins. Activation of 60 min at pH 3 (30  C) was established as the standard activation procedure. Stability to distinct incubation pHs showed that the activated enzyme keeps 100% of the activity only at pH 5, completely losing its activity at pHs 6.0e10.0 (Fig. 5C). The zymogen pH stability experiments were followed by an activation step at pH 3. In contrast to the activated enzyme, zymogen remaining activity is near 100% at pHs 5.0, 6.0 and 7.0. Remaining activities at pHs 8.0, 9.0 and 10.0 are, respectively, 62, 42 and 30% (Fig. 5C). The effect of pH on the activated recombinant NcCTSL1 activity was tested in a pH range from 2.6 to 9.0. The pH optimum was observed at pH 5.5 (Fig. 5D). Kinetic parameters such as Km (4.2 ± 0.6 mM) and kcat (1.2 s1) were also determined for the recombinant NcCTSL1 using Z-FRMCA as substrate. No hydrolysis over Z-RR-MCA was observed. As previously verified, the cysteine cathepsins present in the MD of the

F.J. Fuzita et al. / Insect Biochemistry and Molecular Biology 60 (2015) 47e58

53

Fig. 2. Cation-exchange chromatography of N. cruentata MD samples on a Hytrap S column assayed in the absence and presence of peptidase inhibitors. The column was equilibrated in 0.1 M sodium acetate buffer pH 5.0. The elution was performed using a gradient of 0e100% NaCl (100% ¼ 1 M) in the same buffer. The activity was measured using 10 mM Z-FR-MCA in citrate-phosphate buffer pH 5.0 containing 3 mM cysteine and EDTA. Control (C); E64 ( ), Pepstatin ( ); PMSF (B).



spider N. cruentata seem to be inhibited by pepstatin (Fig. 2). In order to confirm this, different pepstatin concentration in a range from 1 nM to 20 mM were tested using as enzyme source the recombinant NcCTSL1. Although the exact mechanism of inhibition could not be determined, important remarks will be presented. A pepstatin concentration of 0.5 mM starts to affect NcCTSL1 to a remaining activity of 92% and it is almost completely inhibited with 20 mM pepstatin (Fig. 7). Moreover, we observed that the pepstatin inhibition is time-dependent, thus the subsequent tests for Km estimative with different pepstatin concentration were performed with 5 or 25 min incubation prior to substrate addition. In both cases the Vmax is affected by pepstatin, but a longer incubation period causes larger reduction in this parameter (Table 3). The Km values increase with higher pepstatin concentration but in the case of longer incubation period this increase is smaller (Table 3). In conclusion, pepstatin affects both substrate binding and catalysis of NcCTSL1 which is an indicative of a mixed-type inhibition. However,



the exact inhibition pattern using plots such as LineweavereBurk and EadieeHofstee were not clear and still needs further investigation. Another remark about this nonspecific inhibition is that the ordinary indicated pepstatin concentration usage for endopeptidase activity screening is in the micromolar range (Beynon and Bond, 2001). For example, a pepstatin inhibition using a generic substrate as hemoglobin at acidic pH could lead to a misinterpretation of aspartic peptidase activity if proper controls are not done. One should also test the effect of the cysteine peptidase inhibitor E64 and the presence of a reducing agent in the assay. 3.4. Phylogenetic analyses A phylogenetic tree using all identified cysteine cathepsins from N. cruentata is shown in Fig. S1A. Due to bootstrap values smaller than 50% the relationships among most part of cathepsins L could not be distinguished, with the exception of three sister groups

Fig. 3. Inhibitory assays with CA-074 using the substrates Z-FR-MCA and Z-RR-MCA after a hydrophobic interaction chromatography. A) Hydrophobic chromatographic fractionation of N. cruentata MD was assayed with 10 mM Z-FR-MCA (:) and 10 mM Z-RR-MCA (C) in 0.1 M citrate phosphate buffer pH 5.0. (B) Activity assay with 10 mM Z-RR-MCA in the presence (B) and absence (C) of 10 mM CA-074. (C) Activity assay with 10 mM Z-FR-MCA in the presence ( ) and absence (:) of 10 mM CA-074. All assay buffers contain 3 mM cysteine and 3 mM EDTA.



54

F.J. Fuzita et al. / Insect Biochemistry and Molecular Biology 60 (2015) 47e58

Fig. 4. Acidic activation of cysteine cathepsins from N. cruentata. (A) MD sample incubation at different pHs for 60 min at 30  C. (B) The effect of time on the acidic activation of MD cysteine cathepsins incubated in 0.1 M citrate phosphate buffer pH 2.6. Continuous and dashed lines respectively represent fed and fasting spiders. (C) Recombinant NcCTSL1 incubation in different pHs for 10 min at 30  C. (D) The effect of time on the acidic activation of NcCTSL1 incubated in 0.1 M citrate phosphate buffer pH 3.0. The activities percentage is relative to the highest observed hydrolysis ratio. In all experiments the activity was measured using 10 mM Z-FR-MCA diluted in citrate-phosphate buffer pH 5.0. All buffers used contain 3.0 mM cysteine and 3.0 mM EDTA. The values are mean ± SD.

formed by cathepsins L1 with 9, 10 with 11 and 5 with 8. This result indicates that cathepsins L are under selective pressure and/or their time of divergence is too long since it is not possible to trace their ancestry and infer the original ortholog(s). In contrast to that, the phylogenetic relation among the cysteine cathepsins could be distinguished and supported by bootstrap. Cathepsin F appeared as a sister group of cathepsin L-like sequences and cathepsins O and B are respectively more distant. The evolutionary history of arachnid cathepsin L sequences was studied by constructing a phylogenetic tree using Bayesian inference. Again, well supported ancestry of the ortholog(s) gene(s) is not possible to detect as above explained. However, based on high

posterior probabilities (>90%) and phylogenetic proximity some functional and evolutionary aspects can be inferred from Fig. S1B. Although not well supported clade A contains all sequences from Acariformes with the exception of four proteins from Tetranychus urticae. Cathepsins L from Tyrophagus putrescentiae and Stegodyphus scabei are closest associated in contrast to T. urticae ones. In this latter species a large number of possible paralogs are observed. Group P is also not well supported in its base, but it is formed exclusively by Parasitiformes sequences, only lacking Dermanyssus gallinae cathepsin L. Highly supported subgroups are P1 and P2, in which a gene duplication of Haemaphysalis longicornis cathepsins L A and B is clearly observed. Group B1 contains the intracellular

Fig. 5. The effect of pH in the activity of N. cruentata cysteine cathepsins. (A) The effect of pH incubation in activated MD samples from fasting and fed animals after 3 h incubation at 30  C at pHs 5.0e10.0. (B) The effect of pH in the activity of activated MD samples from fasting (B) and fed animals ( ). (C) The effect of pH incubation in activated (continuous line) and non-activated* (dashed line) recombinant NcCTSL1 samples after 3 h incubation at 30  C at pHs 5e10. * In a first step the zymogen was incubated without activation and subsequently it was activated prior to the activity assays. (D) The effect of pH in the activity of recombinant NcCTSL1 activated samples. The activities percentage is relative to the highest observed hydrolysis ratio. The activities (A and C) were measured using 10 mM Z-FR-MCA diluted in 0.1 M citrate phosphate buffer pH 5.0. Buffers used (all contained 3 mM cysteine and 3 mM EDTA): pHs 2.6e7.0, 0.1 M citrate phosphate; pHs 8.0e9.0, 0.1 M TRISeHCl; pH 10.0, 0.1 M Gly-NaOH. The values are mean ± SD.



F.J. Fuzita et al. / Insect Biochemistry and Molecular Biology 60 (2015) 47e58

55

Fig. 6. Heterologous expression of cathepsins L1 and 2. (A) Protein profile on SDS-PAGE of expression at 20  C using BL21 Star(DE3)pLysS cell transformed with pAE-NcCTSL1. (B) Protein profile on SDS-PAGE of expression at 20  C using BL21 Star(DE3)pLysS cell transformed with pAE-NcCTSL2. (C) SDS-PAGE after purification of recombinant NcCTSL1 using affinity chromatography. S, standard (kDa); lane 1, non-induced cells; lane 2, 1 mM IPTG induced cells; lanes 3 and 6, supernatant from IPTG induced cells after lysis; lanes 4 and 5, pellet from IPTG induced cells after lysis; E, eluted fraction after purification; FT, flow-through after purification procedure. The arrow indicates the recombinant proteins location.

NcCTSL1, the digestive cathepsin L2 from the scorpion Tityus serrulatus and two proteins identified in the whole body of the spider Stegodyphus mimosarum. B2 is formed by sequences from the taxa Scorpiones, Opiliones and Orbiculariae, with the presence of NcCTSL2. Clade B3 is also related to enzymes involved in digestion. Our quantitative data from this work and submitted results about the digestion in Tityus serrulatus indicate that those enzymes play an important role in the digestive process, probably intracellularly. Interestingly, none sequence of the velvet spider appears in group B3, leading to the conclusion that this ortholog was lost in this animal or has passed through substantial changes along time. Another assumed gene loss in Stegodyphus mimosarum can be seen in group B2, which includes NcCTSL2. B4 is formed by non-orb weaver spiders whereas clade B5 is composed by spiders, one Acariforme (T. urticae) and Scorpiones. 4. Discussion 4.1. General aspects The mechanism of digestion in predator arachnids (with the exception of Opiliones and some group of Acari) which combines EOD and intracellular food processing, can be considered unique

among terrestrial arthropods and it is likely an ancestor condition in this taxon (Cohen, 1995). With the only exception of this work and Atkinson and Wright (1992), authors characterizing spiders digestion through enzyme assays have focused in the study of the DF, probably due to the fact that this secretion is the first to get in contact with prey in order to digest it, and also to some limitations of the MD study that should be pointed out. The highly branched MD are intimately associated to the intermediate tissue (IT) (Ludwig and Alberti, 1988) making impossible a dissection isolation. Thus, all the material used for biochemical assays, proteomic and transcriptomic analyses contains the digestive epithelium and IT. Although the exact function of the IT is still not largely studied it is believed that it has a storage function (Ludwig and Alberti, 1990). Another limitation is the complex and unknown system of vesicular trafficking in the digestive cells making not possible to know if the enzyme composition of the digestive vacuoles is completely different, similar or equal to lysosomes, or even if they are present at the intermediate tissue. We assumed all identified enzymes and mainly the cysteine cathepsins identified in MD involved in food processing as a first approach of the study. With this aspects in mind Fig. 8 must be interpreted as the first sketch of what is really happening inside the digestive cells. The heterologous expression of NcCTSL1 and NcCTSL2 as well as other digestive enzymes sequences identified by the transcriptomic/proteomic analyses in the present study will allow antibody preparation for immunohistochemistry experiments.

4.2. The physiology of protein digestion in N. cruentata The characterization of peptidases brings a more elucidative picture of the digestive process in spiders. The focus of the sporadic Table 3 Effect of the presence of pepstatin on the Vmax and Km of NcCTSL1. Pepstatin (mM)

Pre-incubation time 5 min

25 min

Vmax (%)a

Fig. 7. Recombinant NcCTSL1 inhibition by pepstatin. Incubation of activated recombinant NcCTSL1 for 5 min at 30  C in a pepstatin range from 1 nM to 20 mM. Activity was measured at pH 5.5 with 10 mM Z-FR-MCA.

0.5 1 5 10

105 94 66 39

± ± ± ±

8 10 9 2

Km (mM) 4 9 10 12

± ± ± ±

1 0.5 5 2

Vmax (%)a 102 93 55 22

± ± ± ±

13 9 8 7

Km (mM) 6 7 22 21

The values are mean ± SD of three different experiments. a Vmax percentage is related to a control experiment without pepstatin.

± ± ± ±

1 1 10 7

56

F.J. Fuzita et al. / Insect Biochemistry and Molecular Biology 60 (2015) 47e58

Fig. 8. Schematic view of cysteine cathepsins and other endopeptidases location in the MD cells of the spider N. cruentata. F: partially digested food.

studies in proteolytic enzymes in spiders was only related to the alkaline hydrolases such as astacins and trypsins present in the DF (Foradori et al., 2001, 2006; Kavanagh and Tillinghast, 1983; Mommsen, 1978). The only study until now, in which the entire MD was used as enzyme source, aimed the measure of collagenolytic activity (Atkinson and Wright, 1992). However, the assays were performed at pH 7.2 for 6 h at 37  C and we observed that after 3 h incubation at 30  C at pH 7 the cysteine cathepsins from N. cruentata are no longer stable (Fig. 5A). Thus, a biochemical characterization of the acidic enzymes complements the information about the digestive process in spiders. Using a combination of molecular biology, enzymology and mass spectrometry approaches, this work reports, for the first time, that cysteine cathepsins are present as activity and at both, proteomic and transcriptomic levels, in the MD and DF of a spider (Tables 1 and 2). Although the specificity of cathepsin L/F for substrates containing hydrophobic residues at P2 and cathepsin B for substrates containing positively charged amino acids at P2 are not restrictive, the comparison of activities on these substrates allows a good inference of the most abundant cathepsin involved in the digestive process of N. cruentata. This analysis associated with inhibition characterization suggests that cathepsin L/F is the most abundant form of cysteine peptidase involved in spider digestion (Figs. 2 and 3). This was corroborated by proteomic data mainly from MD and from chromatographic isolated activities on Z-FRMCA and by other genome and transcriptomic analysis of other arachnids (Sanggaard et al., 2014; Santamaria et al., 2012), in which the latter also pointed out for a larger diversity of cathepsins L associated to the digestive process. Since Z-FR-MCA can be hydrolyzed by these three enzymes the NSAF from Table 1 is the best estimative for the contribution of each cysteine cathepsin to the digestive process in abundance terms. However, it is still not possible to predict the physiological role of each of these enzymes against natural substrates on the diet. In order to explore these differences we have expressed the two more abundant cysteine peptidases. The recombinant and native cathepsins presented optimum pH 5.5 and are not stable at pH 7 or above, with 100% stability only at pH 5 (Fig. 5). A difference could be observed regarded the stability at

pH 6, since the native cathepsins still presented 45% of the activity after the incubation period and the recombinant one totally lost its hydrolysis capacity at this pH (Fig. 5A and C, respectively). It is curious that the stability of the native enzymes at pH 6 is low (about 45%) but 87% of the maximal activity is observed in this pH (Fig. 5A and B). The low stability at pH 6 is probably not an artifact due to autolysis since the same thing should have occurred at pH 5 which also have high hydrolysis ratios. Probably the enzymes are really not stable in this pH for 3 h at 30  C and the high activities at this pH could be observed because the activity assay usually last only one hour in contrast to the 3 h incubation. The same thing (high activity and low stability during the assay) was observed at pHs 7 and 8 for both, recombinant and native enzymes. Differently from the activated recombinant enzyme, the recombinant zymogen presents a stability of approximately 100% up to pH 7 and still can keep 30% of the activity even at pH 10 (Fig. 5C). These results are in accordance with the literature where it has already been reported that the zymogen of cathepsin L is more stable in neutral and alkaline pHs in comparison to the mature form (Nomura et al., 1996). The optimum pH of the recombinant cathepsins L from Rhipicephalus microplus (Renard et al., 2000) and Tenebrio molitor pCAL1a (Cristofoletti et al., 2005) are respectively 5.5 and 5.0. These cathepsins L are lysosomal enzymes and, as NcCTSL1 and the other native cysteine cathepsins from N. cruentata, are not stable and/or do not have activity in neutral and alkaline pHs. NcCTSL1is likely a lysosomal enzyme due to its acidic characteristics similar to other lysosomal cathepsins L and also it was not found in the DF. Cathepsins L4 and 8 are also probably acting inside the lysosomes because they weren't identified in the DF. In conclusion, by our results, both native and recombinant mature cysteine cathepsins presented acidic characteristics regarded pH stability, hydrolysis capacity and are likely lysosomal enzymes. This means that these enzymes, after in vivo activation, will necessarily act in a controlled pH range around 5 but inside lysosome-like vesicles (or any other subcellular compartment) their zymogen will be able to tolerate slightly alkaline pHs prior to acidification. Distinctly from cysteine peptidase involved in insect digestive process where the identification of the zymogen is difficult, Arachnida species present a large quantity of zymogens at the digestive epithelium. The presence of zymogen in the MD of fed animals was confirmed by mass spectrometry, once for the cathepsins L1, 2 and 8, fragments of the propeptide were also identified (data not shown). Numerous works have reported that cysteine cathepsin zymogen can undergo acidic activation to its mature form (Cristofoletti et al., 2005; Jerala et al., 1998; Kramer et al., 2007; Rozman et al., 1999). In this work, we observed that the cysteine cathepsins present in the MD of the spider N. cruentata can be activated after incubation in acidic pHs (Fig. 4A). Two hours incubation at 30  C pH 2.6 resulted in full activation of the cysteine cathepsins (Fig. 4B). The theoretical maximum activation ratio (TMAR) was 1425 and it was calculated based on the activity of activated samples from fed animals and non-activated samples from fasting spiders (Table 2). This means that, in theory, if totally activated, the zymogen stock present in the MD has the potential to generate a 1425 times change in the cysteine catheptic activity. Without acidic activation, it was observed that fasting animals have a basal activity, which is 75 times increased after 9 h feeding (Table 2). Thus, after this period, only a fraction of the cysteine cathepsins zymogens were activated with a theoretical potential to still pass through a 19 times increase in the activity. In the tick Ixodes scapularis the cathepsins L and B are more active during the slow-feeding period between 4 and 6 days after attachment and activity is not observed in the first 2 days (Franta et al., 2010). However, the authors did not present data of in vitro activation and about the presence of cathepsins zymogens. In the spider

F.J. Fuzita et al. / Insect Biochemistry and Molecular Biology 60 (2015) 47e58

N. cruentata it seems that the intracellular digestive process resembles the one performed by ticks and mites. It has been reported, in a histological study, that the spider Coelotis terrestris keeps some unchanged intracellular digestive vacuoles for 2 days after feeding, despite other ones were already digested (Ludwig and Alberti, 1988). This shows that not all digestive vacuoles are digested at once, some are kept for a future digestion and the mechanism controlling this partial digestion process is still not understood. Thus, a reasonable explanation to the zymogen found in the MD of fed animals is due to the presence of intact and active digestive vacuoles and consequently the enzymes as zymogen and in the mature form, respectively. Previous works showed that spiders use serine (Mommsen, 1978) and metallopeptidases (Foradori et al., 2006) for the extraoral digestion. In our entire proteome dataset at least 25 astacins and 8 trypsins were found in the DF confirming these previous findings (Fuzita et al., unpublished). Based on all data from this work and literature the protein digestion in spiders can be depicted as shown in Fig. 8: Trypsins and astacins are released by the secretory cells from the MD to start the extra-oral liquefaction of the prey. After that, the partially digested meal will be absorbed by pinocytosis and the final digestion will take place inside the cells, with cysteine peptidases acting in an acidic environment from the lysosome-like vesicles. The fact that cysteine cathepsins were found at the protein and activity levels in the MD of fasting animals, by the histological observation that twenty minutes after feeding pinocytosis can be observed in the digestive cells and that after one day the secretory granules are resynthesized (Ludwig and Alberti, 1988) are an indicative that the spider MD is already ready for the next predation event. Thus, in a short period of time after prey capture, the enzymes needed for the EOD are released into the lumen and the endopeptidases necessary for intracellular digestion, which are already synthesized, start a new digestive cycle. 4.3. Evolutionary considerations In the present work we identified at the mRNA level 11 different cathepsins L, 2 cathepsins B, one cathepsin O and one cathepsin F (Table S2). Five cathepsins L, one cathepsin B and one cathepsin F were identified by proteomics (Table 1). Two spider genomes followed by proteomic analyses were recently published (Sanggaard et al., 2014) allowing a better comparison of these enzymes in different spider groups. Table S2 exhibits a comparison of both mRNAs and proteins identified in the present work and by Sanggaard and collaborators (2014). Although the authors did not separately studied the MD proteome their analyses of other tissues such as silk glands and the whole body summed to a phylogenetic tree of arachnid cathepsin L sequences allowed some inferences. Functionally, group B1 (Fig. S1B) is formed by abundant enzymes of digestive organs such as NcCTSL1 and cathepsin L2 of the scorpion Tityus serrulatus (Fuzita et al., in press). NcCTSL1 is not secreted once it was not identified in none of the 17 analyzed DF samples (Table 1). Thus it is likely that group B1 is composed by intracellular enzymes involved in food processing, which would include two cathepsins L identified at the protein level in the whole body of Stegodyphus mimosarum. Moreover in the latter cited spider other four mRNAs were identified that seem to derive from the same original ortholog of this group. In contrast to that neither mRNA nor proteins from the velvet spider appears in group B2 and cathepsin L10 of Tityus serrulatus was not identified by proteomics in a similar approach. Hence, it looks like that this ortholog usage in the MD is particular to N. cruentata (or extending the analysis it may be an acquisition of orb-weaver spiders). The sequences from the basal spider Acanthoscurria geniculata, in general, did not form well sustained groups with other arachnids.

57

Group B1 does not include sequences from this spider. It is curious that even sequences from scorpion and harvestmen groups in B1 but tarantula not. The best guesses so far of possible digestive enzymes in this spider are cathepsins L3 and 4 (our nomenclatureTable S2), which were identified by proteomics only in the entire opisthosoma (Sanggaard et al., 2014) and showed a tendency to group with NcCTSL8 in Bayesian (Fig. S1B) and maximum likelihood trees (data not shown), although with low posterior probabilities and bootstrap values, respectively. Group B4 did not contain N. cruentata sequences, but it has the tarantula cathepsin L which was identified in the opisthosoma and hemolymph and the velvet spider cysteine peptidase identified only in the animal whole body and not in specific studied tissues. Two important features are observed when the cathepsin L sequences from main metazoan taxa are included in the analysis (Fuzita et al., in press). The first is the complete divergence from hard ticks' sequences in relation to other arachnids forming an almost exclusive group which included only two harvestmen sequences. It is very likely that this huge divergence is related to the particular selective pressures over blood feeder arachnids. Secondly, in the more inclusive analysis group B5 sequences appear in a well supported clade that includes cathepsins L from different metazoans such as Insecta, Crustacea, Nematoda, Platyhelminthes and also papain. This is an indicative of an old origin of this ortholog which is shared by different groups, and in all studied spiders the protein was identified by mass spectrometry. However, by function, more information is needed since, as can be seen in Table S2, these enzymes were identified in different organs of the three spiders. At the point of the metazoan phylogeny of cathepsin L the spider genome sequences were still not available. However, due to the high support of B5 group in Fig. S1B it is almost certain that these two sequences would also group together as above stated. In conclusion, although many functional data still need to be generated for a better evaluation of cathepsin L evolution in arachnids, it seems that the same ortholog can be present with different functions in different taxa and the number of paralogs may vary according to each clade. Furthermore, the high number of different cathepsin L copies evidences the importance of this enzyme in the Arachnida group. Acknowledgments This work was supported by the Brazilian research agencies ~o de Amparo a Pesquisa do Estado de Sa ~o Paulo) FAPESP (Fundaça (grant number 2005/02486-1) and CNPq (Conselho Nacional de gico) (237706/2012-1), and Desenvolvimento Científico e Tecnolo by the Netherlands Proteomics Centre. We are indebted to Drs. C. Ferreira, W.R. Terra and M. Demasi for equipment support. Felipe Jun Fuzita is a graduate fellow from CAPES. A.R. Lopes is a research scientist at Instituto Butantan. Drs. Peter Verhaert and Martijn Pinkse are professors at TUDelft. Appendix A. Supplementary data Supplementary data related to this article can be found at http:// dx.doi.org/10.1016/j.ibmb.2015.03.005. References Atkinson, R.K., Wright, L.G., 1992. The involvement of collagenase in the necrosis induced by the bites of some spiders. Comp. Biochem. Physiol. C-Pharmacol. Toxicol. Endocrinol. 102, 125e128. Beynon, R., Bond, J.S., 2001. In: Proteolytic Enzymes, second ed. Oxford University Press Inc., New York, United States. Carrillo, L., Martinez, M., Ramessar, K., Cambra, I., Castanera, P., Ortego, F., Diaz, I., 2011. Expression of a barley cystatin gene in maize enhances resistance against

58

F.J. Fuzita et al. / Insect Biochemistry and Molecular Biology 60 (2015) 47e58

phytophagous mites by altering their cysteine-proteases. Plant Cell Rep. 30, 101e112. Cohen, A.C., 1995. Extraoral digestion in predaceous terrestrial arthropoda. Annu. Rev. Entomol. 40, 85e103. Collins, P.R., Stack, C.M., O'Neill, S.M., Doyle, S., Ryan, T., Brennan, G.P., Mousley, A., Stewart, M., Maule, A.G., Dalton, J.P., Donnelly, S., 2004. Cathepsin L1, the major protease involved in liver fluke (Fasciola hepatica) virulence - propeptide cleavage sites and autoactivation of the zymogen secreted from gastrodermal cells. J. Biol. Chem. 279, 17038e17046. Cristofoletti, P.T., Ribeiro, A.F., Terra, W.R., 2005. The cathepsin L-like proteinases from the midgut of Tenebrio molitor larvae: sequence, properties, immunocytochemical localization and function. Insect Biochem. Mol. Biol. 35, 883e901. Drummond, A.J., Suchard, M.A., Xie, D., Rambaut, A., 2012. Bayesian Phylogenetics with BEAUti and the BEAST 1.7. Mol. Biol. Evol. 29, 1969e1973. Foelix, R., 2010. The Biology of Spiders, third ed. Oxford University Press, USA. Foradori, M.J., Keil, L.M., Wells, R.E., Diem, M., Tillinghast, E.K., 2001. An examination of the potential role of spider digestive proteases as a causative factor in spider bite necrosis. Comp. Biochem. Physiol. C-Toxicol. Pharmacol. 130, 209e218. Foradori, M.J., Tillinghast, E.K., Smith, J.S., Townley, M.A., Mooney, R.E., 2006. Astacin family metallopeptidases and serine peptidase inhibitors in spider digestive fluid. Comp. Biochem. Physiol. B-Biochem. Mol. Biol. 143, 257e268. Franta, Z., Frantova, H., Konvickova, J., Horn, M., Sojka, D., Mares, M., Kopacek, P., 2010. Dynamics of digestive proteolytic system during blood feeding of the hard tick Ixodes ricinus. Parasites Vectors 3, 11. Franta, Z., Sojka, D., Frantova, H., Dvorak, J., Horn, M., Srba, J., Talacko, P., Mares, M., Schneider, E., Craik, C.S., McKerrow, J.H., Caffrey, C.R., Kopacek, P., 2011. IrCL1The haemoglobinolytic cathepsin L of the hard tick, Ixodes ricinus. Int. J. Parasitol. 41, 1253e1262. Fuzita, F.J., Pinkse, M.W.H., Patane, J.S.L., Juliano, M.A., Verhaert, P.D.E.M., Lopes, A.R. Biochemical, transcriptomic and proteomic analyses of digestion in the scorpion Tityus serrulatus: insights into function and evolution of digestion in an ancient arthropod. Plos One (in press). Hu, K.-J., Leung, P.-C., 2007. Food digestion by cathepsin L and digestion-related rapid cell differentiation in shrimp hepatopancreas. Comp. Biochem. Physiol. B-Biochem. Mol. Biol. 146. Jerala, R., Zerovnik, E., Kidric, J., Turk, V., 1998. pH-induced conformational transitions of the propeptide of human cathepsin L - A role for a molten globule state in zymogen activation. J. Biol. Chem. 273, 11498e11504. Kavanagh, E.J., Tillinghast, E.K., 1983. The alkaline proteases of Argiope .2. fractionation of protease activity and isolation of a silk fibroin digesting protease. Comp. Biochem. Physiol. B-Biochem. Mol. Biol. 74, 365e372. Kramer, G., Paul, A., Kreusch, A., Schuler, S., Wiederanders, B., Schilling, K., 2007. Optimized folding and activation of recombinant procathepsin L and S produced in Escherichia coli. Protein Expr. Purif. 54, 147e156. Laemmli, U.K., 1970. Cleavage of structural proteins during assembly of head of bacteriophage-T4. Nature 227, 680. Liebensteiner, M.G., Pinkse, M.W.H., Schaap, P.J., Stams, A.J.M., Lomans, B.P., 2013. Archaeal (Per)Chlorate reduction at high temperature: an interplay of biotic and abiotic reactions. Science 340, 85e87. Liu, X.W., Inbar, Y., Dorrestein, P.C., Wynne, C., Edwards, N., Souda, P., Whitelegge, J.P., Bafna, V., Pevzner, P.A., 2010. Deconvolution and database search of complex tandem mass spectra of intact proteins. Mol. Cell. Proteomics 9, 2772e2782. Ludwig, M., Alberti, G., 1988. Digestion in spiders - histology and fine-structure of the midgut gland of coelotes-terrestris (Agelenidae). J. Submicrosc. Cytol. Pathol. 20, 709e718. Ludwig, M., Alberti, G., 1990. Peculiarities of arachnid midgut glands. Acta Zool. Fenn. 255e259. Mommsen, T.P., 1978. Digestive enzymes of a spider (tegenaria-atrica koch) .1. general remarks, digestion of proteins. Comp. Biochem. Physiol. A-Physiol. 60, 365e370. Muth, T., Vaudel, M., Barsnes, H., Martens, L., Sickmann, A., 2010. XTandem Parser: an open-source library to parse and analyse X!Tandem MS/MS search results. Proteomics 10, 1522e1524.

Nomura, T., Fujishima, A., Fujisawa, Y., 1996. Characterization and crystallization of recombinant human cathepsin L. Biochem. Biophys. Res. Commun. 228, 792e796. Petersen, T.N., Brunak, S., von Heijne, G., Nielsen, H., 2011. SignalP 4.0: discriminating signal peptides from transmembrane regions. Nat. Methods 8, 785e786. Pimenta, D.C., Oliveira, A., Juliano, M.A., Juliano, L., 2001. Substrate specificity of human cathepsin D using internally quenched fluorescent peptides derived from reactive site loop of kallistatin. Biochim. Biophys. Acta-Protein Struct. Mol. Enzym. 1544, 113e122. Ramos, C.R.R., Abreu, P.A.E., Nascimento, A., Ho, P.L., 2004. A high-copy T7 Escherichia coli expression vector for the production of recombinant proteins with a minimal N-terminal his-tagged fusion peptide. Braz. J. Med. Biol. Res. 37, 1103e1109. Rawlings, N.D., Salvesen, G., 2013. Handbook of Proteolytic Enzymes, third ed. Elsevier Science Publishing Co Inc., United States. Renard, G., Garcia, J.F., Cardoso, F.C., Richter, M.F., Sakanari, J.A., Ozaki, L.S., Termignoni, C., Masuda, A., 2000. Cloning and functional expression of a Boophilus microplus cathepsin L-like enzyme. Insect Biochem. Mol. Biol. 30. Renard, G., Lara, F.A., de Cardoso, F.C., Miguens, F.C., Dansa-Petretski, M., Termignoni, C., Masuda, A., 2002. Expression and immunolocalization of a Boophilus microplus cathepsin L-like enzyme. Insect Mol. Biol. 11, 325e328. Rozman, J., Stojan, J., Kuhelj, R., Turk, V., Turk, B., 1999. Autocatalytic processing of recombinant human procathepsin B is a bimolecular process. FEBS Lett. 459, 358e362. Sambrook, J., Russell, D.W., 2001. Molecular Cloning: a Laboratory Manual, third ed. Cold Spring Harbor Laboratory Press, New York, United States. Sanggaard, K.W., Bechsgaard, J.S., Fang, X.D., Duan, J.J., Dyrlund, T.F., Gupta, V., Jiang, X.T., Cheng, L., Fan, D.D., Feng, Y., Han, L.J., Huang, Z.Y., Wu, Z.Z., Liao, L., Settepani, V., Thogersen, I.B., Vanthournout, B., Wang, T., Zhu, Y.B., Funch, P., Enghild, J.J., Schauser, L., Andersen, S.U., Villesen, P., Schierup, M.H., Bilde, T., Wang, J., 2014. Spider genomes provide insight into composition and evolution of venom and silk. Nat. Commun. 5. Santamaria, M.E., Hernandez-Crespo, P., Ortego, F., Grbic, V., Grbic, M., Diaz, I., Martinez, M., 2012. Cysteine peptidases and their inhibitors in Tetranychus urticae: a comparative genomic approach. BMC Genomics 13. Searle, B.C., 2010. Scaffold: a bioinformatic tool for validating MS/MS-based proteomic studies. Proteomics 10, 1265e1269. Smith, D.J., Maggio, E.T., Kenyon, G.L., 1975. Simple alkanethiol groups for temporary blocking of sulfhydryl groups of enzymes. Biochemistry 14, 766e771. Smith, P.K., Krohn, R.I., Hermanson, G.T., Mallia, A.K., Gartner, F.H., Provenzano, M.D., Fujimoto, E.K., Goeke, N.M., Olson, B.J., Klenk, D.C., 1985. Measurement of protein using bicinchoninic acid. Anal. Biochem. 150, 76e85. Stephens, A., Rojo, L., Araujo-Bernal, S., Garcia-Carreno, F., Muhlia-Almazan, A., 2012. Cathepsin B from the white shrimp Litopenaeus vannamei: cDNA sequence analysis, tissues-specific expression and biological activity. Comp. Biochem. Physiol. B-Biochem. Mol. Biol. 161, 32e40. Tamura, K., Stecher, G., Peterson, D., Filipski, A., Kumar, S., 2013. MEGA6: molecular evolutionary genetics analysis Version 6.0. Mol. Biol. Evol. 30, 2725e2729. Terra, W.R., Ferreira, C., 2012. Biochemistry and molecular biology of digestion. In: Gilbert, L.I. (Ed.), Insect Molecular Biology and Biochemistry. Academic Press, London, pp. 355e418. Thompson, J.D., Higgins, D.G., Gibson, T.J., 1994. CLUSTAL-W - improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673e4680. Turk, B., Dolenc, I., Lenarcic, B., Krizaj, I., Turk, V., Bieth, J.G., Bjork, I., 1999. Acidic pH as a physiological regulator of human cathepsin L activity. Eur. J. Biochem. 259, 926e932. Zhang, Y., Frohman, M.A., 2000. Using rapid amplification of cDNA ends (RACE) to obtain full-length cDNAs. In: Nucleic Acid Protocols Handbook, pp. 267e288. Zybailov, B., Mosley, A.L., Sardiu, M.E., Coleman, M.K., Florens, L., Washburn, M.P., 2006. Statistical analysis of membrane proteome expression changes in Saccharomyces cerevisiae. J. Proteome Res. 5, 2339e2347.

Cysteine cathepsins as digestive enzymes in the spider Nephilengys cruentata.

Cysteine cathepsins are widely spread on living organisms associated to protein degradation in lysosomes, but some groups of Arthropoda (Heteroptera, ...
1MB Sizes 2 Downloads 6 Views