Journal of Microbiological Methods 98 (2014) 76–83

Contents lists available at ScienceDirect

Journal of Microbiological Methods journal homepage: www.elsevier.com/locate/jmicmeth

Extracellular protein biomarkers for the characterization of enterohemorrhagic and enteroaggregative Escherichia coli strains Rabih E. Jabbour a, Samir V. Deshpande b, Patrick E. McCubbin c, James D. Wright a, Mary Margaret Wade a, A. Peter Snyder a,⁎ a b c

Research and Technology Directorate, Edgewood Chemical Biological Center, US Army Aberdeen Proving Ground, MD 21010, United States Science and Technology Corporation Edgewood, MD 21040, United States Optimetrics, Inc., Abingdon, MD 21009, United States

a r t i c l e

i n f o

Article history: Received 10 September 2013 Received in revised form 18 December 2013 Accepted 20 December 2013 Available online 31 December 2013 Keywords: Electrospray Liquid chromatography–tandem mass spectrometry Proteomics Enterohemorrhagic E. coli Enteroaggregative E. coli Extracellular proteins

a b s t r a c t The extracellular proteins (ECPs) of enterohemorrhagic Escherichia coli (EHEC) can cause hemorrhagic colitis which may cause life threatening hemolytic-uremic syndrome, while that of enteroaggregative E. coli (EAEC) can clump to intestinal membranes. Liquid chromatography–electrospray ionization-tandem mass spectrometry based proteomics is used to evaluate a preliminary study on the extracellular and whole cell protein extracts associated with E. coli strain pathogenicity. Proteomics analysis, which is independent of genomic sequencing, of EAEC O104:H4 (unsequenced genome) identified a number of proteins. Proteomics of EHEC O104:H4, causative agent of the Germany outbreak, showed a closest match with E. coli E55989, in agreement with genomic studies. Dendrogram analysis separated EHEC O157:H7 and EHEC/EAEC O104:H4. ECP analysis compared to that of whole cell processing entails few steps and convenient experimental extraction procedures. Bacterial characterization results are promising in exploring the impact of environmental conditions on E. coli ECP biomarkers with a few relatively straightforward protein extraction steps. Published by Elsevier B.V.

1. Introduction The US Government has initiated efforts for the detection and identification of biological threats in the Defense Advanced Research Projects Agency programs that explore the “detect-to-protect” and “detect-to-treat” paradigms (National Research Council, 2005; Demirev et al., 2005). These initiatives cover areas of general health risk, bioterrorism utility, Homeland Security, agricultural monitoring, food safety, and biological warfare agents in battlefield situations (Demirev et al., 2005). Some of the health concerns include food contamination outbreaks that affect the military and civilian populations and also the transmission from abroad to the USA. An example of the latter is the fatal enteroaggregative Escherichia coli (EAEC) O104:H4 outbreak that occurred in Germany in 2011, which infected citizens from sixteen different nations including the USA (http://www.euro.who.int/en/whatwe-do/health-topics/emergencies/international-health-regulations/ news/news/2011/07/outbreaks-of-e.-coli-o104h4-infection-update30; Perna et al., 2001; European Food Safety Authority, 2011; Mellmann et al., 2011).

⁎ Corresponding author at: Building E3160, ECBC, Edgewood Area, Aberdeen Proving Ground, MD 21010, United States. E-mail address: [email protected] (A.P. Snyder). 0167-7012/$ – see front matter. Published by Elsevier B.V. http://dx.doi.org/10.1016/j.mimet.2013.12.017

The O104:H4 strain exhibits characteristics of enterohemorrhagic E. coli and enteroaggregative E. coli (EHEC and EAEC, respectively). EHEC strains can cause hemorrhagic colitis (bloody diarrhea or hemolytic uremic syndrome), and EAEC produces clumping of the bacteria on intestinal membranes. EHEC and EAEC cause diseases in humans by their presence in food and water matrices. Infection in host cells is by attaching and effacing mechanisms. These occur by the pathogen secreting various proteins that compromise the cytoskeleton of the host cell (Frankel et al., 1998). EHEC and EAEC pathogens show different responses to antibiotics, and their human pathogenicity is enhanced with an antibiotic regimen. Studies have reported the differences in the number and nature of the secreted proteins (secretome) between EHEC and EAEC (Deng et al., 2012). Development of protein/peptide identification techniques between EHEC and EAEC is desirable to provide effective medical countermeasures. The term secretome formally includes proteins and enzymes that are actively and deliberately released (secreted) from the cellular milieu and into the extracellular medium. However, it is routine to find proteins in the extracellular medium that normally are not secreted (Li et al., 2004; Nandakumar et al., 2006). In addition to active bacterial secretion, proteins may exit the cell from proteolytic cleavage, leaking, or shedding of membranes due to their relative localization (Meissner et al., 2013; Nandakumar et al., 2006). These mechanisms can occur without cellular lysis, thus the presence of intracellular-located proteins

R.E. Jabbour et al. / Journal of Microbiological Methods 98 (2014) 76–83

in the extracellular medium. This results in ‘contamination’ of the extracellular medium by the cytoplasmic or other normally non-secreted proteins following cell lysis and cell death (Zheng et al., 2013). These reports show that likelihood of finding non-secreted proteins is significant in a secretome study; therefore, we use the term extracellular proteins (ECPs) to include the true secretome and the other proteins that were not deliberately secreted from the bacterium. The present study sought to determine whether MS-based proteomics could be used to distinguish between EHEC and EAEC strains with respect to their extracellular and whole cell protein compositions. A contrast between the two types of proteins is that there are inherently fewer protein experimental extraction and separation steps from extracellular origin than that of the whole cell lysate. An advantage is that the ECP fraction facilitates the liquid chromatography–electrospray ionization-tandem mass spectrometry (LC–ESI-MS–MS) discovery of the low amounts and numbers of proteins. In addition, the use of MSbased proteomics is useful in characterizing and identifying biological agents without a known genome sequence (Jabbour et al., 2010a) as is the case with the EHEC/EAEC O104:H4 strain. 2. Methods 2.1. EHEC and EAEC strain preparation In the present study the pathogenic E. coli strains used were O157: H7 (EHEC) and O104:H4. The cultures were prepared by streaking cells from cryopreserved stocks onto tryptic soy broth (TSB) and incubated at 37 °C with shaking at 180 rpm overnight until a stationary growth phase was observed. After incubation, bacterial cells were harvested by centrifugation and the viable cell density was determined via serial dilution and plating onto tryptic soy agar (TSA). Three biological replicates for each strain were cultivated as mentioned above. 2.2. Isolation of the secreted proteins The harvested cells were pelleted by centrifugation at 2300 revolution centrifugal force (RCF) for 30 min and the supernatant was immediately separated into 30 mL aliquots. The supernatants were filtered

77

using 0.22 μm hollow fiber dialysis filters to ensure no large particulates or cellular debris was present in the samples. Pelleted and supernatant samples were frozen at −70 °C until further processing. 2.3. Processing of secreted and whole cell proteins The whole cell samples were lysed using the bead beating technique (30 s on, 10 s off for three min duration). The lysates were centrifuged at 14,100 ×g for 30 min to remove cellular debris and large particulates. The supernatant from the whole cell lysates and the filtered ECP samples were loaded separately onto PALL MW-3 kDa filter units (Ann Arbor, MI) and centrifuged at 14,100 ×g for 30 min. The effluents were discarded and the filter membranes were washed with 100 mM ammonium bicarbonate (ABC) and centrifuged for 20 min at 14,100 ×g. Proteins from the whole cell and secretome fractions were denatured by adding 8 M urea and 30 mg/mL dithiothreitol (DTT) to the filter and incubating for an hour at 40 °C. The tubes were then centrifuged at 14,100 ×g for 40 min and washed three times using 150 mL of 100 mM ABC solution. On the last wash, ABC was allowed to sit on the membrane for 20 min while shaking, followed by centrifugation at 14,100 ×g for 40 min. The filter units were then transferred to new receptor tubes and the proteins were digested with 5 μL trypsin in 240 μL of ABC solution + 5 μL acetonitrile (ACN). Proteins were digested overnight at 37 °C on an orbital shaker set to 90 rpm. Sixty microliters of 5% ACN/0.5% formic acid (FA) was added to each filter to quench the trypsin digestion followed by 2 min of vortexing. The tubes were centrifuged for 10 min at 14,100 ×g. An additional 60 mL of 5% ACN/0.5% FA mixture was added to the filter and centrifuged. The effluents were then analyzed using LC–ESI-MS–MS. 2.4. LC–MS/MS analysis of tryptic peptides The tryptic peptides were separated using a capillary Hypersil C18 column (300 Å, 5 μm, 0.1 mm i.d. × 100 mm) by using the Surveyor LC from ThermoFisher (San Jose, CA 95101). The elution was performed using a linear gradient from 98% A (0.1% FA in water) and 2% B (0.1% FA in ACN) to 60% B over 60 min. This is followed by 20 min of isocratic elution at a flow rate of 200 μL/min with a split flow mode in which

Fig. 1. Histogram representing the output of the matrix of degenerate and unique peptides identified for E. coli sakai from ABOID. Peptides were extracted at a 95% confidence level. ECOL, E. coli; SDYS, Shigella dysenteriae; SBOY, S. boydii; SFLE, S. flexneri; SSON, and S. sonnei.

78

R.E. Jabbour et al. / Journal of Microbiological Methods 98 (2014) 76–83

Table 1 Partial listing of extracellular proteins from replicate analyses of identified peptides from the E. coli O157:H7 strain. Accession numbera

Protein name

Highest hit

Process

Function

Component

AP002538.1 AP003849.1 NP_288384.1

Flagellar filament structural protein DNA-binding transcriptional dual regulator Flagellin

EC O157:H7 EC O157:H7 EC O157:H7

Ciliary or flagellar motility Binding Ciliary or flagellar motility

ND Transcription Structural molecule activity

Bacterial-type flagellum hook ND Bacterial-type flagellum filament

a

Accession numbers as of September, 2012. Proteins identified by strain-unique peptides.

between 0.8 and 1.2 μL/min are diverted into the ESI source. The resolved peptides were electrosprayed into a linear ion trap mass spectrometer (LTQ-XL, Thermo Scientific, San Jose, CA 95101). Product ion mass spectra were obtained in the data dependent acquisition mode that consisted of a survey scan over the m/z range of 400–2000 followed by seven scans on the most intense precursor ions activated for 30 ms by an excitation energy level of 35%. A dynamic exclusion was activated for 3 min after the first MS–MS spectrum acquisition for a given ion. Uninterpreted product ion mass spectra were searched against a microbial database with TurboSEQUEST (Bioworks 3.1, Thermo Scientific, San Jose, CA 95101) followed by application of an in-house proteomic algorithm for bacterial identification. 2.5. Protein database and database search engine A protein database was constructed in a FASTA format using the annotated bacterial proteome sequences derived from fully sequenced chromosomes of all available E. coli strains. There were 54 documented strains as of September 2012 with 13 EHEC and five EAEC strains. A PERL program (http://www.activestate.com/ActivePerl; accessed September 2013) was written to download these sequences automatically from the National Institutes of Health National Center for Biotechnology (NCBI) site (http://www.ncbi.nlm.nih.gov; accessed September 2013). Of the two E. coli strains investigated, O104:H4 does not have its genome sequenced. Each database protein sequence was supplemented with information about a source organism and a genomic position of the respective open reading frame (ORF) embedded into a header line. The database of the E. coli bacterial proteome was constructed by translating putative protein-coding genes and consists of a few million amino acid sequences of potential tryptic peptides obtained by the in-silico digestion of all proteins (allowing up to two missed cleavages). The experimental MS–MS spectral database of bacterial peptides was searched using the SEQUEST algorithm against a constructed proteome database of microorganisms. The SEQUEST thresholds for searching the product ion mass spectra of peptides were Xcorr, deltaCn, Sp, RSp, and deltaMpep. The top peptide hits generated by SEQUEST were filtered with a relative correlation score ΔCn N 0.1 and the filtered hits were accepted as peptide identifications when their correlation scores (Xcorr) were higher than the thresholds that allowed generating a desired FDR value (Peng et al., 2003). The peptide validation was performed using the PeptideProphet algorithm (Keller et al., 2002) to provide probabilities score on the peptide assignment to the MS/MS spectra. Peptide sequences with a probability score of 95% and higher were retained in the dataset and used to generate a binary matrix of sequence-to-bacterium (STB) assignments. The binary matrix assignment was populated by matching the peptides with corresponding

proteins in the database and assigning a score of 1. A score of 0 was assigned for a non-match. The column in the binary matrix represents the proteome of a given E. coli strain, and each row represents a tryptic peptide sequence from the LC–MS-MS analysis. A protein is identified as present when it is found in two or more of the three biological replicates that were analyzed for each E. coli strain. Analyzed samples were matched with the E. coli strains based on the number of unique peptides that remained after further filtering of degenerate peptides from the binary matrix. Verification of the classification and identification of candidate microorganisms was performed through hierarchical clustering analysis and taxonomic classification. The in-house developed software ABOID (Jabbour et al., 2010b; Deshpande et al., 2011) transformed the results of searching the product ion mass spectra of peptide ions against a custom protein database, which was downloaded from NCBI with the commercial software SEQUEST into a taxonomically meaningful and easy to interpret output. It calculated probabilities that peptide sequence assignment to a product ion mass spectrum was correct and used accepted spectrum-tosequence matches to generate an STB binary matrix of assignments. Validated peptide sequences, differentially present or absent in various strains (STB matrices), were visualized as assignment bitmaps and analyzed by ABOID that used phylogenetic relationships among E. coli strains as part of the decision tree process. The ABOID bacterial classification and identification algorithm uses assignments of organisms to taxonomic groups (phylogenetic classification) based on an organized scheme that begins at the phylum level and follows through classes, orders, families, and genus down to the strain level. ABOID was developed in-house using PERL, MATLAB, and Microsoft Visual Basic. 3. Results and discussion 3.1. ABOID algorithm output An example of an ABOID output for the LC–ESI-MS–MS analyses of bacterial protein digests using bioinformatics to process the peptide sequence information can be found in Figure 1 of Jabbour et al., 2010a. The top section lists the identified proteins and their corresponding bacterium match. The middle section presents the binary matrix for the STB search matching, and the total number of unique proteins identified for a given bacterium. The lower section of Figure 1 in Jabbour et al. (2010a) represents the bacterial identification histogram. Fig. 1 herein shows an output of ABOID in histogram form. This graph is generated by plotting the number of peptides which match that in the proteome of every microorganism in the database. The y-axis represents the number of peptides matched at 95% confidence level for all strains on the x-axis. Each line represents a bacterial strain, and the number

Table 2 Extracellular proteins from replicate analyses of identified peptides from the E. coli O104:H4 strain. Accession #

Protein name

Highest hit

Process

Function

Component

YP_003223560.1 YP_001463426.1 YP_002292692.1 YP_003229309.1 YP_541664.1 NP_286019.1

Secreted autotrans-porter serine protease Multidrug efflux system subunit MdtA Conserved hypothetical protein Putative DNA primase DNA-binding protein Hypotheti-cal protein

EC O103:H2 EC O139:H28 EC SE11 EC O26:H11 EC UTI89_C2667 EC O157:H7

Proteolysis Transport ND ND Nitrogen utilization Lipoprotein metabolic process

Serine-type endopeptidase activity Transporter activity ND ND DNA binding Lipase/hydrol-ase activities

Peptidase activity Plasma membrane ND ND ND Lipid particle

Accession numbers as of September, 2012. Proteins identified by strain-unique peptides.

R.E. Jabbour et al. / Journal of Microbiological Methods 98 (2014) 76–83

above each line indicates the number of experimental peptides whose sequences matched in the respective bacterial strain's proteome. These organisms span members of E. coli strains and Shigella species where ECOL, E. coli; SDYS, Shigella dysenteriae; SBOY, Shigella boydii; SFLE, Shigella flexneri; SSON, and Shigella sonnei. Seven of the 25 peptides are degenerate, because they can be found in more than one organism proteome. However, it is observed that E. coli sakai rises higher than the rest of the bacteria due to the remaining 18 unique peptides (Fig. 1). These 18 peptides distinguish the E. coli sakai strain from the seven degenerate peptides. The experimentally-determined E. coli sakai in the histogram matched the double-blind bacterial sample. 3.2. Determination of common proteins using the ECP fraction for EHEC and EAEC strains EHEC and EAEC strains O157:H7 and O104:H4, respectively, were analyzed by MS proteomics to determine the ECPs. Tables 1 and 2 list the ECPs that were consistently obtained from three analyses of the O157:H7 and O104:H4 strains, respectively. These analyses originated from three separate TSB preparations for each strain. The BLAST procedure was used for the peptides that identified the proteins in Tables 1 and 2. Protein matching was performed using the uniprotKB database (http://www.uniprot.org/help/uniprotKB). The uniprotKB is a nonredundant database that includes all sequenced microbes and provides biological ontologies, classifications and cross-references, cellular processes, and biochemical function for each protein. One or more peptides were used to identify a protein in the in-house database (Wang et al., 2010; Dworzanski et al., 2010). Further, one or more peptides were used to characterize one strain of E. coli (Dworzanski et al., 2006; Dworzanski et al., 2004; Dworzanski and Snyder, 2005). In Table 1, the three listed proteins have the highest hit/correlation with O157: H7 and two of the three proteins feature cellular functionality related

79

to the flagellum filament. The dominant flagellar functions are often observed with EHEC bacteria as the responsible pathogenic factors (Frankel et al., 1998). This agreement between the genomics and proteomics studies showed that the MS-proteomics approach may be used as an effective complement to genomic-based techniques. On the other hand, the data showed that the proteins identified were consistent regardless of the database used. For example, utilization of the in-house database that includes only E. coli strains resulted in the same match as that found in the uniprotKB database which includes all sequenced bacteria. Table 1 represents the output of the uniprotKB database analyses for the proteins identified in the ECP fraction of the O157:H7 strain of E. coli. The proteins were first identified using ABOID, and then the uniprotKB database determined the non-redundant matching and cellular functions and processes. The E. coli O104:H4 strain is not included in either database, because its genomic sequence has not been performed. The O157:H7 strain is the highest, consistent match in Table 1 and represents the closest (and correct) match between the studied strain (0157:H7) and that of bacterial strains in the uniprotKB database. All three matches consist of EHEC E. coli O157:H7 strain. A different set of circumstances exists for the experimental O104:H4 strain of E. coli. As mentioned, this organism is not present in any on-line database because its genome has not been sequenced. A proteome analysis is outlined in Table 2. Only one of the six matches was with the O157:H7 strain which indicates that O104:H4 is not closely related to EHEC strains. Also, the proteins consistently identified from replicate experimental O104:H4 analyses were diverse in cellular function compared to those of O157:H7. UniprotKB database examination of the cellular functions on the O104:H4 common proteins revealed the potential cellular functionality of the tryptic peptides. The uniprotKB cellular function tools utilize solid thick colored lines (see Fig. 2) to represent the different cellular functions for each active site in a given protein.

secreted auto transporter serine protease YP_003223560.1

Fig. 2. UniprotKB cellular function identification tool, InterProScan, for the autotransporter serine protease consistently identified in the ECP fractions of E. coli O104:H4 replicate experiments. The dotted oval represents the peptides identified from LC–MS/MS analyses. This region confers verotoxicity.

80

R.E. Jabbour et al. / Journal of Microbiological Methods 98 (2014) 76–83

For example, the tryptic peptides that correspond to the identified extracellular autotransporter serine protease were located in the region of the protein that constitutes the virulence function as shown in Fig. 2. The dotted oval represents the region of the identified peptides for the serine protease for the ECP fraction of the O104:H4 strain, and it is this region that confers verotoxicity. The relatively low number of proteins listed in Tables 1 and 2 is a portion of the 74 and 54 identified proteins, respectively, from their extracellular mediums. The majority of proteins were identified as intracellular vs. the few genuine secreted proteins from the uniprotKB database. These intracellular proteins likely were in the supernatant due to cell death such as from lyase autoproteolysis, cell lysis, and the other mechanisms as outlined in the Introduction. Xia et al. (2008) however, list 83 extracellular proteins from two E. coli strains of which

the majority are known to be of intracellular origin. The experimental method (Xia et al., 2008) included the usual protein extraction, purification, and isolation procedures. 3.3. Effect of cellular fraction on the differentiation of EHEC O157:H7 Whole cell and extracellular fractions from O157:H7 were analyzed by LC–MS-MS followed by ABOID data processing. The samples were correctly established as the O157:H7 strain but with more ambiguity using the whole cell fraction compared to the extracellular one. A nearest neighbor analysis, using the Euclidean distance single linkage approach, showed 100% similarity (linkage distance) between the analyzed ECPs (Fig. 3a) and nearest neighbors in the database. There was approximately 92% similarity (8% dissimilarity) for O157:H7 between

(a) Escherichia coli O157:H7 str. TW14359Escherichia coli O157:H7 str. SakaiEscherichia coli O157:H7 EDL933Escherichia coli O157:H7 str. EC4115Unknown_Sample Escherichia coli O55:H7 str. CB9615Escherichia coli- UTI89 Escherichia coli S88Shigella boydii Sb227Shigella boydii CDC 3083-94Escherichia coli- W3110 Escherichia coli SE11Escherichia coli str. K-12 substr. MG1655Escherichia coli E24377AEscherichia coli SMS-3-5Escherichia coli UMN0260.00

0.02

0.04

0.06

0.08

0.10

Linkage Distance

(b) Escherichia coli O157:H7 str. SakaiEscherichia coli O157:H7 EDL933Escherichia coli O157:H7 str. TW14359Escherichia coli O157:H7 str. EC4115Escherichia coli O55:H7 str. CB9615Unknown_Sample Escherichia coli O103:H2 str. 12009Escherichia coli O111:H- str. 11128Escherichia coli O26:H11 str. 11368Escherichia coli SE11Escherichia coli ATCC 8739Escherichia coli 55989Escherichia coli E24377AEscherichia coli IAI1Escherichia coli str. K-12 substr. MG1655Escherichia coli- W3110 0.00

0.02

0.04

0.06

0.08

0.10

0.12

0.14

Linkage Distance Fig. 3. Euclidean distance single linkage dendrograms for the nearest neighbor classification of EHEC E. coli O157:H7 from the (a) extracellular and (b) whole cell fractions.

R.E. Jabbour et al. / Journal of Microbiological Methods 98 (2014) 76–83

81

(a) Escherichia coli 55989Unknown_Sample Escherichia coli HSEscherichia coli IAI1Escherichia coli SE11Escherichia coli BW2952Escherichia coli K 12 substr DH10BEscherichia coli SMS 3 5Escherichia coli K 12 substr MG1655Escherichia coli O55 H7 CB9615Escherichia coli O111 H 11128Escherichia coli O26 H11 11368Escherichia coli DH1Escherichia coli K 12 substr W3110Escherichia coli SE15Escherichia coli UMN0260.00

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

Linkage Distance

(b) Escherichia coli 55989Unknown_Sample Escherichia coli O157:H7 str. TW14359Escherichia coli O157:H7 str. EC4115Escherichia coli O157:H7 str. SakaiEscherichia coli O157:H7 EDL933Escherichia coli HSEscherichia coli O55:H7 str. CB9615Escherichia coli SE11Escherichia coli- W3110 Escherichia coli str. K-12 substr. MG1655Escherichia coli BW2952Escherichia coli IAI1Escherichia coli O103:H2 str. 12009Escherichia coli O26:H11 str. 11368Escherichia coli O111:H- str. 111280.00

0.01

0.02

0.03

0.04

0.05

0.06

0.07

Linkage Distance Fig. 4. Euclidean distance single linkage dendrograms for the nearest neighbor classification of pathogenic E. coli O104:H4 from the (a) extracellular and (b) whole cell fractions.

the whole cell fraction (Fig. 3b) and database O157:H7. This similarity difference between whole cell and extracellular protein fractions could be attributed to a number of factors. It is noted that the similarity index used herein measures how close an organism's virtual protein matches that protein derived from the experimental peptides. The similarity index is not a reflection of overall protein sequence similarity. There is the possibility of more strain-unique peptides from the ECP fraction than that of the whole cell. Also the sheer abundance of proteins in the whole cell extract may hinder the ability of the LC–ESI-MS–MS system to distinguish the strain-unique peptides compared to that from the significantly lower abundant ECPs. The E. coli strain proteins in the whole cell fraction constitute a large number of relatively conserved proteins. Such types of protein may result in less differentiation than that of the ECPs, which does not include highly

expressed and conserved proteins. This difference in types of proteins from the two fractions is reflected in the taxonomic classification as shown in Fig. 3a, b. Further, ECP analysis is a simple and relatively rapid experimental method which is capable of enhancing characterization information with respect to the whole cell protein extract procedure. 3.4. Effect of cellular fraction on the differentiation of EHEC/EAEC O104:H4 Whole cell and extracellular protein fractions from E. coli O104:H4 were analyzed by MS proteomics and ABOID. This E. coli strain does not have its genome sequenced; hence, it is not resident in a public repository. The characterization of the sample was correctly established. Nearest neighbor analysis, using the Euclidean distance single linkage

82

R.E. Jabbour et al. / Journal of Microbiological Methods 98 (2014) 76–83

approach, showed that the identified set of proteins had a closest match with the database E. coli 55989 for both the extracellular (Fig. 4a, 97% similarity) and whole cell (Fig. 4b, 98% similarity) protein fractions. EAEC 55989 was originally isolated from the diarrheagenic stools of an HIV-positive adult suffering from persistent watery diarrhea in the Central African Republic in 2002. The EAEC strains form aggregates as their name suggests, and are an emerging cause of gastroenteritis (http://hamap.expasy.org/proteomes/ECO55.html). The proteomics LC–ESI-MS–MS classification of O104:H4 agrees with genomic sequencing of the E. coli that was the causative agent in the deadly outbreak in Germany in 2011 (Kupferschmidt, 2011). The genomic sequencing of O104:H4 showed 93% genomic similarity (Kupferschmidt, 2011) to EAEC 55989 and suggests that this strain is more of a hybrid clone between 55989 and ancestor O104:H4. This new strain from genomic classification showed to be distant from EHEC strains including O157:H7 which is a culprit in food contamination outbreaks (Mellmann et al., 2011). The genomic studies (93% similarity) provide a strong support for proteomic identification of the strains (97–98% similarity) and in the agreement of the phylogenetic classification for EAEC O104:H4. The utilization of proteomics-based discrimination and phylogenetic classification of the E. coli strains from their ECP fractions showed that this approach is an effective and reliable complementary approach to whole genome sequencing and optical genetic mapping techniques. Moreover, a recent study on the pathogenicity mechanism of O104:H4 showed that this E. coli strain displays verotoxicity to host cells which is a characteristic of EAEC strains (Al Safadi et al., 2012). Fig. 2 shows that the experimental peptides that identified the autotransporter serine protease are found in the verotoxicity region (dotted oval) of the protein. In addition, the E. coli O104:H4 whole cell protein extract analysis supports the utility of the ECP approach. Although the proteomics classification showed strain level classification, each strain did not show close relation to the others. This observation is important to support the findings reported in genomic studies that those strains are different in their protein expression (Kupferschmidt, 2011; Al Safadi et al., 2012). 4. Conclusions The utility of ECPs as biomarkers for characterization of EHEC and EAEC E. coli strains is useful, practical, and uses few and efficient experimental procedures. Differentiation among two EHEC and EAEC strains was improved using ECPs as biomarkers. In addition, very few experimental procedures are necessary in order to capture and detect ECP biomarkers. The ECPs provide a source of cellular variability that was not observed when compared to whole cell lysates. The genomic studies on the studied strains showed agreement in the MS proteomics-based classification of the non-database O104:H4 strain, i.e., determined using the MS-based proteomics approach. Such agreement needs to be further examined through a larger set of E. coli strains and under various environmental conditions to verify the effectiveness of the utilized approach. In addition, such studies once validated could increase the confidence in identifying microbes to the strain level at early stages of outbreaks using protein biomarkers so as to enhance medical countermeasures and diagnostics. MS-MS-based proteomics and bioinformatics were shown to have utility in the comparative proteomics study for the differentiation of EHEC strains. This resulted in different degrees of separation between the correctly determined database organism and the next nearest neighbor organism(s). Classification and identification of an organism that is not in the genome database and not using a genome sequencing technique are possible as with the case of O104:H4. Such properties will allow the utilization of this MS-based proteomics approach to infer taxonomic classification based on the depth of available genomic

sequencing information. The MS-proteomics approach for the ECPs has the potential to characterize emerging and unknown microbes and aid in genomic sequencing analyses. Acknowledgments The authors wish to thank the US Army Edgewood Chemical Biological Center for financial support of this work under the Research and Technology (R&T) Directorate 2013 In-House Laboratory Independent Research (ILIR) Basic Research portfolio and Dr. Augustus Way Fountain III for management of this program. Supporting information available: This material is available free of charge via the Internet at http://pubs.acs.org. References Al Safadi, R., Abu-Ali, G.S., Sloup, R.E., Rudrik, J.T., Waters, C.M., Eaton, K.A., Manning, S.D., 2012. Correlation between in vivo biofilm formation and virulence gene expression in Escherichia coli O104:H4. PLoS One 7 (7), e41628. http://dx.doi.org/10.1371/ journal.pone.0041628. Demirev, P.A., Feldman, A.B., Lin, J.S., 2005. Chemical and biological weapons: current concepts for future defenses. Johns Hopkins APL Tech. Digest 26, 321–333. Deng, W., Yu, H.B., de Hoog, C.L., Stoynov, N., Li, Y., Foster, L.J., Finlay, B.B., 2012. Quantitative proteomic analysis of type III secretome of enteropathogenic Escherichia coli reveals an expanded effector repertoire for attaching/effacing bacterial pathogens. Mol. Cell. Proteomics 11, 692–709. Deshpande, S.V., Jabbour, R.E., Snyder, A.P., Stanford, M., Wick, C.H., Zulich, A.W., 2011. A software for automated identification and phyloproteomics classification of tandem mass spectrometry data. J. Chromatogr. Separation Techniques S5, 001. http:// dx.doi.org/10.4172/2157-7064. Dworzanski, J.P., Snyder, A.P., 2005. Classification and identification of bacteria using mass spectrometry-based proteomics. Expert Rev. Proteomics 2, 863–878. Dworzanski, J.P., Snyder, A.P., Chen, R., Zhang, H., Wishart, D., Li, L., 2004. Identification of bacteria using tandem mass spectrometry combined with a proteome database and statistical scoring. Anal. Chem. 76, 2355–2366. Dworzanski, J.P., Deshpande, S.V., Chen, R., Jabbour, R.E., Snyder, A.P., Wick, C.H., Li, L., 2006. Mass spectrometry-based proteomics combined with bioinformatic tools for bacterial classification. J. Proteome Res. 5, 76–87. Dworzanski, J.P., Dickinson, D.N., Deshpande, S.V., Snyder, A.P., Eckenrode, B.A., 2010. Discrimination and phylogenetic classification of Bacillus anthracis–cereus–thuringiensis strains based on LC–MS/MS analysis of whole cell protein digests. Anal. Chem. 82, 145–155. European Food Safety Authority (EFSA), 2011. Joint EFSA/ECDC Technical Report. Available: http://www.efsa.europa.eu/en/supporting/pub/166e.htm. Frankel, G., Phillips, A.D., Rosenshine, I., Dougan, G., Kaper, J.B., Knutton, S., 1998. Enteropathogenic and enterohaemorrhagic Escherichia coli: more subversive elements. Mol. Microbiol. 30, 911–921. Jabbour, R.E., Deshpande, S.V., Wade, M.M., Stanford, M.F., Wick, C.H., Zulich, A.W., Skowronski, E.W., Snyder, A.P., 2010a. Double blind characterization with nongenome sequenced bacteria by mass spectrometry-based proteomics. Appl. Environ. Microbiol. 76, 3637–3644. Jabbour, R.E., Wade, M.M., Deshpande, S.V., Stanford, M.F., Wick, C.H., Zulich, A.L., Snyder, A.P., 2010b. Identification of Yersinia pestis and Escherichia coli strains by whole cell and outer membrane protein extracts with mass spectrometry-based proteomics. J. Proteome Res. 9, 3647–3655. Keller, A., Nesvizhskii, A.I., Kolker, I., Aebersold, R., 2002. Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. Anal. Chem. 2002 (74), 5383–5392. Kupferschmidt, K., 2011. Scientists rush to study genome of lethal E. coli. Science 332 (6035), 1249–1250. Li, M., Rosenshine, I., Tung, S.L., Wang, X.H., Friedberg, D., Hew, C.L., Leung, K.Y., 2004. Comparative proteomic analysis of extracellular proteins of enterohemorrhagic and enteropathogenic Escherichia coli strains and their ihf and ler mutants. Appl. Environ. Microbiol. 70, 5274–5282. Meissner, F., Schelteme, R.A., Mollenkopf, H.-J., Mann, M., 2013. Direct proteomic quantification of the secretome of activated immune cells. Science 340 (6131), 475–478. Mellmann, A., Harmsen, D., Cummings, C.A., Zentz, E.B., Leopold, S.R., Rico, A., Prior, K., Szczepanowski, R., Ji, Y., Zhang, W., McLaughlin, S.F., Henkhaus, J.K., Leopold, B., Bielaszewska, M., Prager, R., Brzoska, P.M., Moore, R.L., Guenther, S., Rothberg, J.M., Karch, H., 2011. Prospective genomic characterization of the German enterohemorrhagic Escherichia coli O104:H4 outbreak by rapid next generation sequencing technology. PLoS ONE 6 (7), e22751. http://dx.doi.org/ 10.1371/journal.pone.0022751. Nandakumar, M.P., Cheung, A., Marten, M.R., 2006. Proteomic analysis of extracellular proteins from Escherichia coli W3110. J. Proteome Res. 5, 1155–1161. National Research Council, 2005. Sensor Systems for Biological Agent Attack. Natl. Acad. Press, Washington, DC (ISBN-10: 0-309-09576-X). Peng, J., Elias, J.E., Thoreen, C.C., Gygi, S.P., 2003. Evaluation of multidimensional chromatography coupled with tandem mass spectrometry (LC–MS/MS) for large-scale protein analysis: the yeast proteome. J. Proteome Res. 2, 43–50.

R.E. Jabbour et al. / Journal of Microbiological Methods 98 (2014) 76–83 Perna, N.T., Plunkett III, G., Burland, V., Mau, B., Glasner, J.D., Rose, D.J., Mayhew, G.F., Evans, P.S., Gregor, J., Kirkpatrick, H.A., Pósfai, G., Hackett, J., Klink, S., Boutin, A., Shao, Y., Miller, L., Grotbeck, E.J., Wayne Davis, N., Lim, A., Dimalanta, E.T., Potamousis, K.D., Apodaca, J., Anantharaman, T.S., Lin, J., Yen, G., Schwartz, D.C., Welch, R.A., Blattner, F.R., 2001. Genome sequence of enterohaemorrhagic Escherichia coli O157 H7. Nature 409, 529–533. Wang, N., Xu, M., Wang, P., Li, L., 2010. Development of mass spectrometry-based shotgun method for proteome analysis of 500 to 5000 cancer cells. Anal. Chem. 82, 2262–2272.

83

Xia, X.X., Han, M.J., Lee, S.Y., Yoo, J.S., 2008. Comparison of the extracellular proteomes of Escherichia coli B and K-12 strains during high cell density cultivation. Proteomics 8 (10), 2089–2103. Zheng, J., Ren, X., Wei, C., Yang, J., Hu, Y., Liu, L., Xu, X., Wang, J., Jin, Q., 2013. Analysis of the secretome and identification of novel constituents from culture filtrate of Bacillus Calmette-Guerin using high-resolution mass spectrometry. Mol. Cell. Proteomics 12, 2081–2095.

Extracellular protein biomarkers for the characterization of enterohemorrhagic and enteroaggregative Escherichia coli strains.

The extracellular proteins (ECPs) of enterohemorrhagic Escherichia coli (EHEC) can cause hemorrhagic colitis which may cause life threatening hemolyti...
847KB Sizes 0 Downloads 0 Views