Serum Proteomics in Multiple Sclerosis Disease Progression Helen Tremlett, Darlene L.Y. Dai, Zsuzsanna Hollander, Anita Kapanen, Tariq Aziz, Janet E. Wilson-McManus, Scott J. Tebbutt, Christoph H. Borchers, Joel Oger, Gabriela V. Cohen Freue PII: DOI: Reference:
S1874-3919(15)00082-2 doi: 10.1016/j.jprot.2015.02.018 JPROT 2070
To appear in:
Journal of Proteomics
Please cite this article as: Tremlett Helen, Dai Darlene L.Y., Hollander Zsuzsanna, Kapanen Anita, Aziz Tariq, Wilson-McManus Janet E., Tebbutt Scott J., Borchers Christoph H., Oger Joel, Cohen Freue Gabriela V., Serum Proteomics in Multiple Sclerosis Disease Progression, Journal of Proteomics (2015), doi: 10.1016/j.jprot.2015.02.018
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT Confidential
Page 1
17/11/2014
SERUM PROTEOMICS IN MULTIPLE SCLEROSIS DISEASE PROGRESSION
PT
Helen Tremlett1, Darlene L.Y. Dai2,5, Zsuzsanna Hollander2, Anita Kapanen1, Tariq Aziz,1 Janet E. Wilson-
1
SC
RI
McManus2, Scott J. Tebbutt,1,2,3 Christoph H. Borchers4, Joel Oger,1 Gabriela V. Cohen Freue2,5
Faculty of Medicine, Department of Medicine, Division of Neurology, University of British Columbia,
NU
Vancouver, BC V5Z 1M9; 2PROOF Centre of Excellence, Vancouver, BC V6Z 1Y6; 3 Department of
MA
Medicine, Division of Respiratory Medicine; 4University of Victoria Genome BC Proteomics Centre,
D
Victoria, BC V8Z 7X8; 5Department of Statistics, University of British Columbia, Vancouver, BC V6T 1Z4.
AC CE P
biomarkers; classifier
TE
Keywords: Serum proteomics; iTRAQ mass spectrometry; multiple sclerosis; MS progression;
Corresponding author:
Dr. Gabriela V. Cohen Freue
Department of Statistics, University of British Columbia 3152 Earth and Science Building 2207 Main Mall, Vancouver, BC V6T 1Z4, Canada Phone: 604-822-3710 E-mail:
[email protected] ACCEPTED MANUSCRIPT Confidential
Page 2
17/11/2014
Acknowledgements We thank the University of Victoria Genome BC Proteomics Centre, which is supported by Genome
PT
Canada and Genome BC through Science and Technology Innovation Centre funding, for conducting
RI
the iTRAQ experiments. C.H.B. holds the Don and Eleanor Rix BC Leadership Chair in Biomedical and
SC
Environmental Proteomics.
NU
We gratefully acknowledge the BC MS Clinic neurologists who contributed to the study via the BCMS
MA
database through patient examination and data collection (current members listed here): UBC MS Clinic: A. Traboulsee, MD, FRCPC (UBC Hospital MS Clinic Director and Head of the UBC MS
D
Programs); A-L. Sayao, MD, FRCPC; V. Devonshire, MD, FRCPC; S. Hashimoto, MD, FRCPC (UBC and
TE
Victoria MS Clinics); J. Hooge, MD, FRCPC (UBC and Prince George MS Clinic); L. Kastrukoff, MD, FRCPC
AC CE P
(UBC and Prince George MS Clinic); J. Oger, MD, FRCPC Kelowna MS Clinic: D. Adams, MD, FRCPC; D. Craig, MD, FRCPC; S. Meckling, MD, FRCPC Prince George MS Clinic: L. Daly, MD, FRCPC Victoria MS Clinic: O. Hrebicek, MD, FRCPC; D. Parton, MD, FRCPC; K Atwell-Pope, MD, FRCPC. We also thank Anna-Marie Bueno for her help in coordinating the data collection. The views expressed in this paper do not necessarily reflect the views of each individual acknowledged.
H.T. is funded by the Multiple Sclerosis Society of Canada (Don Paty Career Development Award); is a Michael Smith Foundation for Health Research Scholar and the Canada Research Chair for Neuroepidemiology and Multiple Sclerosis. G.C.F is funded by the Canada Research Chair and Canada Foundation for Innovation in Statistical Genomics.
ACCEPTED MANUSCRIPT Confidential
Page 3
17/11/2014
Conflict of Interest: None of the authors have any direct conflicts of interest in the research reported in this article.
PT
H.T. has received: research support from the National Multiple Sclerosis Society, Canadian Institutes
RI
of Health Research, and UK MS Trust; speaker honoraria and/or travel expenses to attend
SC
conferences from the Consortium of MS Centres (2013), the MS Society of Canada, endMS Summer School (2012, 2014), the National MS Society (2012, 2014), Bayer Pharmaceutical (speaker, 2010,
NU
honoraria declined), Teva Pharmaceuticals (speaker 2011), ECTRIMS (2011, 2012, 2013), UK MS Trust
MA
(2011), the Chesapeake Health Education Program, US Veterans Affairs (2012, honorarium declined), Novartis Canada (2012), Biogen Idec (2014, honorarium declined), American Academy of Neurologists
D
(annual meeting speaker, 2013, 2014, honorarium declined). Unless otherwise stated, all speaker
AC CE P
group.
TE
honoraria are either donated to an MS charity or to an unrestricted grant for use by her research
ACCEPTED MANUSCRIPT Confidential
Page 4
17/11/2014
Abbreviations: AUC, area under the receiver operating curve
PT
CSF, cerebrospinal fluid
RI
EN, Elastic Net
SC
FDR, false discovery rate
LIMMA, linear models for microarray analysis
MA
LOOCV, leave-one-out cross-validation
NU
iTRAQ, isobaric tagging for relative and absolute protein quantification
MALDI, matrix-assisted laser desorption ionization
D
MRM-MS, Multiple Reaction Monitoring mass spectrometry
AC CE P
TOF, time-of-flight
TE
MS, multiple sclerosis
ACCEPTED MANUSCRIPT Confidential
Page 5
17/11/2014
SUMMARY Multiple sclerosis (MS) is associated with chronic degeneration of the central nervous system and may
PT
cause permanent neurological problems and considerable disability. Whilst its causes remain unclear,
RI
its extensive phenotypic variability makes its prognosis and treatment difficult. The identification of
SC
serum proteomic biomarkers of MS progression could further our understanding of the molecular mechanisms related to MS disease processes. In the current study, we used isobaric tagging for
NU
relative and absolute protein quantification (iTRAQ) methodology and advanced multivariate
MA
statistical analysis to quantify and identify potential serum biomarker proteins of MS progression. We identified a panel of 11 proteins and combined them into a classifier that best classified samples into
D
the two disease groups. The estimated area under the receiver operating curve of this classifier was
TE
0.88 (p-value=0.017), with 86% sensitivity and specificity. The identified proteins encompassed
AC CE P
processes related to inflammation, opsonization, and complement activation. Results from this study are in particular valuable to design a targeted Multiple Reaction Monitoring mass spectrometry based (MRM-MS) assay to conduct an external validation in an independent and larger cohort of patients. Validated biomarkers may result in the development of a minimally-invasive tool to monitor MS progression and complement current clinical practices.
ACCEPTED MANUSCRIPT Confidential
Page 6
17/11/2014
BIOLOGICAL SIGNIFICANCE A hallmark of multiple sclerosis is the unpredictable disease course (progression). There are currently
PT
no clinically useful biomarkers of MS disease progression; most work has focused on the analysis of
RI
CSF, which requires an invasive procedure. Here, we explore the potential of proteomics to identify
SC
panels of serum biomarkers of disease progression in MS. By comparing the protein signatures of two challenging to obtain, but well-defined, MS phenotypic groups at the extremes of progression (benign
NU
and aggressive cases of MS), we identified proteins that encompass processes related to
MA
inflammation, opsonization, and complement activation. Findings require validation, but are an
AC CE P
TE
D
important step on the pathway to clinically useful biomarker discovery.
ACCEPTED MANUSCRIPT Confidential
Page 7
17/11/2014
INTRODUCTION Multiple sclerosis (MS) results in chronic degeneration of the central nervous system, causing
PT
considerable disability, and has no known cause or cure [1]. The extensive phenotypic variability and
RI
unpredictability of the course of disease in any given individual are considered hallmarks of MS,
SC
making therapeutic decisions, the prognosis, and even ‘life-planning’ difficult. At the molecular level, despite extensive work, it still remains unclear as to the exact mechanisms implicated with the
NU
progression of MS [2, 3]. The application of large-scale quantitative proteomics technologies in MS
MA
research has the potential to further our understanding of this disease [3, 4] and identify potentially useful biomarkers of disease progression. Ultimately, validated serum proteomic biomarkers of MS
D
progression may complement and improve current patient care, for instance by helping identify
AC CE P
TE
patients in most need of aggressive interventions.
Proteomic analyses of biological samples of MS patients and experimental models have started to emerge in the last decade, although most have focused on the study of a limited number of preselected biomarkers [3]. With the ongoing growth of extremely sensitive high throughput proteomics technologies, an alternative approach of emerging interest is the application of mass spectrometric untargeted technologies to conduct a broad, discovery driven, and unbiased identification of disease biomarkers [3-6]. To date, most proteomics studies have focused on the analysis of cerebrospinal fluid (CSF) [5, 6]. While the CSF might be considered more tissue specific, sampling requires an uncomfortable, invasive lumbar puncture; with risks ranging from headache (common) to nerve damage and paraplegia (rare) [7]. Consequently, regular CSF sampling is unacceptable for most patients and their physicians. Although the serum represents an attractive alternative to CSF, both for relative convenience and the potential for biomarker discovery, it reflects the collective expression of
ACCEPTED MANUSCRIPT Confidential
Page 8
17/11/2014
all tissue and cell types [8, 9], lacking tissue specificity. Thus, to be successful, a serum study needs to be based on groups of well-phenotyped MS patients. To date relatively few studies in MS have been
RI
PT
conducted using serum [4, 10], especially in the context of disability progression [4-6, 11].
SC
We set out to apply mass spectrometric untargeted technology to identify potential serum biomarkers (or ‘signatures’) of disease progression in MS by comparing two well-phenotyped clinical
MA
NU
extremes of disease progression - ‘benign’ and ‘aggressive’ MS in this proof-of-principle study.
MATERIALS AND METHODS
D
Study Population and Design
TE
This was a retrospective analysis of prospectively collected data and biological samples. Patients were
AC CE P
selected according to the following criteria: i) diagnosed with definite relapsing-onset MS by an MS neurologist (Poser [12] or McDonald [13] criteria); ii) fulfilled criteria for either benign or aggressive MS, defined as: minimal disability (Expanded Disability Status Scale Score, EDSS≤3) despite 20+ years of disease duration (benign MS) or severe disability (EDSS≥6) within 10 years of MS onset (aggressive MS); iii) availability of a banked serum sample, drawn prior to exposure to any ‘disease modifying’ drugs for MS.
Demographic and clinical characteristics were derived from the British Columbia Multiple Sclerosis (BCMS) database [14-18], a large population-based longitudinal MS database established in 1980, linking the four original MS clinics in BC (Vancouver, Victoria, Kelowna and Prince George to end of 2004). Biological samples were provided by some patients; typically those visiting the Vancouver site,
ACCEPTED MANUSCRIPT Confidential
Page 9
17/11/2014
these were stored within the Neuroimmunology Biobank at the University of British Columbia. By
PT
cross-referencing these two sources, we identified patients fulfilling study criteria.
RI
In accordance with the institutional ethical approval, once the biological samples and clinical
SC
information were linked, all patient identifiers were removed, creating a fully anonymized cohort which could not be re-identified. Only the following patient characteristics were retained, with all
NU
time-related factors recorded in full years (no dates were retained): sex; age at symptoms onset
MA
(years); age at sample draw; disease duration at sample draw; sample storage time; clinical course (group allocation ‘A’ or ‘B’); sample drawn before clinical course was confirmed (yes vs no). The
D
clinical course (benign vs aggressive) determined the group allocation and was concealed from those
TE
involved with the sample analyses (proteomics) or statistical analyses, who were informed groups A
AC CE P
and B represented two different ‘types’ of MS patients (no specifics were given other than the basic demographics needed to balance the groups during the proteomic analysis, see below).
iTRAQ Study Design
Protein relative quantitations were obtained using iTRAQ-MALDI-TOF/TOF methodology, which allows simultaneous processing of eight samples per experimental run. Given the number of samples in this study, two independent iTRAQ runs were used to process all samples. To construct protein ratios that are comparable across both experimental runs, a reference sample was processed together with 7 patient samples in each iTRAQ run. The reference sample consisted of a pool of serum from the 14 individuals in this study and thus, ensures the identification of most proteins in the analyzed samples. Samples were run in 2 separate iTRAQ batches, with each batch having a balanced spread of patients, based on sex, sample storage-time, anonymized grouping (‘A’ vs ‘B’), age and disease duration (time
ACCEPTED MANUSCRIPT Confidential
Page 10
17/11/2014
from symptom onset) at sample collection. Patient and reference samples within each run were
PT
randomly labeled with the eight iTRAQ reagents.
RI
iTRAQ Data Acquisition
SC
Blood samples were processed as previously described [19, 20]. Briefly, peripheral blood samples were drawn into vacuum tubes without any anticoagulation agents and allowed to clot for a minimum
NU
of 30 minutes at room temperature prior to processing. Serum was separated and stored at -80 °C
MA
until selected for analysis. Serum samples were depleted of the 14 most abundant plasma proteins (albumin, fibrinogen, transferin, IgG, IgA, IgM, haptoglobin, α2-macroglobulin, α1-acid glycoprotein,
D
α1-antitrypsin, apolipoprotein-AI, apolipoprotein-AII, complement C3 and apolipoprotein B) by
TE
immuno-affinity chromatography (Genway Biotech; San Diego, CA); digested with trypsin and labeled
AC CE P
with iTRAQ reagents according to the manufacturer’s protocol (Applied Biosystems; Foster City, CA). Labeled peptides were pooled, acidified to pH 2.5-3.0 with 6M phosphoric acid (ACP Chemicals Inc; Montreal, QC, Canada), and separated by 2D liquid chromatography.
iTRAQ labeled peptides were fractionated by strong cation exchange chromatography (SCX) using a 100mm x 4.6 mm ID polysulphoethyl A column packed with 5 μm beads (300 Å poresize). A 120 min linear gradient was used to separate the peptides in the first dimension. The 20 to 30 fractions containing the highest concentration of peptides (based on their UV absorbance at 215 nm) were selected and their volumes were reduced to 150µL in preparation for the subsequent nanoscale reversed-phase chromatography separation. Peptides were desalted on the LC system by loading each fraction onto a C18 PepMap guard column (300 µm ID x 5 mm, 5 µm particle size, 100 Å pore size, LC
ACCEPTED MANUSCRIPT Confidential
Page 11
17/11/2014
Packings, Amsterdam) and washing for 15 min at 50 µL/min with a mobile phase A', consisting of water:acetonitrile:TFA (98:2:0.1 (v/v)). The trapping column was then switched into the flow stream
PT
which operated at 200 nL/min, and the peptide mixture was loaded onto a Magic C18 nano LC
RI
analytical column (15 cm, 5 µm particle size, 100 Å pore size; Michrom Bioresources Inc., Auburn CA,
SC
USA). Peptides were eluted with a 3-step linear gradient: 0-45 min with 5% to 15% B' (acetonitrile:water:TFA 98:2:0.1, v/v); 45-100 min with 15% to 40% B', and 100-105 min with 40% to
NU
75% B'. The eluent was spotted directly onto AB Sciex's 384-spot targets for 30 seconds using a Probot
MA
microfraction collector (LC Packings, Amsterdam, Netherlands). The matrix solution (3 mg/mL αcyano-4-hydroxycinnamic acid (Sigma-Aldrich, St Louis, MO USA) in 50% ACN, 0.1% TFA), was then
D
added at 0.75 µL per spot. The samples were analyzed in by a 4800 MALDI TOF/TOF mass
AC CE P
TE
spectrometer (Applied Biosystems), acquisition time ranging from 35 to 40 hours.
ProteinPilot™ software v4.0 with the integrated Paragon™ Search Algorithm and Pro Group™ Algorithm [21] (Applied Biosystems) was used to process the resulting data and to search against the UniProt database [22] to identify peptides and proteins. The precursor mass tolerance was set to 150 ppm and the iTRAQ fragment tolerance was set to 0.2 Da. Identification parameters were set for trypsin cleavages and cysteine alkylation by methyl methanethiosulfonate. Modifications, substitutions, and number of missed cleavages allowed are not limited to a fixed value by Paragon algorithm [21]. The detected protein threshold was set to Unused ProtScore > 0.70 (equivalent to 80.0% confidence interval). ProteinPilot™ was used to calculate protein ratios by the weighted geometric means of the unique peptides contributing to the identification of the protein, after performing a bias correction and using background correction factor that reduces the ratio compression.
ACCEPTED MANUSCRIPT Confidential
Page 12
17/11/2014
Statistical analysis
PT
The set of proteins detected in both iTRAQ runs were considered for further analysis. Although the
RI
association between sex and disease progression (benign vs aggressive) in our study was not
SC
significant (Fisher’s exact test p-value is 0.19), our data was not sex-balanced with only 3 male samples in the aggressive MS group. We therefore have used a moderated t-test to prefilter any
NU
potential sex-associated proteins in this modest-sized cohort (robust LIMMA, with a conservative p-
MA
value