R eview For reprint orders, please contact [email protected]

Pre-analytical and analytical variability in absolute quantitative MRM-based plasma proteomic studies Quantitative plasma proteomics, through the use of targeted MRM-MS and isotopically labeled standards, is e­merging as a popular technique to address biological- and biomedical-centered queries. High precision and accuracy are essential in such measurements, particularly in protein biomarker research where translation to the clinic is sought. Standardized procedures and routine performance evaluation of all stages of the workflow (both pre-analytical and analytical) are therefore imperative to satisfy these requisites and enable high inter-laboratory reproducibility and transferability. In this review, we first discuss the pre-analytical and analytical variables that can affect the precision and accuracy of ‘absolute’ quantitative plasma proteomic measurements. Proposed strategies to limit such v­ariability will then be highlighted and unmet needs for future exploration will be noted. Although there is no way to conduct a truly comprehensive review on this broad, rapidly changing topic, we have highlighted key aspects and included references to review articles on various sub-topics.

Over the past few decades, the focus in the field of MS-based proteomics has been directed toward methodological and technological advancements. The ongoing developments have included alterations in the MS design for heightened detection sensitivity [1–5], modified separation approaches for improved peak capacity and resolution [2], and novel extraction techniques for increased proteome coverage [2]. This ana­lysis has been conducted on intact proteins (i.e., top-down [6–8]) as well as proteolytic fragments (i.e., bottom-up [9–11]), with the former being favored for charac­ terizing post-translational modifications and identifying protein isoforms in simple protein mixtures, while the latter is preferred for sequencing and quantifying highly complex samples. Collectively, the improvements have increased the popularity of MS, and have led to it becoming an indispensable tool for examining biological or biomedical problems. While MS has solidified its position as a powerful technique for obtaining qualitative and quantitative information, proteomics research is rapidly evolving into a more quantitative discipline [12–14]. In general, protein quantitation can be divided into techniques providing relative expression levels (i.e., fold-changes [15,16]), examples of which are isotope-coded affinity tags [17], isobaric tags for relative and absolute quantitation (iTRAQ [18]), and tandem-mass-tag [19,20], or ‘absolute’ concentrations [15,16,21], which are based on comparisons

with known levels of IS, for example, with chemically synthesized isotopically labeled standards [20,22,23]. In relative quantitation, MS or MS/MS signals from healthy and diseased samples are differentially labeled (introduced metabolically, enzymatically or chemically) and compared relative to each other. In one application, for instance, the iTRAQ approach was used to quantitate the changes in the cerebrospinal fluid proteome of patients with neurodegenerative disorders (e.g., Alzheimer’s disease, Parkinson’s disease and dementia) relative to healthy controls [24]. In the latter type of quantitation, which is the focus of this review, the concentration (e.g., ng/ml of plasma) of a single biomarker or panel of biomarker proteins present in a sample is determined. It should be noted that this type of quantitation could also be considered relative since the results are ultimately compared with protein or peptide standards [25]. Nonetheless, for the purpose of this review, the use of IS for protein quantitation will be termed ‘absolute’ q­uantitation, as is usually done in proteomics literature. An increasingly popular method for absolute protein quantitation involves multiple reaction monitoring (MRM), also called selected reaction monitoring (SRM). This targeted approach has been used for the quantitation of disease-related protein biomarkers in human biofluids, such as plasma or urine, toward their verification and pre-clinical validation. Unlike untargeted or

10.4155/BIO.13.245 © 2013 Future Science Ltd

Bioanalysis (2013) 5(22), 2837–2856

Andrew J Percy, Carol E Parker & Christoph H Borchers* University of Victoria-Genome British Columbia Proteomics Centre, 3101–4464 Markham St, Victoria, BC V8Z 5N3, Canada *Author for correspondence: Tel.: +1 250 483 3221 Fax: +1 250 483 3238 E-mail: [email protected]

ISSN 1757-6180

2837

R eview |

Percy, Parker & Borchers

Key Terms Biomarker: Biological indicator of the pathophysiological status of an individual. This can be measured through the ana­lysis of an analyte (protein or metabolite) in a biofluid, or alternatively through a recording or imaging test.

Specificity: Reflects the ability of the antibody, protease, sorbent and/or mass analyzer to produce a desired effect, be it targeted peptide capture, enzymatic cleavage, or other.

unbiased MS/MS measurements used in biomarker discovery experiments, MRM technology relies on the a priori knowledge of precursor/product ion m/z values for targeted peptide ana­lysis. Mass selection of the known precursor-product ion pairs (i.e., transitions) are commonly performed in a triple quadrupole mass spectrometer, which provides the sensitivity, specificity and multiplexing capability required for verification or validation of a multitude of candidates in a rapid and inexpensive manner. It is the enhanced multiplexing capability, lower cost and rapid development time that make this technique an attractive alternative to ELISAs for verification and validation studies, even though ELISAs are usually more sensitive. The major impetus behind biomarker verification and validation is to ensure diagnostic accuracy before the clinical use of these biomarkers for screening or therapeutic intervention. The development of screening methods for non-communicable diseases has merit since non-communicable diseases (that encompass a variety of disorders, diseases and congenital anomalies) are a global epidemic [26], projected to become increasingly prevalent in the foreseeable future [27]. A preferred biofluid for these biomarker analyses is human plasma,

because it is relatively non-invasive to collect, inexpensive to sample and is a key indicator of an individual’s physiological health status due to the systemic circulation acting as a repository for many analytes. For example, elevated plasma levels of the inflammatory C-reactive protein is a diagnostic indicator of cardiovascular events [28,29], while cardiac myoglobin is released into plasma following a myocardial infarction [30]. A MRM-based absolute quantitative plasma proteomic experiment for biomarker verification or validation is designed to quantify specific proteins in large numbers of samples, so that true biomarkers can be determined for clinical use. It is usually performed through a targeted bottom-up approach, which typically involves: Denaturation with a chaotrope, reductant and/or alkylating agent;

n

Tryptic digestion of the denatured protein(s);

n

Reversed-phase separation of the tryptic p­eptides;

n

MRM–MS detection of specific peptides;

n

Data ana­lysis of the characteristic ions (see Figure 1 for a general overview).

n

Plasma

13

C/15N-labeled proteins

Reduce, alkylate, tryptically digest or Protelytic digest 13

C/15N-labeled peptides

Plasma digest with labeled peptides SPE Concentrated and desalted peptides Q1

LC/MRM–MS

Time (min)

Standard curves for protein quantitation

Relative response

Intensity

ESI

Q2

Q3 LYSER

AMPLIFIER TRANSFER ANALYSER FLICK

1.2 1 0.8 0.6 0.4 0.2 0 0

0.2 0.4 0.6 0.8 1 Relative concentration

Time (min)

Figure 1. General bottom-up LC/MRM-MS workflow with isotopically labeled standards (protein or peptide) for absolute plasma protein quantitation. The insets show the targeted MRM-MS detection of a tryptic peptide column effluent in the upper panel and a peptide standard curve with an extracted ion chromatogram of the labeled (blue) and unlabeled (red) peptide at one concentration level in the lower panel.

2838

Bioanalysis (2013) 5(22)

future science group

Variability in absolute quantitative MRM-based plasma proteomic studies To compensate for sample loss, ionization suppression and variability in instrument performance, 13C/15N-labeled standards are employed. These stable isotope-labeled standards can be added at the beginning of the analytical workflow, in the case of proteins [31,32], or immediately following tryptic digestion, in the case of peptides [20,23,33,34]. While this general approach has demonstrated high specificity and throughput [35–39], additional sample processing steps (e.g., depletion [40–44], enrichment [45–47] and fractionation [48]) have been implemented for enhanced sensitivity. Nevertheless, for their ultimate application to patient samples in a biomedical or clinical setting, the method must be robust and accurate, while demonstrating high transferability between laboratories. Furthermore, the method must be able to be performed by nonexperts in LC–MS and have relatively high throughput with minimal sample handling to diminish processing time and maximize recovery. The principal variables that influence the precision and accuracy of an absolute quantitative proteomics experiment are pre-analytical and analytical by nature. The pre-analytical variables are sample-specific and reflect the differences in collection, processing, handling and storage procedures. Pre-analytical variability itself can be divided into two groups – variables that can be controlled by the researchers (e.g., choice of collection tubes) and those that can be controlled only by the appropriate selection of subjects. By contrast, the analytical variables impact the precision of the experimental workflow and include, but are not limited by, tryptic digestion, IS, chemical interference and the LC–MS/MS platform. Together, these variables introduce bias into the ana­lysis that must be accounted for, and ultimately reduced for precise and accurate implementation in intra- and inter-laboratory analyses. The quality of the results is particularly critical for the inter-laboratory verification or validation of candidate disease biomarkers. Here, we provide a review of the main preanalytical and analytical variables affecting the precision and accuracy of absolute quantitative proteomic measurements in human plasma (arguably the most often studied human biofluid) and outline strategies to mitigate them for future applications. Additionally, we highlight the latest standardized methods or procedures proposed for QC assessment. The review also includes a commentary on suitable conditions for quantitative bottom-up proteomic analyses (from sample selection to detection). future science group

| R eview

Importance of study design Biomarker research is multidisciplinary, existing at the interface of proteomics, statistics and epidemiology. In any comparative study, which includes biomarker discovery and validation, you would ideally have only a single variable (i.e., the disease) under study. In practice, achieving this goal with human subjects is extraordinarily challenging, with high potential for false-positives and -negatives due to improper study design, endogenous variation with the study or population, variable sample collection procedures, or non-reproducible analytical techniques. The population studied must accurately represent the entire population with respect to the objectives of the study. Compared with inbred laboratory animals, there is a larger variation among humans. Statistical and ethical questions must be considered in this type of investigation – who is to be included? What racial distribution is suitable? What ages and which genders should be sampled? The ideal study design in a variable population would be to have the patient be their own control, but this is only possible in certain situations involving, for example, a follow-up study after controlled dosing. However, this is rarely possible in current proteomics studies, particularly with respect to different diseases, which would require samples being collected prior to onset. The proteomics of ‘age’ as a variable in human studies would almost need to be done as a longitudinal study, which is difficult in part because proteomics is a new field and also because long-term sample storage would be involved, which would introduce additional variables. Certain diseases are known to correlate with age (age-related hearing loss, for example), and there have been many studies on the proteomics of lens crystallins with regard to the development of cataracts. Amyotrophic lateral sclerosis and Alzheimer’s disease, which have been studied by proteomics (looking for mechanisms and biomarkers), certainly have an age component too. Again, ideally these would require looking at people’s plasma over time in a longitudinal study, which has not been done. For this type of study, we currently have to rely on animal models, but for human studies these factors must be incorporated. Ethnicity is another variable that must be considered in study design. Certain diseases are already known to correlate with ethnicity, such as diabetes in first nations peoples, Tay-Sachs in Jews of Eastern European Ashkenazi descent, esophageal cancer in people of Asian descent www.future-science.com

2839

R eview |

Percy, Parker & Borchers (where reduced functioning of the acetaldehyde dehydrogenase enzyme is a risk factor) and sicklecell disease in persons of African descent. The prevalence of the latter disease is also related to geography, with regions that are prone to malaria (e.g., Africa and India) presenting higher rates of sickle cell anemia than those that are malaria-free. Racial background has been used above as an example of a variable that must be considered in human studies. Good study design always involves matching all of the variables except for the one being tested, although this is difficult to achieve in human populations studies. Some other variables (e.g., smoking) are included in Box 1 as ‘lifestyle variables’. Inclusion of population subgroups might increase the variability of the results, while exclusion might have ethical implications, as well as calling into question the applicability of the study’s results to the p­opulation as a whole. Study design must involve statisticians from the outset, particularly those whose specialize

in population sampling. These statisticians are needed to determine the number of subjects and samples that best reflect the ‘disease’ and the ‘control’ populations. A contributing explanation for the lack of currently approved protein biomarkers is the small numbers of samples – often only four or eight individuals in the case of iTRAQ studies – analyzed in the earlier biomarker discovery studies. An adequate number of control samples must be collected for representative sampling [49]. We should not forget that what we are analyzing is called a ‘sample’ because it is supposed to be an accurate sample of the population that it is supposed to represent. We usually talk about determining ‘differences’ between populations or treatment groups. In statistical terms, this is done by determining that the probability of them being the same (i.e., the null hypothesis) is low. The probability that they are the same is the familiar p-value. The variability of each measurement, the desired confidence level and the actual difference

Box 1. Sources of pre-analytical variability in human plasma/serum studies. Technical variables „„

Blood derivatives; plasma versus serum – Plasma: incubation time preceding separation of plasma from blood cells – Serum: clotting procedures used; clotting time prior to preparing serum

„„

Sample handling procedures – Plasma: nature of anticoagulant; adequate filling of the tubes during collection; platelet contamination and activation – Serum: peptides resulting from coagulation events; release of peptides from the blood clot

„„

Sample collection mode – Preparation of the blood collection site (avoiding hemolysis) – Site of venepuncture (median cubital vein) – Way of collecting samples: position of the patient (standing, lying, sitting); tourniquet application and needle bore size; venous occlusion time length

„„

Collection tubes – Spreading of substances released from collection tubes: silicones; polymeric surfactants (PVP, PEG); clot inhibitors or activators; polymeric gels in serum separator tubes; rubber stoppers and plastics covering tube walls – Adsorption of serum/plasma proteins to the tube walls: red-top tubes; serum separator tubes

„„

Temperature – Temperature for collection and transport practices within and between facilities – Short-term storage temperature (h/days) – Long-term storage temperature (months) – Number of freeze–thaw cycles

„„

Protease inhibitor addition – AEBSF (4-2-aminoethyl)-benzenesulfonyl fluoride); aprotinin*; bestatin*; E-64*; leupeptin*; pepstatin A*; PMSF; EDTA*; or mixtures*

Biological variables „„

Immutable factors – Age; gender; ethnicity – Genetic polymorphisms

„„

Potentially modifiable factors – Lifestyle factors: diet; obesity; alcohol consumption; smoking; physical activity

Reprinted and modified with permission from Table 1 in [53]. *Extended from Table 1 in [53].

2840

Bioanalysis (2013) 5(22)

future science group

Variability in absolute quantitative MRM-based plasma proteomic studies between the populations all affect the number of samples that need to be collected in order to be able to detect that difference. The larger the difference between the two populations, with respect to the variable being studied, the smaller the number of samples that need to be collected (and vice versa). Likewise, the ‘tighter’ the distribution of values of the variable being studied, the smaller the number of samples will be needed to determine a statistically significant difference between the diseased and normal populations. The determination of ‘n’, however, appears to require the answer before the study is performed. In practice, this determination, often termed a ‘power calculation’, can be made from either previous studies or preliminary results from the same study. Many tools for performing power calculations are available online. We recommend the simple online calculator developed by Schoenfeld [201] for definitions of the variables and to get a feel for the interplay between these variables. Pre-analytical and analytical variability can arise from a variety of sources, and can reduce the accuracy and precision of the measurements, which in turn can prevent the results of a study from being statistically significant. Some of these sources of variability are discussed below. Pre-analytical variables In quantitative plasma proteomics research, there are a large number of sample-related variables that can impact the validity of the quantitative results. Of central importance is the selection of the biofluid (e.g., plasma or serum, as well as patient information including age, gender, medical history and any current diseases or ongoing treatments) and its collection protocols (e.g., tube type and material, anti­ coagulant, protease inhibitor, even the diameter of the needle used to draw the blood [which can affect the sheer force during phlebotomy [50,51]]), as well as the sample storage (e.g., temperature, freeze–thaw cycles) conditions (see Box 1 for the sources of variation, [51–54] for excellent reviews, and [45–49] for their impact on plasma protein biomarker research). Unfortunately, relatively little ‘basic’ work has been conducted on preanalytical variability, although many authors agree that this is an important area of research and a contributing factor to its limited use in clinical chemistry laboratories. Gelfand and Omenn have attributed this to the difficulty in getting these types of studies funded, as the approach taken by funding agencies thus future science group

| R eview

far has been mostly disease-centric [51]. One notable exception has been the National Cancer Institute’s Clinical Proteomic Technologies and Clinical Proteomic Technologies for Cancer Initiatives [55–58], which has funded several of the reproducibility studies discussed in this review [40,41,59,60]. In addition, there have been several important studies on pre-analytical variability that have been performed recently, and a few of these are discussed below. „„Sample

selection Although many researchers use plasma and serum almost interchangeably in referring to the specimen analyzed, their protein compositions/ concentrations are actually quite different. As indicated in Box 1, there are a significant number of variables involved during the production of serum, which is generated from plasma through coagulation. The coagulation process involves the formation of a cross-linked fibrin clot, which causes associated proteins (e.g., fibrinogen, prothrombin, thrombin and coagulation factors) and aggregated platelets to be removed [49]. The protein concentrations are also dissimilar between these two sample types [49], which ultimately affects the accuracy of comparative quantitative analyses; for example, the concentration of VEGF has been found to be nearly tenfold higher in serum than in plasma [61]. Moreover, the coagulation process itself is inherently variable, with clotting in plastic giving different results than clotting in glass [62], and the choice of clot activators also has variable effects on the proteome. Although Lundblad recommended the use of serum over plasma in his 2005 study [49] and Lista et al. reported a lack of consensus over this fluid type [52], the current consensus within the proteomics community is to recommend the use of plasma [63–66]. „„Sample

collection Vacuum-drawn blood collection tubes containing either protease inhibitors (e.g., chymostatin, leupeptin and pepstatin) or exogenous anticoagulants (e.g., EDTA, sodium citrate and heparin) have been found to deactivate protease activity and prevent coagulation, which could alter sample composition. For example, thrombin has recently been demonstrated to cause damage to the plasma proteome by behaving in a similar manner to trypsin [67]. The effect of the additives on the plasma proteins has also been investigated previously [68]. Heparin, for example, is a polymer, which can create www.future-science.com

2841

R eview |

Percy, Parker & Borchers problems in the downstream MS ana­lysis. Collection tubes preloaded with a protease inhibitor cocktail and anticoagulants, however, have been found to produce reproducible plasma samples [69–71]. By contrast, no commercially available serum collection tubes have yet been developed which can reproducibly produce serum samples for proteomics studies. It is for this reason that the Human Proteome Organization has recommended using plasma instead of serum [63–65] and this is our rationale for conducting plasma‑based analyses. „„Sample

stability & storage Although the sampling problems listed in Box 1 appear almost insurmountable (especially in large-scale studies involving multiple collection sites), several recent studies have actually demonstrated that the plasma proteome is surprisingly stable. In a study of 200 high-tomoderate abundance plasma proteins [72], the A

B

Time course

1034

32

Spearman r = 0.94

1 1

32

1034

1034

1034

32

Spearman r = 0.97

1

32,768

1

32 1034 32,768 Spectral counts (fresh)

1

32 1034 32,768 Spectral counts (fresh)

Spectral counts (t = 0)

32

HBB HBA HBD Spearman r = 0.97

1 1

32 1034 32,768 Spectral counts (hemolyzed)

32,768 Spectral counts (n = 25)

32,768

Hemolysis

32,768

Spectral counts (nonhemolyzed)

Spectral counts (n = 5)

Spectral counts (t = 24 h)

C

Freeze–thaw

32,768

32,768

Spectral counts (t =1 week)

Liebler group found that the plasma proteome was ‘stable’ to delay in sample preparation (up to 1 week at 4°C), freeze–thaw cycles (up to 25), and hemolysis (Figure  2). Only a small percentage of the proteins revealed changes and these were mainly higher in abundance (e.g., albumin and serotransferrin). As an aside, they found good agreement between plasma and serum, except for fibrinogen. It should be noted, however, that this study was based on protein identifications where their presence or absence over time was determined from MS/MS spectral count data, not by absolute protein quantitation. Their stability results were similar to a MRM-based quantitative assay of 31 highabundance plasma proteins that was conducted over two time-courses (weekly over 4 weeks and bimonthly for 19 weeks) [73]. These were verified by a separate non-MS, aptamer-based SomaLogics® (CO, USA) protein array study of biobanked samples (plasma preparation tube,

1034

32

Spearman r = 0.96

1 1

32 1034 32,768 Spectral counts (t = 0)

1034

32

Spearman r = 0.98

1

Figure 2. Stability measurements of EDTA-treated plasma proteins quantified by label-free, bottom-up LC–MS/MS ana­lysis under various time points and conditions. (A) Two time points (24 h in upper and 1 week in lower) relative to the 0 time point. (B) Two freeze–thaw cycles (five in upper and 25 in lower) relative to a fresh sample. (C) Hemolyzed versus nonhemolyzed plasma. Reproduced with permission from [72] © Americal society for Biochemistry and Molecular Biology (2012).

2842

Bioanalysis (2013) 5(22)

future science group

Variability in absolute quantitative MRM-based plasma proteomic studies serum separation tube, and Red Top serum) [74], and a series of immunologic assays for measuring low-abundance cytokines, cytokine receptors, and activation markers in plasma over time and in variable storage conditions [75]. In contrast to the aforementioned stability measurements, an earlier study by Martino et al. (using surface-enhanced laser desorption ionization fingerprinting, followed by protein identification using in-gel digestion and LC–MS/MS) found diurnal variations in the concentrations of plasma protein, including plasminogen, transthyretin and apolipoproteins – proteins involved in metabolic pathways and other processes that exhibit a circadian response [76]. The authors suggested that these protein fingerprints could be used to perform a ‘body time of day’ to time treatments for personalized medicine. Body time of day should therefore be considered as another pre-analytical variable in biomarker screening and diagnostic studies. To avoid this potentially confounding factor, Lundblad suggested collecting samples at the same time of day [49]. We refer the reader to the Robles and Mann review article for a detailed discussion of the effect of circadian cycles on proteomic analyses [77]. Nonetheless, we would still recommend the use of plasma (stored frozen until use) instead of serum for ‘new’ studies, although serum may be acceptable if it is the only blood-based biofluid available (e.g., biobanked samples), particularly in light of a recent long-term stability study [78]. In this study, Ito et al. used immuno-radiometric assays or ELISAs to demonstrate that stored sera were stable to degradation in the quantitation of several moderate abundance proteins (e.g., insulin-like growth factors I and II, and insulin-like growth factor binding protein 3) after 9 years of -80°C storage. The researcher must be aware, however, that the CVs are likely to be higher due to pre-analytical variability if serum is used. The temperature of plasma storage has also been recognized as a potential pre-clinical variable in plasma proteomics, although the shelflife of plasma for proteomics studies has not yet been determined. In our laboratory, we store plasma for 12 months or less, and try to minimize repeated freeze–thaw cycles through the preparation of limited-use aliquots [66]. How that temperature is achieved may also be important. In a 2010 reproducibility study using 2D difference gel electrophoresis and in-gel digestion, the authors found that samples stored at -30°C agreed well with those stored at -80°C, with the exception of complement C3 precursor, future science group

| R eview

some of whose isoforms increased while others decreased [79], probably due to the known interrelatedness of the complement and coagulation pathways [80]. In a recent study by Murphy et al. [81], the authors found that transporting or storing samples in microcentrifuge tubes on dry ice can lead to acidification of the samples by 2.5 pH units, with resulting protein denaturation and/or aggregation. Dry ice was found to affect solutions in contact with CO2, and the authors found this effect could be prevented “by venting the headspace or allowing sample to sit in a -70°C freezer for 48 h before thawing”. „„Summary

of pre-analytical variability There is a global consensus that the collection, processing, and storage of samples for proteomics needs to be standardized, and that standard operating procedures (SOPs) need to be developed and rigorously followed to avoid introducing unwanted variabilities into the assay during the pre-analytical portion of the study. In fact, SOPs have been proposed for testing the effect of these pre-analytical variables on proteomics assays [82], and Gelfand and Omenn have published a list of pre-analytical variables that need to be controlled through SOPs (Box 2) [51] with some being incorporated into recently published SOPs [66,82,83]. On the importance of SOPs, Lundblad stated [49]: “It is difficult, if not impossible to eliminate these factors; the best that one can do is to very carefully document the conditions of blood p­rocessing. It is strongly recommended that SOPs are established for the process of obtaining a blood sample. It is only by doing this that one is able to assure reproducibility of samples and to allow some rationale comparison of data from various laboratories.” Although protein-based biomarkers are already being detected and quantified, tighter control of pre-analytical variables will reduce the artifacts that arise from changes in sample collection, which will allow the detection of smaller but significant differences between control and disease populations with high confidence. Analytical variables The precision and accuracy of absolute plasma protein quantitation heavily relies on the reproducibility of the sample preparation process (from denaturation through digestion), as well as the effectiveness of the mass spectrometer. Each step of the workflow introduces variables that impact the quality of the quantitative results (summarized in Box 3). The following sections discuss the variability surrounding sample pretreatment, www.future-science.com

2843

R eview |

Percy, Parker & Borchers

Box 2. Pre-analytical details that can be incorporated into an standard operating procedures and recorded during acquisition and processing. Collection details „„ Date/time of draw „„ Phlebotomist name „„ Standard operating procedures (SOP) – Was phlebotomist trained on this SOP? (Operator should certify yes or no) – Version of SOP being used „„ Patient information – Time of last meal – Orientation (e.g., sitting, standing) „„ Temperature of collection site „„ Blood collection process – Needle/set used: type: straight; set (‘butterfly’); length of tubing in the set; needle gauge – Vendor; product name; number; lot number; expiration date – Number of sticks (e.g., was a second stick required to find a vein?) – Tourniquet used? (if so, part #, etc.); site disinfectant (part #, etc.) „„ Blood collection tube(s) used – Vendor; product name; number; lot „„ Total volume of blood collected – From the venepuncture event – In each tube „„ Draw order of all tubes used – Was a ‘waste tube’ used to prime the blood collection device? „„ Mixing of tube (to dissolve additive) – Number of inversions Processing details „„ Temperature of processing site „„ Centrifuge – Operator/technician name: trained on this SOP; version of SOP being used – Time at start of spin – Condition of sample between spin and aliquotting (e.g., kept at room temperature, 4ºC, etc.) – Duration – Speed (G or RPM?) – Internal temperature – Make/model of centrifuge – Make/model of rotor – Date of last calibration „„ Aliquoting – Operator/technician name: trained on this SOP?; version of SOP being used – Time of aliquoting – Volume of aliquot – Container (vendor, part #, etc.) – Condition of sample between spin and aliquotting (e.g., kept at room temperature, 4ºC, etc.) „„ Storage – Time of placement into storage container – Target temperature – Date of calibration of refrigerator/freezer – Method of freezing (e.g., flash freeze on dry ice or directly placed in freezer) – Rate of cooling (if known) „„

Shipping – Temperature – Method of maintaining temperature (wet ice; dry ice; gel pack; etc.) – Date/time of pick-up – Shipping vendor information, company, tracking number, etc.

Table 16.2 reproduced from [51] with permission.

2844

Bioanalysis (2013) 5(22)

future science group

Variability in absolute quantitative MRM-based plasma proteomic studies tryptic digestion, IS, chemical i­nterference and the LC–MS/MS platform. „„Sample

pretreatment: depletion & enrichment Immunoaffinity depletion and enrichment are common pretreatment techniques used to simplify the complexity of the plasma proteome toward attaining deeper levels of protein quantitation. Such strategies can, however, increase the variability and affect sample recovery [40,84]. For instance, the recovery of protein-specific antigen after immunoaffinity depletion on a Proteome­ Lab™ IgY-12 column (Beckman Coulter, CA, USA) was previously measured to be 30%, while myelin basic protein was lost altogether [40]. For sensitive and accurate quantitative proteomic measurements, steps to diminish the potential for variable performance and sample loss should be employed. It is for this reason that we have elected to perform our analyses on non-depleted and non-enriched plasma. To achieve the ultimate sensitivity, however, these sample pretreatment steps may need to be included, particularly with human plasma – the most complex

| R eview

human-derived proteome sample known [30]. Experimentally, sample enrichment is typically performed on the peptide level (stable isotope capture by anti-peptide antibodies [45,46,85–88,202] and immuno-MALDI [89–92]), while depletion of abundant proteins is usually conducted on the protein level (using antibody-based media). However, just as with the removal of fibrinogen and other clotting factors to generate serum, there is the possibility of unintentional removal of associated proteins when removing the targeted abundant proteins (e.g., a1-antichymotrypsin and a2-macroglobulin form a tight complex with protein-specific antigen [43]). In addition, these pretreatment steps increase the assay cost and can diminish sample throughput [40,93,94], which is undesirable if the technique is to eventually have clinical utility. While we do not recommend the development of enrichment techniques for biomarker verification/validation studies where a large number of candidates must be screened in a rapid and inexpensive manner, it is reasonable to institute this at the clinical stage for rapid and accurate quantitation of a small number of approved plasma protein biomarkers.

Box 3. Sources of analytical variability in human plasma/serum studies. Analytical bias* Sample-processing procedures* „„

Liquid-handling methods (manual or robotic)*

„„

Depletion of high-abundance proteins?* (if used) – Which proteins to remove? – How many proteins to remove? – Source of medium (company, type of antibody [e.g., IgY]) – Format (spin vs HPLC column)

„„

Enrichment of low-abundance proteins* (if used) – Polyclonal or monoclonal? – Peptide enrichment or protein enrichment

„„

Digestion (bottom-up) or intact protein (top-down) – Selection of enzyme (if used) – Source of enzyme (if used) – Denaturant (urea, sodium deoxycholate) – Denaturant and digestion conditions: time; temperature; solvent; pH

„„

Prefractionation method: – Gel: 1D or 2D? – SCX – HPLC – IEF

„„

Mass spectrometer type (ESI or MALDI)? – Triple quadruple or orbitrap? – Online LC system: flow rate; gradient; solvents – Acquisition settings: scan rate*; scan mode (MRM, MS/MS); collision energy – Environmental factors (temperature, humidity) – Quantitation method; with internal standards (SIS or not); label-free methods

Table 1 reproduced from [53] with permission, and extended. *: From Table 1 in [52].

future science group

www.future-science.com

2845

R eview |

Percy, Parker & Borchers „„Tryptic

digestion While a wide range of enzymes and chemicals, with varying specificities and efficiencies, are available, trypsin (EC 3.4.21.4) is the most widely used protease in bottom-up LC–MS proteomics. This preference stems from its high specificity in cleaving amide bonds at the C-terminal side of lysine (Lys) and arginine (Arg) residues under basic conditions [95,96], as well as its low cost, which is an important factor in large-scale projects. This proteolysis reaction generates peptides with practical size (14-mer on average) and favorable charge (+1 to +3, but commonly +2) for efficient ionization and gas-phase fragmentation in MS/ MS ana­lysis. Numerous proteolytic digestion protocols exist, with some implementing specialized equipment (e.g., microwave [97] and microwave-assisted acid hydrolysis [98,99], ultrasonic probe [100], and hydraulic pressurepulse activator [101,102]) or additional serine endoproteases (e.g., Lys-C [103]) for enhanced efficiency. The major differences center on the denaturant (i.e., chaotrope, surfactant or solvent) and buffer (e.g., ammonium bicarbonate or phosphate) used to unfold and lyse the protein at its pH optima. As Proc et al. demonstrated in a large-scale quantitative comparison study [60], variability in the denaturation/ digestion method can dramatically impact the efficiency and reproducibility of tryptic digestion, and therefore influence the precision and accuracy of the plasma protein concentrations derived. The authors’ findings revealed that sodium dodecyl sulfate or sodium deoxycholate (NaDOC), as opposed to the conventionally employed urea chaotrope [104], produced the highest yield of target peptides generated from 45 high-to-moderate abundance plasma proteins. Furthermore, the highest reproducibility was obtained with NaDOC. The reduced variability could be explained by its easy removal, which helps diminish the potential for downstream interference from any chemical artifacts stemming from the NaDOC surfactant. There is, however, one caveat – although NaDOC was found to produce the best results, on average, no single method was found to be the best for all of the proteins studied. Thus, for some specific targeted proteins, a different denaturation/ digestion protocol, such as that involving 9 M urea followed by ambient incubations of 5 mM TCEP and 10 mM iodoacetamide prior to overnight tryptic digestion, might be superior [60]. Based on the overall results from the Proc study, 2846

Bioanalysis (2013) 5(22)

and the fact that NaDOC can be easily removed prior to LC–MS ana­lysis by centrifuging the acidified digest [105,106], we recommend the use of NaDOC in plasma proteomic analyses. The grade and source of the trypsin also influences digestion efficiency and reproducibility, as demonstrated by Burkhart et al. in a comparative study of the qualitative and quantitative performance of six commercially available trypsins [107]. Their results indicated that sequencinggrade modified trypsin (from Promega, WI, USA) performed best. Regardless of the trypsin grade, however, this protease was found to generate variable cleavages (be it missed, semi-tryptic or nonspecific), which can affect the digestion reproducibility over independent replicates. With regard to the nonspecific cleavages, although cleavage at Arg-Pro or Lys-Pro bonds is considered prohibited [108], violations have been observed [109]. At this time, only a few suppliers of protein-based standards for measuring tryptic digestion efficiency are currently available (e.g., Protein Standard for Absolute Quantification from Promise Advanced Proteomics [203]). Digestion efficiency assessment is mandatory if correction factors are to be applied later toward accurate and absolute protein quantitation. However, it should be pointed out that for precise and relative quantitative proteomic studies with labeled peptide standards, digestion efficiency is of less concern than reproducibility. For instance, a target peptide that is reproducibly detected with high precision can still be used as a surrogate to quantify its corresponding protein in relative quantitative MRM-MS ana­lysis, even if the rest of the protein is digested with poor efficiency. The high precision in targeted protein quantitation has been demonstrated previously in our and other researchers’ quantitative plasma proteomic studies [35,36,38,44,110–113]. A potential solution to the less-than-ideal nature of tryptic digestion is to use stable isotope-labeled standard (SIS) proteins in targeted protein quantitation [114–116], instead of the more commonly employed SIS peptides. Since the protein standards are added to the sample at the beginning of the analytical process, the digestion yield can be determined and a correction for protein losses that occur during sample preparation (e.g., immunoaffinity enrichment) can be applied. Despite their merits, these labeled protein standards are expensive and unlikely to: contain all of the post-translational modifications that are found in the endogenous proteins (since the standards are often expressed in cell-free or E. coli future science group

Variability in absolute quantitative MRM-based plasma proteomic studies systems) and be present in the native conformation. These limitations can affect the determination of the digestion efficiency, the normalization of the standard and the accuracy of the natural (NAT)/synthetic ratio. Preparing these proteins in mammalian cells, however, may overcome some of these limitations. An alternative solution to controlling the variability of tryptic digestion involves the use of external calibrators, such as those proposed by Agger et al. [117]. Through the quantitative ana­ lysis of two apolipoproteins, the authors demonstrated that single-point external calibration of a ‘standard’ plasma sample can effectively calibrate the NAT/SIS ratio of the LC/MRM-MS assay over multiple days. While the approach holds great promise in compensating for the less-than-ideal nature of tryptic digestion, it requires the protein concentrations of the calibrant to be determined by an ELISA using a matrix-matched sample collected under identical conditions. This could extend the method development time and increase the costs of the assay, which detract from its merits. However, if a reliable ELISA method exists, it can readily be used to accurately quantitate the protein in the standard sample, as well as the patient samples. The critical need for reproducibility in the digestion step is summarized in the last sentence of Hoofnagle’s perspective article [131]: “Clearly, future studies aimed at improving the inter-laboratory reproducibility of the p­roteolytic digestion of plasma and practical methods for m­onitoring that digestion are imperative.” „„Automation

One of the ways to reduce the analytical variability of the ana­lysis is through the use of liquid handling robotics, which are designed to reduce (as much as possible) human errors and any variation in pretreatment between samples. Several robotic workstations are commonly used in proteomics laboratories – Thermo Fisher’s KingFisher™ Flex Magnetic Particle Processor, Leap Technologies CTC Analytics PAL® Sample Handler, Tecan Freedom EVO® Automated Liquid Handler, Agilent’s Encore Multispan Liquid Handling System, Agilent’s Bravo Liquid Handling Platform, to name a few. Many of these robotic systems are modular and can perform automated digestion, as well as the manipulation of small sample sizes and the handling of affinity beads, should depletion or enrichment of the plasma samples be required. Many are designed to handle 96-well plates, allowing future science group

| R eview

high-throughput sample preparation, including aliquoting of the original sample, digestion, filtration and strong cation exchange separation. If needed for other applications some can perform automated spotting of MALDI targets. In our laboratory, we currently use a Tecan Freedom EVO robotic station for reproducible plasma dilution, reduction and alkylation, trypsin a­ddition and SIS peptide addition. „„IS

The inclusion of an isotopically labeled IS in absolute quantitative proteomic analyses is essential in order to achieve strong accuracy and precision. These standards help with endogenous peak verification and the normalization of sample preparation or instrument-related variability [118]. The use of isotopically labeled standards, however, is nothing new or specific to the ‘omics’ community, having been implemented previously in the quantitation of small biomolecules (e.g., drugs, metabolites and hormones) [119,120]. The most common standards are labeled with stable 13C and/or 15N isotopes at the C-terminal Lys or Arg residue. The benefit is that they exhibit identical physicochemical properties to their unlabeled analogues, unlike 2H labeled standards. This results in the two peptide forms (SIS and NAT) behaving in a similar manner during reversed-phase chromatography, electrospray ionization and collision-induced dissociation. The only difference lies in the m/z values of their precursor ion and y series product ion. The co-elution of signals helps distinguish between real and aberrant peaks, which helps guide peptide and transition selection toward maximum specificity and sensitivity. Isotopically labeled peptides are the most commonly employed standards in absolute quantitative proteomic measurements. These chemically synthesized standards must be added following digestion due to the unpredictable prevalence of peptide degradation or chemical modification, which was previously found to occur only for SIS peptides during tryptic digestion [39]. An advantage of the post-digestion addition of SIS peptides is that it requires minimal sample volumes. However, based on their point of addition, they are unable to account for losses that manifest prior to trypsin quenching. This unaccounted variability can instead be normalized with SIS proteins, which are spiked into the sample at the earliest stage of the workflow. In fact, the use of SIS proteins instead of SIS peptides was found to improve both precision and accuracy in the www.future-science.com

2847

R eview |

Percy, Parker & Borchers immuno-MRM-MS quantitation of plasma proteins [113]. Relative to synthetic peptides, however, the recombinant protein standards are limited by higher cost, more laborious preparation with the potential for increased technical challenges (e.g., solubility) and possible structural/ conformational differences from their endogenous counterpart. Although the targeted multiplexed quantitation of plasma proteins using protein [31,121] and peptide [20,31] standards have independently demonstrated strong accuracy and excellent precision, the variability associated with the IS itself must not be discounted. Toward absolute quantitation, the exact concentration and composition of the spiked standard must be accurately determined (e.g., through amino acid ana­lysis and capillary zone electrophoresis) and applied d­uring q­uantitative protein ana­lysis. „„Chemical

interference Chemical interference from nontarget co-eluting ions frequently occurs in the identification or quantitation of human plasma, and affects the reliability of the results. Contributing factors include its inherent complexity and the low resolution mass spectrometer (~1000–3000 on a triple quadrupole) employed in its ana­lysis. The complexity is attributed to not only its wide concentration range (approximately 10,000 proteins spanning ten orders of magnitude), but also to the hundreds of thousands to millions of peptides derived from proteolytic digestion [30,122]. This substantially increases the potential for background interference and chemical noise in the mass transmission windows, despite being operated at unit mass resolution (full-width at half-height maximum: 0.7 Da) and having several dimensions of analyte specificity (e.g., precursor ion m/z, product ion m/z, retention time and NAT/SIS ratio). Interferences in the MRM ion channels can be reduced through sample prefractionation or extended chromatographic run times. Doing so, however, increases the assay cost, while sacrificing the throughput and multiplexing capabilities, which are considerable limitations in protein biomarker or clinical research studies. An alternative targeted proteomic option for interference reduction is MRM3 [123,124]. Although the MS detection is of secondary product ions, as opposed to the primary product ions used in a typical MRM experiment, this approach enabled improved analyte specificity and sensitivity. Despite its merits, the approach can adversely 2848

Bioanalysis (2013) 5(22)

affect such assay attributes as precision and accuracy, while limiting the multiplexing capability due to long cycle times [125]. Apart from interference reduction, strategies have been developed to empirically [40,126] or theoretically [127] detect interferences in absolute quantitative plasma proteomic studies. Empirically, interferences are identified through the monitoring of SIS and NAT transitions (three transitions per peptide) in matrix-free and matrix-containing conditions [126]. By comparing the signal and response ratios between peptide forms and sample types, interferences can be identified and removed. This yields panels of interference-free MRM ion pairs (one quantifier and two qualifiers) that specifically represent the target peptides for monitoring in subsequent quantitative analyses. For further verification of the peptide identities and to help build specific peptide MRM transition lists, MS/MS data collected from initial discovery experiments could be used, as is performed with the iMRM software [128]. To obviate the need for manual and subjective examination of the data, the automated detection of inaccurate and imprecise transitions (AuDIT) algorithm was recently developed [127]. It automatically evaluates relative peak area ratios and flags those that are problematic (based on p values and CVs). While the benefit to ana­lysis throughput is recognized, it lacks the automated verification of retention time and peak shape comparisons between transition pairs and peptide forms that are imperative for accurate protein quantitation. Future algorithms should therefore incorporate this functionality for added confidence. Apart from reducing interferences through altered experimental design or detecting/removing interferences through inspection, the program ‘MRMCollider’ was recently developed to predict the mass of potential interferences in a given proteome [129]. Its utility was recently demonstrated in the targeted quantitation of the yeast proteome [130], but has not yet been applied to human protein panels. „„LC–MS/MS

platform The specific LC–MS/MS platform used has an impact on the accuracy and the reproducibility of the results, particularly with respect to the possibility of quantitating interferences. A paper from Hoofnagle pointed out the great potential for LC/MRM-MS for quantitative proteomics and the encouraging results from a ‘round-robin’ inter-laboratory comparative study [131]. However, future science group

Variability in absolute quantitative MRM-based plasma proteomic studies he still emphasizes the need for standard methods of digestion and a means of monitoring that digestion. The characteristics of various different platforms are compared in a paper from the International Federation of Clinical Chemistry Working Group on Clinical Quantitative Mass Spectrometry Proteomics [132]. Clearly, lower resolution instruments have a higher potential for generating false-positives in an MS-based assay, but we have recently demonstrated the importance of chromatographic resolution and chromatographic reproducibility as well. In a comparative study on two different LC systems interfaced to the same mass spectrometer (an Agilent 6490), we found that the standard-flow UHPLC system was less subject to interference than the more commonly used nano-flow system [36]. Although the nano-flow system per se was more sensitive, the standard-flow system used larger-bore UHPLC columns, resulting in increased robustness and increased loading capacity, which more than compensated for the decrease in absolute sensitivity of the standard-flow platform. In addition, the standard-flow UPLC system gave narrow peak widths and highly reproducible retention times, which assisted in the development of highly multiplexed assays. The ‘pseudo-MRM’ approach, using the very high sensitivity and very high mass accuracy of ThermoFisher’s Q-Exactive

| R eview

quadrupole Orbitrap instrument, also bears consideration [133]. Note that we refer to this approach as ‘pseudo MRM’ because, unlike in standard triple-quadrupole-based MRM, there is no mass filtering involved in the second stage of mass separation; instead, full-scan product ion mass spectra are collected from the selected p­recursor ions. Standardization methods & their significance With the rising popularity of MS-based quantitative proteomics, there is an increasing demand for evaluation or ‘accreditation’ of method and instrument performance. The need is well recognized [134], and has become a priority of the Human Proteome Organization-sponsored scientific initiatives [135–137]. The importance of QC assessment is illustrated by the surfaceenhanced laser desorption ionization failure [138,139]. This technique was marketed and promoted without QC, and has since been largely discredited due to poor performance [140]. It is now being marketed again, this time by Bruker Daltonics and Bio-Rad [204], but on a higher resolution MALDI instrument, and this time with better QC. To evaluate the reproducibility of a targeted MRM absolute quantitative proteomic approach with SIS peptides, a large multisite study was

1.0

Study I Study II Study III

0.8

%CV

0.6

Site 19 Site 52 Site 54 Site 56 Site 65 Site 73 Site 86 Site 95

0.4

0.2

0 fmol/µl 1.0 µg/ml 2.3

2.92 6.8

8.55 19.9

25 58.2

46 107.2

83 193.3

151 351.8

275 640.6

500 1,164.8

Theoretical concentration per sample for HRP-SSD

Figure 3. Inter-laboratory variability in the MRM-based plasma protein quantitation of horseradish peroxidase (inferred from SSDLVALSGGHTFGK) for three different study designs conducted at nine different spiked-in concentrations and measured at eight different sites. The studies were designed to assess the effect of analytical variability (through sample ­preparation and LC/MRM-MS ana­lysis) on quantitative performance. The %CV values corresponding to each site and each study are color coded, as indicated in the legend. Reproduced from [59] .

future science group

www.future-science.com

2849

R eview |

Percy, Parker & Borchers conducted with a target panel of ten signature peptides (equating to seven high-to-moderate abundance plasma proteins) in three different study designs [59]. Despite the introduction of increasing levels of variability with each study, high intra- and inter-laboratory reproducibility (mean CVs

Pre-analytical and analytical variability in absolute quantitative MRM-based plasma proteomic studies.

Quantitative plasma proteomics, through the use of targeted MRM-MS and isotopically labeled standards, is emerging as a popular technique to address b...
2MB Sizes 0 Downloads 0 Views