Coronary Artery Disease

Analytical validation of novel cardiac biomarkers used in clinical trials Alan H.B. Wu, PhD, DABCC San Francisco, CA

Background Blood-based biomarkers such as cardiac troponin and B-natriuretic peptides are widely used in clinical practice for the diagnosis, rule out, and risk stratification for patients with acute coronary syndromes and heart failure. Because neither these nor any other laboratory test meets all clinical needs, there are many novel biomarkers that are proposed and evaluated each year for possible implementation into clinical practice. Results of clinical trials are used as a means to validate their effectiveness and to obtain regulatory approval. Methods and results Novel biomarkers are discovered through a targeted approach using knowledge of the pathophysiology disease process and an untargeted approach where proteins from tissues or blood of disease patients are compared against healthy subjects or those with benign conditions. Once a candidate biomarker has been identified, it is important to understand where the protein is located and how it is released into blood. In designing trials, the requirements for Food and Drug Administration clearance and approval should be taken into consideration. There are preanalytical studies that should be considered including the preservative used to collect samples and in vivo and in vitro analyte stability. If the analyte is not stable, a surrogate marker could be used such as stable “pro” molecules (precursor proteins) may be preferred. Assay imprecision and bias, biological variation and criteria for the establishment of a reference range are important analytical attributes. The need for harmonization and commutability and correlation of results to other markers and clinical outcomes are important postanalytical attributes of novel biomarkers. Conclusions Inadequate adherence to these variables when conducting clinical trials reduces the quality and value of the information contained in literature reports of novel serum/plasma-based biomarkers. (Am Heart J 2015;169:674-83.)

Biomarkers are widely used in routine clinical practice for diagnosis, monitoring, risk stratification, and therapeutic selection for cardiovascular diseases. The “ideal characteristics” of serum-based biomarkers have been reviewed. 1 For acute coronary syndrome, there has been a progression of biomarkers starting with aspartate aminotransferase, creatine kinase and its MB isoenzyme, and now cardiac troponin. For heart failure (HF), the natriuretic peptides (B-type natriuretic peptide [BNP] and N-terminal pro–B-type natriuretic peptide [NT-proBNP]) are widely used. These biomarkers have been successful because they have fulfilled most of the characteristics for an ideal marker. Both troponin and BNP indicate the presence of a pathophysiologic condition, that is, myocardial ischemia and injury, and hemodynamic stress, respectively, and not presence of a specific disease. Therefore, the optimum use of these tests is in conjunction with other data, especially clinical

From the and Department of Laboratory Medicine, University of California, San Francisco, CA. Submitted November 21, 2014; accepted January 16, 2015. Reprint requests: Alan H.B. Wu, PhD, DABCC, San Francisco General Hospital, 1001 Potrero Ave, San Francisco, CA 94110. E-mail: [email protected] 0002-8703 © 2015 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.ahj.2015.01.016

presentation. When present, it raises the pretest probably of myocardial infarction (MI) and HF. There are several unmet clinical needs with regard to the utilization of existing biomarkers. For acute coronary syndrome, a marker that can diagnose MI early, for example, during the initial presentation of patients with chest pain to the emergency department, would better facilitate triaging. For HF, biomarkers that can be used for long-term disease monitoring and selection of optimum therapy have the promise of reducing the rate of readmission to the hospital for HF exacerbation. For these and other potential clinical applications, there is considerable research conducted in the discovery and clinical validation of novel biomarkers for cardiovascular diseases. This review is intended for researchers, clinicians, laboratorians, regulatory scientists, journal reviewers and editors, and manufacturers of diagnostic assays, who examine literature reports, plan validation studies, consider commercialization of laboratory tests, or implement new clinical testing services. The author is solely responsible for the drafting and editing of the manuscript and its final contents.

Biomarker discovery platforms There are 2 principal mechanisms to discover new serum-based biomarkers. 1 The “pathophysiologic”

American Heart Journal Volume 169, Number 5

Wu 675

(eg, acute myocardial infarction) or a disease (eg, HF). Cardiac troponin is an event marker, whereas C-reactive protein is a disease marker. The natriuretic peptides can be considered as both an event marker (eg, decompensated HF) and a disease marker (eg, chronic HF). Once the physiologic role of a candidate biomarker is known, it is also useful to classify biomarkers according to pathophysiologic groups, such as myocardial ischemia, necrosis, inflammation, neurohormonal activation, angiogenesis, plaque instability, atherosclerosis, and thrombosis. 4 Markers in each of these categories could provide complementary information using multimarker approaches. Two markers of the same category (eg, BNP and NT-proBNP) do not provide differential information. The selection of members of a multimarker panel must be demonstrated through clinical trials.

approach involves knowledge of the relevant human physiology and disease processes, an active area of clinical research. Cardiac troponin is an important marker for myocardial ischemia because these proteins participate in muscle contraction and are found in high concentrations within the myocyte. B-type natriuretic peptide is a hormone that participates in the regulation of water and electrolytes, which are essential for the maintenance of the proper blood pressure. Understanding the pathophysiology is important for the development of new pharmacologic agents. It also gives an opportunity for the discovery of novel biomarkers that are produced, released, or cleared during the disease process. Most biomarkers that are in use or being evaluated as markers for cardiovascular diseases were discovered by the pathophysiologic approach. The cardiac troponin complex is a collection of 3 proteins that participates in muscle contraction. B-type natriuretic peptide is a hormone that counters the actions of renin and aldosterone. The discovery is how these markers are used in the diagnosis or prognosis of cardiovascular disease, not the existence of the protein or hormone itself in blood. The “proteomic” and “metabolomic” approach involves an untargeted search for biomarkers. This science involves the use of techniques such as 2-dimensional electrophoresis and mass spectrometry that enable the simultaneous analysis of hundreds to thousands of proteins and metabolites. These techniques are used to compare the protein or metabolite signature of disease patients compared with healthy individuals. Proteins found in 1 set of samples but not the other may reveal itself as being a potentially useful biomarker. Metabolites are defined as low-molecular-weight products of enzymatic reactions. Novel metabolites may be present in blood of MI patients, due to the release of proteolytic enzymes that degrade circulating proteins. Sometimes, the identity of the substance is not known at the time of their discovery. Subsequent analysis is conducted to determine the identity of the protein and its physiologic or pathophysiologic role. There have only been a few cardiac biomarkers that have been discovered by the proteomic approach. Soluble ST-2 was released into surrounding media from cell cultures of rat myocytes that undergo mechanical strain. 2 The role of ST-2 was delineated after the discovery of interleukin 33, the natural ligand of ST-2. 3 It is likely that more biomarkers will be discovered by untargeted approaches as the mass spectrometry technology for protein identification improves and the analytical sensitivity increases. Although there is some interest in measuring N-terminal and C-terminal peptides degraded from intact troponin, there is no biomarker discovered by metabolomics currently in routine clinical use.

Extensive validation studies are required before biomarkers can be put into clinical practice. The Food and Drug Administration (FDA) is responsible for the regulatory approval of clinical laboratory tests. Reagents, instruments, and systems are intended for use in diagnosis of disease to cure, mitigate, treat, or prevent disease, or its sequelae are considered medical devices. Their clearance ensures that the devices are safe and the diagnostic claims made by the manufacturer are verified. Table I lists the different FDA classification of biomarkers and approval processes they undergo. The FDA examines data from analytical and clinical validation studies conducted on tests submitted for approval. Clinical trials are directed to address specific intended claims to be made by manufacturers of the test. The FDA requires a significant fraction of testing to be conducted prospectively. Retrospective biomarker data obtained from pharma-based clinical trials designed to determine efficacy of a therapeutic agent are usually not acceptable as part of an FDA submission for an in vitro diagnostic (IVD) test. The exception is markers designed to be a part of “companion diagnostics.” These are tests that are directly linked to specific therapeutics, that is, a requirement for use is based on the result of a clinical laboratory test. For example, the drug trastuzumab is on women with breast cancer who overexpress the her-2/neu receptor. 5 Because of the medical consequences of a false-positive or false-negative result, this and other tests such as those for infectious diseases, oncology, and cardiovascular disease require either a premarket approval submission or a 510(k) with a clinical study to include an adjudication of a patient's discharge diagnosis or outcome.

Classification of novel biomarkers

Clinical trial study design

Once biomarkers have been discovered, it is useful to classify the biomarker as an indicator of clinical events

There are 2 objectives for clinical trials of in vitro diagnostic devices: FDA clearance of a biomarker for which

Relevance of clinical trials to Food and Drug Administration submission

American Heart Journal May 2015

676 Wu

Table I. FDA regulatory approval process for in vitro diagnostic devices Types of devices IVD: devices that include laboratory instruments, reagents, assays not directly used on patients. Medical: devices used directly on patients, eg, blood collection needles. Device classifications Class I: lowest potential for harm. Most laboratory tests fall into this category. Class II: Special controls needed, including labeling requirements, performance standards, and postmarket surveillance. Class III: Highest risk of harm to patients for which general and special controls are insufficient to provide a reasonable assurance of safety and effectiveness. FDA clearance/approval process 510(k): FDA agrees that the new device is substantially equivalent to a legally marketed “predicate” device. Clinical data are not required for most 510(k) submission. Successful submissions are given the label “FDA Cleared.” De novo 510(k): FDA mechanism to allow a manufacturer to submit for clearance under the 510(k) process in the absence of a predicate device for low- or moderate-risk devices. Successful submissions are given the label “FDA Cleared.” Premarket approval: a clinical assessment of the safety of class III devices. For IVD products, because the device is not used on the patient, the safety relates to the potential of medical consequences of test results that are false negative and false positive. Successful submissions are given the label “FDA Approved.” Investigational device exemption Requirement: allows an investigational device, deemed to have significant risk, to be used in a clinical study. Exemption: investigational device exemption filing not required for an investigational device deemed to not have significant risk.

clinical trial data are required and documentation as to the value of the test in clinical practice. Both objectives can be achieved with the same trial. If the principal objective is to obtain regulatory approval, most manufacturers seek the 510(k) route, that is, demonstration of “non-inferiority.” These trials are typically not randomized. However, it is important to select the appropriate control population, that is, patients who do not have the disease but who have biomarker results at or near the concentrations for the intended disease. Once the biomarker is cleared, a second trial can be designed to determine new disease indications. If the trial is designed to demonstrate improvements in clinical outcomes through decisions guided by the result of a clinical laboratory test, a randomized trial may be required. For example, the BATTLESCARRED Trial examined if HF outcomes can be improved with treatment decisions guided by NT-proBNP results. 6 For these trials, the use of FDA-cleared assays is optimal, as different results and conclusions can be produced with use of assays labeled as “research-use only” due to the lack of standardization and harmonization (see section below). The FDA highly encourages IVD manufacturers to submit data that produce additional claims of laboratory tests, especially if those

indications are routinely used in clinical practice (eg, risk stratification for BNP). However, the cost of conducting another trial discourages manufacturers from seeking additional indications unless they foresee a significant marketing advantage over their competitors.

The biochemistry of biomarkers It is logical that a prior understanding of the biomarker's tissue distribution, subcellular locations, and release patterns into blood is known before commencing with a clinical trial. In actual practice, these facts are usually not known to the fullest extent. For example, the release kinetics of cardiac troponin is still being investigated despite over 20 years of routine use of this biomarker. Recently, Starnberg et al 7 described the release kinetics of cardiac troponin from myofibrils as being dependent on the rate of organ perfusion and reperfusion. The biochemistry of the natriuretic peptides is complicated by the release of pro-BNP, the precursor molecule to both BNP and NT-proBNP into blood of HF patients. 8 An understanding of the biochemistry of a biomarker is important in targeting the appropriate form of the biomarker that appears in blood. For example, pregnancy-associated plasma protein A (PAPP-A) is a novel biomarker that participates in the progression of atherosclerotic disease. Lund et al 9 showed that measurement of the free form is superior to the total PAPP-A, which includes the sum of the free form and PAPP-A complexed to proeosinophil major basic protein. Measurement of total PAPP-A may include release from other conditions such as pregnancy and stroke. Therefore, the specificity of measuring free PAPP-A as a cardiac marker is expected to be higher than for total PAPP-A assays. Many proteins undergo posttranslational modifications after their production, including phosphorylation and glycosylation. 10,11 The analytical specificity of antibodies used toward these modifications determines the clinical specificity of the corresponding laboratory test for the biomarker. Proteins also undergo degradation once released into the circulation by proteolytic enzymes. Detection of partially degraded forms affects both the specificity and analytical sensitivity of the marker.

Preanalytical variables Blood collection tubes and specimen processing The detection of circulating biomarkers begins with the collection of blood into the appropriate tubes. For many biomarkers, either serum or plasma can be used. Red cells coagulate when blood is collected into tubes containing no preservatives. After clot formation, the tubes are centrifuged, and the serum can be removed. Blood collected in tubes containing heparin, EDTA, or citrate enables retention of plasma. Heparin is the most commonly used anticoagulant for cardiac biomarkers. However, some markers, such

American Heart Journal Volume 169, Number 5

as BNP, are only FDA cleared for use in plasma collected in EDTA. There are also some differences in results of troponin between serum and plasma. 12 Most clinical trials collect blood into only 1 preservative. Therefore, an investigator often does not have a choice as to which tube is available for biomarker analysis. Individuals who are responsible for designing clinical studies that involve biomarkers analysis should plan accordingly for the types of samples and anticoagulants needed for the study. For studies involving HF, NT-proBNP may be the analyte of choice, as testing can be conducted in serum or plasma (heparin or EDTA). B-type natriuretic peptide undergoes degradation in serum and should not be used. Shih et al13 showed that myeloperoxidase is most stable when collected in EDTA. When heparin is used as the anticoagulant, there is leakage of myeloperoxidase from leukocytes causing in increase in serum or plasma. Heparin treatment may also increase the concentration of PAPP-A. 14 CD40 ligand is another novel biomarker for atherosclerosis. Higher concentrations of CD40 ligand are seen in serum than in plasma, and the increase is a function storage time, as this marker is released from activated platelets in vitro.15 In general, biomarkers of platelet function require careful attention to preanalytical issues to avoid inadvertent activation after blood collection due to shear stresses. 16 It makes use of these types of markers more problematic. Insufficient filling of blood collection tubes is another source of preanalytical error. The final concentration of anticoagulants such as heparin, EDTA, and citrate is optimized with the assumption that the tube is completely filled with blood. If the tube is partially filled, the concentration of the anticoagulant increases proportionally. This will result in a higher degree of inaccuracies in the measurement of a biomarker for analytes that are influence by the anticoagulant concentrations. Blood collection tubes containing citrate in liquid form will also produce a greater than expected dilution of biomarkers if the tube is not completely filled to its intended volume. Improper mixing of tubes after blood collection can also produce “concentration gradients” with regard to anticoagulant level and can cause analytic error.

Short-term in vivo and in vitro analyte stability A critical and often overlooked preanalytical variable for biomarker studies is the in vivo stability of the analyte. Some potentially useful cardiac markers have very short half-lives while circulating in blood and cannot reliably be measured. In vitro analyte instability does not necessarily preclude its value for clinical practice. Accurate results for activated clotting time can be achieved in clinical practice despite the instability of the sample, if testing is conducted immediately, for example, at patient bedside. 17 Long-term in vitro analyte stability An important issue for the success of biomarkers from clinical trials is the analyte stability during storage. Data obtained from clinical trials using stored specimens of an

Wu 677

Table II. “Surrogate” biomarkers for cardiovascular disease Active hormone Insulin BNP BNP Vasopressin ANP Adm

Surrogate

Classification

C-peptide NT-proBNP Pro-BNP Copeptin NT-proANP MR-proAdm

Cosecreted with insulin Cosecreted with BNP Precursor protein Cosecreted with vasopressin Cosecreted with ANP Precursor protein

Abbreviations: ANP, Atrial natriuretic peptide; NT, aminoterminus; MR, mid regional; Adm, adrenomedulin.

unstable biomarker may not produce any meaningful results. In 1 study, BNP lost 30% of its concentration after 1 day of storage at −20°C compared to N90% recovery for NT-proBNP. 18 B-type natriuretic peptide is degraded by plasma proteases, and improved stability of BNP can be achieved with the application of inhibitors. 19 However, the addition of a protease inhibitor requires a special blood collection tube and is more difficult to include as part of clinical trials. Cardiac troponin undergoes degradation in serum of patients with acute MI due to the presence of proteolytic enzymes that are coreleased from cardiac tissue. 20 Despite this fact, most commercial manufacturers of troponin assays have demonstrated good sample stability under room temperature, refrigerated, and frozen conditions using first-generation assays 21 and the more recent high-sensitivity assays. 22,23 This is because commercial assays are constructed to be equal molar with each of the degradation products that are formed. 20 Many manufacturers use N2 antibodies to accomplish this detection. 24

Surrogate biomarkers When the analyte demonstrates in vivo or in vitro instability, the analysis of precursor proteins (eg, “pro” and “prepro” forms or cosecreted metabolites) may be an appropriate alternative for routine practice. Table II lists some hormones where measurement of the surrogate may be more appropriate. 25-28 C-peptide, the precursor protein for insulin, is used to evaluate native insulin secretion for diabetics who have received insulin injections. 25 Pro-BNP is the precursor protein to BNP. NT-proBNP is an inactive metabolite of pro-BNP. Surrogate peptides and proteins are generally not physiologically active. Fortunately, this is not a requirement for a cardiac biomarker, as the objective is to correlate the concentration of the marker to disease and not its function in blood. This is different than coagulation markers where the objective is to determine the in vivo functionality.

Analytical variables Assay imprecision and bias Analytical imprecision must be taken into account when evaluating clinical laboratory data. If the

American Heart Journal May 2015

678 Wu

uncertainty of the measurement is unknown, one cannot accurately determine if a particular result is abnormal or the result of variances within the measurement procedure itself. A commercial assay may not have precision to meet a specific clinical need. Therefore, precision of the biomarker, when tested by the researchers themselves, should be documented for all clinical studies. There are 3 types of precision measurements. Intra-assay precision involves testing the analyte on multiple replicates at the same time; interassay precision is testing over different runs, typically over days; and total precision is the combination of both. Typically, the total intra-assay and total precision coefficient of variance are lower than the interassay precision by a few percent, and when this attribute is listed, it may give an inflated indicator of assay performance. For biomarkers that are tested from the central laboratory, the intra-assay imprecision is not relevant unless the analyte is tested in duplicate, which is rarely done by clinical laboratories. An exception is when research tests are conducted using enzyme-linked immunosorbent assays, whereby duplicate measurements are conducted to improve assay precision. Such a practice should be stated in a report or manuscript. The intra-assay imprecision for tests conducted on point-of-care devices is less relevant as each run is a separate analysis procedure and results may be close to interassay precision. Unless otherwise noted, one can assume that samples were assayed in singlicate and the interassay or day-to-day precision should be performed and quoted. Typically, 20 days is used to determine this measure. The imprecision of an assay is also dependent on the analyte concentration. A plot of percent coefficient of variance (%CV) (y-axis) versus concentration (x-axis) is “L” shaped, that is, the imprecision is highest a low concentrations and lowest at high concentrations (Figure 1). 29 Ideally, the precision results should be determined at multiple levels with concentration of the analyte tested given. If only 1 precision statistic is given, the concentration at the decision limit should be listed. For protein biomarkers, the typical interrun precision is 10%. Some biomarkers, such as BNP, can have higher imprecision 30 but may be acceptable if the expected range of results between healthy and diseased groups is large. The imprecision of quantitative measures for other non– blood-based cardiac markers is potentially important, such as the ejection fraction by echocardiography or the QRS interval from the electrocardiogram. Imprecision and biological variation (see next section) of these quantitative measures require repeated measurements, which is typically not performed, and such data are not ever given. Analytical bias refers to the difference between a test result for an analytical method and the “true” result as determined by a reference method. Bias is an important attribute for the determination of the total allowable error for a test (typically, TE = |bias| + 1.96 * imprecision). For experimental biomarkers, reference methods are not

Figure 1

Typical imprecision versus analyte concentration curve for cardiac troponin. Used with permission from the Clin Chem 2009;55:1433-1434.29

available, and the true result is not known. Moreover, there is no standardization for routinely used cardiac biomarkers such as troponin or the natriuretic peptides, and the bias cannot be determined. A term often used interchangeably with total allowable error is total analytical error. In addition to imprecision and bias, the analytical error also includes random bias due to the presence of interferences that are present in some samples but not others. Ideally, total analytical error is determined by comparison of the experimental method against a reference method. Because a reference method is not usually available for many biomarkers, the use of patient comparisons against a predicate (FDA-cleared) assay is one means by which the FDA examines the random error of an assay (see “Correlation of results to other markers” section). 31 The total analytical error cannot be determined if a predicate assay is not available.

Biological variation The biological variation of biomarkers refers to the changes in analyte results within an individual over time, termed the percent coefficient of variance or %CVI, and between groups of individuals, termed %CVG. 32 Studies are conducted in healthy subjects and are used to set decision limits for diseased patients. These calculations in conjunction with the analytical imprecision (CVA) are used to calculate the index of individuality, reference change values (RCVs), and number of samples needed to establish the homeostatic set point. Reference ranges are appropriate for tests where the intraindividual is equal to or greater than interindividual variances (high index of individuality). However, when the index is low (ie, CVI ≪ CVG), a single test result will not span the population-based reference range, and serial measurements should be used for interpretation of results. Cardiac troponin has a low index of individuality; therefore, serial measurements become important. 33 In this case, it is

American Heart Journal Volume 169, Number 5

appropriate to use an individual's own baseline, where available, for the optimum interpretation of results. The RCV determines the percent change that is needed for the biomarker to exceed the analytical variances seen during health. While RCV is given as a percent change, recent studies using high-sensitivity troponin have suggested that an absolute change superior to the RCV. 34 The number of samples that establish a homeostatic set point is the calculations based on biological variation studies and is used to establish a baseline level. Total, low-density lipoprotein, and high-density lipoprotein cholesterols have a low index. The number of samples needed to establish a baseline for these tests is 2 to 4. 35 The homeostatic set point for C-reactive protein is 38 due to high biological variation, 36 limiting its value for primary cardiovascular disease risk.

Reference range studies In many clinical trials of serum biomarkers, especially risk stratification studies, results are divided into “bins” based on biomarker concentrations, for example, tertiles, quartiles, quintiles, and others. Absent from many of these reports is the establishment or documentation of a reference range. In evaluating the biomarker's effectiveness or for an FDA submission, the reference range is essential documentation. The reference range is used to estimate the frequency of abnormal results within a population. If the most of the bins are contained within the reference population, the biomarker will be of less value because the number of subjects affected by the test is will necessarily be small. In the Dallas Heart Study, cardiac troponin T was used as a marker in the general population using the standard- versus high-sensitivity assays. 37 Results for the standard assay were not broken down into bins because most of the subjects had results that were below the assay's detection limit (0.7%). When using the high-sensitivity assay, 25% of subjects had detectable results, so it was appropriate to divide the subjects into at least quintiles. The upper quintile was established at the 99th percentile cutoff limit of the reference range and necessary to put results from this trial into the proper context of routine testing. The establishment of a reference range is not always straightforward. For cardiac troponin, finding subjects that are free from preexisting asymptomatic cardiovascular disease is an important determinant in the reference range assessments. Collinson et al 38 demonstrated that the elimination of subjects with left ventricular dysfunction, hypertension, and vascular and renal dysfunction reduced the value of the 99th percentile for the population they tested. The number of subjects needed to determine the reference population is dependent on covariates. If there are differences in age, gender, or ethnicities for a biomarker, the minimum number of healthy subjects increases. The Clinical Laboratory Science Institute recommends a minimum of 120 subjects for each group to be

Wu 679

tested. 39 If age is broken down into decades, this greatly increases the sample size needed to characterize the reference population. Thus, when a reference range is given, the characterization of the reference population should be included. Recently, gender-specific reference ranges for cardiac troponin have been proposed 40 but not widely implemented yet. Although there are also differences in the natriuretic peptides concentrations between men and women, no separate reference range has been proposed in routine clinical practice to date.

Postanalytical variables Standardization and harmonization of assays “Standardization of assays” involves the development of a reference method and creation of an accepted reference material to calibrate the method. International bodies such as the World Health Organization, National Institute on Standards (United States), Institute for Reference Materials and Methods (European Commission), and International Federation of Clinical Chemistry are responsible for standardization of clinical laboratory tests. Standardization enables results from one test to be directly comparable to another. Examples of clinical tests that are standardized across the world include electrolytes, creatinine, cholesterol, and hemoglobin A1c. “Harmonization of assays” is the process of converting test results despite differences in standardization, so that they can be used interchangeably in clinical practice. 41 If 2 assays are harmonized, results from one can be converted to the other through the application of an equation derived from a regression analysis. An important part of harmonization is the “commutability,” which is a property of the reference materials chosen for a test. Figure 2 shows regression analysis of test results from 2 assays of the same analyte. In panel A, there is concordance of results from standards and samples, and these assays are commutable. Panel B shows concordance between samples, there are biases for samples, and these assays are not commutable and test results cannot be harmonized. Noncommutability is due to differences in the response of the assay toward the matrix used for the reference standard or calibrator. Standardization committees must conduct commutability studies in selection of the best matrix for a reference material. There may also be differences in test results between patients. This may be due to differences in assay response to in vitro constituents such as drugs or other atypically encountered proteins. These differences must be resolved from the selection of reagents used to produce the test kit. For tests that are well established, for example, troponin, this is unlikely to happen. 42 Although there is much activity conducted for harmonization and standardization, this work is conducted for tests that are in widespread clinical use and is not performed for novel biomarkers. As a result, manufacturers of diagnostic

680 Wu

American Heart Journal May 2015

Figure 2

Commutability of results for calibrators used in 2 assays. A, Concordance between clinical samples and reference materials. B, Nonconcordance between clinical samples and reference materials. Used with permission from Clin Chem 2011;57:1108-1117. 41

tests will use different reagents that will produce nonconcordant results. This is particularly problematic form immunoassays, as the performance of a test is dependent on the quality and epitope location that the antibody is selected against. Even when the biomarker has been used for many years, for example, cardiac troponin I or BNP, standardization is not available. Thus, when clinical studies are reported using different methods, absolute results cannot be compared. Depending on how the assay is constructed, this can produced heterogeneity of conclusions.

Correlation of results to other markers In reviewing biomarker data from clinical trials, it is essential to examine how the novel biomarker compares to existing biomarkers. Many markers when used alone can demonstrate clinical value. However, when the results are combined with other biomarkers, especially those that are established in routine practice, the value of the novel biomarker may lose statistical significance. Study funding and the volume of sample remaining may limit the number of biomarkers that can be tested from any given cohort. At a minimum, the results of biomarkers used in routine clinical practice (ie, troponin, natriuretic peptides, and C-reactive protein where appropriate) should be included in the statistical analysis. In a multicenter study, these routine tests ideally should be tested from a central core laboratory using the same testing methodology. If this is not available or practical, results from the “local” laboratory may be sufficient. Given the lack of standardization for even the most widely used assays, it is important for biomarker publications to state manufacturer of any commercial assays used and, where applicable, the specific generation of the assay. A novel biomarker may show superiority to an earlier generation of an assay, for example, cardiac troponin, but whose diagnostic advantage is lost when

contemporary or next-generation of the predicate assays are used.

Correlation of results to discharge diagnoses and clinical outcomes Depending on the intended use of the cardiac biomarker, documentation of the clinical utility of novel biomarkers is required with regard to the discharge diagnoses and/or clinical outcomes. For diagnosis, this might include acute MI using the latest definitions or HF from the latest international guidelines. Risk stratification outcomes typically include cardiovascular death, all-cause death, MI, HF readmission, and need for revascularization (eg, percutaneous coronary intervention or bypass graft surgery). If the biomarker is to be submitted to the FDA, independent adjudication of the clinical diagnosis and outcome is required by qualified clinical experts who are blinded to the experimental laboratory data. 43 Typically, there are 2 adjudicators with a third physician brought in to break a ties in the opinion of the first 2. Laboratory tests from the local clinical laboratory that are used as part of clinical care can be included in making the final discharge diagnoses. Cardiac troponin is the established biomarker for diagnosis of MI. Clinical trials involving next-generation troponin assays will necessarily produce clinically concordant results if there is analytical concordance between test results. Troponin assays that are analytically sensitive, resulting in the use of lower cutoff concentrations, will produce clinical specificities that are reduced from adjudicated MI diagnoses. If however, the clinical end point is “cardiac injury” and not MI, the clinical specificity of high-sensitive troponin assays will be superior to the earlier less sensitive troponin assays. Adherence to STARD guidelines Researchers and clinical trialists should be aware of the subtle nuances regarding clinical laboratory tests, when

American Heart Journal Volume 169, Number 5

Wu 681

Table III. STARD checklist for reporting of studies of diagnostic accuracy 1

Title/abstract/keywords

2

Introduction

3

Methods Participants

4

5

6 7 8

Test methods

9 10 11

12

Statistical methods

13 14 15

Results Participants

16

17

Test results

18 19

20 21 22 23 24 25

Estimates

Discussion

Identify the article as a study of diagnostic accuracy (recommend MeSH heading “sensitivity and specificity”). State the research questions or study aims, such as estimating diagnostic accuracy or comparing accuracy between tests or across participant groups. Describe the study population: The inclusion and exclusion criteria, setting, and locations where the data were collected. Describe participant recruitment: Was recruitment based on presenting symptoms, results from previous tests, or the fact that the participants had received the index tests or the reference standard? Describe participant sampling: Was the study population a consecutive series of participants defined by the selection criteria in items 3 and 4? If not, specify how participants were further selected. Describe data collection: Was data collection planned before the index test and reference standard were performed (prospective) or after (retrospective study)? Describe the reference standard and its rationale. Describe technical specifications of material and methods involved including how and when measurements were taken, and/or cite references for index tests and reference standard. Describe definition of and rationale for the units, cutoffs, and/or categories of the results of the index tests and the reference standard. Describe the number, training, and expertise of the persons executing and reading the index tests and the reference standard. Describe whether the readers of the index tests and reference standard were blind (masked) to the results of the other test and describe any other clinical information available to the reader. Describe methods for calculating or comparing measures of diagnostic accuracy and the statistical methods used to quantify uncertainty (eg, 95% CI). Describe methods for calculating test reproducibility, if done. Report when study was done, including beginning and ending dates of recruitment. Report clinical and demographic characteristics of the study population (eg, age, sex, spectrum of presenting symptoms, comorbidity, current treatments, recruitment centers). Report the number of participants satisfying the criteria for inclusion that did or did not undergo the index tests and/or the reference standard; describe why participants failed to receive either test (a flow diagram is strongly recommended). Report time interval from the index tests to the reference standard, and any treatment administered between Report distribution of severity of disease (define criteria) in those with the target condition; other diagnoses in participants without the target condition. Report a cross tabulation of the results of the index tests (including indeterminant and missing results) by the results of the reference standard; for continuous results, the distribution of the test results by the results of the reference standard. Report any adverse events from performing the index tests or the reference standard Report estimates of diagnostic accuracy and measures of statistical uncertainty (eg, 95% CI). Report how indeterminate results, missing responses, and outliers of the index tests were handled. Report estimates of variability of diagnostic accuracy between subgroups of participants, readers or centers, if done. Report estimates of test reproducibility, if done. Discuss the clinical applicability of the study findings.

planning and reporting results of biomarkers conducted within clinical trials. To assist in the adherence of these issues, a group of clinical scientists and journal editors have created a checklist called, the “Standards for Reporting of Diagnostic Accuracy” (STARD). 44 This checklist contains 25 items that the STARD committee felt were minimal requirements for reporting on the diagnostic value of biomarkers (Table III). These items are broken down into 5 sections that discuss requirements for conducting and publishing results of a clinical trial, including the title/ abstract keywords, introduction, methods, results, and

discussion. Individuals planning trials will find the checklist useful in complying with guidelines.

Conclusions The commercial success of cardiac troponin and the natriuretic peptides has stimulated research in novel biomarkers that have the potential to address unmet clinical needs. This review focused on validation studies needed for protein-based biomarkers (peptide hormones, precursor proteins, receptor ligands, and enzymes).

American Heart Journal May 2015

682 Wu

Table IV. Types of cardiac biomarkers Analyte

Examples

Structural proteins Enzymes Receptor ligands Hormones Precursor proteins Metabolites

Troponin, myoglobin Creatine kinase, myeloperoxidase CD40l, sST2 BNP, insulin NT-proBNP, copeptin Low-molecular-weight analytes, eg, citric acid pathways for myocardial ischemia 45 Gene expression array, eg, T-lymphocyte signaling and inflammatory genes for cardiovascular disease risk 46 Germline mutations, eg, 9p21 for acute MI 47 CYP 2C19 for clopidogrel, CYP2C9/VKORC1 for warfarin

mRNA

DNA Pharmacogenomics

Abbreviation: mRNA, Messenger RNA.

These validations are applicable to biomarkers, listed in Table IV, including metabolites, and nucleic acid (messenger RNA, DNA, and pharmacogenomics tests) molecular-based biomarkers. Proteins are more fragile than nucleic acids or metabolites and require more attention with reference to analyte stability. Molecular markers usually do not require quantitation. DNA/RNA analysis is evolving and not standardized. Relative to protein analysis, the quality of the result is more dependent on the detection scheme used (eg, single mutation detection analysis vs sequencing).

References 1. Vasa RS. Biomarkers of cardiovascular disease. Molecular basis and practical considerations. Circulation 2006;113:2335-62. 2. Weinberg EO, Shimpo M, De Keulenaer GW, et al. Expression and regulation of ST2, an interleukin-1 receptor family member, in cardiomyocytes and myocardial infarction. Circulation 2002;106: 2961-6. 3. Sanada S, Hakuno D, Higgins LJ, et al. Il-33 and ST2 comprise a critically biomechanically induced and cardioprotective signaling system. J Clin Invest 2007;117:1538-49. 4. Morrow DW, Braunwald E. Future of biomarkers in acute coronary syndromes: moving towards a multimarker strategy. Circulation 2003;108:250-2. 5. Jacobs TW, Gown AM, Yaziji H, et al. Specificity of HercepTest in determining HER-2/neu status of breast cancers using the United States Food and Drug Administration–approved scoring system. J Clin Oncol 1999;17:1983-7. 6. Lainchbury JG, Troughton RW, Strangman KM, et al. N-terminal pro-B-type natriuretic peptide–guided treatment for chronic heart failure: results from the BATTLESCARRED Trial. JACC 2010;55: 53-60. 7. Starnberg K, Jeppsson A, Lindahl B, et al. Revision of the troponin T release mechanism from damaged human myocardium. Clin Chem 2014;60:1098-104.

8. Dries DL, Ky B, Wu AH, et al. Simultaneous assessment of unprocessed proBNP1-108 in addition to processed BNP32 improves identification of high-risk ambulatory patients with heart failure. Circ Heart Fail 2010;3:220-7. 9. Lund J, Wittfooth S, Qin QP, et al. Free vs. total pregnancy-associated plasma protein A (PAPP-A) as a predictor of 1-year outcome in patients presenting with non–ST-elevation acute coronary syndrome. Clin Chem 2010;56:1158-65. 10. Jaffe AS, Van Eyke JE. Degradation of cardiac troponins. In: Morrow DA, ed. Implications for clinical practice. Totowa, NJ: Cardiovascular Biomarkers, Human; 2006. p. 261-74. 11. Luckenbill KN, Jaffe AS, Mair J, et al. Cross-reactivity of BNP, NT-proBNP and proBNP in commercial NT-proBNP assays: a substudy from the IFCC Committee for Standardization of Markers of Cardiac Damage. Clin Chem 2008;54:619-21. 12. Dorizzi RM, Caputo M, Ferrari A, et al. Comparison of serum and heparin-polasma samples in different generations of dimension tropnin I assay. Clin Chem 2003;49:2294-6. 13. Shih J, Datwyler SA, Hsu SC, et al. Effect of collection tube type and preanalytical handling on myeloperoxidase concentrations. Clin Chem 2008;54:1076-9. 14. Tertti R, Wittfooth S, Porela P, et al. Intravenous administration of low molecular weight and unfractionated heparin elicits a rapidincrease in serum pregnancy-associated plasma protein A. Clin Chem 2009;55:1214-7. 15. Halldorsdottir AM, Stoker J, Porce-Sorbet R, et al. Soluble CD40 ligand measurement inaccurcies attributeable to specimen type, processing time, and ELISA method. Clin Chem 2005;51:1054-7. 16. Lu Q, Hofferbert BV, Koo G, et al. In vitro shear stress-induced platelet activation: sensitivity of human and bovine blood. Artif Organs 2013;37:894-903. 17. Meybohm P, Zacharowski K, Weber CF. Point-of-care coagulation management in intensive care medicine. Crit Care 2013;17:218-27. 18. Mueller T, Gegenhuber A, Dieplinger B, et al. Long-term stability of endogenous B-type natriuretic peptide (BNP) and amino terminal proBNP (NT-proBNP) in frozen plasma samples. Clin Chem Lab Med 2004;42:942-4. 19. Belenky A, Smith A, Zhang B, et al. The effect of class-specific protease inhibitors on the stabilization of B-type natriuretic peptide in human plasma. Clin Chim Acta 2004;340:163-72. 20. Wu AHB, Feng YJ, Moore R, et al. Characterization of cardiac troponin subunit release into serum following acute myocardial infarction, and comparison of assays for troponin T and I. Clin Chem 1998;44:1198-208. 21. Apple FS, Murakami MM, Christenson RH, et al. Analytical performance of the i-STAT cardiac troponin I assay. Clin Chim Acta 2004;345:123-7. 22. Wu AHB, Shea E, Lu QT, et al. Short- and long-term cardiac troponin I analyte stability in plasma and serum from healthy volunteers by use of an ultrasensitive, single-molecule counting assay. Clin Chem 2009;55:2057-9. 23. Agarwal SK, Avery CL, Ballantyne CM, et al. Sources of variability in measurements of cardiac troponin T in a community-base sample: the Atherosclerosis Risk in Communitites Study. Clin Chem 2011;57: 891-7. 24. Apple FS, Collinson PO. Analytical characteristics of high-sensitivity cardiac troponin assays. Clin Chem 2012;58:54-61. 25. Kitabchi AE. Proinsulin and C-peptide: a review. Metabolism 1977;26:547-87. 26. Morgenthaler NG, Struck J, Alonso C, et al. Assay for the measurement of copeptin, a stable peptide derived from the precursor of vasopressin. Clin Chem 2006;52:112-9.

American Heart Journal Volume 169, Number 5

Wu 683

27. Morgenthaler NG, Struck J, Alonso C, et al. Measurement of midregionalproadrenomedeullin in plasma with an immunoluminometric assay. Clin Chem 2005;51:1823-9. 28. Morgenthal NG, Struck J, Thomas B, et al. Immunoluminometric assay for the midregion of pro-atrial natriuretic peptide in human plasam. Clin Chem 2004;50:234-6. 29. Collinson PO, Clifford-Mobley O, Gaze D, et al. Assay imprecision and 99th percentile reference value of a high-sensitivity cardiac troponin I assay. Clin Chem 2009;55:1433-4. 30. Clerico A, Zaninotto M, Prontera C, et al. State of the art of BNP and NT-proBNP immunoassays: the CardioOrmoCheck study. Clin Chim Acta 2012;414:112-9. 31. Krouwer JS, Astles JR, Cooper WG, et al. Estimation of total analytical error for clinical laboratory methods. Guideline EP21-A. Wayne, PA: Clinical and Laboratory Standard Institute. 2003. 32. Fraser C. Biologica Variation. From principles to practice. Washington DC: AACC Press. 2001. 33. Wu AHB, Lu A, Todd J, et al. Short- and long-term biological variation in cardiac troponin I with a high-sensitivity assay: implications for clinical practice. Clin Chem 2009;55:52-8. 34. Keller T, Zeller T, Ojeda F. Serial changes in highly sensitive troponin I assay and early diagnosis of myocardial infarction. JAMA 2011;306:2684-93. 35. Ricos C, Alvarez V, Cava F, et al. Current databases on biological variation: pros, cons and progress. Scand J Clin Lab Invest 1999;59:491-500. 36. Fraser CG. Test result variation and the quality of evidence-based clinical guidelines. Clin Chim Acta 2004;346:19-24. 37. De Lemos JA, Drazner MH, Omland T, et al. Association of troponin T detected with a highly sensitive assay and cardiac structure and mortality risk in the general popluation. JAMA 2010;304:2503-15.

O

38. Collinson PO, Heung YM, Gaze D, et al. Influence of population selection on the 99th percentile reference value for cardiac torponin assays. Clin Chem 2012;58:219-25. 39. Clinical and Laboratory Standards Institute. Defining, establishing, and verifying reference intervals in the clinical laboratory; approved guideline. CLSI document C28-A33rd ed. . Wayne, PA: Clinical and Laboratory Standards Institute. 2008. 40. Gore MO, Seliger SL, de Filippi CR, et al. Age- and sex-dependent upper reference limits for the high-sensitivity troponin T assay. J Am Coll Cardiol 2014;63:1441-8. 41. Miller WG, Myers GL, Gantzer ML, et al. Roadmap for harmonization of clinical laboratory measurement procedures. Clin Chem 2011;57: 1108-17. 42. Apple FS. Counterpoint: standardization of cardiac troponin I assays will not occur in my lifetime. Clin Chem 2012;58:169-71. 43. Apple FS, Hollander J, Wu AHB, et al. Improving the 519(k) process for cardiac troponin assays: in search for common ground. Clin Chem 2014;60:1273-5. 44. Bossuyt PM, Reistma JB, Bruns D, et al. The STARD statement for reporting studies of diagnostic accuracy: explanation and elaboration. Clin Chem 2003;49:7-18. 45. Sabatine MS, Liu E, Morrow DA, et al. Metabolimic identification of novel biomarkers of myocardial ischemia. Circulation 2005;112: 3868-75. 46. Kim J, Ghazemazdeh N, Eapen DJ. Gene expression profiles associated with acute myocardial infarction and risk of cardiovascular death. Genome Med 2014;6. [in press]. 47. Fan M, Dandona S, McPherson R, et al. Two chromosome 9p21 haplotype blocks distinguish between coronary artery disease and myocardial infarction risk. Circ Cardiovasc Genet 2013;6:372-80.

N THE MOVE? Send us your new address at least 6 weeks ahead

Don’t miss a single issue of the journal! To ensure prompt service when you change your address, please photocopy and complete the form below. Please send your change of address notification at least 6 weeks before your move to ensure continued service. We regret we cannot guarantee replacement of issues missed because of late notification. JOURNAL TITLE: Fill in the title of the journal here.

__________________________________________________________

OLD ADDRESS:

NEW ADDRESS:

Affix the address label from a recent issue of the journal here.

Clearly print your new address here. Name ____________________________________________________ Address __________________________________________________ City/State/ZIP _____________________________________________

COPY AND MAIL THIS FORM TO: Elsevier Health Sciences Division Subscription Customer Service 3251 Riverport Lane, Maryland Heights, MO 63043

OR FAX TO: 341-447-8029

OR PHONE:

OR E-MAIL:

1-800-654-2452 Outside the U.S., call 341-447-8871

[email protected]

Analytical validation of novel cardiac biomarkers used in clinical trials.

Blood-based biomarkers such as cardiac troponin and B-natriuretic peptides are widely used in clinical practice for the diagnosis, rule out, and risk ...
370KB Sizes 2 Downloads 13 Views