Article

Adverse event load, onset, and maximum grade: A novel method of reporting adverse events in cancer clinical trials

CLINICAL TRIALS Clinical Trials 1–10 Ó The Author(s) 2020 Article reuse guidelines: sagepub.com/journals-permissions DOI: 10.1177/1740774520959313 journals.sagepub.com/home/ctj

Guilherme S Lopes1 , Christophe Tournigand2, Curtis L Olswold1, Romain Cohen1,9, Emmanuelle Kempf2, Leonard Saltz3, Richard M Goldberg4, Herbert Hurwitz5, Charles Fuchs6,7, Aimery de Gramont8 and Qian Shi1

Abstract Background: Current adverse event reporting practices do not document longitudinal characteristics of adverse effects, and alternative methods are not easily interpretable and have not been employed by clinical trials. Introducing time parameters in the evaluation of safety that are comprehensive yet easily interpretable could allow for a better understanding of treatment quality. In this study, we developed and applied a novel adverse event reporting method based on longitudinal adverse event changes to aid describing, summarizing, and presenting adverse event profile. We termed it the ‘‘Adverse Event Load, Onset, and Maximum Grade’’ method. Methods: We developed two adverse event summary metrics to complement the traditional maximum grade report. Onset time indicates the time period in which the maximum grade for a specific adverse event occurred and was defined as ‘‘early’’ (i.e. maximum grade happened for the first time before 6 weeks) or ‘‘late’’ (i.e. after the 6th week). Adverse event load indicates the overall severity of a specific adverse event over the entire treatment. Higher adverse event load indicates a worse overall experience. These metrics can be calculated for adverse events with different maximum grades, in treatments with planned changes (e.g. dosage changes), using data sets with different number of adverse event data points between treatments (e.g. treatments with longer cycle lengths may have less adverse event data points) and on data sets with different adverse event data availability (e.g. cycle basis and patient-outcome reports). We tested the utility of this method using individual patient data from two major backbone therapies (‘‘Irinotecan’’ and ‘‘Oxaliplatin’’) from the N9741 trial available in the Fondation ARCAD database (fondationarcad.org). We investigated profiles of diarrhea, neutropenia/leukopenia, and nausea/vomiting. Results: Our method provided additional information compared to traditional adverse event reports. For example, for nausea/vomiting, while patients in Irinotecan had a higher risk of experiencing maximum grade 3–4 (15.6% vs 7.6%, respectively; p \ 0.001), patients in both groups experienced similar severity over time (adverse effect load = 0.102 and 0.096, respectively; p = 0.26), suggesting that patients in Oxaliplatin experienced a lower-grade but more persistent nausea/vomiting. For neutropenia/leukopenia, more patients in Irinotecan experienced their maximum grade for the first

1

Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA Hospital Henri Mondor, Paris-Est Cre´teil University, Cre´teil, France 3 Memorial Sloan Kettering Cancer Center, New York, NY, USA 4 Mary Babb Randolph Cancer Center, West Virginia University Cancer Institute, Morgantown, WV, USA 5 Duke University Medical Center, Durham, NC, USA 6 Yale Cancer Center, Boston, MA, USA 7 Dana-Farber Cancer Institute, Boston, MA, USA 8 Department of Medical Oncology, Franco-British Institute, Levallois-Perret, France 9 Sorbonne Universite´, Department of Medical Oncology, Saint-Antoine hospital, AP-HP, F-75012 Paris, France. 2

Corresponding author: Guilherme S Lopes, Department of Health Sciences Research, Mayo Clinic, 200 First Street SW, Rochester, MN 55905, USA. Email: [email protected]

2

Clinical Trials 00(0)

time early in the treatment compared to patients in Oxaliplatin (67.9% vs 41.7%; p \ 0.001), regardless of maximum grade. Longitudinal information can help compare treatments or guide clinicians on choosing appropriate interventions for low-grade but persistent adverse event or early adverse event onset. Conclusion: We developed an adverse event reporting method that provides clinically relevant information about treatment toxicity by incorporating two longitudinal adverse event metrics to the traditional maximum grade approach. Future research should establish clinical benchmarks for metrics included in this adverse event reporting method. Keywords Adverse events, safety, toxicity, longitudinal analysis, clinical trials, adverse event load, onset time

Introduction The major goal of cancer clinical trials is to evaluate the efficacy and safety of tested therapies. While efficacy criteria evolved with the science of clinical trials conduct, the evaluation of adverse events (AEs) through the National Cancer Institute’s Consensus Toxicity Criteria for Adverse Events criteria has remained relatively static. Safety reporting in clinical trials mainly focuses on the percentage of patients who experience AEs by grade at least once over the course of their treatment. However, there are important limitations to the current AE reporting method. In particular, the traditional maximum grade approach to reporting AEs captures acute but not chronic events occurring for multiple days.1 Long-lasting low-grade AEs can significantly impact quality of life and can be more troubling than one occurrence of a grade 3–4 AE.2 In addition, the traditional AE reporting method does not capture whether patients experienced grade 3–4 early on versus later in the treatment. This information can help clinical decision-making by anticipating the need for an intervention. Introducing time parameters in the evaluation of safety could allow for a better approach of the treatment quality and improve the comparison between treatment arms. Previous research has attempted to challenge traditional methods of AE reporting by presenting novel approaches that estimate weighted toxicity score,3 major safety risks,4 and AEs over time.5 These proposed approaches have some important limitations. First, most approaches do not capture toxicity changes over time. Second, one approach quantifies changes over time by conducting area under the curve (AUC) analyses, but the absolute AUC values are not easily interpretable by clinicians or comparable across studies or grading systems. Third, some methods presented in this previous research summarize AE data using special statistical tools and require considerable biostatistician efforts to program, such that reporting would have to be simplified and tailored based on the major needs of a given study or patient population. However, different safety reports lack efficiency and may not permit outputs to be compared or meta-analyzed. Fourth, the adoption of complex, lengthy longitudinal AE analyses

requires a substantial shift in how biostatisticians and clinicians report and interpret AEs, and therefore these analyses have not been consistently employed in past clinical trials and seem unlikely to be employed in future clinical trials. In this study, we address previous limitations in AE reporting methods by developing a comprehensive yet easily interpretable AE reporting method based on two AE summary metrics for individual patients to aid in describing, summarizing, and presenting AE profile. We termed this method the ‘‘Adverse Effect Load, Onset, and Maximum Grade’’ method.

Methods Development of AE summary metrics We developed two summary metrics to complement the maximum grade metric reported in clinical trials. The three metrics conjointly provide a comprehensive toxicity evaluation. First metric: maximum grade. Traditionally reported in clinical trials, the first AE metric is a patient’s maximum grade during the treatment. This is an ordinal variable (i.e. 0–5 as per the National Cancer Institute’s Consensus Toxicity Criteria for Adverse Events grading system6) that refers to the maximum grade experienced by a patient for a specific AE. At the individual patient level, this metric describes the worst grade experienced by a patient at any point in the treatment for a specific AE. At the treatment level, clinical trials generally report and compare summary statistics (frequencies and percentages) of grades 3–4 of specific AEs between treatments. In this study, we propose two AE summary metrics to complement this maximum grade approach. Second metric: onset time of maximum grade. The second metric indicates time of onset of the maximum grade for a specific AE. This is a categorical variable that reflects a pre-defined time period in which a patient experienced their maximum grade for the first time for a specific AE. For example, 12 weeks of treatment is important to show early tolerance as well as efficacy of

Lopes et al.

3

Figure 1. AEL calculation for an AE at a patient level using fictional patient data.

a new agent in advanced colorectal cancer. For individual patients, onset time of maximum grade could therefore be defined as ‘‘early onset’’ (i.e. patient’s maximum grade happened for the first time within the first X weeks) or ‘‘late onset’’ (i.e. happened after the 12th week). This metric can be organized into different number of categories. This metric adds to traditional reports because it can detect early versus late onset of maximum grade among patients who experienced the same maximum grade for a specific AE. It can also provide insights on individual characteristics affecting toxicity (e.g. older patients may experience maximum grades earlier in the treatment).

then be summarized for treatment-level analysis. AE load can be calculated for (and compared across) AEs with different worst grades and treatments with changes in the regimen (e.g. dosage or drug changes). It can also be calculated using data sets containing treatments with different number of AE data points (e.g. treatments with longer cycle lengths may have less AE data points) or on data sets with different AE data availability conditions (e.g. cycle basis and patient-outcome reports). Suggested rules to create AE timelines based on different AE data availability conditions are described in the Supplemental Material.

Third metric: adverse event load. AE load is a metric that reflects the overall severity (or ‘‘load’’) of a specific AE experienced by a patient over the course of the entire treatment. AE load for a specific AE is calculated at an individual level and can be subsequently calculated at a treatment level for summary. AE load varies from 0 to 1, such that higher AE load for a specific AE indicates a worse overall experience (or a higher ‘‘load’’) of that AE for the patient (individual level) or for all patients of the treatment (treatment level). For example, a patient with AE load of 0.25 for nausea experienced 25% of the worst nausea possible across the entire treatment. Figure 1 describes the calculation of AE load for a specific AE at a patient level. The first step is to create a ‘‘timeline’’ of the AE (step 1). Next, one calculates the load of the AE by averaging its grades of all days of treatment (steps 2–4), then divides a patient’s averaged grade for the AE by the highest grade preceding death in the grading system (i.e. 4 in the National Cancer Institute’s Consensus Toxicity Criteria for Adverse Events grading system) (step 5). AE load can

Summary. We introduced time parameters in the evaluation of toxicity by developing two AE summary metrics (i.e. onset time and AE load) that complement the traditional maximum grade metric and provide information about AE changes over time. Table 1 summarizes the characteristics of each AE metric. Because each metric captures a unique feature of toxicity, the conjoint interpretation of these metrics provides a comprehensive view of treatment toxicity. Figure 2 illustrates the utility of the conjoint interpretation of the metrics for toxicity evaluation. Figure 2(a) depicts two fictional patients who experienced identical maximum grade (i.e. 3) and identical onset of maximum grade (‘‘early’’), but different AE load (0.193 and 0.356). Figure 2(b) depicts two fictional patients who experienced identical maximum grade (i.e. 3), identical AE load (0.193), but different onset of maximum grade (‘‘early’’ and ‘‘late’’). Figure 2(c) depicts two fictional patients who experienced almost identical AE load (0.419), identical onset time (‘‘early’’), but different maximum grades (i.e. 4 and 2). In this study, we applied the three metrics to real

4

Clinical Trials 00(0)

Table 1. Characteristics of AE metrics. Metric

Measurement level

Definition

Level of analysis Patient

Treatment (summary)

Maximum gradea

Ordinal

Maximum grade experienced in the treatment for a specific AE

Indicates the patient’s highest grade for a specific AE

Indicates the number of patients who experienced each maximum grade for a specific AE

Onset time of maximum grade

Categorical

Pre-defined time period in the treatment that the maximum grade for a specific AE occurred

Indicates the time period in which the patient experienced his or her worst AE grade

Indicates the proportion of patients who experienced maximum AE grade within each pre-defined time period

AE load (AEL)

Continuous

Overall grade (or ‘‘load’’) of a specific AE over the course of the entire treatment

Describes how severe the AE was experienced by the patient over the entire treatment

Describes the average ‘‘load’’ of an AE experienced by all patients in the treatment

AE: adverse event. a This metric is traditionally reported in oncology clinical trials.

Figure 2. Fictional scenarios comparing AE profiles at a patient level. Outlined numbers refer to the maximum grade in their first occurrence for each patient. Vertical gray line refers to onset cutoff point used to define ‘‘early’’ and ‘‘late’’ onset. AEL: adverse event load.

Lopes et al.

5

Table 2. Patient characteristics by treatment. Total Sex (%) Male Female Missing Age Mean SD Median Q1–Q3 Race (%) Caucasian Black Asian Other/unknown ECOG PS (%) 0 1 2 Missing Metastases (%) 0 1 2+ Missing Dosage changes (%) No Yes, planned Yes, unplanned Missing

Irinotecan (n = 270)

Oxaliplatin (n = 446)

p-value

97 (35.9) 173 (64.1) 0

183 (41) 263 (59) 0

0.201a

59.62 11.25 61 53–68

60.09 11.74 60.5 52–69

0.607b

243 (90) 24 (8.9) 1 (0.4) 2 (0.7)

402 (90.1) 26 (5.8) 13 (2.9) 5 (1.1)

0.051c

124 (45.9) 95 (35.2) 11 (4.1) 40

195 (43.7) 167 (37.4) 22 (4.9) 62

0.715a

2 (0.7) 145 (53.7) 123 (45.6) 0

3 (0.7) 222 (49.8) 221 (49.6) 0

0.616c

95 (35.2) 175 (64.8) 0 (0) 0

149 (33.4) 297 (66.6) 0 (0) 0

0.686a

SD: standard deviation; ECOG PS: Eastern Cooperative Oncology Group performance status. All patients had metastasis to the liver and to the lungs. a Pearson’s chi-square test. b Wilcoxon rank sum test with continuity correction. c Pearson’s chi-square test with simulated p-value based on 2000 replicates.

individual patient data to demonstrate the utility of the proposed AE reporting method.

Application of AE summary metrics to real individual data Selection of real individual patient data. We applied the metrics to individual patient data from the North Central Cancer Treatment Group trial N9741 (NCT00003594), a randomized first-line phase III trial for patients with metastatic colorectal cancer available in the Fondation ARCAD database. The ARCAD database integrates individual patient-level data from a large collection of colorectal cancer clinical trials (fondationarcad.org). We selected the N9741 trial due to its high-quality data (e.g. low proportion of missing values, clear use of National Cancer Institute’s Consensus Toxicity Criteria for Adverse Events criteria, longitudinal AE data available). The N9741 trial includes three arms, which we categorized into two major treatment groups: ‘‘Irinotecan’’ (n = 276), which included the

Irinotecan, folinic acid, and fluorouracil treatment, and ‘‘Oxaliplatin’’ (n = 457), which included the treatments Oxaliplatin combined with 5-fluorouracil and folinic acid and Oxaliplatin combined with 5-fluorouracil. The purpose of applying this AE reporting method to real individual patient data is to demonstrate the utility of this method and not to provide a clinical benchmark of these metrics for specific treatments. Participants signed an institutional review board–approved, protocol-specific informed consent in accordance with federal and institutional guidelines. Statistical analysis. Baseline patient characteristics were summarized using counts and percentages for nominal variables, and mean, standard deviation, median, and quartiles for real-valued variables. We investigated profiles of diarrhea, neutropenia/leukopenia, and nausea/ vomiting because these are clinically relevant AEs in both Irinotecan- and Oxaliplatin-based treatments. We set onset time threshold to 6 weeks because most severe AEs and treatment changes for metastatic colorectal cancer patients should have been addressed within 6 weeks. That is, for each AE, onset time was defined as ‘‘early’’ (i.e. maximum grade happened for the first time before the first 42 days or 6 weeks) or ‘‘late’’ (i.e. happened on or after the 42nd day). For each AE, we compared groups in terms of occurrence of grade 3–4 and onset time of maximum grade using logistic regression, and in terms of AE load using linear regression. We included age, sex, number of metastases, and dosage changes as covariates in the regression models. A two-sided significance threshold of a = 0.05 was adopted for all analyses.

Results Patient characteristics Among the patients in the N9741 trial, 17 did not have AE data and were excluded from the analysis (n = 6 and n = 11 for Irinotecan and Oxaliplatin, respectively). Table 2 summarizes the characteristics of the patients included in our analyses (n = 716). The groups had similar proportions of patients in terms of sex, age, race, Eastern Cooperative Oncology Group performance status, number of metastases, and dosage changes. All patients had metastasis to the liver and to the lungs.

Nausea/vomiting Table 3 summarizes differences between groups of each metric for each AE. Approximately twice as many patients in Irinotecan (15.6%) experienced grade 3–4 nausea/vomiting compared to patients in Oxaliplatin (7.6%, p \ 0.001). However, patients in Irinotecan and Oxaliplatin experienced similar overall grades of nausea/vomiting over the course of the treatment (AE

6

Clinical Trials 00(0)

Table 3. AE profile of Irinotecan versus Oxaliplatin for different AEs. Nausea/vomiting Irinotecan Total (n = 270) Maximum grade 1 86 (31.9) 2 54 (20) 3–4 42 (15.6) Grade 1 AE load Mean SD

Oxaliplatin

Neutropenia/leukopenia Adjusted p-value

(n = 446) 197 (44.2) 97 (21.7) 34 (7.6)

\0.001

Irinotecan

Oxaliplatin

(n = 270)

(n = 446)

24 (8.9) 65 (24.1) 123 (45.6)

32 (7.2) 96 (21.5) 246 (55.2)

(n = 212)

(n = 374)

Diarrhea Adjusted p-value

\0.001

Irinotecan

Oxaliplatin

(n = 270)

(n = 446)

69 (25.6) 48 (17.8) 89 (33)

155 (34.8) 87 (19.5) 66 (14.8)

(n = 206)

(n = 308)

Adjusted p-value

\0.001

(n = 182)

(n = 328)

0.102 0.09

0.096 0.07

0.2614

0.173 0.14

0.187 0.11

0.2471

0.133 0.09

0.110 0.08

0.0019

190 (57.9) 138 (42.1)

0.9644

144 (67.9) 68 (32.1)

156 (41.7) 218 (58.3)

\0.001

151 (73.3) 55 (26.7)

170 (55.2) 138 (44.8)

\0.001

Onset time Early 105 (57.7) Late 77 (42.3)

AE: adverse event; SD: standard deviation. p-values adjusted for age, sex, number of metastases, and dosage changes. This table summarizes all metrics from our method.

load = 0.102 and 0.096, respectively; p = 0.261). In addition, patients in Oxaliplatin and Irinotecan had similar risk of experiencing their maximum grade for the first time early in the treatment (57.7% and 57.9%, respectively; p = 0.964). Traditional AE report practices using our data would have documented only that a lower proportion of patients in Oxaliplatin experienced nausea/vomiting maximum grade 3–4 compared to Irinotecan. This novel AE reporting method adds to traditional AE reports by documenting that, despite patients in Irinotecan having a higher risk of experiencing grades 3–4, patients in both Irinotecan and Oxaliplatin experienced similar severity over time in nausea/vomiting (i.e. similar AE load), suggesting that patients in Oxaliplatin may have experienced a lower-grade but more persistent nausea/vomiting.

Neutropenia/leukopenia More patients in Oxaliplatin experienced neutropenia/ leukopenia maximum grade 3–4 compared to patients in Irinotecan (55.2% and 45.6%, respectively; p \ 0.001). However, patients in Irinotecan and Oxaliplatin experienced similar overall grades of neutropenia/leukopenia over the course of the treatment (AE load = 0.173 and 0.187, respectively; p = 0.247). Traditional AE report practices would have documented that more patients in Irinotecan experienced maximum grade 3–4. In this proposed AE reporting method, we also document that patients in Oxaliplatin experienced a lower-grade but more persistent neutropenia/leukopenia over the course of the treatment, as indicated by Oxaliplatin having similar AE load despite having a lower proportion of patients experiencing grades 3–4.

Among patients with maximum grade 2, patients in Oxaliplatin experienced higher AE load values over the course of the treatment (p = 0.014; Figure 3(a)). Traditional reports would have indicated that patients with maximum grade 2 in Irinotecan and Oxaliplatin experienced similar neutropenia/leukopenia. Our AE reporting method adds to traditional AE reports by documenting that neutropenia/leukopenia is more persistent in Oxaliplatin (compared to Irinotecan) among patients with maximum grade 2—a grade that is generally regarded as low but that can still affect quality of life. In addition, more patients in Irinotecan (67.9%) experienced their worst neutropenia/leukopenia for the first time early in the treatment (i.e. before the sixth week) compared to patients in Oxaliplatin (41.7%, p \ 0.001), regardless of their maximum grade (Figure 3(b)). Onset time may provide insights about differences in the mechanism of action between treatments or about specific individual differences affecting the occurrence and severity of AE within a treatment.

Diarrhea More patients in Irinotecan experienced diarrhea maximum grade 3–4 compared to patients in Oxaliplatin (33% and 14.8%, respectively; p \ 0.001). In addition, patients in Irinotecan experienced worst overall grades (i.e. higher AE load values) of diarrhea over the course of the treatment (AE load = 0.133) compared to patients in Oxaliplatin (AE load = 0.110; p = 0.002), and more patients in Irinotecan (73.3%) experienced their worst diarrhea for the first time early in the treatment compared to patients in Oxaliplatin (55.2%, p \ 0.001). These results provide a more comprehensive

Lopes et al.

7

Figure 3. Onset time and AEL of neutropenia/leukopenia between groups by maximum grade: (a) neutropenia/leukopenia AEL by maximum grade and (b) neutropenia/leukopenia early onset by maximum grade. AEL: adverse effect load. p-values adjusted for age, sex, number of metastases, and dosage changes. Outlined numbers at the bottom represent the sample size.

In this study, we proposed a novel AE reporting method by developing two AE metrics that complements the traditional maximum grade summary. We also applied these metrics to real individual data to demonstrate that this reporting method adds to traditional AE reports by documenting important longitudinal aspects of toxicity. Here, we discuss benefits and challenges of implementing the proposed reporting method.

revealed that the onset of grade ø2 diarrhea is earlier for patients given Irinotecan and Oxaliplatin than those given Oxaliplatin combined with 5-fluorouracil and folinic acid—that is, at 1 month, 26.2% of patients receiving Irinotecan and Oxaliplatin had grade 2 or worse diarrhea compared with 8.2% of those receiving Oxaliplatin combined with 5-fluorouracil and folinic acid.5 The use of our metric ‘‘onset time’’ provided similar information, yet is simpler to calculate, report, and interpret (i.e. it does not require the Kaplan–Meier method)—for example, 67.9% of patients in Irinotecan experienced their worst neutropenia/leukopenia before the 6th week of treatment (compared to 41.7% of patients in Oxaliplatin), and this difference remained significant among patients with same maximum grade.

Onset time of maximum grade versus alternative onset time approaches

AE load versus alternative severity over time approaches

Onset time of maximum grade informs when patients experienced their maximum grade over the course of the treatment. This information can help clinical decision-making by anticipating the need for an intervention or by providing support for maintenance strategies. Previous research has proposed different approaches to AE analysis that identifies when the maximum grade occurred over the course of the treatment.5 ToxT analysis identifies time of onset by conducting time-to-event analysis using the Kaplan–Meier method—a methodology that is well understood by clinicians. However, we believe that our metric provides similar information about onset time while requiring substantially less computational effort and manuscript space. For example, one use of time-toevent analysis using the Kaplan–Meier method

AE load reflects the overall longitudinal severity (or ‘‘load’’) of a specific AE. This information can reveal long-lasting low-grade AEs and therefore can help guide clinicians on choosing appropriate interventions for low-grade but persistent AE. Previous research has proposed techniques to capture chronic low-grade toxicity.5 For example, the ToxT approach uses AUC analysis. An application of AUC analysis revealed a higher mean diarrhea grade over time for Irinotecan and Oxaliplatin (AUC = 4.2; standard deviation (SD) = 5.2) compared to Oxaliplatin combined with 5fluorouracil and folinic acid (AUC = 2.9; SD = 4.2; p \ 0.001).5 However, because the absolute AUC values are not easily interpretable or comparable across studies, it is difficult to understand the clinical meaning of this difference or the actual severity of the AE

profile of diarrhea, revealing that diarrhea is worse in Irinotecan in terms of maximum grade, severity over time, and onset time across the entire treatment.

Discussion

8 experienced by patients in each treatment. In contrast, AE load varies from 0 to 1, with higher AE load indicating a worse overall experience of that AE over the course of the treatment. For example, one application of AE load revealed that patients in Irinotecan (AE load = 0.133) experienced worst overall grades of diarrhea over the course of the treatment compared to patients in Oxaliplatin (AE load = 0.110; p = 0.002). This means that patients in Irinotecan and Oxaliplatin experienced, respectively, 13.3% and 11% of the worst diarrhea possible (excluding death) over the entire treatment. Because of the normalized nature of AE load (i.e. it always varies from 0 to 1), the use of this metric allows for comparable interpretation across different types of AEs, number of cycles, agents, trials, and grading systems. Future research may establish clinical references of AE load values.

The AE load calculation excludes death due to AE (i.e. grade 5) AE load is defined as the overall severity of an AE relative to the worst experience preceding death over the entire treatment. This definition excludes death because accurate AE load values require patients to be able to experience AE load = 1 while continuing the treatment. The inclusion of death (i.e. grade 5) would prevent AE load values from reaching 1 even if the patient died. In the National Cancer Institute’s Consensus Toxicity Criteria for Adverse Events grading system, specifically, the inclusion of death would cause AE load to not reach values above 0.8 unless death due to AE occurred. That is, the inclusion of death in the calculation of AE load would reduce the range of AE load while not providing benefits to its interpretation. Excluding death in the AE load calculation therefore ensures that AE load values accurately reflect the most severe AE over time to the extent that the patient can continue the treatment.

Evaluating toxicity using one single metric Although the conjoint interpretation of the three metrics provides a comprehensive view of toxicity, AE load is the preferred metric if toxicity is to be evaluated using a single metric. A single AE load value can reveal more about toxicity profile than the proportion of patients experiencing grades 3–4 or different onset times, rendering AE load as the most informative metric among all three metrics. This is because AE load is the only metric that alone can provide information about both time (acute vs chronic) and grade (low vs high) of toxicity profile. For example, maximum grade gives information about grade (low vs high), but it does not give information about time (acute vs chronic). High AE load values can only be experienced if there were consistent high grades over time (i.e. chronic high

Clinical Trials 00(0) grade AE). Moderate AE load values may indicate one of two scenarios and requires the aid of the other metrics for a full interpretation of toxicity profile. First, moderate AE load values can indicate an acute high grade AE (the acute condition contributes to a decrease in AE load value, while occurrences of high grades increase AE load). Second, moderate AE load values can indicate a chronic low-grade AE (the chronic condition increases AE load, but low-grade occurrences decrease AE load). Finally, low AE load values can only indicate acute low-grade AE, that is, the occurrence of low-grade AE following (or followed by) absence of AE. In short, although AE load alone does not provide a complete view of toxicity, reporting AE load alone is more informative than reporting maximum grade or onset time alone.

Data imputation assumptions in treatments with different AE availability To create an AE timeline required for AE load calculation, the biostatistician needs to make assumptions to fill the ‘‘gaps’’ where there is no available AE data (see Supplemental Material). Therefore, the amount of available AE data in a treatment may affect the accuracy of AE load. In clinical trials recording AE data on a cycle basis, data imputation assumptions have a stronger effect in treatments with longer (vs shorter) cycle lengths because longer cycle lengths will have longer ‘‘gaps’’ between AE data points and therefore the AE timeline will have more assumed grades. The incorporation of patient-reported outcomes mitigates the influence of this assumption on the calculation of metrics because reporting of AE data is no longer limited to clinical visits. Nonetheless, future research may investigate the effect of AE data imputation assumptions on the accuracy of metrics when comparing treatments with different AE data availability.

Calculating AE load for AEs with different worst grades AE load allows for comparison between AEs regardless of whether the AEs have different worst grades. To secure this feature, our recommendation is to divide the patient’s averaged AE grade by the highest grade below death in the grading system (i.e. 4 in the National Cancer Institute’s Consensus Toxicity Criteria for Adverse Events grading system). For example, the maximum grade for diarrhea is 5 and for dry mouth is 3 in the National Cancer Institute’s Consensus Toxicity Criteria for Adverse Events. The normalized nature of AE load allows for the comparison of diarrhea and dry mouth. In this example, AE load = 0.25 could be interpreted as 25% of the worst experience preceding death for diarrhea or dry mouth. Because the worst

Lopes et al. grade for dry mouth is 3, AE load calculations for this AE would never reach above 0.75 (i.e. 3 divided by 4), thus reflecting accurately the National Cancer Institute’s Consensus Toxicity Criteria for Adverse Events grading for dry mouth. In short, dividing AE load by 4 for any AE allows this metric to have the same meaning across AEs.

Longitudinal metrics and changes in the regimen (e.g. dosage or drug changes) At a treatment level, AE load reflects the overall severity of an AE experienced by all patients receiving a specific treatment. For this reason, it is paramount that the AE load summary reflects the overall severity of an AE for the treatment. Therefore, we recommend AE load calculation to not include grades occurring during time periods when the treatment was modified in an unplanned manner. For example, some patients receiving a treatment may have the treatment’s backbone removed from their regimen due to excessive toxicity (e.g. Oxaliplatin removed from Oxaliplatin combined with 5-fluorouracil and folinic acid). If the backbone removal was predicted by the trial protocol, then the time period in which the patient received the treatment without the backbone still reflects the treatment as per the protocol. This time period should be included in the AE load calculation. However, if the backbone removal was done in an unplanned manner and is considered a major treatment violation, then the AE experienced during this time period may not reflect the AE that a patient would experience if they were receiving the treatment as per the protocol. Because AE load reporting refers to an AE for a defined treatment, AEs experienced due to unplanned modifications of the treatment must be excluded. In short, the calculation of AE load should only take into account the AEs experienced during time periods in which patients received the treatment as per the protocol. For the AE timeline, our recommendation is that if the observed change is not part of the regimen (e.g. dosage changed in an unplanned manner), then the AE timeline should omit the days of modified regimen because the change is not part of the treatment. The removal of days of modified regimen will create ‘‘gaps’’ in the timeline. AE grades in these gaps are missing data for the calculation of AE load. The number of treatment days will reflect only the days for which the patient received the original treatment tested in the study. AE load will then be accurately reflecting the overall severity of an AE for that specific treatment. If the observed regimen modification is part of the treatment protocol, then the AE timeline should include the treatment days with modified regimen because the change was expected to occur.

9

Our reporting method versus alternative reporting methods Our proposed AE reporting method does not undermine the utility of alternative AE reporting methods. For example, the Toxicity Burden Score7 or the Weighted Toxicity Score3 summarizes individual toxic effects into an overall score, such that grades of multiple AEs can be reflected into one single score. The ‘‘Adverse Effect Load, Onset, and Maximum Grade’’ method does not substitute the use of these overall scores, but it adds to them by incorporating longitudinal characteristics of adverse events. This method is more parsimoniously reported than alternative longitudinal AE summary approaches, such that all results can be reported in just one table (Table 3) and one optional figure (Figure 3). In addition, there is a growing body of literature supporting the incorporation of patientreported outcomes into cancer clinical trials.8 Patientreported outcomes provide additional and valuable data to clinician-based National Cancer Institute’s Consensus Toxicity Criteria for Adverse Events reports, notably for the duration of AEs.9,10 The development of electronic devices to collect patient-reported outcomes is currently maximizing the accuracy of toxicity data.11 These electronic devices lead to an accumulation of data that need to be summarized in an easy-tointerpret manner. The AE reporting method presented in this study can help describe, summarize, and present AE profiles based on patient-reported outcomes.

Limitations This study has limitations. First, AE load summary incorporates the AE load of patients receiving very long treatments with the same weight of the AE load of patients receiving only few treatment days. Data points from patients receiving several more cycles than most patients in a treatment may therefore affect the accuracy of AE load. For this reason, we recommend the removal of outliers in terms of number of treatment days before summarizing AE load at a treatment level. Second, some patients who received a combination of chemotherapy drugs may have at one time stopped one drug and remained on treatment with the other drug alone for a period of ‘‘maintenance’’ time. We did not have information about (and therefore could not control for) patients who stopped taking one of or more drugs during the treatment. However, this does not affect our ability to demonstrate the utility of the AE metrics using real individual data. Third, statistically significant differences that are not clinically relevant can always arise when comparing outcomes between treatments. Importantly, the addition of metrics in toxicity reports may increase the risk of detecting those differences. For this reason, the interpretation of toxicity

10

Clinical Trials 00(0)

between treatments must always take into account the clinical meaning of observed differences (e.g. effect sizes) beyond what is being revealed by the statistical results. Finally, the application of AE metrics to real individual patient data in this study is not appropriate for establishing clinical benchmarks because we conducted retrospective analysis on data of which collection was not planned for the development of such benchmarks. We recommend future clinical trials to establish treatment benchmarks for onset time and AE load.

Summary In this study, we developed an AE reporting method that includes two longitudinal AE metrics in addition to the traditional maximum grade report to aid describing, summarizing, and presenting AE profile. This method provides clinically relevant longitudinal information about AEs and can be easily calculated, reported, and interpreted by biostatisticians and clinicians. We recommend the application of these metrics in future oncology clinical trials. Author contributions G.S.L. is the Lead Biostatistician. C.T. is the Core Investigator. C.L.O. is the Biostatistician. R.C. is the Consultant. E.K. is the Consultant. L.S. is the Trial Contributor. R.M.G. is the Trial Contributor. H.H. is the Trial Contributor. C.F. is the Trial Contributor. A.d.G. is the Database Principal Investigator and the Steering Committee Member. Q.S. is the Core investigator, the Database Principal Investigator, and the Steering Committee Member.

Declaration of conflicting interests The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iD Guilherme S Lopes

https://orcid.org/0000-0003-2923-2721

Supplemental material Supplemental material for this article is available online.

References 1. Thanarajasingam G, Hubbard JM, Sloan JA, et al. The imperative for a new approach to toxicity analysis in oncology clinical trials. J Natl Cancer Inst 2015; 107(10): djv216. 2. Henon C, Lissa D, Paoletti X, et al. Patient-reported tolerability of adverse events in phase 1 trials. ESMO Open 2017; 2(2): e000148. 3. Carbini M, Sua´rez-Farin˜as M and Maki RG. A method to summarize toxicity in cancer randomized clinical trials. Clin Cancer Res 2018; 24(20): 4968–4975. 4. Trotti A, Pajak TF, Gwede CK, et al. TAME: development of a new method for summarising adverse events of cancer treatment by the Radiation Therapy Oncology Group. Lancet Oncol 2007; 8(7): 613–624 5. Thanarajasingam G, Atherton PJ, Novotny PJ, et al. Longitudinal adverse event assessment in oncology clinical trials: the Toxicity over Time (ToxT) analysis of Alliance trials NCCTG N9741 and 979254. Lancet Oncol 2016; 17(5): 663–670. 6. U.S. Department of Health and Human Services. Common terminology criteria for adverse events (version 5.0). Bethesda, MD: National Institutes of Health, 2017. 7. Lee SM, Hershman DL, Martin P, et al. Toxicity burden score: a novel approach to summarize multiple toxic effects. Ann Oncol 2012; 23(2): 537–541. 8. Basch E, Abernethy AP, Mullins CD, et al. Recommendations for incorporating patient-reported outcomes into clinical comparative effectiveness research in adult oncology. J Clin Oncol 2012; 30(34): 4249–4255. 9. Galizia D, Milani A, Geuna E, et al. Self-evaluation of duration of adjuvant chemotherapy side effects in breast cancer patients: a prospective study. Cancer Med 2018; 7(9): 4339–4344 10. Kluetz PG, Chingos DT, Basch EM, et al. Patientreported outcomes in cancer clinical trials: measuring symptomatic adverse events with the National Cancer Institute’s Patient-Reported Outcomes Version of the Common Terminology Criteria for Adverse Events (PRO-CTCAE). Am Soc Clin Oncol Educ Book 2016; 35: 67–73. 11. LeBlanc TW and Abernethy AP. Patient-reported outcomes in cancer care—hearing the patient voice at greater volume. Nat Rev Clin Oncol 2017; 14(12): 763–772.

No title

Article Adverse event load, onset, and maximum grade: A novel method of reporting adverse events in cancer clinical trials CLINICAL TRIALS Clinical...
1MB Sizes 0 Downloads 0 Views