Breast Cancer Research and Treatment 22: 193-196, 1992. © 1992 Kluwer Academic Publishers. Printed in the NetheHands.
Prognostic factors in clinical trials* Steve Dahlberg and Ping-Yu Liu Southwest Oncology Group Statistical Center, Fred Hutchinson Cancer Research Center, Seattle WA, USA
Key words: prognostic factors, study design, study analysis
Summary Prognostic factors define the study population, help formulate the study objectives, and influence the treatment strategies. They must be accounted for in the study analysis to obtain valid estimates of the treatment differences and to evaluate results across studies. The causal relationship between a prognostic factor and the study endpoint can only be established through prospective randomized study designs. Potential factors discovered through retrospective analysis must be verified to establish their validity. Using such factors prematurely to select patient population and treatment strategy for a new study will not establish the validity of the potentially important factor.
Introduction
Types of prognostic factors
Prognostic factors are often used to choose between treatment strategies. They play a fundamental role in the design, analysis, and interpretation of clinical trials. Recently, the timing of surgery during the menstrual cycle has been examined as a possible prognostic factor of survival for premenopausal women with operable breast cancer [1]. At this time, randomized prospective studies to address this question are not available. Researchers designing new studies for this patient population may or may not choose to view timing of surgery as a prognostic factor. This choice influences how future studies are conducted and interpreted. This paper discusses the role of prognostic factors in clinical trials.
In general, a prognostic factor is a patient characteristic that can be used to foretell the probable course or outcome of a disease. From a clinical point of view, a prognostic factor predicts an individual's outcome. However, predicting an individual's outcome is often difficult because of large variation in the response of different patients to therapy. Instead, statisticians consider groups of patients and attempt to predict the group's disease course. It is useful to divide potentially prognostic factors into two types: factors measured before the start of treatment, and those that can be known only after treatment begins. Factors known at the start of treatment include
Address for correspondence and offprints: Steve Dahlberg, Southwest Oncology Group Statistical Center, Fred Hutchinson Cancer Research Center, 1124 Columbia Street MP-557, Seattle, WA 98101, USA.
* Supported by NCI 5 PO1 CA53996-15 and NCI 5 U10 CA38926-08.
194
S Dahlberg and P-Y Liu
patient demographics such as age, gender, and race. They include disease characteristics such as stage of disease and histology. Also included are pretreatment patient characteristics such as performance status. Some treatment information is also known at this time. This includes the type of initial treatment, the planned total dose, and the planned duration of therapy. Factors known after the start of treatment include things like the amount and duration of treatment delivered, the number of hospital days, the patient's current performance status, and response to treatment. These factors are called time-dependent variables. Here, time refers to the time after start of treatment. With time-dependent variables, the relationship between cause and effect is not always clear. For example, if at the end of a study, one observes that the group of patients who received reduced doses had a poor response, the conclusion that the lower doses cause the poor response is not necessarily valid, unless the study was designed to answer this specific question. This is because the reasons for the dose reductions have not been considered. Perhaps these patients had clinical complications that precluded the use of full-dose therapy. Therefore, both the reduced dose and the poor response are results of some underlying condition not present for those patients who received full doses. By accepting the erroneous conclusion that lower doses cause poor response, all differences between patients with and without clinical complications are attributed only to the dose of therapy delivered. Furthermore, this leads to the conclusion that if all patients were given full doses, they would have responded identically to those who were actually given full doses. Depending on the reasons for dose reductions, this may be a completely false conclusion.
Prognostic factors and study design Established prognostic factors known at the start
of treatment have a major role in an individual patient's treatment strategy as well as in the design of new clinical trials. Treatment questions in clinical trials are posed for specific patient populations which are defined by prognostic factors. Once the study question and patient population are defined, treatment strategies can then be proposed. There are a great many endpoints in clinical trials. Response, survival, relapse-free survival, remission duration, time to treatment failure, time to progression, and time to second primary are commonly used. It is important to realize that factors predictive for one study endpoint may not be prognostic for another. A prognostic factor may also be dependent on the treatment received. For example, factors predicting outcome for marrow transplantation might not be the same as for other regimens because of differences in that treatment regimen itself. In randomized studies, prognostic factors are also used in stratification. Stratification is a way to insure that nearly equal numbers of patients are assigned to each randomized arm for each potentially important factor. There is a practical limit to the number of stratification factors that can be used. The use of too many factors, or factors with too many levels, can be equivalent to using no factors at all. For example, suppose a study has five stratification factors, each with three levels. Under a common design, a randomized block design, treatment assignments are balanced within each combination of stratification factors. In this case there would be 35 , or 243 such blocks. At the end of even a large study, most blocks would have very few patients and the results would be almost the same as if no stratification had been used. The issues concerning balancing the treatment allocation with respect to prognostic factors have been addressed extensively in the literature (see, for example Pocock and Simon [2]).
Prognostic factors in clinical trials
Prognostic factors and study analysis An analysis of potential treatment differences should incorporate known prognostic factors. Using these factors in the treatment comparisons reduces the variation of the treatment comparison. Even if the trial is stratified by all important prognostic factor, this factor should be included in the analysis of treatment differences [3-51. One is not limited to the stratification factors in such an analysis. If histology, for example, is an important prognostic factor, it should be used in the analysis of treatment differences whether or not it was used to balance treatment arms or is statistically significant in the present data. Inclusion of prognostic factors, as covariates, ensures that the estimates of treatment effects are free from confounding effects. Prognostic factors also have an important role in evaluating the study's conclusion. Treatment results may differ by prognostic group. Study results must be interpreted in the light of known prognostic factors. A treatment may be effective only in a subset of patients. This is often the reason why seemingly similar studies have different conclusions. An initial question, if this occurs, should concern the distribution of patients with important prognostic factors.
Identification of new prognostic factors An objective of a study may be to identify new prognostic factors. Often this is done retrospectively. Identification of new prognostic factors is always difficult. An example illustrating this is an analysis reported by Efron and Gong in 1983 [6]. They studied 155 chronic hepatitis patients. The goal was to identify factors predicting response among 19 covariates. Covariates included age, gender, specific symptoms, laboratory test results, and histology. In a multivariate analysis, they found four significant factors using a p-value less than 0.05 as a cri-
195
terion. Then, they generated 500 random sets of data, each with 155 observations, using a bootstrap procedure. One can think of this as creating a "new" data set by selecting patients at random from the original group. A given patient may be selected more than once in any particular data set. This is called sampling with replacement. For each of the 500 data sets, they repeated the original analysis selecting predictive factors. Of the four originally selected predictors, one was selected only 59% of the time among the 500 analyses. Another predictor was selected 48%, another 37%, and the last was selected only 35% of the time. They concluded that these results discourage confidence in the causal nature of the selected predictors. Very often different predictive factors are selected among different studies. In this example, different variables are selected even when the data has the same underlying structure. Why does the retrospective method select predictors that may not be causally related to outcome or fail to identify true prognostic factors? Possible answers include: correlations may exist among potential prognostic factors; the "real" or perhaps most important, prognostic factor may be unknown or not measurable; or there may be interactions between a prognostic factor and other factors. For example, in Hodgkin's disease, stage III patients with B-symptoms may have an outcome similar to stage IV patients while stage III patients with A symptoms may have an outcome more like stage II patients. In this case, stage and symptoms have an interaction. In exploratory analyses, such interactions are often unexpected, the true prognostic nature of the factors is masked, and the factors may therefore fail to be selected in retrospective analyses. In some instances the appearance of a prognostic factor may be a pure chance occurrence particular to the specific data set. Such "newly discovered" prognostic factors must be verified through prospective studies. However, if such a
196
S Dahlberg and P-Y Liu
factor is used to determine the treatment strategy, the study design will generally not allow for the validation of whether this factor is prognostic. In breast cancer, the timing of surgery with respect to the menstrual cycle is a potential prognostic factor. If this factor is considered to be an "established" prognostic factor, a study design might place restrictions on the timing of surgery to reduce overall variation. In this case, a statistical test of whether or not surgical timing is an important factor is not possible because it is confounded with treatment. Prognostic factors used in a study's eligibility criteria, or other less formal mechanisms used to select a more homogeneous group of patients such as physician bias in patient selection, or even the presence of competing studies, reduce the variation in the study endpoint, but have a cost in terms of interpretation of the study's results. The study results are not generalizable to the broader population. Because these factors cannot be validated by the study design, it is important to use only "established" predictive factors when choosing treatment or defining study populations. The choice of when to use a particular prognostic factor to determine treatment is dependent on whether or not it is important to validate the predictive power of the factor or whether this has already been "established".
Conclusion In clinical trials the known prognostic factors define the study population, help formulate the study objectives, and influence the treatment strategies. They must be accounted for in the
study analysis to obtain the correct estimate of the treatment effect and to evaluate results across studies. Care must be taken when interpreting the relationship between a time-dependent prognostic factor and the study endpoint, as the relationship between cause and effect is not always clear. Causal relationships can only be established through prospective, randomized study designs. Identification of new prognostic factors is difficult. Potential factors discovered through retrospective analyses must be verified to establish their validity. Using such factors prematurely to select patient populations and treatment strategies for a new study will not establish the validity of the potentially important factor.
References 1. Badwe RR, Gregory WM, Chaudary MA, Richards MA, Bentley AE, Rubens RD: Timing of surgery" during menstrual cycle and survival of premenopausal women with operable breast cancer. Lancet 337:12611264, 1991. 2. Pocock SJ, Simon R: Sequential treatment assignment with balancing for prognostic factors in the controlled clinical trial. Biometrics 31:103-t 15, 1989. 3. Canner PL: Covariate adjustment of treatment effects in clinical trials. Controlled Clinical Trials 12:359-366, 1991. 4. Anderson GL: Mismodelling covariates in Cox regression. Unpublished Ph.D. thesis, University of Washington, 1989. 5. Beach ML, Meier P: Choosing covariates in the analysis of clinical trials. Controlled Clinical Trials 10: 1615-1755, 1989. 6. Efron B, Gong G: A leisurely look at the bootstrap, the jackknife, and cross-validation. The American Statistician 37:36-48, 1983.