STATISTICS IN MEDICINE VOL. 9 , 7 3 4 6 (1990)

SELECTION OF PATIENTS FOR RANDOMIZED CONTROLLED TRIALS: IMPLICATIONS OF WIDE OR NARROW ELIGIBILITY CRITERIA SALIM YUSUF, PETER HELD AND K. K. TEO Clinical Trials Branch, Division of Epidemiology and Clinical Application, National Heart, Lung, and Blood Institute, Bethesda, M D 20892, U.S.A.

AND ELIZABETH RUDIN TORETSKY Collaborative Studies Coordinating Center, Department of Biostatistics, University of North Carolina, Chapel Hill, NC,

U.S.A.

SUMMARY This paper discusses the various philosophies that influence the selection of patients for entry into randomized controlled trials. Although a number of different and often competing issues have to be considered depending upon the trial, keeping entry criteria simple, wide and at times even flexible is usually preferable. Such a strategy can be a positive virtue by helping to attain the large numbers of patients that are usually needed to reliably detect the sorts of moderate benefits that are plausible, at a reasonable cost and by providing answers that are relevant to many different categories of patients with a particular condition.

INTRODUCTION The criteria for a good trial are fairly straightforward: ask an important question and answer it reliably. The importance of the question depends to a large extent on its clinical relevance. It is obvious that the more widely applicable are the results of a clinical trial, the more relevant and valuable are those results. While the above statements seem fairly self-evident, it has often been argued that the overall results from clinical trials studying a broad group of patients are relatively unimportant and that what is really needed are specific answers regarding the value of an intervention in certain ‘homogeneous’ subsets of patients. Emphasis on one or other of the above philosophies has profound implications for the selection of patients for a trial, which in turn influences the clinical and public health relevance of its results. First, we discuss the options for inclusion of patients in cardiovascular trials in the context of some common and proven treatments. Second, we discuss two important judgements (moderateness of treatment effect and commonness of qualitative but not of unanticipated qualitative interactions) which influence the various options for patient entry. We will review various treatments for cardiovascular diseases in order to highlight the above judgements. Third, we discuss the advantages and disadvantages of utilizing wide or narrow eligibility criteria. Fourth, we discuss the need for flexibility of entry measures in appropriate situations and their impact on study results, costs and efficiency. 0277-671 5/90/010073-14$07.00 0 1990 by John Wiley & Sons, Ltd.

74

S. YUSUF ET AL.

Table I. Selection of patients for a trial ~~

Drug clearly indicated based upon current knowledge. Drug effect unknown: (a) but likely to be beneficial in certain patients (b) and outcome uncertain in certan patients

1. 2.

3.

4.

*

(c) but theoretical possibility of harm or little hope of benefit. Special exclusions (for competing risk, non-compliance,inability to ascertain endpoint, and so on). Drug contraindicated based upon current knowledge.

Exclude Include * ?Include ??Include* Exclude Exclude

Include such patients with explicit prior statement of the direction of a subgroup hypothesis.

OPTIONS FOR PATIENT INCLUSION OR EXCLUSION When a trial with a particular intervention is being designed there is usually a certain amount of information about the drug- for instance its pharmacology, value in relieving specific symptoms, side-effects, and categories of patients in whom treatment is considered to be contraindicated. Such information, which may or may not be based upon previous randomized clinical trials, may lead to a consensus of opinion as to certain categories of patients with a particular condition, in whom treatment is considered to be clearly indicated or clearly contraindicated. The remaining categories of patients are those among whom the effects of treatment are on balance uncertain. Such patients might include (a) those in whom there are prior reasons to believe treatment might be preferentially beneficial; (b) those in whom there is real uncertainty; or (c) those in whom there is a possibility (but no proof or clear reason) that treatment might not be beneficial and may even be harmful (Table I). Most trials would generally include patients in category (a), some may additionally include patients in category (b), and a few would include patients in category (c).The approach that different scientists might take to inclusion criteria for a specific trial depends to a large extent on their differing backgrounds, experience and philosophy (especially in relation to the two fundamental judgements of moderateness of effect and probability of different types of interactions).

TWO FUNDAMENTAL JUDGEMENTS THAT INFLUENCE THE DESIGN OF CLINICAL TRIALS, IN PARTICULAR CRITERIA FOR PATIENT ELIGIBILITY

Moderateness of treatment effect When treatments usually have large effects or when a treatment has a moderate effect in a disease where an outcome is invariable, detecting benefit is usually relatively simple. The benefits of such treatments might be clearly evident by simple observation, data-base studies, epidemiologic studies or registries. While a randomized trial is likely to provide more convincing answers earlier, such trials are often not necessary to detect large differences. By contrast, when there is real uncertainty about whether or not a treatment works (especially in reducing the risk of a major outcome such as death, stroke or myocardial infarction), the magnitude of the effect is, in general, likely to be small or at best moderate (about 15,20 or 25 per cent risk reduction). The validity of this judgement is supported by the experience with a number of different interventions in a common disease sudh as myocardial infarction (Table II).' While moderate effects, at best, are likely on major endpoints (such as death) in common disease, larger effects might be seen on

SELECTION OF PATIENTS FOR RANDOMIZED CONTROLLED TRIALS

75

Table 11. Summary of the effects of common treatments for myocardial infarction on mortality (adapted from Reference 1) Drug

Short-term interventions in the acute phase 1. Streptokinase: Intra-coronary IV IV 115 mmHg) and demonstrated a significant reduction in strokes. Subsequent trials done over a 20-year period in those with moderate (diastolic BP > 105 mmHg) or mild (diastolic BP > 90 mmHg) hypertension showed about a similar degree of benefit. While each of these different categories of trials demonstrated a useful progression of knowledge, one wonders at the numbers of morbid events that might have been prevented if some of the early trials had randomized 10,000or 20,000 hypertensive patients with a broad range of elevated blood pressure and concluded that the benefit was applicable to those with mild, moderate and severe hypertension a decade or two earlier. We note that the only trial to provide clear evidence for reduction in mortality (Hypertension Detection and Follow-up Program) had very broad entry criteria and probably included a substantially higher proportion of eligible hypertensives than any other trial.16

SELECTION OF PATIENTS FOR RANDOMIZED CONTROLLED TRIALS

79

Table IV. Advantages of wide eligibility criteria for entering patients into randomized trials 1. 2. 3. 4.

Easier screening and recruitment. Large trials are more feasible and affordable. Large study sizes reduce random error, providing more reliable overall results. Wider applicability of results. Therefore greater clinical and public health impact. Greater opportunity to test subgroup hypothesis: (a) By including patients with a broader range of values for key baseline variables such as cholesterol, blood pressure or time from onset of disease (b) By increasing study size.

Advantages of wide eligibility criteria for patient selection Wide eligibility criteria provide at least four obvious advantages (Table IV): 1. In general, wide eligibility criteria substantially simplifies screening and increases recruitment. This in turn reduces the effort and cost per patient enrolled, thereby making large trials more affordable and feasible. 2. The larger the size of the trial, the smaller the random error. Consequently, reliable overall results are more likely to emerge. 3. The wider the types of patients enrolled, the greater the applicability of the trial and the greater its impact on public health. For example, a trial showing that a treatment reduced the risk of death by a quarter in one-fifth of the patients (for example, thrombolytic therapy in patients with ST elevation seen within 4 hours of pain) would have a substantially smaller clinical and public health impact than a trial that additionally shows a similar or even smaller degree of benefit in a much wider population (for example, thrombolytic therapy in those presenting with up to 24 hours of pain, irrespective of ECG or age) (Table V). 4. The wider the types of patient especially with regard to key baseline variables, the greater the likelihood of detecting such subgroup effects. For example, if one were seriously interested in testing whether thrombolytic therapy had a greater effect among patients presenting early compared with those presenting late, confining patient entry to those seen very early (say less than 4 or 6 hours) limits our ability to detect such an interaction compared with a trial with a much wider entry time window (say up to 24 hours).

Situations where narrow eligibility criteria might be sensible Although wide eligibility criteria are generally preferable to narrow ones, in a few situations restricting the entry of specific types of patients might be sensible. For example, it may be considered unethical to randomize certain subgroups of patients because clear evidence may already exist that such patients benefit from treatment (for example, most patients with myocardial infarction seen within 6 hours of pain should be treated with a thrombolytic agent). Sometimes the treatment being tested may be expensive or toxic, or the evaluation of the endpoint (for example, invasive tests such as angiography) may carry some inherent risks. In such cases it would be judicious to include only high risk patients so that the potential benefit of the treatment is likely to outweigh the risks. If the study can identify in advance subjects who are unlikely to adhere to the drugs or who will be difficult to follow, excluding such patients can often increase the sensitivity of the trial. There are a number of situations, especially in trials of surrogate endpoints, where the endpoint of interest is not evaluable in certain types of patients: for example, the ECG (Holter, exercise test, and so on) is not interpretable in patients with bundle branch block, or the

80

S. YUSUF ET AL.

Table V. Estimates of the proportion of patients with acute myocardial infarctioneligiblefor entry into trials of thrombolytic therapy based upon entry and exclusion criteria for four recent large trials. Currently there is consensus that group 1 patients should be given thrombolytic therapy. Only a few patients in categories 2 to 5 were entered into various trials. The approximate size of benefit in those categoriesof patients excluded from several of the trials is also provided. (Some of the categories below are not mutually exclusive.)

Category of patient 1.

2. 3. 4.

5.

ST elevation, < 6 h Age < 75 years Other electrocardiographic abnormalities at entry Over 75 years Onset of symptoms between 6 and 12 h Onset of symptoms between 12 and 24 h

Approximate proportion Approximate of patients with size of mortality acute MI* reduction?

Recent large trials which included such patients

21)-30%

3040%

ISIS-2, GISSI, ASSET, AIMS

1&15% 10-15%

10-20% 21)-25%

ISIS-2, GISSI, ASSET ISIS-2, a few patients in GISSI

20-30%

15-20%

ISIS-2, a few patients in GISSI

15-20%

15-20%

ISIS-2

ISIS-2 Second International Study of Infarct Survival. GISSI: Gruppo Italian0 per lo Studio della Streptochinasi nell Infarto Miocardio. ASSET Anglo-Scandinavian Study of Early Thrombolysis. AIMS: APSAC (anisolylated plasminogen-streptokinaseactivator complex) in Myocardial Infarction Study. * The approximate proportion of patients with acute MI is based upon a review of the logs maintained for GISSI and ASSET and the numbers of patients included in various categories of ISIS-2. t The approximate size of benefit is based on an overview of the available data from a number of studies.

second measure (such as 5-year coronary angiogram) is not obtainable in patients with poor left ventricular function if they die before the end of the study (competing events). Such categories of patients are best excluded in such trials. However, even in these situations other criteria for eligibility can be widened in order to facilitate recruitment.

FLEXIBILITY I N ENTRY MEASURES If one accepts that generally wide eligibility criteria are preferable to narrow criteria and that only a few important and relevant covariates are worth collecting, one should consider the precision with which to measure these few variables. In earlier sections have argued that in trials of mortality or major morbidity covariate adjustment is unlikely to alter the results of the trial. In addition to describing the population being studied, baseline characteristics are often used to include patients at high risk or to estimate treatment effects in subgroups defined by such characteristics. For example, in the Studies of Left Ventricular Dysfunction, it was decided to include patients with abnormal left ventricular function (ejection fraction E F < 0.35). A secondary aim was to examine the effects of treatment in those with low, intermediate and high E F (divided by terciles). During the design of the study it became apparent that even within a clinical centre the E F was being measured using one of three techniques: radionuclide ventriculogram, contrast ventriculography, and 2D echocardiography. In general, the first two techniques were considered to be reliable whereas the reliability of the echocardiogram was less certain. Several centres wished to retain the echocardiogram to measure entry EF as this was commonly used in their centre. Consequently, a large number of patients could be enrolled if this technique were used in their centre. These centres

SELECTION OF PATIENTS FOR RANDOMIZED CONTROLLED TRIALS

81

Table VI. Flexibility of baseline measurement of ejection fraction in the studies of left ventricular dysfunction depending upon the question ~

~~

_

_

_

Overall goal of study Effect of enalapril, an angiotensin converting enzyme inhibitor, on mortality in patients with abnormal left ventricular function. Purpose of E F measurement 1. Ejection fraction < 0.35 to identify patients at high risk and presence of left ventricular dysfunction. 2. For subgroup analysis to assess treatment effect in tertiles. Question facing steering Committee Should we accept any of three methods done over a variable interval, or insist on EF being done by one method within a short interval prior to randomization? Approach taken Use all three methods for EF < 4 months as baseline measure if patient’s clinical condition had not changed. Baseline results in the first 5000 pafienls N

Mean fone SD

Method: Radionuclide ventriculography 2D echocardiography Contrast ventriculography

3275 (66%) 942 (190/,) 735 (150/)

26.2 i6 6 26.3 6.4 27.8 f5.9

Time before randomization: < 1 month 1-2 months 2-3 months > 3 months

2123 1598 770 534

26.3 6.5 26.4 f6.5 26.4 f6.5 27.2 & 6 5

(42%) (32%) (150/) (11%)

EF was found to be a highly significant and powerful predictor of prognosis. Similar increments in risk predicted by each technique and each time interval.

Substudy and its goal To assess effect of enalapril on changes in right and left ventricular function at rest and exercise. Since baseline EF is part of the endpoint (that is change in EF), and the study was being done in only few specialized centres, baseline and subsequent measures were performed using standardized techniques and central data analyses.

provided information showing a high degree of correlation between the E F calculated from the 2 D echocardiogram using the area-length method or Simpson’s rule and the other two techniques. In addition, there were data suggesting that in a group of patients with heart failure the mean EF remains constant over 6 to 12 months. The S O L V D Steering Committee therefore decided to accept EF measured within four months using any of the three methods as the baseline E F as long as the patient’s clinical condition had not changed subsequent to the measurement. The experience from the first 5000 patients is summarized in Table VI. It can be seen that restricting E F to the radionuclide technique (the commonest) and to within the first month, only about 25 to 30 per cent of patients would have been considered eligible. (The use of different techniques was distributed similarly between the four months.) The mean values and distribution of E F were similar with each of the three techniques and four time intervals. Moreover E F was the most powerful predictor of prognosis and similar increments in risk were predicted by the same change

82

S. YUSUF E T AL.

in E F measured by each technique and each time interval. Therefore, this relatively flexible approach to measuring EF substantially increased recruitment (at least three- to four) and was a powerful predictor of prognosis. The similarities of the means, distributions and predictive slope (beta coefficients)further allow the subdivision of patients into subgroups of low, intermediate and high EF. What if the E F means and the predictive slopes of E F versus mortality had been different? Even in such cases, as long as E F by all three techniques predicted outcome, we could have used a stratified approach to study treatment interaction with EF. Compared with measuring EF using only one technique exclusively for the study, the flexible approach saved about $3 million to $4 million and substantially increased the ease of patient recruitment. In contrast with the main trial of SOLVD, where there was considerable flexibility in measuring baseline EF, a different and more rigorous approach was adopted in a radionuclide substudy which aimed to study the effect of treatment on EF. Here E F was measured at randomization, at four months post-randomization and at one year. The study is being conducted in three centres with attention to standardization (radionuclide technique, view, amount of radioactive material, position of patient, and so on) and all analyses are being performed by a central laboratory. The greater emphasis being devoted to the measurement is in part because the E F itself is a key endpoint (the baseline E F is ‘part’ of the endpoint when one examines change in EF) and in part because some of the other measures (diastolic function and right ventricular EF) require special expertise and equipment. Even in this substudy we are collecting information to compare whether evaluating treatment effects based upon the central laboratory’s measures differ systematically from those derived using clinic measures. Indeed, based on theoretical considerations, for those measures readily analysed at the clinics, use of the central laboratory’s values would not necessarily be expected to increase the sensitivity to detect differences between the treatment and control groups.14 RECOMMENDATIONS AND CONCLUSIONS Broad patient eligibility, coupled with simple and flexible characterization of patients, arc generally adequate for most trials evaluating the effects of treatments on major outcomes. Often such a strategy is a positive virtue by helping to attain the large numbers of patients that are usually needed to detect the sorts of moderate benefits that are plausible at a reasonable cost. Eligibility criteria for trials should follow the principle of randomizing all patients with a particular condition in whom the effects of treatment are uncertain (the ‘uncertainty principle’). REFERENCES

1. Yusuf, S., Wittes, J. and Friedman, L. ‘Overview of results of randomized clinical trials in heart disease: 1. Treatments following myocardial infarction’, Journal of the American Medical Association, 260, 2088--2093 (1988). 2. ISIS-2 Collaborative Group. ‘Randomized trial of IV streptokinase, oral aspirin, both or neither among 17,187 cases of suspected acute myocardial infarction’, Lancet, ii, 349-360 (1988). 3. Antiplatelet Trialist’s Collaboration. ‘Secondary prevention of vascular disease by prolonged antiplatelet treatment’, British Medical Journal, 296, 320-331 (1988). 4. Theroux, P., Ourlette, H., McCans, J., Latour, J. G., Joly, P., Levy, G., Pelletier, E., Juneau, M., Stasiak, J., deGuise, P., Pelletier, G. B., Rinzler, D. and Waters, D. D. ‘Aspirin and heparin or both to treat acute unstable angina’, New England Journal of Medicine, 319, 1105-1 1 11 (1 988). 5. Cairns, J. A., Gent, M., Singer, J., Finnie, K. J., Froggatt, G. M., Holder, D. A., Jablonsky, G., Kostuk, W. J., Melendez, L. J., Myers, M. G., Sackett, D. L., Sealy, B. J. and Tanser, P. H. ‘Aspirin on sulfinpyrazone or both in unstable angina’, New England Journal of Medicine, 313, 1369-1375 (1985). 6. Lewis, D. H., Davis, J. W., Archibald, D. G. et al. ‘Protective effects of aspirin against acute myocardial infarction and death in men with unstable angina’, New England Journal of Medicine, 309, 396-403 (1983).

SELECTION O F PATIENTS FOR RANDOMIZED CONTROLLED TRIALS

83

7. Marder, V. and Sherry, S. ‘Thrombolytic therapy: current status’, New England Journal of Medicine, 318, 1512-1520 (1988). 8. Yusuf, S., Sleight, P., Held, P. and McMahon, S. ‘Routine medical management of myocardial infarction. Lessons from recent controlled trials’, Circulation (1989), in press. 9. Chadda, K., Goldstein, S., Byington, R. and Curb, J. D. ‘Effect of propranolol after acute myocardial infarction in patients with congestive heart failure’, Circulation, 73, 503-510 (1986). 10. Multicenter Post-infarction Research Group. ‘Risk stratification and survival after myocardial infarction’, New England Journal of Medicine, 309, 331-339 (1983). 11. Yusuf, S., Collins, R. and Peto, R. ‘Why do we need some large, simple randomized trials?’, Statistics in Medicine, 3, 309420 (1984). 12. Peto, R., Pike, M. C.,Armitage, P., Breslow, N. E., Cox, D. R., Howard, S. V., Mantel, N., McPherson, K., 13.

14. 15. 16.

Peto, J. and Smith, P. G. ‘Design and analysis of randomized clinical trials requiring prolonged observation of each patients: I. Introduction and design’, British Journal of Cancer, 34, 585-612 (1976). Aspirin Myocardial Infarction Research Group. ‘Randomized controlled trial of aspirin in persons recovered from myocardial infarction’, Journal of the American Medical Association, 243,661-669 (1980). Peto, R. ‘Monitoring cancer patients in clinical trials need not be precise’, In Symington, T., William, A. E. and McVie, J. G. (eds), Cancer: Assessment and Monitoring, 10th Pfizer International Symposium, Churchill Livingstone, Edinburgh, London and New York, 1980, 377-38 1. Studies of Left Ventricular Dysfunction. Protocol, National Heart, Lung, and Blood Institute, 1985. Hypertension Detection and Follow-up Program Cooperative Group. ‘The effect of treatment on mortality in “mild” hypertension. Results of the Hypertension Detection and Follow-up Program’, New England Journal of Medicine, 307, 976980 (1982).

DISCUSSION Dr. Neaton: I believe that although we might consider relaxing our methods of standardization and quality assurance in some situations, we should not do so generally. The examples discussed thus far have been trials with positive outcomes. For several of the trials, those positive outcomes were quite consistent with strong priors. I’m trying to understand trials that had outcomes quite contrary to our prior; we would be much more concerned about issues of standardization and quality assurance. I fear that some of the cost saving measures discussed may be self-defeating if they resulted in a loss of credibility of study findings. Dr. Yusuf: I agree in certain situations you do want more attention paid to quality control. For instance, if ejection fraction had been an endpoint in our trial, we might have taken a different view. Often, however, I think it is a question of not always fearing the worst. I think if there was a good scientific reason to impose certain quality control measures, then let’s do it. On the other hand, if the reason to implement some quality control measure is purely cosmetic, then we should not do it and defend our position. In fact, unnecessary attempts at quality control may not only be wasteful but might interfere with a study by diverting efforts from those aspects of the trial that really do matter. Dr. Detsky: Trials in chronic disease, particularly those in cardiovascular medicine, all seem to have a standard form no matter what the interventions are. The rate in the control group-no matter how long you follow up, whether six months, a year or three weeks - is always between 8 and 10 per cent. Tbe event rate in the experimental group is always between 6 and 8 per cent. The risk reduction is always 20 to 40 per cent. We now have five or six interventions, for instance betablockers, aspirin and heparin. The next question is to design an efficient strategy for studying combinations of these therapies.

Dr. Yusuf: Since these trials were done independently, the only thing we can do now is use clinicaljudgement. We might state that there is no reason to think beta-blockers and aspirin post -

84

S. YUSUF ET AL.

MI interact in any negative way. The sensible thing might be to give both. One could do another analysis. For example, one could examine the beta-blocker trials and work out the effects among people who did and did not take aspirin. This raises the danger of subgroups. If there are two important treatments and we were uncertain about their value, I would do a factorial design. You might increase your sample size a little bit to protect against the possibility of a negative interaction.

Dr. Tognoni: GISSI-2 and ISIS-2 are examples of trials designed to test different hypotheses with factorial designs. One of the advantages of the ISIS structure is that you can observe the impact of therapy on the diseases over time. For instance, our estimates of the baseline mortality of patients before thrombolysis (in 1983) was around 13 per cent. Now (1989) that all patients recruited receive thrombolysis, acute beta-blockade and aspirin, the overall rate is 9 per cent. If we tested a new treatment, the baseline rate would reflect the cumulative effect of all treatments.

Dr. Temple: If one considers another area of cardiovascular medicine, namely antiarrhythmic drugs, one might suspect that subgroup characteristics do matter. This is an area where, on the whole, trials have been depressingly unsuccessful. Many drugs are clearly active in suppressing abnormal beats but they don’t seem to affect survival. The current, though unproved, explanation is that the failure results from putting the wrong people into the trials. That reflects the comment earlier that if the results aren’t very good everyone will try to explain why by accusing people of studying too broad a population. A sensible approach would be to define and study subgroups of particular arrhythmia subtypes, particular states of the anatomy of the heart, or particular states of the function of the heart, because the drugs may behave differently in people with different functional, anatomical and electrical status. That is not proven, of course, although there is a fair amount of evidence that worsening of arrhythmias is related to underlying cardiac status. Dr. Davis: The Hypertension Detection and Follow-up Program (HDFP), which took a very wide population, used very precise measures of blood pressure for inclusion criteria; in fact, they set the standard for measurement of blood pressure in cardiovascular disease in this country. Their entry criteria included two or three gates. Dr. Yusuf: I am impressed with the H D F P not because of their precise measurements but because of their very wide window. Other hypertensive trials have also required elevated blood pressure on more than one visit, but they didn’t have the very wide window of the HDFP. However, there is little evidence that multiple measurements are truly needed in clinical trials with clinically relevant endpoints. Perhaps in future trials we should have only one blood pressure measurement.

Dr. Cent: I’m very impressed with the strategy in SOLVD that allowed three different methods of measuring ejection fraction. It turned out the results were very similar and you saved $5 million. Suppose, however, that the results had been very, very different. What effect would that have had on the credibility of the study? Dr. Yusuf: We had data to support that our strategy in SOLVD was likely to succeed. Many papers had already shown high correlations among the different techniques. Moreover, we were confident in the quality of our centres. Even if there were some differences, our main purpose was to enroll people at high risk. Thus, I think we would have succeeded even if the correlation between different techniques had been only modest. Indeed, other trials have used functional class instead of EF as the entry criterion. That certainly would be no more precise than using E F measured by any of the three techniques. If, in fact, our approach had not worked, we could have stratified by technique. Although we were fairly non-stringent about measuring E F overall, in the

SELECTION OF PATIENTS FOR RANDOMIZED CONTROLLED TRIALS

85

subgroup that is part of the substudy with ejection fraction as an endpoint we are using a central laboratory and very careful quality control methods to measure EF. As far as subgroup hypotheses, I do not think SOLVD has the power to look at subgroups. The design of a trial is always a compromise. Some people always want to look at subgroups. Therefore, even though the power is inadequate we will analyse subgroups and calculate statistics of interaction. Very likely we won’t see clear results by subgroups. Dr. Gent: You are proposing that there could be broad criteria for entry. Are these criteria agreed upon by some study group beforehand? If so, I find that very different philosophically from Rory’s proposal to let the local people do what they like. Dr. Collins: Our study had more eligibility criteria than I mentioned. Our approach was adopting wide criteria instead of demanding, for example, that the patient must have some specific ST elevation or that they must have ejection fractions below a certain percentage. So we used very simple, easy to understand eligibility criteria to enroll subjects: anyone who in the view of the responsible physician was within 24 hours of onset of symptoms of acute MI and had no clear contraindication to treatment. We gave some guidelines to possible contraindications but they were suggestions only. Dr. Tognoni: I am a bit surprised that so many of you object to the uncertainty principle. I think that broader inclusion criteria are mandatory in clinical trials in order to draw conclusions for larger populations. The earlier discussion concerning SOLVD focused on how to establish a cutoff point for ejection fraction. Why not simply declare that we are so uncertain about the biology that we must broaden our criteria for inclusion? Only then can we answer practical medical problems. The more we check assumptions concerning the disease, the more we find unexpected results. Consider the surprising finding that streptokinase and t-PA are effective in prolonging life even if they are administered many hours after the infarction. If the thrombolytic trials had not entered late patients, we never would have learned that. We simply must accept the fact that clinical trials reveal the substantial uncertainty of our assumptions about biology. Therefore, we need wide entry criteria. Dr. Collins: We should evaluate methodology in the same way that we evaluate treatment. We should use randomized trials to learn which methodology is best. For example, because of prejudices concerning the perceived superiority of placebo-controlled trials, we do not use randomized open-control trials even when we have good reason to believe that we could randomize a substantially greater number of patients in an open trial which might increase substantially the ability of the trial to demonstrate benefit of a particular treatment. We might consider doing randomized pilot trials of open versus placebo-controlled designs to find out whether we could substantially increase recruitment and whether there would be any more likelihood of biases between the treatment groups with respect to concomitant treatment in an open study. Dr. Terrin: We have strayed from the thrust of this workshop, which is on cost and efficiency. If there is a cost in a trial for features you can live without, then eliminate them and save your money. Dr. Chalmers: In trials of acute myocardial infarction we have found, on average, significant differences between the treatment and the control group when the randomization is blinded and when it is not, but no suggestion of a difference when the treatment is blinded and when it is not. This is why I’m against open randomization approaches.

86

S. YUSUF ET AL.

Dr. Gent: I’m concerned that Dr. Collins seemed to be saying that open studies may lead to a major advantage in patient recruitment. Recruiting many patients is pointless if it leads to bias in the subsequent course of treatment and in the evaluation. Dr. Meier: A number of the approaches that we take are, I think, in the interest neither of the patient nor of efficiency in clinical trials. They are in the interest of self-protection. One tack that is perhaps less arguable is exclusions. We often exclude people with kidney damage or liver dysfunction, not because we suppose they will be excluded in practice, but because we are worried that if we include them in the study something bad will happen to someone. The drug may not be at fault, but we will be at fault. As a result, sometimes we study people whom we are comfortable studying rather than the group to whom we really do think the treament is going to be generalized. That often applies to pregnant women. I am not clear that such caution is either ethical or efficient. Dr. Chalmers: Dr. Meier didn’t mention the two most blatant examples of prejudices in selection of patients for clinical trials: age and sex. We recently looked at the age exclusions of 51 clinical trials of the prevention of recurrent acute myocardial infarction in patients discharged from the hospital after having had an infarction. Two-thirds of the trials excluded patients over the age of 65. The data from the National Hospital Survey, however, showed that over 50 per cent of the patients discharged alive were over the age of 65. Thus, the studies are confined to less than 50 per cent of the people who have the disease. Dr. Detsky: The issue of inclusion criteria is relevant to a policy issue in Canada as it relates to age discrimination. The Coronary Bypass Studies all had an age limit. Recently Anderson and Lomas wrote a very provocative piece in a public policy journal nothing that the majority of the growth in cardiac procedures in Ontario in the last decade has been in the elderly’. In the light of the fact that all the trials were in people under 65, he questioned whether government should pay for bypass surgery in the elderly. In terms of generalizability, it was a fair question. It was unfair in the sense that it is highly unlikely that another randomized trial of coronary bypass surgery will ever be done in anybody no matter what. Thus, by excluding the elderly we will never know whether bypass surgery will be beneficial to them. REFERENCE 1. Anderson, G. M. and Lomas, J. ‘Monitoring the diffusion of technology: coronary artery bypass surgery and Ontario’, American Journal of Public Health, 78, 251-254 (1988).

Selection of patients for randomized controlled trials: implications of wide or narrow eligibility criteria.

This paper discusses the various philosophies that influence the selection of patients for entry into randomized controlled trials. Although a number ...
1MB Sizes 0 Downloads 0 Views