Proposal for a core curriculum in medical biostatistics Report of a Committee* of the American Statistical Association Subsection on Teaching of Statistics in Health Sciences

Preamble

This report was approved and accepted on August 27, 1974, at the annual meeting of the American Statistical Association Subsection on the Teaching of Statistics in the Health Sciences; and a recommendation was voted that the report be communicated to the medical community and to leaders in medical education. The committee that prepared the report had been appointed in May, 1972, by Dr. Edward N. Brandt, who was then Chairman of CEOMSI(Committee to Effect the Optimization of Medical-Statistical Interactions). CEOMSI, which was originally established in 1969 by the late Dr. Lyon Hyams and by Dr. Stanley Schor, subsequently became affiliated with the ASA Subsection. Our committee's charge was to develop guidelines and to define content for a core of biostatistical knowledge for medical students. At its first meeting, the committee agreed unanimously that no one is really satisfied with current courses in medical biostatistics and that no one would be totally satisfied with whatever changes might be proposed. The committee also agreed unanimously on four other points that allowed productive progress: (1) most current courses have been greatly unbalanced, * Names and institutions of the committee members are as follows: Philip G. Archer, Sc.D. (University of Colorado); Alvan R. Feinstein. M.D .• Chairman (Yale University); Paul E. Leaverton, Ph.D. (University of Iowa); Richard B. McHugh. Ph.D. (University of Minnesota); Alfred A. Rimm, Ph.D. (Medical College of Wisconsin); Stanley Schor. Ph.D. (Temple University); Irene M. Trawinski. Ph.D. (University of Alabama in Birmingham); Sidney M. Stahl, Ph.D. (Purdue University); and Elbert B. Whorton, Ph.D. (University of Texas at Galveston).

with most of the emphasis on statistical techniques and not enough attention to the design (or architecture) of the research that is subjected to statistical techniques; (2) it is possible to outline a "core" of architectural principles for medical research; (3) it is possible to outline a "core" of inherently statistical techniques; and (4) the combined "core" that is formed by these two outlines is neither final nor immutable and can be modified, at the discretion of individual instructors, in focus, sequence, and contents. Our focus was deliberately set on the intellectual ingredients of the "core curriculum", without regard to challenges in pedagogic methods of instruction, or to the selection of suitable exercises and problems that can evoke additional interest and provide practical experience. The material contained in this proposal is necessary for any physician who plans to do research or to understand the general literature of his profession, but the information need not be learned exclusively in a formal course of biostatistics. Some of the material cited here and certain omissions-such as "vital statistics" and the role of computers in medicinemay be learned as part of other courses. To produce this document, the committee engaged in various communications by letter and by telephone, and held three direct meetings between August, 1972, and December, 1973. The document represents a reasonable consensus rather than unanimous agreement, and we recognize that it cannot evoke universal assent. Nevertheless, we also recognize that 127

128

Committee of the American Statistical Association

if a "core" is to be constructed for medical biostatistics, someone must start somewhere. Introduction to the core curriculum

Biostatistics contains three main classes of methodologic procedures that are useful in medical science: (1) Methods for designing and conducting research; (2) Methods for displaying and presenting the data; and (3) Methods for analyzing the data to determine associations, make comparisons, and draw conclusions. Since all of these techniques are used in the research reported in medical literature, one of the main purposes of teaching biostatistics in medical school is to ensure that the student is able to read, understand, and interpret scientific papers in medical journals. The ability to appraise the literature critically is a necessary and invaluable skill for every physician, regardless of specialty or other professional activities. This document is prepared according to the general organization of biostatistical methods. Parts I and II are concerned with research design. Since the topics of Parts I and II have not always been included in the conventional teaching of biostatistics, the descriptions are much more detailed than those of the subsequent three parts, which are presented in more concise form. Part III is concerned with presentation of data; and Parts IV and V deal with methods of data analysis. 1. Types of medical studies. The purpose of this section is to note the diverse ways in which medical research is conducted. An awareness of the diversity and of the different designs that are used in the research is a prerequisite to selecting and evaluating appropriate statistical procedures. A. Procedures for measurement and assessment. These are the activities that create and evaluate the "instruments" used to provide data. The instruments include such diverse entities as a spectrophotometer that measures serum calcium, a physician auscultating a heart, and a questionnaire that a patient fills out to indicate satisfaction with health care. The methodologic research devoted to these instruments and their data is a precursor of any subsequent surveys or experiments. The scientific quality of the

Clinical Pharmacology and Therapeutics

data depends on processes (cited with such names as quality control) by which the reliability of the instruments is established and maintained. One form of evaluation is concerned with the instrument's inherent performance on a particular "specimen". (What result is obtained if the same lab does the test three times on the same specimen? What result is obtained when other labs test this same specimen? Does the patient really understand what was asked in Question No. 23? Would three different graders give the same rating to the patient's response?) A second type of evaluation is concerned with the external connotation of the data. (Does an IQ test really measure intelligence? How accurate is the rheumatoid factor test in identifying patients with and without rheumatoid arthritis?). This type of evaluation is discussed later under "Validity". B. Descriptive surveys. These studies are commonly done to indicate or to find out what is happening. The results show the existence or distribution of selected phenomena, and no attempt is made to engage in comparative analyses. The studies may be intended, like a "Case Report", to call attention to an interesting finding or group of patients. More often, the work is done to provide precise data about phenomena that were previously not well described. The results often serve as a background or stimulus for subsequent analytic research. Examples: Surveys of serum cholesterol values in healthy men, ages 40-49; incidence of deaths due to coronary heart disease in the United States during 1970; distribution of doctors and medical costs in rural areas. C. Analytic research. Most analytic studies are performed to answer a question involving a comparison. The ingredients of the design can be outlined as follows: What GROUP of individuals has received what MANEUVER (by self-selection or pre-planned assignment) to achieve what RESPONSE? In this concept, the word maneuver refers to the principal agent whose effects are to be noted. The maneuver may be a suspected cause of disease, a mode of therapy, a socioeconomic condition, or some other agent; e.g., coronary artery disease may be studied to see

Volume 18 Number I

whether it is caused by cigarette smoking, prevented by a low-fat diet, or alleviated by veingraft surgery. This outline of group, maneuver, and response will be used for the elements of design discussed in the sections that follow. In an experiment, the investigator chooses and assigns the maneuvers to which each person (or experimental substance) is exposed. In an analytic survey, the maneuvers under study were either self-selected by the recipient or assigned by someone else (usually a physician). 1. Experiments a. Human (1). Clinical trials: Planned tests of interventional therapeutic agents. The therapy may be remedial (to alter an existing entity, such as headache) or prophylactic (to prevent new occurrence of a disease or to prevent adverse complications of an established disease). (2). Other: Studies performed to explain biologic or pharmaceutical mechanisms, to test health services delivery, to investigate observer variability, etc. b. Laboratory: Therapeutic, explicative, or other research performed on animals or inanimate substances. 2. Analytic surveys a. Etiologic: Associations of disease and suspected causal agents; e.g., occurrence of thrombophlebitis among users and non-users of oral contraceptive pills; occurrence of pilltaking among women with and without thrombophlebitis. b. Therapeutic: Collected results of agents assigned ad hoc for prophylactic or remedial purposes. c. Other: Comparisons of mortality rate at different temporal eras; detection of "risk factors" or "prognostic predictors". II. Elements of design. Because statistics plays so major a role in the Analytic Research described in Section I-C, the ingredients of that type of research will be emphasized here. The comments pertain to either experiments or analytic surveys. A. The maneuvers 1. Types of comparison a. Agents: Choice of new treatment vs. no treatment, placebo, old treatment or other new treatment.

Core curriculum in medical biostatistics

129

b. Dosage: High dose vs. low dose or other arrangements. c. Timing: Concurrent observations via parallel or "cross-over" groups; "historical controls" . 2. Sources of bias in conduct of maneuvers a. Unequal performance: E.g., radical operation done by skillful surgeon compared against simple operation done by inexperienced surgeon. b. Unequal compliance: E.g., Drugs A and B have same pharmaceutical effectiveness if taken by patients, but Drug A tastes terrible and is avoided, making it seem ineffective. c. Unequal ancillary maneuvers: E.g., radical operation plus special care in recovery room compared against simple operation without use of recovery room. B. The groups 1. Admission criteria: Specifications of the particular conditions of age, disease, etc. that allow people to be included in the group(s) under investigation; relation of these groups to the "target population". 2. Allocation: Formation of the groups as the maneuvers are assigned by self-selection, "clinical judgment", randomization, etc. 3. Stratification: Attempts to avoid bias and/or increase precision in the research by reducing the degree of heterogeneity in the compared groups. The comparison is carried out in matched pairs or in subgroups of "homogeneous" strata. The strata may be selected prognostically as groups with distinctly different degrees of "risk" for an identified outcome event. 4. Chronologic tactics a. Longitudinal: Association of a "prospectively" observed antecedent condition and a subsequent effect (cohort groups). b. Cross-sectional: Association of two or more entities noted concomitantly. (E.g.: Relation of serum cholesterol and hematocrit in a group of patients with anemia. Relation of income level and educational background in people who choose pre-paid medical care.) c. "Retrospective": In a cross-sectional arrangement, the subsequent effect, observed in the "cases" and "controls", is associated with an antecedent condition. The sequence of

130

Committee of the American Statistical Association

a longitudinal cohort study is reversed in these "case-control" or "trohoc" groups. 5. Sources of bias in groups a. Selection bias: In individual's transfer from general population to "parent" (or "target" population) to the groups actually under investigation. (1). Formation of groups by random samples vs. convenience samples, volunteer samples, "pure-disease" samples, or compliance samples. Examples: Patients at a private clinic may have ethnic backgrounds, socio-economic states, or clinical severity that is different from patients in municipal hospitals; "volunteers" may differ psychologically from general population; exclusion of patients with co-morbid diseases may produce a "pure" group that does not properly represent the disease under investigation; people who can comply with a complex or difficult experimental protocol may differ from the non-compliers who are omitted from the study. (2). Bias produced when maneuvers are self-selected or judgmentally assigned rather than randomly allocated. b. Chronologic bias (l). Effect of losses of original cohort from the "survivors" studied in cross-sectional and "retrospective" surveys. (2). Effect of drop-outs or other losses from cohort studies. (3). "Longitudinal cross-sections" . These are cross-sections taken at different points in time, giving the illusion but not the substance of a cohort follow-up study. E.g., in cross-sectional studies, rheumatic heart disease was found to have a much higher prevalence in patients with recurrences of rheumatic fever than in patients with first attacks. The erroneous conclusion was that recurrent attacks regularly create rheumatic heart disease in patients who were initially free of it. In cohort studies, it was found that recurrent attacks were particularly likely to develop in patients who already had acquired rheumatic heart disease in their first attack. C. The response The term "Response" is used to encompass the variables that characterize the baseline state

Clinical Pharmacology and Therapeutics

(before the maneuver), the subsequent state (afterward), and the transition between the two states. 1. Observations of variables: Procedures used to improve objectivity (e.g., "doubleblind " techniques) and to assess accuracy, reproducibility, and reliability of data-acquisition techniques. 2. Sources of bias in measuring variables a. Perception and observation: Prejudice when patient or doctor is aware of identity of compared treatments. b. Validity: May be compromised by the "substitution game" in which an easily measured variable (e.g., size of pupil) is substituted for the variable that is more important but more difficult to measure (e.g., psychologic distress). c. Detection: Inequalities in intensity of procedures used to detect target events (e. g. , more frequent medical examinations in users of oral contraceptive pills than in women using mechanical devices). 3. Expression of observation a. Choice of target variables to be used for indicating principal responses. The unit of expression: people, events, concentrations, etc. b. Choice of scales and criteria for individual "measurements". (1). Qualitative (categorical): Nominal and ordinal scales. (2). Quantitative (dimensional): Discrete counts; continuous (measurement) data. c. Choice of "dependent" and "independent" variables for analysis of data.

...

Note: The sections that follow contain the more conventional topics of statistical instruction. Not all of these topics need to be discussed in detail. Some topics should be covered briefly, mainly for purposes of definition or background. Other topics should be covered very briefly, mainly to be cited and brought to the student's attention as something that exists. In making these selections, the instructor should clearly distinguish between the pragmatic needs of a physician and the mathematical goals of a statistician. III. Univariate descriptions of groups (data reduction)

Volume 18 Number I

A. Types of expressions 1. Counted data: Proportions (percentages) and rates; incidence rates in longitudinal studies; prevalence rates in cross-sectional or "case-control" data. 2. Measured data: Average (mean). 3. Ratios of rates: Relative risk ratio. 4. Life-table (actuarial) techniques. 5. Sensitivity, specificity, and validity (of a diagnostic test); false negative and false positive results. B. Distribution of a group 1. Frequency tables and polygons; bar graphs and histograms. 2. Patterns of distribution: Gaussian (normal); Skewness; other distributions. C. Characteristics of a distribution 1. Measures of location a. Central tendency: Mean, median, mode. b. Percentiles. 2. Measures of dispersion a. Standard deviation; variance; coefficient of variation. b. Range and percentile-ranges. IV. Probability and uncertainty A. Terminology and concepts 1. Probability of an event. 2. Mutually exclusive, independent, and conditional events. 3. Parameters and estimators. 4. Random sampling, random numbers, and random allocation. 5. The central limit theorem and its consequences. B. Applications of probability 1. P-values and areas of the Gaussian curve. 2. Sampling distributions and standard error. v. Analysis and interpretation of data A. Point and interval estimation B. Bivariate associations 1. Contingency tables (2-way); scattergrams. 2. Coefficients of correlation: Pearson; Spearman; Kendall's tau. 3. Linear regression for two variables. C. Tests of statistical significance 1. Basic concepts a. The null hypothesis and alternatives.

Core curriculum in medical biostatistics

131

b. The choice of a (Type I errors) and statistical significance. c. The interpretation of "significance". (1). Clinical vs. statistical significance. (2). The role of sample size (determination of sample size). (3). Type II ({3) errors. d. Parametric and non-parametric concepts. e. Relation of types of data to tests. f. Robustness and the violation of basic assumptions. 2. Specific tests a. The t-test for means (paired and unpaired). b. The chi-square test for proportions. c. Fisher exact test and randomization (permutation) tests. d. Significance of correlation coefficient. e. Non-parametric tests: One-sample (paired) and 2-sample ranks tests; Sign test. f. Multiple comparison tests. D. Multiple groups and variables 1. Analysis of variance. 2. Multiple linear regression. 3. Discriminant function analysis. Summary

This document contains guidelines and suggested content for a core of biostatistical knowledge for medical students. The material is divided into five main sections. The first two, concerned with research design, indicate the different forms of medical research and the elements of scientific design for analytic experiments and surveys. The third part is concerned with the various expressions used for presenting summaries and displays of data. The fourth part includes basic concepts and procedures in the statistical application of probability theory. The last part is devoted to statistical procedures for the analysis and interpretation of data. This core of knowledge, which need not be learned exclusively in a formal course of biostatistics, is necessary background for any physician who plans to do research or to understand the general literature of the medical profession.

Proposal for a core curriculum in medical biostatistics. Report of a Committee of the American Statistical Association Subsection on Teaching of Statistics in Health Sciences.

Proposal for a core curriculum in medical biostatistics Report of a Committee* of the American Statistical Association Subsection on Teaching of Stati...
665KB Sizes 0 Downloads 0 Views