bs_bs_banner

INVITED REVIEW SERIES: MODERN STATISTICAL METHODS IN RESPIRATORY MEDICINE SERIES EDITORS: RORY WOLFE AND MICHAEL ABRAMSON

Modern statistical methods in respiratory medicine RORY WOLFE AND MICHAEL J ABRAMSON School of Public Health and Preventive Medicine, Monash University, Melbourne, Victoria, Australia

ABSTRACT Statistics sits right at the heart of scientific endeavour in respiratory medicine and many other disciplines. In this introductory article, some key epidemiological concepts such as representativeness, random sampling, association and causation, and confounding are reviewed. A brief introduction to basic statistics covering topics such as frequentist methods, confidence intervals, hypothesis testing, P values and Type II error is provided. Subsequent articles in this series will cover some modern statistical methods including regression models, analysis of repeated measures, causal diagrams, propensity scores, multiple imputation, accounting for measurement error, survival analysis, risk prediction, latent class analysis and meta-analysis. Key words: association, causality, estimation, sampling, statistics.

INTRODUCTION Statistical methods have advanced rapidly over the past 30 years as computing power has increased dramatically, and computer availability has become commonplace. Where in the past methods became established because of their computational simplicity, nowadays, methods can be exceedingly complex yet be accessible in an instant using an everyday comCorrespondence: Michael Abramson, Department of Epidemiology & Preventive Medicine, School of Public Health & Preventive Medicine, Monash University, The Alfred, Melbourne, Vic 3004, Australia. Email: [email protected] The Authors: Professor Rory Wolfe, BSc, PhD, Professor of Biostatistics at the School of Public Health and Preventive Medicine, has broad research interests in biostatistics. Michael Abramson, MB, BS, PhD, FRACP is Professor of Clinical Epidemiology in the School of Public Health and Preventive Medicine, and a Specialist Physician in Allergy, Immunology and Respiratory Medicine at the Alfred Hospital in Melbourne. He has taught introductory biostatistics and uses statistical methods in his research. Received 14 October 2013; accepted 24 October 2013. © 2013 The Authors Respirology © 2013 Asian Pacific Society of Respirology

puting device. Research studies have also become far more complex, and the typical size of studies has increased markedly. Nearly 40 years ago, the key statistical methods reported in the British Medical Journal were the two-sample (Student’s) t-test and the chi-square test.1 Thirty years ago, nearly threequarters of articles in the New England Journal of Medicine were understandable to a reader familiar only with descriptive statistics, the t-test and situations in which the chi-square test was used.2 However the landscape of statistical methods has changed radically. The initial wave of this change was a move to analyses involving regression models, and as illustration of this shift, Respirology in its first volume in 1996 included papers reporting results from Cox proportional hazards regression models and logistic regression. At the present time, the consumer of medical research would need familiarity with a broad and diverse array of statistical methods. In 2013, Respirology’s original research articles involved widespread use of t-tests, linear and logistic regression, as well as more special-purpose analyses such as negative binomial regression, ordinal logistic regression, conditional logistic regression, Bland–Altman plots for analysis of agreement and receiver operating characteristic analysis. More broadly, the American Journal of Respiratory and Critical Care Medicine published articles in 2013 that used methods such as multiple regression, generalized estimating equations for repeated measures, multiple imputation for missing data, land-use regression for spatially arranged data, multilevel proportional hazards modelling, cluster analysis and latent transition analysis. Statistics sit right at the heart of scientific endeavour. The modern statistician has important roles in the design of research studies that advance human knowledge in specific ways, and in guiding analyses of the quantitative data that arise from such studies. Statistical methodology is a continually evolving field of knowledge in its own right and provides novel study design proposals and novel methods for data analysis. The discipline of statistics also provides theoretical evaluations of these newly proposed designs and methods in efforts to ensure that those shown to Respirology (2014) 19, 9–13 doi: 10.1111/resp.12223

10 involve desirable statistical properties such as unbiasedness and efficiency achieve preferential uptake in the wider practice of scientific enquiry. Nevertheless, there is an element of the ad hoc to uptake of statistical methods; hence guidance on appropriate use of methods, preferable methods to use and more sophisticated options are all part of a broader education role that the statistician plays. The advances in statistical methodology that bring replacement of old ways with new, or the introduction of new paradigms, require continual refreshing of scientists’ understanding of the appropriateness of the statistical methods that are at their disposal. So, it is our pleasure to introduce this series of articles on modern statistical methods that aims to educate, update and introduce scientists in the field of respiratory health to a range of methods that are to the fore of the modern statistical toolkit. Before introducing the topics that are contained in the articles of this series, we set the scene for their contents by describing some important underlying concepts for statistical methodology and briefly bringing simple methods of analysis to the reader’s attention.

STUDY DESIGN AND STUDY PARTICIPANTS Statistical considerations begin at the point of designing a research study and considering who will form the study participants. A critically important concept in identifying which individuals to approach to participate in respiratory health studies is representativeness. The study participants should be representative of the broader population of individuals that interests us. If we want to know about respiratory health in the general population, then we should not recruit participants from a hospital respiratory clinic. If we want to know about a drug’s effectiveness in primary care patients, then those are the individuals who need to be studied. A useful statistical principle is the notion of random sampling. The best chance of a study having participants who are representative of the population is if the participants are selected at random from that population. In practice, it is not always easy to identify the exact population or to select from it in an entirely random way, and we are not able to select who participates from those invited to do so. Nevertheless, the principle guides us to invite people who have been chosen according to a random process from a population that is well defined. Selection completely at random requires the use of randomly generated numbers. Other ideas such as selecting every fourth name in a list of individuals are not sufficient and run the risk of introducing selection bias (which is defined as selection of participants from the population in a non-random way). There can be unavoidable selection bias if the study participants differ in some systematic way from individuals who were approached to participate but declined. The process of selection of study participants needs to be embedded in a study design strategy. The strategy should map to the key scientific question that will Respirology (2014) 19, 9–13

R Wolfe and MJ Abramson

be asked of the data collected in the study. Thus, randomized, controlled clinical trials are used to test the effectiveness of a new treatment in comparison with placebo or usual care. This design aims to have a group of study participants who receive the new treatment and a group who receive placebo or usual care that are comparable at the time of randomization in every other way. Cohort studies are long-term endeavours that are useful for prospective exploration of the development of disease. These studies in particular, but almost all research studies to some extent, can suffer from ‘attrition’ bias that is the loss of study participants over time. Some people choose to cease their participation; for others, declining health can prevent them from attending follow-up study visits. The possibility of attrition bias arises if participants are lost to follow-up in a way that is not completely at random. In the following sections, we will discuss the notion of representing biological relationships with mathematical models, enabling statistical analysis. Embedding these models in a statistical framework enables us to make inferences about disease in the population of interest, an inference that is beyond simply reflecting on what we observed in the people who happened to participate in the research study. To achieve this, the models and framework for their use can incorporate elements to deal with possible selection bias or attrition bias.

ASSOCIATION, CAUSATION AND STATISTICAL ANALYSIS What is the purpose of a statistical analysis? A good analysis related to respiratory health of humans should articulate in quantitative terms a question about the course of the disease. For example: • How are various factors such as age, sex and height associated with a respiratory health indicator such as forced expiratory volume in 1 s (FEV1)? • Do different values of a modifiable factor such as dose of an inhaled steroid relate to improvements in asthma control? • Is it possible to predict the future trajectory of respiratory health in individuals? • Can causes of poorer respiratory health be disentangled from other factors that exhibit strong associations with poorer respiratory health, but which are not biological causes? Statistical analysis relies upon the conversion of research questions into assumed or hypothesized biological relationships among interrelating factors and the articulation of these assumptions in a mathematical equation that is embedded in a statistical framework. This sets the scene for the statistical analysis of data collected in a research study. Of interest, but outside our scope, is discussion of competing statistical frameworks, frequentist and Bayesian, that set the scene in different ways.3 One distinction between the two approaches is the potential in the Bayesian approach for explicit incorporation of prior knowledge or prior beliefs regarding an exposure-disease association of interest that exist © 2013 The Authors Respirology © 2013 Asian Pacific Society of Respirology

11

Modern statistical methods

when one commences collecting data within a new research study. We proceed with the more commonly encountered frequentist approach that, for better or worse, has evolved with a strong focus on the testing of ‘null’ hypotheses as an approach to statistical inference (which we discuss further in the next section). Statistical analyses, on their own, are unable to distinguish between factors that are merely associated with respiratory disease and factors that are causally related. This subtle yet critical distinction between causation and association requires a broader framework of assumptions relating to causation. Such criteria were first clearly stated by one of the founders of medical statistics Sir Austin Bradford Hill in a speech to the Royal Society of Medicine in 1965.4 As refined by epidemiologists since, the criteria that need to be satisfied before we can conclude that an environmental factor causes disease are: • Strong study designs: particularly randomized, controlled trials (if ethical and feasible) or longitudinal (cohort) studies. • Strong associations: typically relative risks of 2.5 or greater. • Consistency: the same associations being demonstrated by different studies. • Dose–response relationship: the greater the exposure, the greater the risk of disease. • Temporality: the exposure must occur before the disease outcome. While this might sometimes seem obvious, it cannot be established in a case–control study. • Biological plausibility: the association should not be inconsistent with biological knowledge from in vivo and in vitro experiments. Another key element of a broader framework is the epidemiological concept of confounding. This concept says that when we are interested in a possibly causal relationship between individuals’ exposure to a risk factor and a disease, we may need to take into account other factors, termed confounders, that are associated with poorer health and that also happen to be associated with the risk factor of interest. As an example, consider a hypothetical report of a relationship between drinking coffee and the risk of lung cancer (Table 1). These data were collected in a case–control study that interviewed 100 patients with lung cancer and 100 controls selected from the same source population. The appropriate summary measure of association in a case–control study is the odds ratio that can be calculated from the data in Table 1 as: Odds ratio = (68 × 56 ) ( 32 × 44) = 2.7

(1)

This is consistent with the conclusion that drinking coffee increases the risk of lung cancer almost threefold. It is possible that this association could have been found by chance (random error). It is also possibly the result of bias (systematic error) of one sort or another. For example, case–control studies are almost always affected by recall bias because the participants are typically asked retrospectively about their exposure (here, coffee drinking). © 2013 The Authors Respirology © 2013 Asian Pacific Society of Respirology

Table 1 An observed relationship between lung cancer and self-reported coffee drinking in 200 people

+ Coffee − Coffee

Lung cancer

Controls

68 32 100

44 56 100

?? Coffee

Lung cancer

Cigarette smoking Figure 1 Potential relationships between coffee drinking, cigarette smoking and lung cancer.

However, we also need to consider the likelihood of confounding. Cigarette smoking is a well-recognized risk factor for lung cancer. Yet cigarette smoking is also associated with drinking coffee. These relationships are summarized in Figure 1. One way of exploring whether confounding exists is to undertake the analysis separately in groups of study participants defined by whether or not they reported cigarette smoking (Table 2). When this is done, it can be seen that there is no association between drinking coffee and lung cancer in either the smokers (odds ratio = 1) or the non-smokers (odds ratio = 1). So the apparent association was almost entirely due to confounding. The most recent metaanalysis suggests only a very modest association between drinking more than two cups of coffee per day and lung cancer (pooled relative risk = 1.14, 95% confidence interval 1.04–1.26).5

METHODS OF STATISTICAL ANALYSIS: A BRIEF INTRODUCTORY TOUR We now touch on some of the more practical aspects of undertaking statistical analysis of data collected in a research study. We encourage researchers and authors to present their data graphically whenever possible. A boxplot or histogram often summarizes data effectively with summary statistics such as mean, standard deviation or median, and interquartile range being succinct descriptors. The distinction between categorical variables (such as living/deceased and male/female) and continuous variables (such as forced expiratory volume in 1 s) is important and leads to different methods of analysis. Basic frequentist statistical methods, such as the two-sample t-test for comparing the means of a continuous variable between two groups of study participants, are still perfectly valid and sometimes all that is necessary. The two-sample t-test involves the specifiRespirology (2014) 19, 9–13

12 Table 2

R Wolfe and MJ Abramson The relationship between lung cancer and self-reported coffee drinking in smokers and non-smokers Smokers

Non-smokers

Lung cancer

Controls

64 16 80

32 8 40

+ Coffee − Coffee

OR = 1

+ Coffee − Coffee

Lung cancer

Controls

4 16 20

12 48 60

OR = 1

OR, odds ratio.

cation of a null hypothesis of no difference between group means in the population of interest and a competing hypothesis of some unspecified difference in these means in the population. With modern computing, it is now possible for researchers to use the frequentist approach to test other hypotheses, although to do so, it is necessary to prespecify the magnitude of the association that is hypothesized, and doing this can be more naturally achieved in a Bayesian approach. We encourage a focus on estimation of quantities of interest, for example an exposure-disease association as described by an odds ratio and use of confidence intervals to characterize the uncertainty due to the random aspect of sample selection in this estimation process. The testing of specific hypotheses has been a dominant paradigm in the past, and this approach still provides some useful concepts, for example for determining sample size at the design stage of a study. Testing hypotheses involves the calculation of a P-value. When testing a general null hypothesis statement that ‘there is no association between risk factor and disease in the population’, the P-value is the probability of observing an association such as the one exhibited in our study participants, or one that is even stronger despite no such association existing in the broader population (this absence of association being our null hypothesis, not a fact). By convention, we reject the null hypothesis if the P-value is smaller than 5%, this rule governing the probability of us making a ‘type I error’, that is, rejecting the null despite its actual truth. There is now much less emphasis on P-values in medical journals;3 for example, the journal Epidemiology strongly discourages their use. It is generally considered that more useful information is conveyed by a confidence interval. By convention, one typically reports 95% confidence intervals, and these represent an interval within which we might reasonably expect the population value of an association to lie.6 When planning a research study, a part of the hypothesis testing framework that is of particular use is the concept of type II error – failing to find in our study participants a clinically important difference of specified magnitude. The chance of making a type II error can be controlled by the choice of sample size for a study. The converse of the type II error is termed the power of the study, and studies typically are designed to have 80% or 90% power for the primary research question being investigated. We encourage researchers to consult a statistician when designing a Respirology (2014) 19, 9–13

study, so study design can be discussed, and sample size and power calculations can be performed. It is not our intention to provide a detailed treatment of specific statistical tests or guidelines for their interpretation. We would refer readers to excellent medical statistical textbooks such as those by Altman,7 or Kirkwood and Sterne,8 textbooks aimed at biological scientists such as Zar,9 to an existing series of statistics articles aimed at an introductory level10 and a landmark series of statistical notes in the British Medical Journal, accessible at http://www.bmj.com/ bmj-series/statistics-notes. There are some more specialized applied topics such as reference ranges, statistics in genetic research, statistics in epidemiological research, and statistics in public health and policy that lie beyond our scope.

THIS SERIES ON MODERN STATISTICAL METHODS Now, we come to the papers in this series in Respirology. The series opens with an article that outlines statistical regression models.11 The models included in the article – linear, logistic and ordinal logistic regression – are among the most commonly used. Many more models exist for increasingly specialized applications. The regression model framework pulls together many apparently disparate methods of analysis. For example, the two-sample t-test, analysis of variance and analysis of covariance can all be viewed as special cases of the linear regression model. Regression underpins many of the other articles in the series, so this article is central to the series, as regression is central to modern medical statistics. An important extension of the linear regression model is to the situation of analysing a number of repeated measures from an individual study participant over time. The second article12 outlines this extension by introducing the ‘mixed model’ for measures made on a continuous scale of measurement. Indeed, the series could have included articles on the extension of logistic regression to a mixed model formulation to analyse repeated measures on a binary scale or of ordinal logistic regression for repeated measures on an ordinal scale. However, those extensions bring additional complexity that is not present in the linear regression case so we refer the interested reader to other sources.13 © 2013 The Authors Respirology © 2013 Asian Pacific Society of Respirology

13

Modern statistical methods

A framework for deducing causality can be articulated at the level of statistical analysis, subject to a range of key assumptions being made. This approach, specifically introducing the concept of causal diagrams, is included in the series because of the importance of this topic. The series will also discuss propensity scores that is a method designed for drawing causal inferences from observational studies. The causal inference approach makes explicit assumptions concerning the relationships among variables on a causal pathway, an explicitness that is not well matched by more traditional approaches based on regression models. Missing data are common in both observational and experimental studies and can be a source of bias if inferences are drawn from statistical analyses based on the observed data alone. Yet that approach to analysis is the default approach of almost every statistical software package. The series includes an article on multiple imputation that provides a brief overview and the theory of this approach to handling missing data. The article also contains a discussion surrounding when multiple imputation can be useful and an illustration of its use in a population-based longitudinal cohort study to explore the association between current asthma and forced expiratory volume in 1 s.14 Another ubiquitous problem in real-world research studies is the inevitable measurement error that comes with quantifying difficult-to-measure phenomena such as physical activity, dietary habits, quality of sleep and level of pain. In statistical analysis, the default ‘turn a blind eye’ approach to this problem is to leave the errors introduced by imperfect measurement to accumulate as part of the ‘residual error’ term of a regression model. The series will describe a more principled approach for dealing with measurement error in an exposure of interest as illustrated by an assessment of the role of diet in respiratory health. Survival analysis has a long history in cancer research. The methods are being increasingly used in other areas of medical research, and there have been important recent advances in the methods that have received rapid uptake in the published literature. This series will provide an introduction to survival analysis and makes connections with the more modern advances. An important application of survival analysis methods is to the prediction of disease incidence. The series also reports on risk prediction that outlines the regression models that can be used as a basis for prediction as well as the methodological care that is required in developing and testing prediction systems. Classification problems have long exercised the minds of statisticians, but there has been significant growth in recent years in the methodology for tackling

© 2013 The Authors Respirology © 2013 Asian Pacific Society of Respirology

this problem. There has also been burgeoning interest in the application of these methods in respiratory medicine, for example in identifying different asthma subtypes. This series will include latent class analysis methods to address this important topic. In the criteria for causality discussed earlier in this introduction, we pointed out the important role of replication of research findings across different research studies. If appropriate, we can combine results from a variety of research studies; the statistical method to achieve this is meta-analysis. We hope that readers will find these articles useful.

Acknowledgements The example of confounding is modified from the American Thoracic Society MECOR course (http://www.thoracic.org/globalhealth/mecor-courses), with the permission of Sonia Buist. We wish to thank our colleagues in the Victorian Centre for Biostatistics in Melbourne who have written these articles, Tom Kotsimbos, Tracy Glass and Christian Schindler for their assistance in reviewing the articles, and Katherine Lee, Julie Simpson and Elizabeth Williamson for helpful comments on a draft of this introduction to the series.

REFERENCES 1 Gore SM, Jones IG, Rytter EC. Misuse of statistical methods: critical assessment of articles in BMJ from January to March 1976. BMJ 1977; 1: 85–7. 2 Emerson JD, Colditz GA. Use of statistical analysis in the New England Journal of Medicine. N. Engl. J. Med. 1983; 309: 709–13. 3 Bland JM, Altman DG. Bayesians and frequentists. BMJ 1998; 317: 1151. 4 Hill AB. The environment and disease: association or causation? Proc. R. Soc. Med. 1965; 58: 295–300. 5 Tang N, Wu Y, Ma J et al. Coffee consumption and risk of lung cancer: a meta-analysis. Lung Cancer [Meta-Analysis] 2010; 67: 17–22. 6 Gardner MJ, Altman DG. Confidence intervals rather than P values: estimation rather than hypothesis testing. BMJ (Clin. Res. Ed.) 1986; 292: 746–750. 7 Altman DG. Practical Statistics for Medical Research, 2nd edn. Chapman & Hall/CRC, Boca Raton, 2006. 8 Kirkwood BR, Sterne JAC. Essential Medical Statistics. Blackwell Science, Malden, MA, 2003. 9 Zar JH. Biostatistical Analysis, 5th edn. Pearson Prentice-Hall, Upper Saddle River, NJ, 2010. 10 Carlin J, Doyle L. Statistics for clinicians; introduction. J. Paediatr. Child Health 2000; 36: 74–5. 11 Kasza J, Wolfe R. Interpretation of commonly-used statistical regression models. Respirology 2014; 19: 14–21. 12 De Livera AM, Zaloumis S, Simpson JA. Models for the analysis of repeated continuous outcome measures in clinical trials. Respirology 2014. In press. 13 Twisk JWR. Applied Longitudinal Data Analysis for Epidemiology: A Practical Guide, 2nd edn. Cambridge University Press, Cambridge, 2013. 14 Lee KJ, Simpson JA. An introduction to multiple imputation for dealing with missing data. Respirology 2014. In press.

Respirology (2014) 19, 9–13

Modern statistical methods in respiratory medicine.

Statistics sits right at the heart of scientific endeavour in respiratory medicine and many other disciplines. In this introductory article, some key ...
173KB Sizes 0 Downloads 0 Views