592735 research-article2015

SJP0010.1177/1403494815592735J.C. Karran et al.Statistical methods in public health

Scandinavian Journal of Public Health, 2015; 43: 776–782

Review article

Statistical method use in public health research

James C. Karran, Erica E. M. Moodie & Michael P. Wallace Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Canada

Abstract Aims: The content of public health research is often statistically complex. This review seeks to assess the breadth of statistical literacy required to understand this material, with a view to informing practitioners’ statistical training. Methods: We review the statistical content of original research articles published in 2011 in four major public health journals. Categories of statistical methodologies are identified and their frequency of use recorded. Methods’ “usefulness” in terms of the extent to which their understanding increases accessibility to the literature is assessed. Results: A total of 482 articles were reviewed and 30 categories of methods identified. Along with descriptive statistics (467 articles), regression analyses were also common, with logistic regression (206 articles) more than twice as prevalent as linear regression (95 articles). More complex regression models for use with clustered data were also commonly encountered, appearing in 96 articles. Conclusions: The public health literature features a wide variety of statistical methods, some of which are advanced. To ensure the literature remains accessible, training for public health practitioners should include statistical training that maximizes breadth as well as depth of understanding. Key Words: Public health, statistics, education, preventive medicine

Introduction Researchers in most modern scientific disciplines make routine use of statistics and statistical methods. Those working in the field of public health are no exception, and as such fulfill a role that—along with expertise in their chosen area of specialization— requires a degree of statistical literacy. Whether devising their own studies, conducting their own analyses, or working at a policy level, a public health practitioner’s statistical knowledge affects their ability to access and comprehend an ever-expanding literature, both in their own field and beyond. Like many disciplines, some amount of statistical training is commonplace in the education of health researchers. Such courses may vary in content, but will often focus on a core set of “basic” statistical techniques. While graduate training in the related field of epidemiology typically includes courses in standard statistical methods such as contingency tables and regression, a brief survey of the syllabi of

degrees in public health revealed less of an emphasis on quantitative courses. We note there has been some research into which statistical methods are most commonly used in epidemiology [1], but we are unaware of any parallel studies in public health research that could form a basis for the statistical education of researchers in that field. In this article, we conduct an investigation into the statistical methods used in original public health research. Through review and analysis of four major public health journals, we assess the level of statistical literacy required to have a good understanding of published work. We identify the methods most commonly used, and examine the range and complexity of methods used in the field, with a view to identifying how statistical training for public health practitioners could become more targeted. A secondary consideration is the rapid development and increasing affordability of good computing

Correspondence: Erica E. M. Moodie, Department of Epidemiology, Biostatistics and Occupational Health, McGill University, 1020 Avenue des Pins Ouest, Montréal, QC H3A 1A2, Canada. E-mail: [email protected] (Accepted 1 June 2015) © 2015 the Nordic Societies of Public Health DOI: 10.1177/1403494815592735

Downloaded from sjp.sagepub.com at FLORIDA INTERNATIONAL UNIV on November 15, 2015

Statistical methods in public health   777 resources, the effect of which is twofold. First, the existence of more powerful machines has motivated both the development and use of analytical approaches that were previously impossible (or at least impractical). Second, the development of statistical packages such as SAS, SPSS, and Stata has made complex methods more accessible, allowing those in many fields (including public health) to perform even complex analyses—using available functions or macros—without the need for specialized code. Although familiarity with one computing software or programming language often allows for ease of learning another so that researchers need not be tied to a single product, we shall also investigate which software packages (if any) are employed. Methods We examine original research articles published in the year 2011 in four public health journals: •• •• •• ••

American Journal of Public Health (AJPH), American Journal of Preventive Medicine (AJPM), Canadian Journal of Public Health (CJPH), and International Journal of Public Health, published by Springer (IJPH).

These journals are recognized as being among the top journals in public health for English-speaking researchers in North American and Europe. Summaries of extant research such as meta-analyses, opinion pieces, reports, letters, and editorials were not examined. In addition, a small number of articles in CJPH were not available in English and were also excluded. A thorough review was undertaken of all other original research articles with a view to identifying which statistical methods, if any, were used. If the use of specific software packages was mentioned, this was also recorded. Statistical methods were divided into categories initially based on those of Horton and Switzer [2]. As the review process progressed, these categories were modified and new ones added to accommodate the variety of new methods encountered. Necessarily, many categories contain numerous techniques, as it is both impractical and of little use to consider every unique method used separately. We therefore group together similar methods according to larger “umbrella terms,” many of which form the basis of a module in a statistics (or data analysis) course. As such, some categories are more specific in nature than others, but each represents a general “type” or “class” of methodologies. For example, we include a category for “data transformation”; we note that this is not in itself a method of analysis, but rather it is

typically a step taken prior to an analysis. Nevertheless, we felt that this requires a specific level of knowledge to understand why a transformation is needed and, ideally, how estimates resulting from a final analysis must be interpreted in light of such a pre-analysis step. Often a post-analysis step, meanwhile, multiple comparison procedures represent the application (and understanding) of a broad principle of statistical inference. Ultimately, 30 categories were identified (along with “no statistics” and “other methods”’), which are outlined in Table I, grouped together broadly by the complexity of the methods and the order in which they are typically encountered in statistical methods courses. Summary statistics of methods and software packages used are also reported. After the data are collected, methods are ordered into a list that corresponds approximately to their frequency of use; this can be seen as a rough measure of their importance in terms of granting access to articles (in terms of understanding their use of statistics). At any point in this list, the addition of the next method listed (assuming all previous methods are known) yields the maximal return in terms of additional articles that an individual could understand (this metric being based on the “accumulation by article” ranking used by Emerson and Colditz [1]). Specifically, we initially suppose a hypothetical individual who does not understand any statistical methods. Next, we identify the single category of methods that, were they taught to this individual, would facilitate comprehension of the largest number of new articles. This process is then repeated (assuming the individual maintains an understanding of all previously taught methods), identifying the next most “useful” category as that which grants access to the largest number of articles in addition to those they already understand, and so on. We opted to force the “other methods” category to lie at the end of the list, since it contains a large collection of specialized and unrelated methods that would not reasonably be taught in any first or second course in statistics or data analysis. Results A total of 482 articles were identified, across which a total of 1841 method categories were used at an average of 3.8 methods per article. Fifteen articles (3.1%) did not make use of any statistical methods. Table II summarizes the number of articles and their use of statistical methods from each journal examined, where we note the CJPH has, on average, the lowest number of methods per article. This reflects the relatively high proportion of CJPH articles that were qualitative in nature rather than quantitative (15%

Downloaded from sjp.sagepub.com at FLORIDA INTERNATIONAL UNIV on November 15, 2015

778    J.C. Karran et al. Table I.  Statistical method categories encountered in original research papers of the four public health journals studied. Method category

Description of method/examples

No statistics Descriptive statistics t-tests Contingency table analyses

Papers containing no statistical methods. Descriptive statistics only (e.g. means, standard deviations and percentages). One-sample, two-sample, and matched pair t-tests. Methods associated with analysis of contingency table-type data, including chisquare tests, McNemar’s test (and Stuart–Maxwell test), and Fisher’s exact tests. Simple non-parametric comparisons including Kruskal–Wallis, Wilcoxon rank-sum, Kolmogorov–Smirnov. Including odds ratios, relative risk, incidence/prevalence, sensitivity and specificity, number needed to treat. Classical product–moment correlation. Analysis of variance, analysis of covariance, F-tests. Data transformation methods (such as log or Box-Cox). Non-parametric measures of correlation such as Spearman’s rho. Methods associated with survival analysis such as life tables, Kaplan–Meier curves, log-rank tests, and Cox regression. Univariate and multivariate linear regression. Binary outcome regression using a logistic function. Regression models not covered by one of the above groups, including Poisson, probit, and multinomial logistic regressions. Techniques associated with the analysis of hierarchical, nested, or clustered data (such as mixed effect modeling and generalized estimating equations). Methods accounting for multiple analyses of the same data set (such as Bonferroni techniques). Modification of incidence and prevalence rates for analysis of diverse populations (e.g. standardized mortality rates). A priori and post-hoc power analyses and sample size calculations. Comparisons of cost against public health outcomes. Assesses sensitivity of results to small changes in the analysis process. Methods that account for missing data, such as multiple imputation or expectation maximization. ROC curves and ROC curve analysis. Bootstrapping of sample statistics such as standard errors and confidence intervals. Use of principle component analysis and cluster analysis methods. Explicit evaluation of models, including discussion/analysis of quality of model fit (such as goodness of fit tests and R2 values) and tests of model assumptions. Methods of model selection such as via forward and backward stepwise selection, Akaike’s information criterion and p-values. Use of Cronbach’s alpha to validate a scale or index. Cochran–Armitage test for trend and generic test for trend. Monte-Carlo modeling and other custom simulation models. Tests of agreement such as intraclass correlation coefficients and Cohen’s kappa.

Simple non-parametric tests Epidemiologic statistics Pearson’s correlation Analysis of variance Transformation Non-parametric correlation Survival methods Linear regression Logistic regression Other regression models Clustered data models Multiple comparisons Adjustment and standardization Power analyses Cost–benefit and cost-effectiveness analysis Sensitivity analysis Missing data methods Receiver-operating characteristic (ROC) Resampling Principle component and cluster analysis Model assessment Model selection Cronbach’s alpha Test for trend Simulations Intraclass correlation, inter-rater agreement, test–retest reliability Robustness methods

Methods that explicitly provide robustness against potentially mis-specified models or violated assumptions. Methods that do not fit in the above categories.

Other methods

Table II.  Statistical method use by journal. Journal

Articles

Total categorized methods

Average methods per article

AJPH AJPM CJPH IJPH Overall

213 147 53 69 482

836 564 172 269 1841

4 3.8 3.2 3.9 3.8

compared to 7% and lower for the other journals we reviewed), with these articles typically using either no or solely descriptive statistics.

Table III summarizes the statistical methods used based on the categories defined in Table I. By far the most common form of statistics used was descriptive

Downloaded from sjp.sagepub.com at FLORIDA INTERNATIONAL UNIV on November 15, 2015

Statistical methods in public health   779 Table III.  Summary of statistical methods used across all journals. Method category

No statistics Descriptive statistics Logistic regression Contingency tables t-tests Model selection Linear regression Clustered data models Model assessment Analysis of variance Cronbach’s alpha Survival methods Other regression Epidemiologic statistics Intraclass correlation, inter-rater agreement, test–retest reliability Adjustment and standardization Simple non-parametric tests Missing data methods Multiple comparisons Sensitivity analyses Transformation Pearson’s correlation Robustness methods Principle component and cluster analyses Nonparametric correlation Power analyses Resampling Cost–benefit and costeffectiveness analysis Test for trend Simulations Receiver-operating characteristic Other

Articles using these methods (%)

Accessible articles (%)

15 (3.1) 467 (96.9) 206 (42.7) 166 (34.4) 80 (16.6) 50 (10.4) 95 (19.7) 96 (19.9) 70 (14.5) 53 (11.0) 36 (7.5) 26 (5.4) 61 (12.7) 47 (9.8) 35 (7.3)

15 (3.1) 63 (13.1) 85 (17.6) 114 (23.7) 126 (26.1) 137 (28.4) 147 (30.5) 160 (33.2) 174 (36.1) 186 (38.6) 200 (41.5) 212 (44.0) 224 (46.5) 240 (49.8) 254 (52.7)

28 (5.8) 22 (4.6) 32 (6.6) 22 (4.6) 33 (6.8) 36 (7.5) 23 (4.8) 19 (3.9) 16 (3.3)

268 (55.6) 283 (58.7) 297 (61.6) 309 (64.1) 320 (66.4) 332 (68.9) 347 (72.0) 363 (75.3) 378 (78.4)

15 (3.1) 17 (3.5) 14 (2.9) 13 (2.7)

391 (81.1) 405 (84.0) 414 (85.9) 423 (87.8)

9 (1.9) 6 (1.2) 7 (1.5) 41 (8.5)

432 (89.6) 438 (90.9) 441 (91.5) 482 (100)

statistics, featuring in 467 (96.9%) of articles. Logistic regression (42.7% of articles) and contingency table methods (34.4% of articles) were also commonplace. It is of some surprise to see the relatively complex category of clustered data models in such widespread use; appearing in 96 articles, they are more common than considerably more straightforward methods such as t-tests (80 articles) and linear regression (95 articles). Forty-one articles (8.5%) featured methods that were classified as “other” owing to their low usage. This “other” group accounts for 33 distinct methods, further highlighting the diversity of statistical methods in use. In Table III, we use the term “accessible articles” to describe the number of articles whose entire statistical content is comprehensible assuming a familiarity with the relevant method categories. For example, 15 articles are considered accessible assuming no

Table IV.  Software packages used. Software package

Article-software uses (% of total uses)

SAS STATA SPSS SUDAAN Excel R HLM Epi Info MPlus S-Plus Other Total

123 (32.3) 87 (22.8) 82 (21.5) 23 (6.0) 12 (3.1) 9 (2.4) 6 (1.6) 6 (1.6) 6 (1.6) 4 (1.0) 23 (6.0) 381 (100.0)

statistical knowledge, while an understanding of descriptive statistics alone provides access to a further 48, yielding an “accessible articles” total of 63. Knowledge of descriptive statistics, logistic regression, and contingency table methods would grant access to 114 of the articles in our data set, or 23.7% of the total. Knowledge of the first 14 method categories listed would elicit comprehension of half of the articles in our data set. Note that as the “other” category contains numerous dissimilar methods, these were treated as a special case and assumed unknown until all other method categories were understood. Table IV summarizes the use of software packages, with SAS, Stata, and SPSS accounting for more than 75% of software uses mentioned. The category “other” contains any statistical software that appeared in fewer than 1% of articles. One hundred and fortythree articles (29.7% of the 482 total) did not specifically mention which (if any) software had been employed. However, in most cases, it was clear that some computational packages had been used. Discussion A large number and variety of statistical techniques are used in public health research, reflecting its firm foundations in epidemiology and biostatistics. It also demonstrates the importance of statistical literacy in the modern public health researcher. Fewer than 14% of articles in our data set were accessible to those with an understanding of only descriptive statistics, the most basic method category in our review. Moreover, these articles typically fell under the “qualitative” (rather than quantitative) heading, generally only using statistics to describe the demographics of those under study. Beyond descriptive statistics, the single category of method that maximized access to additional articles was logistic regression, which, appearing in 206 articles, was also the second most prevalent technique in

Downloaded from sjp.sagepub.com at FLORIDA INTERNATIONAL UNIV on November 15, 2015

780    J.C. Karran et al. this survey. However, it is a testament to the number and diversity of statistical methods in use that knowledge of this method, appearing in 43% of articles, only increases statistical accessibility from 13.1% (for descriptive statistics only) to 17.6%. In general, understanding of a single additional method category would only grant access to around a further 10 articles. The plurality of techniques can further be demonstrated by the fact that understanding 14 (of 30 excluding “no statistics” and “other”) categories of methods is required to access 50% of papers, and 29 to access 90%. Furthermore, a total of 33 methods—more than the total number of categories we identified—were classified as “other” owing to their infrequent use. While a considerable range of statistical methods are employed, there are also some surprises in their complexity. For example, logistic regression features more often than much simpler methods such as chisquare tests and t-tests. Indeed, logistic regression occurs more than twice as often as linear regression, typically the first regression method taught in the classroom. What we have termed “clustered data models”—typically dealing with hierarchical or clustered data—also feature surprisingly often. Appearing in more articles than linear regression or t-tests, it seems practitioners are becoming accustomed to more complex regression methodologies. This finding is somewhat balanced, however, by the fact that the “simpler” method categories of linear regression and t-tests are ranked higher in terms of maximizing accessibility to articles. While experience with a specific statistical software package is not necessary to understand a welldocumented analysis, our figures for software use are nevertheless enlightening. SAS, Stata, and SPSS make up a “big three” of statistical analysis tools, with more than three quarters of reported software use being conducted in one of these environments. Familiarity with at least one of these packages, then, would be of some use to the practicing public health researcher, particularly if they wish to recreate specific analyses or collaborate with others in the field. Limitations Our analysis is, of course, not without limitations. While we believe our sample is a representative assessment of the variety of statistical methods being used in the public health literature today, it is nevertheless a snapshot of four journals in one particular year. A later expansion of this study to cover research in, say, five or ten years’ time would provide valuable additional insight into any trends of how statistical method

use is changing over time. We also acknowledge that there is a limit to how generalizable the findings of this paper are. “Public health” is a broad church, encompassing countless journals other than those we have examined here, with many having very different foci (such as public health policy or industrial relations). A further consideration is the practicalities of statistical teaching, particularly with regards to the ordering of categories in Table III. It is worth bearing in mind that this was a data-driven exercise, with a view to maximizing a hypothetical public health practitioner’s access to articles as they learn one method category at a time. This inevitably leads to some ostensibly peculiar orderings, such as the importance of advanced regression techniques over analysis of variance methods, even though few would consider it reasonable to teach advanced methods without first having taught the more basic forms of regression. Method categories also vary in terms of the number and variety of techniques they contain. Nevertheless, it does serve to highlight areas where statistical teaching, particularly of public health practitioners, may require a shift in emphasis. We do not advocate starting with advanced methods that are more commonly encountered – this would likely lead to poor learning outcomes. However, it may be useful for students who have completed a course in fundamentals of regression to then be exposed to a survey of more advanced methods, which would enable them to understand better the methods encountered in the literature and facilitate discussions with a statistician or data analyst who could provide the needed support to carry out the analysis. It is important to note that we have performed a survey of the methods used, but have not attempted to assess whether such methods were the most appropriate for the research question posed. The choice of statistical approach, and decisions about which variables to include in an analysis, must be driven by the research question, the type of data (e.g. continuous, ordinal, or binary outcome), and the study design. While basic courses in statistical methods can help a researcher avoid gross errors in methodological approach, many subtle points remain (e.g. should a risk or odds ratio be reported, and if so, does the study design permit estimation of the desired quantity?). With the increasing complexity of methods in use, the importance of collaboration with statisticians should not be overlooked. Collaboration with statisticians who have expertise provided by theoretical knowledge and the experience of performing many analyses can ensure that complex methods are correctly applied, and that interpretation of the estimated

Downloaded from sjp.sagepub.com at FLORIDA INTERNATIONAL UNIV on November 15, 2015

Statistical methods in public health   781 parameters is correct. A statistician has sometimes been likened to a mechanic: not only can they drive the car (implement the method), they can open up the hood of the car to see what needs fixing should the ride not be smooth (sensitivity analyses, alternative approaches, etc.). The importance of communication in all research domains cannot be sufficiently stressed. Just as statisticians need to learn to explain the methods that they feel should be used to subject area specialists such as public health researchers, so too do the public health researchers need a basic understanding of the principles underlying the statistical methods to be employed and to be able to communicate key aspects of the underlying research problem for the two researchers to work together effectively.` While there has been some literature highlighting the importance of numeracy in medical decision making, the emphasis has primarily been on communication of risks and statistical concepts to physicians and patients [3–6], rather than public health practitioners and policy makers. A small number of authors have considered how to approach the teaching of statistics to graduate students in public health [7,8]. However, the topic of what content should be included is frequently overlooked. An exception to this is the nearly 20-year-old article by Simpson [8], who suggested key elements should include descriptive statistics and the concept of variability, p-values, and confidence intervals. Finally, the role of subject area journals should not be overlooked. Encouraging authors to describe fully any complex methods used and discuss the implications of the analyses themselves would help the practicing researcher maintain and expand their knowledge of available methods. One possibility would be to allow (or encourage) sections on statistical methods to be longer so that authors can provide a more thorough description of the methods used and how their results should be interpreted. This could be particularly useful when more complex methods are employed, both as an opportunity to expand on potentially difficult theoretical material and to discuss why less complicated or more standard approaches were deemed inappropriate. In addition, most public health journals accept review articles. In some cases, such articles are specifically methodological in content, such as the “Hints and Kinks” section of the IJPH. However, more often such reviews are in areas of public health. It would be both interesting and a valuable contribution to continuing education of practitioners and researchers for journals to publish (perhaps even solicit) methodological reviews.

Conclusions Our results demonstrate that to have command of even a fraction of the public health literature, researchers need familiarity with a wide (and seemingly expanding) battery of statistical methods. Many of these methods are complex, requiring a high level of statistical expertise and understanding. Moreover, the sheer variety of methods we encountered in our review makes specific recommendations difficult (beyond perhaps an increased focus on logistic regression methodology). The importance of including training in statistical fundamentals such as linear and logistic regression, as well as all key concepts leading to these methods (t-tests, chi-square tests, hypothesis testing), in any public health program is not controversial. However, this survey suggests that some time should also be devoted to a survey course in more advanced methods to ensure good subject matter literacy and to ensure effective collaboration with statisticians. A sensible method for statistical teaching in modern public health might therefore be one that emphasizes breadth of knowledge over depth: knowing a minimum required to be familiar with most of the extant methods rather than being expert in techniques that only grant access to some small percentage of the literature. However, a solid statistical foundation will remain essential not only for researchers conducting their own analyses but also those who must read and evaluate the literature in order to implement policy changes. This is a fundamental principle that we do not anticipate seeing change in the future. Conflict of interests The authors declare that there are no conflicts of interest. Funding Author Moodie is supported by a chercheur-boursier junior 2 career award from the fonds de recherche du Québec—Santé. This work is supported by a Discovery Grant from the Natural Sciences and Engineering Research Council of Canada (NSERC) and an Operating Grant from the Canadian Institutes of Health Research (CIHR). The funding organizations had no role in designing the research question, conducting the review, the interpretation of the results, the writing of the report, or the decision to submit the report for publication. References [1] Emerson JD and Colditz GA. Use of statistical analysis in the New England Journal of Medicine. N Engl J Med 1983;309:709–713.

Downloaded from sjp.sagepub.com at FLORIDA INTERNATIONAL UNIV on November 15, 2015

782    J.C. Karran et al. [2] Horton NJ and Switzer SS. Statistical methods in the Journal. N Engl J Med 2005;353:1977–1979. [3] Gigerenzer G, Gaissmaier W, Kurz-Milcke E, et al. Helping doctors and patients make sense of health statistics. Psychol Sci Public Interest 2007;8:53–96. [4] Moussa MAA. Developments in the instruction of biostatistics at the Kuwait University Health Science Centre in a decade. Teach Learn Med 2002;14:194–198. [5] Reyna VF and Brainerd CJ. The importance of mathematics in health and human judgment: numeracy, risk communication,

0

and medical decision making. Learn Indiv Diff 2007;17: 147–159. [6] Swift L, Miles S, Price GM, et al. Do doctors need statistics? Doctors’ use of and attitudes to probability and statistics. Stat Med 2009;28:1969–1981. [7] Boyd Enders F and Diener-West M. Methods of learning in statistical education: a randomized trial of public health graduate students. Stat Educ Res J 2006;5:5–19. [8] Simpson JM. Teaching statistics to non-specialists. Stat Med 1995;14:199–208.

Information for Authors

55

What am I permitted to do with my article when I have assigned copyright to the Nordic Societies of Public Health? 5

10

15

1. You may re-publish the whole or any part of your Contribution in a printed work written, edited or compiled by you, following publication in the Journal, provided the usual acknowledgements* are given regarding copyright notice and reference to first publication by the Journal, the Nordic Societies of Public Health and SAGE. 2. You may make photocopies of your article for your own teaching needs or to supply on an individual basis to research colleagues. 3. We also agree that one year following publication in the Journal, provided the usual acknowledgements are given*, you may post/make electronically available the abstract and up to 100% of your own version of your paper as accepted for publication, including all corrections made by you following peer-review on

60

65

70

(a) your employer’s web site or repository and/or (b) on your own personal web site 20

25

30

*Authors must include the following copyright acknowledgment whenever making electronically available/posting their article: ‘The final, definitive version of this paper has been published in Scandinavian Journal of Public Health, vvolume/issue, Month/Yearw by SAGE publications Ltd. All rights reserved. # The Nordic Societies of Public Health, vyear of publicationw. It is available at: http:// sjp.sagepub.com SAGE administers the copyright and permissions for SJPH on behalf of the Nordic Societies of Public Health. For more information please go to the Author gateway: http://www.sagepub. co.uk/journalEditors.nav

75

80

85

35

90

40

95

45

100

Downloaded from sjp.sagepub.com at FLORIDA INTERNATIONAL UNIV on November 15, 2015

Statistical method use in public health research.

The content of public health research is often statistically complex. This review seeks to assess the breadth of statistical literacy required to unde...
377KB Sizes 2 Downloads 10 Views