Sci Eng Ethics DOI 10.1007/s11948-014-9532-1 ORIGINAL PAPER

The (lack of) Impact of Retraction on Citation Networks Charisse R. Madlock-Brown • David Eichmann

Received: 12 November 2013 / Accepted: 13 March 2014  Springer Science+Business Media Dordrecht 2014

Abstract Article retraction in research is rising, yet retracted articles continue to be cited at a disturbing rate. This paper presents an analysis of recent retraction patterns, with a unique emphasis on the role author self-cites play, to assist the scientific community in creating counter-strategies. This was accomplished by examining the following: (1) A categorization of retracted articles more complete than previously published work. (2) The relationship between citation counts and after-retraction selfcites from the authors of the work, and the distribution of self-cites across our retraction categories. (3) The distribution of retractions written by both the author and the editor across our retraction categories. (4) The trends for seven of our nine defined retraction categories over a 6-year period. (5) The average journal impact factor by category, and the relationship between impact factor, author self-cites, and overall citations. Our findings indicate new reasons for retractions have emerged in recent years, and more editors are penning retractions. The rates of increase for retraction varies by category, and there is statistically significant difference of average impact factor between many categories. 18 % of authors self-cite retracted work post retraction with only 10 % of those authors also citing the retraction notice. Further, there is a positive correlation between self-cites and after retraction citations.

C. R. Madlock-Brown (&) Health Informatics, University of Iowa, 125 West Washington St., Iowa City, IA 52242, USA e-mail: [email protected] D. Eichmann School of Library and Information Science, University of Iowa, 125 West Washington St., Iowa City, IA 52242, USA e-mail: [email protected] D. Eichmann Institute for Clinical and Translational Science, University of Iowa, 125 West Washington St., Iowa City, IA 52242, USA

123

C. R. Madlock-Brown, D. Eichmann

Keywords Scientific misconduct  Publication ethics  Citation networks  Retractions

Introduction Science is believed to be a self-correcting system. That so many retractions occur can be considered evidence of this—questions arise as scientists fail to reproduce published findings, for example. While retractions are precipitated by these attempts at validation and other means, retracted research frequently continues to be cited as if the record has not, in fact, been set straight (Unger and Couzin 2006; Campanario 2000). Several studies suggest that not only are these articles likely to be cited, but that the more citations they have before the retraction, the more they are likely to have afterward (Campanario 2000; Friedman 1990; Redman et al. 2008). To better this trend, we analyze the citation network (i.e., the interconnected network of articles where the link indicates a citation) of retracted work and explore various factors. We explore in this paper the continued palliation of an error/fraud/mistake by continued citation of a retracted work. We analyzed the citation graph to determine how often authors of retracted papers continue to self-cite that work and to establish if this phenomena is correlated with higher citation counts after retraction and journal impact factor. Self-citing authors have the opportunity to minimize the scope of the error, which may offset efforts to decrease citations to retracted work. In 2009, the Committee of Publication Ethics created guidelines for retraction of published work, in which they stipulate that reasons for retraction should be clearly stated in retraction notices (Wager et al. 2009). Authors should be able to provide a more detailed representation of their work than editors. For that reason, we identified the top retraction categories written by author and editor. It is important to note who typically writes which type of retraction. In addition to analyzing the network, we created a detailed classification of retracted articles. Though the concept of retraction classification is not new, we identified a need for a new category and explore retraction trends for all categories. We also identify the average impact factor of the journals in which the retracted work was published. An understanding of the precedents for retraction types can help journal editors make relevant decisions about questionable work.

Related Work Previous research notes the steady rise in retractions (Cokol et al. 2008). Interestingly, errors are more likely to be identified when they are published in high impact (i.e., more visible) journals (Franzen et al. 2007), potentially due to the fact that more of these articles are read and the scientific community attempts to verify their results. Redman et al. (2008) studied average impact by category and found comparable rates between errors and misconduct implying the average impact does not vary across category.

123

The (lack of) Impact of Retraction on Citation Networks

Given that many retracted papers are published in high impact journals it is not surprising that these articles are often still cited after retraction, and that such citations can continue to have troubling effects (Redman et al. 2008). As is noted by Redman et al., retracted papers are often cited after the articles are retracted, and the number of citations before retraction is positively correlated with the number of citations after retraction (r at .60) (Cokol et al. 2008). Furthermore, journals and authors may be reluctant to retract (Sox and Rennie 2006; Couzin and Unger 2006), and reluctant to do all they can to ensure retracted articles are visible as it can affect the reputation of their publication. Most troubling of all, according to Neale et al., is that less than 5 % of citations to retracted articles indicate any awareness of the retraction (Neale et al. 2009). There are many reasons articles may be retracted, some more problematic than others. Wager et al. classified retractions of over 700 retracted papers between 2005 and 2008 and confirmed the research by Steen that most retractions are due to errors and over a fourth are due to misconduct (Steen 2011; Wager and Williams 2011). They also note that retractions for reasons of misconduct and error are both on the rise. Fang et al. categorized 2,047 retracted articles from the MEDLINE database (Fang et al. 2012). They found that misconduct (fraud, duplication or plagiarism) accounted from 67 % of retractions, which differs from earlier work. This discrepancy may be due to their inclusion of Office of Research Integrity (ORI) reports in their study in addition to retraction notices. As noted above, authors may be reticent, and Fang et al. found research identified as fraud by the ORI office was not always identified as fraud in the literature. This difference could alternatively derive from their categorization of retractions as fraud when fraud was only suspected. Despite responses to these findings by the scientific community, such as the guidelines mentions above, there are still relevant areas to explore to fully understand this phenomenon. We identify here several new aspects of the citations to retractions phenomena. We are most particularly interested in examining the relationship between author self-cites and the number of citations after retraction. Our interest in this relationship is based on the findings by Broadus et al. and Simkin et al., which determined that the majority of citations are merely lifted from other papers and that the corresponding articles are not actually read. They accomplished this by analyzing mistakes in citations such as spelling errors and date errors that repeat in the literature (Broadus 1983; Simkin and Roychowdhury 2006). We infer from this that many citations to retractions could be due, not to citing authors finding the original article, but finding the article through papers that cite the article positively. We further hypothesize that because of this tendency, positive self-cites will boost citation rates for retracted articles. Our analysis indicates that this behavior is leading to problematic downstream impact on the citation networks of retracted work.

Methods Materials Though retractions occur in all scientific fields this work focuses on biomedical literature. Retraction notice coverage is more complete in the biomedical fields as

123

C. R. Madlock-Brown, D. Eichmann

the information is available as part of MEDLINE. We identified all retractions from journals from 2003 through 2010 using the 2012 Baseline of MEDLINE. We classified all retraction notices we were able to access through local library resources. We classified 1,066 (96 %) of 1,113 retracted articles for the time span noted above. We then searched the Web of Science citation database for citation data for a subset of 740 articles retracted in the above-mentioned time frame. We reduced the number of articles considered due to the time involved in manually downloading citation data. We identified post-retraction author self-cites by matching first and last name of authors citing the work in question post-retraction. For papers identified as self-cites, we searched the Web of Science database to find out if those papers also had citations to the retraction. Journal impact factor was obtained from the Journal Citation Reports database (JCR). It is calculated by dividing the number of citations a journal received in the previous 2 years over the number of articles published in that journal during the same period. Categorization of Retractions One member of the research team categorized retractions based on the reason given in the retraction notice. We started out with the categories identified by previous research. If the reason was unclear, the retraction was categorized as unknown. If the reason for retraction did not fit into a category, a new category was created after a discussion by both authors. If it was indicated that fraud was suspected, but not proven, we classified the retraction under error or inability to reproduce based on the information given. We grouped categories under either process or misconduct, to differentiate between what appears to be honest error and malfeasance. Retractions were categorized under duplication if the author self-plagiarized. Retractions were categorized under approval issues if the author did not have the permission of coauthors to use their name or work, or they did not have approval from patients or their institutions to perform their experiment. Retractions were categorized under editorial mistake if the editor mistakenly published an article twice, or published an article that was withdrawn. Retractions were categorized under other when the quality of the work was in question, the authors had a conflict of interest, the work was being resubmitted elsewhere, or the article was retracted because it was based on retracted work. Features for Correlations We studied the relationship between several features and the number of post retraction citations, including: • •

Self-cites after retraction (author cites retracted work) Average impact factor of the journals in which retracted work was published

123

The (lack of) Impact of Retraction on Citation Networks

Results Retraction Categories Table 1 presents our classification categories, the average journal impact factor of retractions in each, and their frequencies. Highlighted are the three highest average journal impact factors. As shown in Table 2, most retractions written by article authors involved errors or inability to reproduce. In contrast, editors wrote most (61 %) of the retractions involving misconduct. Ten percent of the articles we reviewed did not give a specific reason for the retraction. Retraction Trends Cokol et al. (2008) demonstrated that retractions, as a percentage of published articles, are on the rise. Steen et al. noted that the rate of retractions due to both misconduct and scientific error is increasing (Steen 2011). We identified the trends for retraction for the reason of misconduct, errors, non-reproducibility, plagiarism, duplication, and reasons unknown. Our work differs from Steen et al. in that we look at the retractions as percentage of published articles, and we provide trends for more specific categories. We see normalizing for the number of publications annually as a necessary component of our analysis, given the rate of increase in total annual publications. Fang et al. provided information on retraction rates as a percentage of articles published (and found the rate of fraudulent papers are on the rise), but not for all the categories presented below. Our results are displayed in Fig. 1. Figure 2 shows the retraction rate by writer. In recent years, there has been a dramatic increase in the number of retractions written by editors. Correlations Identification As shown in Table 3, there is a positive correlation for our data between citations to retracted work before and after it is retracted, corroborating previous work (Wager and Williams 2011). We additionally observe a previously not-identified positive correlation between author self-cites and after retraction citations. We also found that impact factor is positively correlated with after retraction citations, and after retraction author self-cites. Note that compared to previous analyses (0.6 in Redman et al.), our before/after citation correlation is far less substantial. After Retraction Author Self-cites Of the 740 articles for which we had citation data, 135 included post-retraction author selfcites. On average, self-cites accounted for 5 % of the total after retraction citations. As shown in Table 4, 66 % of those retractions were due to errors or non-reproducibility, and 10 % were due to fabrication. Only 10 % of the authors who cite their own retracted work also cite the retraction. There is a positive though not very strong correlation between authors citing their retracted article and the number of citations after the retraction.

123

C. R. Madlock-Brown, D. Eichmann Table 1 Reasons for retraction

Class

Misconduct

Process

Table 2 Top reasons by retractor

Reason

Avg. journal impact factor

Number of events

% of events

Duplication

3.23

180

17

Plagiarism

2.45

169

16

Fraud

8.32

164

15

Errors

6.64

330

31

Not reproducible

9.99

137

13

Unknown

3.47

89

8

Approval issues

3.44

72

7

Editorial mistake

1.59

24

2

Other

2.35

12

1

Reason

Author

Editor

Errors

166

105

Not reproducible

99

32

Fraud

36

116

Plagiarism

30

134

Duplication

36

141

Unknown

22

48

Approval issues

23

45

6

6

Other

Fig. 1 Since 2003 retractions for all reasons have increased, though at different rates; the y-axis shows the number of retractions as a percentage of articles published in MEDLINE

123

The (lack of) Impact of Retraction on Citation Networks

Fig. 2 The rates of retraction by the writer of the retraction

Table 3 Correlations Feature 1

Feature 2

R

P value

Original author cites after retraction

Cites after retraction

0.20

P \ 0.05

Journal impact factor

Citations after retraction

0.24

P \ 0.05

Journal impact factor

After retraction self-cite

0.11

P \ 0.05

Discussion Retraction Categories We found that 42 % of retracted papers were due to errors or an inability to reproduce. 48 % of the retractions reviewed were due to misconduct: fraud, duplication, and plagiarism. Previous work has not identified two categories that we enumerate here: retractions resulting from approval issues or editorial mistakes. The percentage of retractions attributable to approval issues, where the author does not have approval to either have done the experiment or publish the results has risen in recent years (see Fig. 1). The three categories with the highest average journal impact factor are: errors, fraud, and not reproducible. We performed 2-tailed t tests between the journal impact factor of all pairs of categories. Each of these three top categories by impact factor have statistically significant differences with the remaining categories, with the exception of comparing errors with other, and not reproducible with other. This demonstrates that by category some retractions are more visible than others. Our new category, approval issues, is most likely indicative of a larger trend. There are three primary reasons a retracted work would be put under this category: the authors did not have permission to list a co-author, did not have permission to use results from another’s lab, or did not have permission to conduct their study. Having prominent researchers listed as co-author on a paper can increase the likelihood of a paper being published in a high impact journal or published at all. Using results from another’s lab or publishing results where the study has not been

123

C. R. Madlock-Brown, D. Eichmann Table 4 Self-cites by reason

Reason for retraction

# of author self-cites

% of self-cites

Errors

48

36

Non-reproducibility

41

30

Duplication

15

11

Fraud

14

10

Unknown

8

6

Approval issues

6

4

Plagiarism

5

4

Other

2

1

approved by the appropriate agency can tarnish the reputation of entire institutions, and prohibit collaboration as trust breaks down. That there is a precedent for retracting articles because of these misdeeds is important. With a more complete taxonomy of retracted work, editors can make more informed decisions when issues about published work in journals are raised. Referring to work retracted due to editorial mistakes as retracted could potentially hurt the reputation of authors who have not done anything wrong or made a mistake. If there is nothing wrong with the actual work, and no issue of unethical behavior, should a retraction be issued? A retraction has negative connotations and could hurt the reputation of authors who have not done anything wrong or made a mistake. An alternative classification for these mistakes could counteract this effect. Of particular interest is the difference in frequency of the most common reasons given for the retraction written by editors versus those by authors. Authors have a better understanding of the work in question and are in a better position to explain problems with it. When the reason for retraction is not clear, false assumptions may be made about the work. In such cases editors should be involved in ensuring the retraction notice has the reason clearly stated. That editors are increasingly writing the majority of misconduct retractions is encouraging. Guidelines for retractions should be expanded to account for problems associated with retraction authorship. Retractions attributable to misconduct, errors, plagiarism, are all on the rise. Given the variety of reasons for retractions, it becomes all the more important for readers know why an article is retracted. When a paper, retracted because of one of the top three causes, is cited, there are numerous papers, not just the original, bolstering the idea that the research is valid. These categories have the highest average journal impact factor, and therefore the highest visibility. Our research shows, with the exception of one category (‘‘other’’), the difference in journal impact factor between these and all others is statistically significant. Science is based on trust, therefore given the nature of the majority of retractions, the research community needs to make informed attempts to control the impact of retracted papers on the scientific citation network. Retractions owing to plagiarism and duplication are more likely to come from journals with lower impact factors than the more common reasons for retraction.

123

The (lack of) Impact of Retraction on Citation Networks

Those journals may be less likely to screen for plagiarism. The problem could be more substantial in reality. Post Retraction Author Self-cites That many authors of retracted work continue to self-cite is very concerning, particularly given that many refrain from citing the retraction. Clearly, if the author needs to refer to their retracted work, the retraction should additionally be cited. We found a positive correlation between self-cites of retracted work and the number of citations to the work after retraction. This correlation suggests that authors may be able to influence the way their retracted work is viewed is viewed by referring to retracted work without citing the retraction and thereby maintaining the appearance of legitimacy. However, we can’t necessarily assume nefarious intent. Authors may self-cite while highlighting aspects of their work that remain valid. They may not consider self-citing without citing the retraction as inappropriate. It is further possible that not all listed co-authors are made aware of the retraction and cite it in good faith. The phenomenon of self-cites to retracted work is a serious problem. Most (66 %) of the self-cites involve retractions due to errors and non-reproducibility. Boosting the overall citation counts with self-cites increases the perceived legitimacy of the work and can lead to wasted hours of research and resources. Recommendations Despite efforts to make the biomedical community more aware of retractions by posting notices on search and publisher sites, retracted work continues to be cited. Editors can make use of CrossMark (www.crossref.org) a system designed to ensure researchers are using the most recent and reliable version of a document. However, it covers only participating publishers. Editors can use this system to verify references. This would not help curb the rate of all citations to retracted work including self-cites. The final stage of the publishing process may prove to be the best time to double-check citation validity. Though this task may seem tedious systems can be developed to automatically detect citations in digital format. Additionally a retractions database similar to the Rutgers retraction database (http://retract.rutgers.edu/) (which is no longer updated) could be maintained so that researchers could quickly search the names of authors they cited for a history of retractions. Editors could also use this site before making the choice to publish work to determine if the author they are publishing is citing retracted work or has a history of publishing retracted work. While this may not eliminate all citations to retracted work it would be an easy task that may have an impact.

Conclusion/Future Work The main contributions of our research are:

123

C. R. Madlock-Brown, D. Eichmann

• • • •

Determination of the impact of self-cites of retracted work on the citation network. Exploration of the correlation between impact factor and after retraction citation counts. Presentation of a classification of retracted articles with newly identified categories, and the trends for each category. Investigation of retraction trends by writer, comparing those by author with those by editor.

Controlling for the number of years in the before and after timeline could further refine our results. For instance in our study we include articles that were retracted between 2003 and 2010. The articles retracted in 2003 have had more time to gain citations. Also, articles are retracted at different times after the original article is published so that the amount of time in which they could accumulate before retraction citations varies. Furthermore, the citation rates in biomedical journals vary (Zitt et al. 2005), which further complicates the task of controlling for variability. With more data, we would be able to correct for these differences. Acknowledgments This publication was supported in part by Grant Number UL1RR024979 from the National Center for Research Resources (NCRR), a part of the National Institutes of Health (NIH). Its contents are solely the responsibility of the authors and do not necessarily represent the views of the CTSA or NIH. We also thank Todd Papke for his comments on this work.

References Broadus, R. N. (1983). An Investigation of the validity of bibliographic citations. Journal of the American Society for Information Science, 34(2), 132–135. doi:10.1002/asi.4630340206. Campanario, J. (2000). Fraud: Retracted articles are still being cited. Nature, 408(6810), 288. Cokol, M., Ozbay, F., & Rodriguez-Esteban, R. (2008). Retraction rates are on the rise. EMBO Reports, 9(1), 2. doi:10.1038/sj.embor.7401143. Couzin, J., & Unger, K. (2006). Cleaning up the paper trail. Science, 312(5770), 38–43. doi:10.1126/ science.312.5770.38. Fang, F. C., Steen, R. G., & Casadevall, A. (2012). Misconduct accounts for the majority of retracted scientific publications. Proceedings of the National Academy of Sciences, 109(42), 17028–17033. doi:10.1073/pnas.1212247109. Franzen, M., Rodder, S., & Weingart, P. (2007). Fraud: Causes and culprits as perceived by science and the media. EMBO Reports, 8(1), 3–7. doi:10.1038/sj.embor.7400884. Friedman, P. J. (1990). Correcting the literature following fraudulent publication. JAMA: TheJournal of the American Medical Association, 263(10), 1416. Neale, A. V., Dailey, R. K., & Abrams, J. (2009). Analysis of citations to biomedical articles affected by scientific misconduct. Science and Engineering Ethics, 16(2), 251–261. doi:10.1007/s11948-0099151-4. Redman, B. K., Yarandi, H. N., & Merz, J. F. (2008). Empirical developments in retraction. Journal of Medical Ethics, 34(11), 807–809. doi:10.1136/jme.2007.023069. Simkin, M., & Roychowdhury, V. (2006). Do you sincerely want to be cited? Or: Read before you cite. Significance, 3(4), 179–181. doi:10.1111/j.1740-9713.2006.00202.x. Sox, H. C., & Rennie, D. (2006). Research misconduct, retraction, and cleansing the medical literature: Lessons from the Poehlman case. Annals of Internal Medicine, 144(8), 609. Steen, R. G. (2011). Retractions in the scientific literature: Is the incidence of research fraud increasing? Journal of Medical Ethics, 37(4), 249–253. doi:10.1136/jme.2010.040923. Unger, K., & Couzin, J. (2006). Even retracted papers endure. Science, 312(5770), 40–41. doi:10.1126/ science.312.5770.40.

123

The (lack of) Impact of Retraction on Citation Networks Wager, E., Barbour, V., Yentis, S., Kleinert, S., et al. (2009). Retractions: Guidance from the Committee on Publication Ethics (COPE). Maturitas, 64(4), 201–203. Wager, E., & Williams, P. (2011). Why and how do journals retract articles? An Analysis of medline retractions 1988–2008. Journal of Medical Ethics. doi:10.1136/jme.2010.040964. http://www.ncbi. nlm.nih.gov/pubmed/21486985. Zitt, M., Ramanana-Rahary, S., & Bassecoulard, E. (2005). Relativity of citation performance and excellence measures: From cross-field to cross-scale effects of field-normalisation. Scientometrics, 63(2), 373–401. doi:10.1007/s11192-005-0218-y.

123

The (lack of) impact of retraction on citation networks.

Article retraction in research is rising, yet retracted articles continue to be cited at a disturbing rate. This paper presents an analysis of recent ...
420KB Sizes 2 Downloads 3 Views