Structure

Letter AppCiter: A Web Application for Increasing Rates and Accuracy of Scientific Software Citation Stephanie M. Socias,1,2 Andrew Morin,1,2 Michael A. Timony,1 and Piotr Sliz1,* 1Department

of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, MA 02115, USA author *Correspondence: [email protected] http://dx.doi.org/10.1016/j.str.2015.04.005 2Co-first

The AppCiter web application we present here has been developed for structural biologists and provides current and exhaustive citation information for structural biology software programs. Users are guided in their navigation of a categorized listing of programs to view, select, and export the citations most applicable to their work. Citation is the common metric of scientific utility and often a critical factor in decisions on further development of scientific software, yet identifying a correct set of citations is often a complex process. The symbiotic relationship between computing and science can be seen across all scientific disciplines. From simulation to data management, computing facilitates an exploration beyond theory and experiment that leads to unprecedented knowledge discovery (Bell et al., 2009; de la Iglesia et al., 2013). Whereas such scientific advancements are recognized through the citation of resulting research publications, the computational tools and techniques that predicate such progress do not receive this same recognition (Ceguerra et al., 2013). Existing attribution metrics fail to properly recognize the nontraditional form of scholarly output made by individual scientists as programmers and software creators (Hames, 2012; Morin et al., 2012). These scientist-programmers rely on proper citation to receive the recognition and benefits associated with useful tool creation and dissemination. Citation is the common measure of scientific utility, and is considered the currency of scientific achievement (Gla¨nzel, 2008). Citation metrics are often used in decisions on career advancement, research funding, and strategy. Beyond issues of incentives and recognition, the reproducibility of computational results depends on the accurate citation of the programs employed in research; even the simple omission of a

program version can lead to significantly different outcomes and results. It is therefore in the interest of scientific advancement to properly reward and incentivize the creation of computational tools through full and accurate citation. Structural biology as a scientific discipline is especially reliant on computational tools. Some of the earliest uses of scientific computing were in elucidating the form and structures of biological molecules (Campbell, 2002). Many of these valuable computing tools are created by other practicing scientists who have long suspected that rates of citation do not accurately reflect rates of use (Morin and Sliz, 2013; Hannay et al., 2009). There are several potential reasons why proper citation of scientific software can be difficult to achieve. Proper citation for a given scientist-created piece of software is typically to reference the original journal article first describing the software. Although program developers often attempt to provide proper reference information in a window within the program, in program documentation, or in a text file included with program distribution, manuscript authors frequently encounter difficulty locating and parsing accurate and up-to-date citation information. Even when accurate information is found, it can be difficult to correlate multiple citations with distinct program versions. Successive versions of a program often have different contributors, each due credit for their work. Additionally, program collections or suites can contain numerous individual programs created by various contributors, where accurately citing the collection does not constitute proper citation of an individual program contained within. The difficulties of proper citation do not go unnoticed. The creators of the PHENIX software suite in particular are making greater attempts to help users locate accurate and applicable citation information

by offering in-program guidance. PHENIX now provides users with a listing of citation information for all packages used for a given project. Outside of this singular instance of in-program guidance, however, the difficulties of proper citation remain. To address these issues, the SBGrid Software Consortium (Morin et al., 2013) has created AppCiter. AppCiter is a web application that generates accurate, upto-date citations of software used in structural biology research. Built on a database of current and historical citation information for 295 structural biology programs and program suites, AppCiter serves both program developers and program users by bringing the latest citation data from developers directly to users. Hundreds of program developers were contacted to verify and update their program’s citation information in the AppCiter database. These developers had direct input into which citations and annotations are displayed alongside their programs. Currently, the AppCiter database contains 770 citations for 249 of the 295 SBGrid-supported programs. These programs have anywhere from 1 to 41 citations each. Nearly half of these programs have 2 or more associated citations and a quarter have 3 or more associated citations. AppCiter guides users through this large amount of information to ensure the selection of only the most applicable and appropriate citations. AppCiter is accessible via any standard web browser at http://www.sbgrid.org/ software/. In three simple steps, users can create a custom list of citations. Users navigate an extensive listing of categorically organized programs, view and select available citations for each program, subprogram, and program version, and export selected citations in various file formats (Figure S1). Export options currently include BibTeX and RIS file

Structure 23, May 5, 2015 ª2015 Elsevier Ltd All rights reserved 807

Structure

Letter formats for importing into bibliographic and reference management programs, as well as a plain-text (.txt) output option in APA reference style for cut-and-pasting directly into documents. Files may be saved directly to the user’s device or, optionally, emailed to a user-provided email address. Text descriptions of each program, along with notes and developer comments, are included in order to guide users in their citation selections. Maintenance of the AppCiter database depends on close interaction with the structural biology community. To ensure continued accurate and up-to-date information, AppCiter provides a link alongside the existing citations where program developers may directly submit updated or corrected citation information, notes, or advisories. Developers also receive a biyearly email containing a snapshot of their program’s current citation data within AppCiter. They may respond to that email with additional updates or corrections. The AppCiter web tool is anticipated to help improve both the accuracy and rates of citation for software used in structural

biology research by making it easier for users-cum-authors to obtain correct, current citation information from a centralized and comprehensive software database. Program developers will benefit from increased rates and accuracy of citation, whereas users are assured easy access to the most up-to-date citation information available. By helping to better align incentives with reward for the creators of scientific software, citation tools like AppCiter yield benefits for the institution of science research. SUPPLEMENTAL INFORMATION Supplemental Information includes one figure and can be found with this article online at http://dx. doi.org/10.1016/j.str.2015.04.005.

ACKNOWLEDGMENTS This work was supported by the National Science Foundation grant 1448069.

Campbell, I.D. (2002). Nat. Rev. Mol. Cell Biol. 3, 377–381. Ceguerra, A.V., Liddicoat, P.V., Ringer, S.P., Goscinski, W.J., and Androulakis, S. (2013). A tool for scientific provenance of data and software. In Proceedings of IEEE 16th International Conference on Computational Science and Engineering, pp. 561–565. de la Iglesia, D., Garcı´a-Remesal, M., de la Calle, G., Kulikowski, C., Sanz, F., and Maojo, V. (2013). Curr. Top. Med. Chem. 13, 526–575. Gla¨nzel, W. (2008). Collnet J. Scientometrics Inf. Manage. 2, 9–17. Hames, I. (2012). Report on the International Workshop on Contributorship and Scholarly Attribution, May 16, 2012 (Harvard University and the Welcome Trust). Hannay, J.E., MacLeod, C., Singer, J., Langtangen, H.P., Pfahl, D., and Wilson, G. (2009). How do scientists develop and use scientific software? In Proceedings of the 2009 ICSE Workshop on Software Engineering for Computational Science and Engineering, pp. 1–8. Morin, A., and Sliz, P. (2013). Biopolymers 99, 809–816.

REFERENCES

Morin, A., Urban, J., Adams, P.D., Foster, I., Sali, A., Baker, D., and Sliz, P. (2012). Science 336, 159–160.

Bell, G., Hey, T., and Szalay, A. (2009). Science 323, 1297–1298.

Morin, A., Eisenbraun, B., Key, J., Sanschagrin, P.C., Timony, M.A., Ottaviano, M., and Sliz, P. (2013). Elife 2, e01456.

808 Structure 23, May 5, 2015 ª2015 Elsevier Ltd All rights reserved

AppCiter: A Web Application for Increasing Rates and Accuracy of Scientific Software Citation.

AppCiter: A Web Application for Increasing Rates and Accuracy of Scientific Software Citation. - PDF Download Free
155KB Sizes 0 Downloads 10 Views