577865

research-article2015

PPSXXX10.1177/1745691615577865EngelScientific Disintegrity as a Public Bad

Scientific Disintegrity as a Public Bad

Perspectives on Psychological Science 2015, Vol. 10(3) 361­–379 © The Author(s) 2015 Reprints and permissions: sagepub.com/journalsPermissions.nav DOI: 10.1177/1745691615577865 pps.sagepub.com

Christoph Engel Max Planck Institute for Research on Collective Goods

Abstract In this article, I argue that scientific dishonesty essentially results from an incentive problem; I do so using a standard economic model—the public bad. Arguably, at least in the short run, most scientists would increase their personal utility by being sloppy with scientific standards. Yet, if they do, it becomes more difficult for all scientists to make their voice heard in society, to convince policy makers to assign public funds to academia, and to lead fulfilling academic lives. The nature of the ensuing governance problem (and appropriate policy intervention) hinges on the definition of scientists’ utility function. The policy problem is less grave if society attaches disproportionally more weight to severe or widespread violations and if individual scientists do not precisely know in advance when they will quit their academic lives. If most scientists internalize most scientific standards, then the problem is alleviated. However, internalization is immaterial if honorable scientists dislike that others advance their careers by violating those standards. Sanctions are helpful, even if relatively mild. However, it is important to also punish those who do not punish others for breaking the rules or, alternatively, to put some centralized mechanism for vigilance and enforcement into place. Keywords scientific misconduct, incentives, public bad, intervention

Thou shalt not cheat. Much of the lively debate on standards of scientific integrity has a biblical overtone. Scientists are reminded of the conditions for empirical evidence to be reliable, and they are exhorted to take them seriously. It certainly is helpful to explain why some practices may lead to questionable results. It is equally helpful to appeal to scientific honor. However, scientists are not angelic beings, free from personal interest. It is not farfetched to believe that scientific standards are not exclusively violated by accident. At least in the short run, it might be advantageous for the individual scientist to take scientific standards lightly. She might advance her career, win a grant, outperform her peers, write more articles, boost her self-esteem, or simply save effort. In this article, I contribute such an incentive perspective to the policy debate on scientific integrity. I show how the problem might result from a conflict between individual and social rationality (i.e., from a dilemma). This is a pure theory article, but it generates a series of testable hypotheses. Provided these hypotheses are supported by the data (or provided they are taken to be convincing without testing), recommendations for interventions follow.

Scientific Disintegrity in the Spotlight In recent years, psychology has been rocked by scandals, including several instances of confirmed fraud (Levelt Committee, Noort Committee, & Drenth Committee, 2012, p. 25). These scandals have led to considerable acrimony, and much of the discussion has a moral undertone. In the long run, blaming individual scientists may not be the most promising approach to improve scientific standards. A more disinterested approach may be more effective. In a way, this conviction also guides the movement in which researchers call for more rigorous standards in data collection, data analysis, and reporting (Bakker, van Dijk, & Wicherts, 2012; Fiedler, 2011; Fuchs, Jenny, & Fiedler, 2012; John, Loewenstein, & Prelec, 2012; Kerr, 1998; Simmons, Nelson, & Simonsohn, 2011). Yet, implicitly, this movement is fairly optimistic. Most researchers contributing to this methodology literature seem to believe that if scientists are made aware that Corresponding Author: Christoph Engel, Max Planck Institute for Research on Collective Goods, Kurt-Schumacher-Straße 10, D 53113 Bonn, Germany E-mail: [email protected]

Engel

362 certain scientific practices are questionable, most of them will behave more prudently. Research shows that threats with sanctions are not entirely pointless but that deterrence effects are certainly not guaranteed (not even for the death penalty; Nagin & Pepper, 2012). It therefore seems timely to treat scientific integrity as a governance problem (cf. Nosek, Spies, & Motyl, 2012, calling for changing incentives). Before one designs interventions, one should understand why and to what extent there actually is a governance problem. This is what I set out to do in the current article.

A Rational Choice Perspective on Scientific Disintegrity Understanding requires theory. The theory on which I build in this article is microeconomics—a branch of rational choice theory. The analysis I present is, of course, a radical simplification for understanding scientific disintegrity, but it is thought-provoking nonetheless. To substantiate my claim, I rely on a translation of rational choice theory into a well-worn model. The foundation of this model will be familiar to many psychologists: the prisoner’s dilemma. In the most familiar version of the prisoner’s dilemma, two players choose whether to cooperate or defect. The example of scientific integrity, a “public good”—or disintegrity, a “public bad”—differs from the familiar version of the prisoner’s dilemma in two ways. First, the action space is continuous. Actors are free to cooperate a little or a lot. Second, rather than having only two players, to understand scientific disintegrity, one needs an N-person game (a canonical treatment is provided by Cornes & Sandler, 1996; a good overview of later developments is provided by Fehr & Schmidt, 2006). The backbone of the model is a negative externality. If one researcher is sloppy with scientific standards, she inflicts harm on other researchers and on the scientific community at large (Lacetera & Zirulia, 2011). This model is not confined to scientific fraud. Although fraud unfortunately occurs (for an overview, see Stroebe, Postmes, & Spears, 2012), the empirical evidence suggests that fraud is the least important part of the problem of disintegrity (see, e.g., the evidence by John et  al., 2012). The true governance problem originates in misconduct of a much less obvious and dramatic—but still detrimental—­ character, and how it might possibly be contained. Before I embark on the theory, let me sketch the goal: I want to see why an individual scientist might be tempted to disregard rules of scientific integrity, despite the fact that violating these rules is not only bad for science at large but ultimately also for this scientist. In more elaborate versions of the model, I also want to see how this prediction hinges on the assumptions I make, and I want

to define which properties interventions must have if they are to get the problem under control. The basic building block of the model is the assumption that each scientist must rely on society putting generalized trust in the integrity of scientific work. Otherwise, society would stop financing universities and research institutions and would turn universities into mere schools of higher education. Yet, arguably in the short run, a scientist who oversteps one of the rules of scientific conduct has more to gain than she stands to lose in the long run by contributing to an erosion in the perception of scientific integrity. It is this conflict between individual and social rationality, and possible ways to overcome it, that is captured by the model.

The Baseline Dilemma The purpose of (rational choice) modeling The good thing about models is that they are precise. The bad thing about models is that they are precise. If the model is cleanly written, all the assumptions on which the model builds are made explicit; all the components are lined up, all channels of influence are defined, and the model makes clear predictions. Yet, there is a price to be paid for clarity. Everything that is not built into the model is not part of the argument. Models are not meant to be descriptions of the real-life phenomenon that the modeler is interested in. The modeler consciously abstracts from almost any intervening variable to make the effect that she is interested in visible. I do so in several steps. For expositional reasons, I start with a version of the model that most readers will consider to be a caricature: Using this model, I predict that all scientists break all rules of scientific conduct at all times (at least as long as they can get away with it). This is certainly not a faithful picture of academic reality. Yet, this caricature serves as a challenge. It forces me to specify the additional assumptions that are needed to generate more plausible predictions. Readers will be led to ask themselves which of these alternative ways of enriching the utility function of individual scientists—or of the definition of science as an opportunity structure— resonate with impressions they have gained from their professional environment. (For the sake of clarity, I use mathematical notation in the Appendix. However, no advanced mathematical skills are needed to follow the main text.) It is assumed in the model that individual scientists maximize some well-defined utility function. Do they? Most likely they do not in the strict, formal sense. In that sense, I do not offer a process model in this article. Yet, the claim also does not seem farfetched that the

Scientific Disintegrity as a Public Bad

363

Table 1.  Baseline Model: How Much You and All Others Suffer From a Dirty Flat Total number of flatmates who clean Condition Your enjoyment   You clean   You do not clean Total community enjoyment

4 flatmates

3 flatmates

2 flatmates

1 flatmate

0 flatmates

20

10

0

−10

   

80 4 × 20

30 60 3 × 10 1 × 30

20 40 2×0 2 × 20

10 20 1 × −10 3 × 10

0 0 4×0

Note: The arrows indicate your personal problem. If you follow the arrow—that is, if you decide to litter (maximally)—you always gain 10 units of personal enjoyment, irrespective of how many other flatmates are well behaved. Total community enjoyment is the sum of individual enjoyments. If you are the only flatmate to spoil the flat, total enjoyment is given by 3 × 10 + 30 = 60. If you do not misbehave, the flat instead enjoys 4 × 20 = 80 units.

perceived problem of scientific policy is related to the fact that at least a nonnegligible fraction of scientists take into account how their scientific conduct affects their career and personal well-being.

Public good (and public bad) models Many things in life can be modeled as public goods: the environment, knowledge, institutions, language, and peace. The production or preservation of such goods requires effort and often also the willingness to take risk. Yet, those who exert this effort, and those who bear this risk, have a hard time preventing others from benefiting from their efforts. If one firm invests into the abatement of emissions, everybody in that community enjoys cleaner air. Moreover, if I breathe fresh air, this does not prevent you from breathing that same air. Pure public goods are thus characterized by two properties: Nobody is excluded from using the good, and there is no rivalry in consumption. Pure public goods pose a thorny policy problem. If everybody maximizes individual payoff, there is a net social dilemma. Nobody contributes anything to the provision of the public good for two combined reasons: If I do not contribute, but others do, I get a free lunch. If I contribute, whereas others do not, I am exploited. A pure public good thus constitutes a prisoner’s dilemma. Throughout the article, I use one numerical example. Let us first apply it to a straightforward public bad—that is, the mirror image of a public good in the negative domain. It is much more natural to interpret scientific disintegrity as a public bad, rather than scientific integrity as a public good to which each scientist contributes by not disregarding the rules of scientific conduct, which is why I work with this mirror image.

The Baseline Model (Student Flat Version) Let us assume that four students share a flat. Each student enjoys living in the flat quite a bit and, if hard pressed to assign a number to this enjoyment, would say that she receives 20 units of utility. However, each student finds it tedious to clean the flat. If she lets the flat get a little less proper, she personally gains more than she loses. Let us assume that she gains twice as much as she loses—she gains 20 for not having to work, but she loses 10 because the flat is less clean. Therefore, if she does not clean, her utility is 20 (for living in the flat) + 20 (for not having to clean) – 10 (for the flat being partly dirty) = 30. Because all the flatmates prefer the flat to be tidy, all the others suffer as well (losing 10 each) if one of them starts using the flat without cleaning up the dirt she causes. The resulting incentives can be found in Table 1 for the extreme case in which each individual flatmate is either perfectly well-behaved or does no cleaning at all.1 Welfare—that is, the total enjoyment of all flatmates—can be found in the bottom row. If all flatmates clean the flat whenever they have made it dirty, all are best off. Each of them receives 20 units of enjoyment, and all of them together receive 80 units of enjoyment. Is this what a selfish flatmate would do? Unfor­tunately, the answer is no. The arrows characterize the reasoning of a selfish individual. Take the leftmost arrow: If all are perfectly well behaved, all flatmates receive 20 units of enjoyment. How does that compare with the situation in which all others still clean, but one individual stops doing it? As explained earlier, this individual receives 30 units of enjoyment. By “deviating to defection,” as game theorists would put it, she gains 10 units. This individual gain comes at a substantial social cost though: Three loyal

364 flatmates lose 10 units of enjoyment each. Consider the bottom row: whenever a flatmate stops cleaning, the whole flat loses 20 units of enjoyment (the noncleaner gains 10 units, but the others lose a total of 30 units). Yet, a selfish flatmate does not care. Really? What if she is not the only selfish inhabitant of the flat? In this example, a second selfish tenant suffices to destroy the first one’s benefit from selfishness. If two people forget about cleaning, despite their personal gains from using the flat, overall each of them just comes out even. The additional personal gain of 10 units is eaten up by the additional loss of 10 units resulting from the carelessness of the other selfish flatmate.2 However, why should one be concerned? Is it not obvious that all inhabitants would foresee this outcome that none of them desire and just contribute to the cleaning of the flat in the first place? Unfortunately, the answer is again no, even if all of them have perfect foresight and if all correctly expect all others to be selfish.3 The reason for this gloomy prediction follows from the arrows. The arrows illustrate a socially minded individual’s choice problem. Irrespective of how many other flatmates are prepared to clean, she is always better off by using the flat without cleaning. They each have an incentive to misbehave themselves to preempt being exploited by other selfish flatmates. The “tragedy of the commons” hits the flat (Olson, 1965).

Scientific disintegrity as a public bad In which ways might this model, as applied to flatmates’ tidiness, help to understand scientific disintegrity? Let us replace flatmates with scientists and look at the incentives of individual scientists. For interpretation, the critical component of the model is the endowment. In the previous section, this was the enjoyment from a perfectly tidy flat. To make the model meaningful for scientific integrity, one should not interpret the analogue as money, human capital, zeal, or any other private good. All of these ingredients may well be indispensable for success in academia. However, using the model I focus on another necessary condition for academic success: generalized trust in academia. Scientists should of course be able to explain to laypeople what they have found out and why it matters. However, to a large extent the consumers of scientific findings have to take it for granted that these findings have been properly generated and that they are presented with the necessary qualifications and safeguards. It is this component that is captured by the model through the endowment. By the very fact of being a member of the scientific community, in each and every project a scientist engages the generalized trust that society puts into science.

Engel From this, the interpretation of the selfish act4 naturally follows. It is the degree by which an individual scientist abuses the generalized trust in science. It is assumed in the model that scientific disintegrity is profitable for the individual scientist. You could think of the scientist advancing her career, her reputation in the scientific community, increasing the pride that she derives from publishing in good journals, or the self-esteem from knowing that she has been the first to gain an insight. It is further assumed in the model that scientific disintegrity comes at a private cost. The scientist does not enjoy putting general trust in science in peril. The scientist might have a bad conscience. She might have internalized scientific standards. She might also be afraid of negative effects on her career, reputation, pride, and self-esteem if—at some later point in time—her dishonesty becomes public. Yet, it is assumed in the model that this private cost is smaller than the private benefit. In this model, there is temptation to cheat. The scientist is, however, not myopic. She is fully aware of the disutility that her disintegrity not only imposes on herself but on all members of this scientific community. She only assesses this additional cost to be smaller than the private benefit minus the private cost. This feature of the model is easiest to motivate by uncertainty. As various cases have made obvious, even outright scientific fraud may go unnoticed for a long time. On first reading it may seem unconvincing that the damage to the scientist engaging in disintegrity should be modeled to be the same as the damage to any other scientist. Presumably, if the dishonesty is found out, the dishonest scientist suffers much more than her innocent peers. Yet, within the framework of the model, this additional potential drawback is already captured by the fact that her benefit from her action is not larger because of the expected values. (In the example, dishonesty only gets her 20; arguably the benefit from dishonesty is much bigger—say 40—as long as it goes undetected, but it is 0 if detected. If the scientist expects detection with probability ½, the expected value of dishonesty is ½ × 40 + ½ × 0 = 20.) In the model, this is already netted out, and damage really only captures the effect on generalized trust, not on individual trust in this one scientist. Although all these elements of the interpretation have their plausibility, most scientists will have a hard time accepting the model prediction. If this model gets it is right, there should be no such thing as scientific integrity. Provided all scientists maximize individual utility, they all should maximally abuse generalized trust in science. The tragedy of the commons should long ago have made science an unsustainable endeavor. Society should long ago have stopped putting any trust into scientific findings and in using any funds to finance the enterprise. Obviously,

Scientific Disintegrity as a Public Bad

365

Table 2.  Damage to Trust in Science Is Uncertain Number of scientists who respect the rules of scientific conduct Condition

4 scientists

3 scientists

2 scientists

1 scientist

0 scientists

Your utility   You respect

20

12.5

5

−2.5

   

  You do not respect Total utility of the scientific community

80

32.5 70

25 60

17.5 50

10 40

Note: Payoffs are represented from an ex-ante perspective—that is, in expected values. Compared with the baseline model (see Table 1), in this model the personal gain from violating the rules of scientific conduct is held constant (at 20). If trust in science is affected, again as in the baseline, the negative effect on the scientist who misbehaves is half as severe as the personal gain from the violation. However, in this version of the model, damage to this member and all other members of the scientific community is no longer certain but only occurs with 75% probability. Hence, if a scientist is alone in violating the rules of scientific conduct, and 1 3 does so maximally, she gains 20 + 20 − * * 20 = 32.5 . 2 4

there must be something wrong with this seemingly convincing model. Researchers studying other public goods and public bads have faced similar challenges. There is a whole literature, both theoretical and empirical, in which researchers try to understand in which ways the commons is usually not an outright tragedy but still a drama (Ostrom et al., 2002). In the same spirit, in the sections to follow, I gradually relax assumptions of the baseline model, demonstrate the ways that this changes predictions, and discuss the appropriateness of the changes for understanding scientific disintegrity.

Certain Versus Possible Damage to Trust in Science Most scientists believe that outright scientific fraud is rare (see the evidence presented by Stroebe et  al., 2012). Arguably, the core of the governance problem is not the “fabrication, falsification or unjustified replenishment of data, as well as the whole or partial fabrication of analysis results” (Levelt Committee et al., 2012, p. 17). Minor irregularities, and in particular the practices that increase the risk of false positives, are much more likely. Possibly, several of these practices are even innocent. However, there is a positive risk that the resulting data are not reliable, and many researchers would admit that they are aware of the risk. Society might react differently to intentional fraud versus small or inadvertent errors: Damage to trust in science is (almost) certain if outright fraud is reported, whereas damage is only a possibility if minor violations are discovered. In legal terms, this shifts the governance problem from intentionally causing harm (to the scientific community and to society at large) to negligence. Within the framework of the model, this qualification—that there is only a possibility that generalized trust in science will be reduced—is easy to capture. One

qualifies damage to trust in science by a probability less than 1. Social damage is no longer automatic but comes as a risk. In two respects, this is good news. Even if all scientists completely disregard the rules of scientific conduct, trust in science may remain intact; risks need not materialize.5 For planning purposes, it is logical to weigh uncertain outcomes with their probabilities, that is, to work with expected values. If bad behavior does not always have bad consequences, in expectation the scientific community suffers less. Comparing the last rows of Tables 1 and 2, one can see that, in absolute terms, the scientific community is always better off if damage is only possible but not certain. If communal damage is certain and the dilemma completely materializes, trust in science completely disappears (the group’s payoff is 0); however, if any one violation of scientific integrity only affects trust in science with probability 75%, the expected group payoff is only cut in half (it is 40 instead of 80).6 Yet, if an erosion of trust in science is only a possibility, this aggravates the dilemma.7 If damage is certain, the scientist is at least partly disciplined by the negative effects of her own bad behavior on her own utility (in Table 1, the net gain from maximally bad behavior is only 10). However, if damage is only a possibility, her own bad behavior only cuts into her own payoff with some probability (in Table 2, the net expected gain from maximally bad behavior is 12.5). If a scientist exclusively maximizes her own utility, this comparison is all she cares about. Misbehaving becomes even more appealing.8

Gravity of the Offense In the first two models, each scientist can rationally choose to misbehave. A scientist who does so individually benefits even though generalized trust in science (like the tidiness of the flat) erodes. In these models,

Engel

366 Table 3.  Gravity of the Offense Number of scientists who respect the rules of scientific conduct Condition

4 scientists

3 scientists

2 scientists

1 scientist

0 scientists

Your utility   You respect

20

15.5

11

6.5

   

  You do not respect Total utility of the scientific community

80

18.5 65

14 50

9.5 35

5 20

Note: In Table 3, it is assumed that the effect of scientific misconduct on this and all other members of the scientific community grows disproportionally (quadratically) with the gravity of the individual offense. In this table, the borderline case (in integer units) is investigated in which it is just no longer profitable to increase the degree of scientific misconduct (which happens at 3 units). If an individual scientist engages in this degree of scientific misconduct, she gains 20 + 3 −

1 2 * 3 = 18.5 . 2

each action, by each scientist, creates the same amount of damage. However, what if society were to care more about some types of violations than others? For example, making outsiders aware that some scientists have tested hypotheses using unspecified covariates is not likely to create the same degree of skepticism about scientific findings as learning that scientists have faked their data. Furthermore, what if society were to care about the number of people who had previously violated the scientific standards because of a worry about the overall level of misconduct and the erosion of scientific norms? In the next two models, I try to capture society’s sensitivity to the gravity of the offense. There are many ways to do so. Technically, one must introduce a nonlinearity into the model. Tables 3 and 4 demonstrate two possibilities. In both tables, I use a quadratic term.9

Society cares about the gravity of the individual violation Let us first consider Table 3. The table represents a situation in which society disproportionally cares about the degree by which individual scientists violate the rules of scientific conduct. Suppose society assigns a weight of −1 to not reporting all measures that have been collected and a weight of −5 to not reporting a replication that has failed to support the claim made in the article. In the baseline model, the following two scientists cause the same amount of societal disdain: The first scientist has not reported some measures in five articles; the second scientist has hidden a failure to replicate in one article. With the quadratic term, however, total disdain remains 5 for the first scientist, but it is 25 for the second scientist who committed the more grave offense. Of course, this remains a model about a public bad. Consequently, these numbers measure by how much each member of the

relevant scientific community suffers from one of them engaging in these questionable practices. The seemingly minor shift from a linear to a quadratic damage term has profound effects. Now each individual scientist is much more likely to damage herself if she misbehaves. For small violations of rules of scientific conduct, this will not be the case. However, more severe violations are against the scientist’s own self-interest. If the scientist maximally abuses trust in science, the damage to herself more than outweighs the immediate gains. Using this model, I thus predict that only small violations should occur.10 This is illustrated by Table 3. In this table, I focus on the degree of scientific misconduct that is just barely no longer individually profitable. As in the baseline, each scientist may choose a degree of misconduct between 0 (complete scientific integrity) and 20 (total disintegrity). Misconduct still translates into a private gain of the same degree. Half of the ensuing damage is still felt by the scientist herself as well as by any other member of the scientific community. Yet, society attaches higher weight to more serious violations. For the smallest violations, the difference does not matter (12 = 1). In this example, for a selfish scientist, violations of the second degree are neither profitable nor detrimental (she gains 2 but loses 22/2 = 2). Yet, violations of the third degree are a bad idea, even if the scientist does not care about the repercussions on the scientific community (she gains 3 but loses 32/2 = 4.5 and, hence, loses 1.5 units of utility). This is the example represented in Table 3. If the three remaining members of the scientific community faithfully obey the rules of scientific conduct, they all suffer 4.5 units from the first scientist breaking a rule of the third degree (and have no compensating gain from themselves breaking another rule of conduct). This difference in payoffs is preserved if this scientist thinks that she is not the only one who compares

Scientific Disintegrity as a Public Bad

367

Table 4.  The Erosion of Scientific Norms Number of scientists who respect the rules of scientific conduct Condition

4 scientists

3 scientists

2 scientists

1 scientist

0 scientists

Your utility   You respect

20

15.5

2

−17.5

   

  You do not respect Total utility of the scientific community

80

18.5 65

5 14

−15.5 −64

−49 −196

Note: In Table 4, it is assumed that the effect of misconduct grows disproportionally (quadratically) with the total amount of misconduct. In this table, the borderline case (in integer units) is investigated in which it is just no longer profitable to increase the degree of scientific misconduct (which happens at 3 units) for an individual scientist expecting all her peers to abide by the rules of scientific conduct. Take the case in which two scientists engage in third degree scientific misconduct. Both gain 1 2 20 + 3 − * ( 3 + 3 ) = 5 . 2

personal gain with personal damage. In the example, if she and one more scientist engage in misconduct of the third degree, she has a payoff of 14, whereas she expects 15.5 if she refrains from these activities herself. Even if she expects all others to misbehave, it still is in her personal best interest to be well-behaved. This leaves her with a payoff of 6.5, which is still more than 5. Consequently, and again, the individually rational decision does not depend on the behavior of other scientists.11 Introducing a nonlinearity into individual scientists’ payoff functions has a further noteworthy implication. The linear baseline model (see Table 1) leads to a complete conflict between individual and social rationality. Individually, each scientist is best off if she completely disregards the rules of scientific conduct. Socially, the community of scientists would wish all individual scientists to completely obey those rules. Consequently, in the baseline model there is no conflict between the social interest of the scientific community and of society at large. Society at large wants no scientists to break any rule of conduct—and that is when the community of ­scientists has the highest joint utility (of 80 units in the example). The scientific community cherishes trust in science as much as society at large. If, however, society weighs disdain by the gravity of the offense, this drives a wedge between the collective interest of the scientific community and the interests of those who finance and support science. Although society wants the rules of scientific conduct to be faithfully obeyed, the scientific community in this scenario has incentives to agree to violations as long as the collective damage for all scientists is smaller than the individual gain by the single sloppy scientist. Note, however, that the scientific community is the more reticent to welcome even small violations of the rules of scientific conduct the larger the relevant community. If that community had 100

members, the desirable degree of scientific disintegrity would already be down to 1/100.

Society cares about the erosion of scientific norms In the situation represented by Table 3, society reacts to scandal. Trust in science suffers when individual instances of blatant violations of scientific integrity are reported. What if, instead, society is most concerned about the erosion of scientific norms? This attitude of society is captured by Table 4. In this setting, it is not the gravity of the individual offense that is critical. As in the baseline version of the model, five small violations with a weight of −1 are as important as one big violation with a weight of −5. However, society reacts disproportionately to the overall level of misconduct in the scientific community. Consequently, the individually profitable choice depends on the choices of other scientists. The most important implication is directly visible from comparing Table 3 with Table 4. Society no longer evaluates the honesty of individual scientists but the collective honesty of the scientific community. If there is at most one scientist misbehaving, the same prediction is made by the two models. However, if more than one scientist engages in questionable practices, the scientific community suffers disproportionately more. In the example, the utility that each individual scientist derives from her profession turns negative if more than two scientists misbehave.12 From that moment on, science would no longer be a sustainable endeavor. Even scientists who would be willing to obey the rules of scientific conduct themselves would have to leave science. Let us again explain the point in the example. For expositional reasons, I continue to give the relevant scientific community a size of four members,13 and I make Tables 3 and 4 directly comparable by assuming that a

368 scientist who misbehaves chooses misconduct of the third degree. If a scientist is sure that the other three members of the community will faithfully obey the rules of conduct, she is in the same situation as in Table 3. She gains 3 units from misbehaving. However, the whole scientific community now misbehaves by the same amount. This total amount of misbehavior in the community is taken to the power of two (this is the critical quadratic term) and is divided by the weighting factor ½ (which makes sure that the baseline is indeed a dilemma). Hence, 1 all members of the scientific community lose 32 = 4.5 . 2 Yet, Tables 3 and 4 differ strongly as soon as more than one scientist misbehaves. In the example, if a scientist expects a single other scientist to violate the most minuscule rule of scientific conduct (to misbehave at least in the first degree), she is individually better off if she perfectly obeys all scientific standards. In the former case, 1 she earns 20 − 1 = 19.5 , whereas she only earns 2 1 20 + 1 − 2 = 19 in the latter case. The additional gain 2 from misbehaving herself (1) is smaller than the addi3 tional loss   . If society cares about the erosion of sci2 entific standards, even selfish scientists deter each other.

The Prospect of a Scientific Career Thus far, another simplifying assumption is made by the model. It is a model for a one-shot game. This is as if the typical scientist would want to publish a single article and then leave academia. For doctoral students who choose to leave academia, this sometimes happens. However, most articles are written by individuals who want to pursue a scientific career for a much longer period of time, most of them for life. For this section, I revert to the linear definition of the payoff function.14 I thus assume that, at each and every point in time, payoffs are as in the baseline game of Table 1. Dishonest scientists suffer from their own dishonesty. However, they do not suffer enough, making selfish scientists maximally dishonest. Intuitively one would think that the shadow of the future is helpful. Yet, this is a model about rational actors. By assumption they only care about individual payoffs, and they each assume that all other members of the scientific community will do the same. On these additional assumptions (of common knowledge of rationality, as it is usually put), the prediction from the original one-shot game is also the prediction for the repeated game. To the extent that there is a dilemma in the oneshot game, it persists in the repeated game (Rosenthal, 1981; Selten, 1978). To understand this surprising result, consider a repeated game with a known duration of two periods. Of

Engel course, in the second period, it no longer makes sense to invest in trust. In this period all players will behave selfishly. However, each player perfectly anticipates this and knows that investment in trust does not pay in the first period either. If the game has more periods, the reasoning does not change. Each player knows that all other players will be selfish in the ultimate period. Each player further knows that all other players will anticipate this in the penultimate period and will be selfish in that period as well. Following this reasoning, the prospect of mutually beneficial integrity unravels, and all players are selfish right from the beginning. Yet, there is a critical assumption in this reasoning. In this example, everybody knows when the repeated game ends. Strictly speaking, this would presuppose that all science stops at one predefined day. This is clearly not a plausible assumption. Retirement age is not completely fixed. Many scientists even want to publish after retirement. Others get offers from outside academia and quit their careers earlier in life. Furthermore, others have to give up because they cannot find a decent job. A much more plausible assumption is therefore a repeated game in which it is uncertain for each participant at which point she will leave the game. Happily, this is welcome news. Game theorists refer to the relevant finding as the “folk theorem” (Aumann & Shapley, 1994). If nobody knows with certainty at which point her interest in the game will come to an end, and if this fact is publicly known, any level of cooperation can be sustained as an equilibrium. Let me stress why this is such an important finding. It allows the scientific community to establish any degree of scientific integrity it deems desirable, including perfect integrity. Once the scientific community has settled on one such level, it is individually rational for all scientists to abide by this expectation. The chosen level of integrity is an equilibrium. This means that it is in the best interest of every scientist to play by these rules. How can that be? Given the payoff function, in the short run any scientist would of course do better by cheating. Her own articles would stand a better chance to be published. She might climb up the career ladder faster than her peers. Yet, in the long run there would be negative consequences for dishonest behavior. There are several ways that a society determined to sustain cooperation can discipline free riders. The most straightforward approach is everybody adopting a grim strategy. All scientists play by the standards that their community has agreed on. As soon as a single scientist cheats for a single time, all scientists start cheating maximally and continue to do so forever. On first reading, this strategy may seem plain stupid. Why should one allow a single bad apple to spoil the entire barrel? Why would all others want to hurt themselves so severely? Yet, all scientists shifting to scientific

Scientific Disintegrity as a Public Bad disintegrity is an action off the equilibrium path. Given that all scientists are perfectly rational, and all know all others are, all they need is a credible threat. If all are committed to this “grim trigger” strategy, they know for sure that none of them will ever give in to temptation. The threat must never be executed. Yet, unfortunately in a larger group, things may be more complicated. It may not be necessary for disciplining single free riders that all honorable scientists start violating the rules of scientific conduct as soon as a single violation has been observed. However, if not everyone violates the rules, the hostile reaction to norm violations becomes itself a (second order) public good (Heckathorn, 1989; Yamagishi, 1986). Honorable scientists may be tempted to let others do the dirty work while they reap the benefits from upholding high standards among themselves. This choice of actions essentially splits the community into a subgroup adhering to high standards and a subgroup sinking to bad manners. As with all public goods, if this reaction is anticipated, no honorable scientist wants to end up in the bad subgroup. Everybody anticipates this result. The threat with punishing free riders is no longer credible. The shadow of the future no longer helps sustain scientific integrity. This line of reasoning also leads to the solution. Individuals must not only be punished for contributing to the original public bad but they must also be punished if they do not contribute to the second order public good—that is, if they do not participate in punishing free riders (Aumann & Shapley, 1994; Rubinstein, 1979).15

Targeted Sanctions In the previous section, I introduced the notion of punishment. However, it was punishment in a very crude sense. If a single member of the community disregards the community’s standards, then every honorable member of the scientific community starts violating scientific integrity in reaction. From a governance perspective, this strategy for disciplining free riders has an obvious drawback: Those who want to maintain scientific integrity have to start misbehaving themselves. This limitation is, of course, due to the fact that the action set is so limited in the original game. In a society of two this limitation is not very important. However, in larger societies, it has a severe drawback. In the interest of punishing free riders, all honorable scientists are punished as well. As long as targeted sanctions are not possible, this is the only way out. The actual scientific community is not constrained that severely. If there are rumors that one scientist has violated the standards of integrity, other scientists investigate the matter. If they find evidence, they make it public (see, e.g., Simonsohn, 2012). It becomes difficult for such

369 scientists to publish their articles in good journals. If the offenses are more severe, they may even lose their jobs. In principle, such targeted sanctions are extremely helpful: If they are credible and severe enough, selfish scientists are perfectly deterred. Yet, going after one’s colleagues is not what every scientist likes to do. Because culprits have a lot to lose, they will likely try to strike back. Targeted sanctions make it unnecessary to attack other honorable scientists. However, the second order public good still persists (see again Heckathorn, 1989; Yamagishi, 1986). For the threat to be credible, the scientific community also has to punish those who do not punish others who have violated the rules of scientific conduct. This unpleasant strategy explains why there is a strong case to be made for centralizing vigilance and enforcement of the rules of scientific conduct. Yet, from a governance perspective, this choice is not trivial. Giving some centralized agency power to enforce the standards of scientific conduct has a number of potential drawbacks. Such an agency inevitably has considerable power. All power can be abused—for instance, curbing developments of the discipline that those holding office find unattractive. Because there is a risk of abuse, sooner than later rudimentary forms of the rule of law will be adopted. This inevitably increases the number of false negatives— that is, of scientists who have actually violated the rules of conduct but can play the game such that they get away with it. Furthermore, the vigilance of any agency only goes so far. Disrespectful scientists stand a chance to go unsanctioned just because the agency never learns about the violation. For all those reasons, central enforcement is likely to deter less than credible decentralized enforcement. In principle, the agency could react by making sanctions more severe (see the seminal article by Becker, 1968). However, Draconian sanctions for small or midsized violations of the rules of scientific conduct may not seem appealing to many scientists, especially if one takes the possibility into account that sanctions are erroneously inflicted on innocent scientists

Internalized Norms Many (most?) scientists will object to the previous analysis: “But I would never do that! I do not sit down and calculate the benefit and the risk inherent in faking my data. It would not ever occur to me as a possibility.” With minor offenses, such automatic reticence may be less prevalent. However, it is certainly not unheard of.

Violating one’s own standards There is a straightforward way of introducing internalized norms into the individual’s decision problem. The benefit

Engel

370 Table 5.  Internalized Norm, Without Let-Down Aversion Number of scientists who respect the rules of scientific conduct Condition

4 scientists

3 scientists

2 scientists

1 scientist

0 scientists

Your utility   You respect

20

10

0

−10

   

  You do not respect Total utility of the scientific community

80

10 40

0 0

−10 −40

−20 −80

Note: In Table 5, it is assumed that a scientist loses as much from scientific disintegrity in terms of morality cost as she gains in terms of additional profit (whereas the utility function is otherwise the same as in Table 1). That is, she gets 20 personal gain, loses that 20 as personal morality cost, and also loses 10 because of the damage to science.

from breaking a rule of scientific conduct is reduced by the disutility from violating one’s own internalized standards. If this disutility is strong enough, the scientist will obey those rules. For a numerical example, see Table 5. For her disintegrity, the scientists gets 20 personal gain, loses that same 20 as personal cost for violating her own moral standards, and also loses 10 because of the damage to trust in science. This intrinsic motive is particularly helpful if the extrinsic threat with punishment might be incomplete, in particular because the risk of detection is small. Taken together, the extrinsic and the intrinsic motive may suffice to hold back a scientist who would otherwise be tempted to overstep a boundary (Engel, 2014).

Heterogeneity In all of the previous analyses, I have treated all scientists alike. I have implicitly assumed that all of them are tempted to disregard the rules of scientific integrity if only the benefit is large enough and if it is not outweighed by the cost. Especially when introducing internalized norms, this assumption is strong. Looking around, it seems obvious that not all scientists are equally attracted to all forms of scientific misconduct. Clearly, internalized norms vary. If I revisit the other specifications of the payoff or utility function, they essentially all allow for heterogeneity. The disutility from generalized trust in science eroding need not be the same for everybody. For scientists close to the end of their careers, success in academia may become less relevant, or, to the contrary, they may be afraid that their entire professional life is tainted if society respects scientists less. In some disciplines, opportunities for making a lot of money outside academia may be much more pronounced, making more scientists tempted to gamble with violations of standards. Sensitivity to punishment is also likely to differ. Last, but not least, the benefit from committing scientific disintegrity quite likely differs among scientists.

Let-down aversion In the world of the model, heterogeneity affects results very little. If, for instance, there is heterogeneity with respect to disutility from breaking internalized norms, only those scientists who are sufficiently sensitive to norm violations respect scientific integrity, even if there is no external enforcement. However, all others behave as before. Intuitively, this formally correct result seems surprising. Should there not be a qualitative difference between a homogeneous and a heterogeneous scientific community? One way of capturing this intuition is by making the individual scientist’s utility depend on other scientists’ payoff. This is done in one of the most widely cited models in behavioral economics (Fehr & Schmidt, 1999). Again, for simplicity, I only discuss the effects for a linear model, as in the baseline. In Fehr and Schmidt’s (1999) model, honorable scientists dislike being the victim.16 They suffer if they see others disregard the rules of scientific conduct and get away with it. To see why this matters, compare Table 5 with Table 6. In Table 5, a socially very desirable situation is defined. Payoffs are as in the baseline (see Table 1), but the scientist in question has completely internalized the norms of scientific conduct. Any additional payoff from scientific misconduct is completely neutralized by remorse. Whatever the remaining scientists do, this scientist does not gain from misbehaving. Consequently, in utility terms, this scientist is always better off keeping the standards of scientific conduct. In Table 6, it is still assumed that this scientist exhibits strong guilt aversion—the same way as in Table 5. Yet, additionally, this scientist is also averse to being let down by her peers. If all her peers break the norms of scientific conduct and therefore receive a higher payoff than the faithful scientist, she loses 1 unit of utility for each unit by which their payoffs differ. This lowers her utility from −10 to −30 units. If two of them get a higher payoff, the faithful scientist loses the payoff difference multiplied with the fraction of the remaining scientists who have been

Scientific Disintegrity as a Public Bad

371

Table 6.  Internalized Norm, With Let-Down Aversion Number of scientists who respect the rules of scientific conduct Condition

4 scientists

3 scientists

2 scientists

1 scientist

0 scientists

Your utility   You respect

20

3.3

−13.3

−30

   

  You do not respect Total utility of the scientific community

80

10 20

0 −26.6

−10 −60

−20 −80

Note: In Table 6, the same assumption as in Table 5 is made regarding morality cost (that a scientist loses as much from scientific disintegrity in terms of morality cost as she gains in terms of additional profit). However, it is additionally assumed that a scientist dislikes when her efforts to maintain a high standard of conduct are exploited by other less scrupulous scientists. Specifically, it is assumed in the model that an honorable scientist loses 1 3 × foreign intensity of misconduct for every other member of the community who disregards the standard of conduct. Take a scientist who faithfully respects all rules of scientific conduct herself, the same way as two other members of the scientific community, whereas the final member of the community 1 1 is perfectly selfish. Then the utility of the first scientist is given by 20 − * 20 − * 20 = 3.33 . In the second term, disutility from 3 2 less trust in science is captured, and the final term is disutility from being let down.

cheating (i.e., 2/3). Hence, her utility is reduced by 2 20 = 13.33 . Likewise, if just one of them misbehaves, 1 3 the payoff difference of 20 is then multiplied by , 3 1 1 which reduces utility to 20 − 20 − 20 = 3.33 . The for2 3 mer loss results from the effect on trust in science. The latter loss results from the inequity. In this version of the model, the composition of the scientific community matters. If there is no more than one cheater, the internalized norm is still strong enough (3.3 is still more than 0). However, with two or more cheaters, let-down aversion trumps guilt aversion (−13.3 is less than −10, and −30 is less than −20). If the faithful individual knows, learns, or expects that not many other scientists will be faithful as well, the dilemma is back. Despite the fact that a fraction of the scientists has internalized the norms of scientific conduct, they are not obeyed. Heterogeneity exacerbates the governance problem.

Which Way to Go? Scientific disintegrity certainly does not have a single cause. It is quite likely that many scientists are genuinely honorable and have just not been made aware of the pitfalls inherent in some hitherto widespread practice. The fact that it is generally believed that outright scientific fraud is rare suggests that the picture of ruthless scientists who disregard standards of scientific practice whenever they can get away with it is too negative. Yet, it would also be naïve to think that scientific disintegrity has nothing to do with incentives.

this article. The main building block of the analysis is a payoff function in which every scientist is endowed by the scientific community with a portion of generalized trust in science. The scientific community can never completely prevent individual scientists from abusing this trust. In this article, I assume that, from a selfish perspective, such abuse makes sense. The individual scientist stands a greater chance to advance her career, to build herself a reputation, or to boost her self-esteem. Yet, such action comes at a cost. It can never be ruled out that violations of good scientific practice become known. In that case, this individual scientist does not suffer alone. Society becomes more skeptical about the reliability of scientific findings. By abusing trust, trust erodes. This is why it is meaningful to model scientific disintegrity as a contribution to a public bad. Although this general interpretation may seem convincing, the prediction derived from the baseline model is too radical. In this model, it would be expected that all scientists would maximally cheat and that trust in science would be completely nonexistent. For society, the only way to go would be deterrence. Individual scientists would have so much to lose if they disregard the rules of scientific conduct that scientific disintegrity becomes irrational. This is obviously not a fair description of science in practice. The main purpose of the article is therefore to discuss qualifications and extensions of the model that lead to more plausible predictions. These qualifications should be investigated empirically in subsequent studies and should be used to design more helpful interventions.

The gloomy world of selfish scientists

When scientific disintegrity hurts oneself

In the spirit of rational choice theory, I try to explain scientific disintegrity from an individualistic perspective in

The need for intervention is considerably reduced if trust in science reacts disproportionally to more severe, or to

Engel

372 more widespread, violations. Technically, this requires introducing a nonlinearity into the model. If big scandals count disproportionally more than an individual scientist frequently violating minor rules of scientific conduct, the situation improves. Gross violations stop because the (expected) damage to the scientist herself outweighs the individual benefit. If the erosion of trust disproportionally grows in the total intensity of disintegrity in the entire scientific community, the situation looks even brighter. Then, to a degree scientists deter each other. Within the framework of the model, the reaction of society is exogenous. In practice, individual scientists, and even more so scientific organizations, might become proactive. They might alert the general public to problems of scientific integrity. The fact that scandals have been widely covered in the general press may also have shifted the “game of scientific integrity” more in this direction. The more this is the case, the more even strategically selfish scientists will desist from misconduct— provided heterogeneity is not such that, individually, they still benefit more than they are put at risk. A second strategy for creating more plausible predictions is adopting a career perspective. This modeling strategy can help if one further assumes that most scientists do not know in advance at which exact moment they will leave science. In such a supergame, integrity can be upheld. The scientific community would be able to maintain a situation with no or little disintegrity. Yet, this requires that all scientists adopt a viable strategy for sustaining integrity, such as grim trigger: Every scientist credibly threatens any perpetrator that they will all start breaking the rules of scientific conduct as soon as a single violation has been observed. If the scientific community wanted to push this strategy, it would have to invest in an intuitively unappealing second order norm: Those who look aside if they spot signals of misconduct are punished themselves for not becoming active. For many scientists, this may unpleasantly smack of denunciation. In the interest of preserving the trust of society in science, this strategy might endanger trust within science.

Vigilance is needed Explicit sanctions targeted only to perpetrators are a much more viable solution, especially if vigilance and sanctioning are entrusted to some central authority, which could of course be an authority created and run by scientists themselves. However, this solution inevitably leads to more centralization of power in science. Many scientists will be concerned that such a body might abuse the power—for instance, to stall heterodox paradigms and methods. The internalization of scientific norms is attractive because it avoids this drawback. The current movement

in psychology that alerts scientists to potential problems with the reliability of their findings—and that calls for very practical, easily implementable solutions—is therefore a good move also from a governance perspective. Yet, there are two concerns with internalization. First, if all the scientific community calls for is internalization, it is at the mercy of individual scientists. Now they are very likely heterogeneous. Not all will readily listen to the call of scientific duty. This problem becomes bigger for the remaining scientists the more society is sensitive to the overall erosion of scientific standards. Then, a few ruthless scientists may suffice to destroy trust in science despite the fact that most scientists faithfully respect the rules of scientific conduct. Moreover, scientists are likely to react to the unfairness of some colleagues getting away with misconduct and advancing their careers faster. If such events remain no longer exceptional, even honorable scientists might stop following their internalized norms of scientific conduct just because they do not want to be the sucker. This explains why targeted explicit sanctions may eventually be hard to avoid. Yet, it may suffice to direct them at the small number of scientists who knowingly take the risk of severely violating the rules of scientific conduct.

Scientific integrity as a governance problem The bottom line of the analysis will by no means surprise those who have studied the provision of other public goods. The resulting mix of institutional interventions with elements of self-governance has been proven effective with many other public goods. It is supported by a rich experimental literature (Chaudhuri, 2011; Fehr & Gächter, 2000; Ledyard, 1995; Masclet, Noussair, Tucker, & Villeval, 2003; Page, Putterman, & Unel, 2005; Potters, Sefton, & Vesterlund, 2005; van Dijk, Sonnemans, & van Winden, 2002; Zelmer, 2003) and is corroborated in the field (Andersen, Bulte, Gneezy, & List, 2008; Anderson, Mellor, & Milyo, 2004; Ostrom et al., 2002). In this article, I do not claim to be developing new solutions. Rather, I aim at convincing scientists that their seemingly idiosyncratic problem actually shares the main features of a governance problem that has been well understood and for which viable institutional interventions have been found. Scientists can capitalize on this rich body of analysis and institutional design. Solving dilemma problems is not easy, but—as many other dilemmas show—there is hope.

Appendix Eventually, the message is of course more important than technicalities. In the interest of making it easier to

Scientific Disintegrity as a Public Bad

373

understand this message, in the main article I work with numerical examples. In such an example, the normative solution can easily be calculated. The price I have to pay for accessibility is a lack in generality. In the main article, I only show that results hold for the chosen parameters. Sceptical readers need not believe these parameters to be adequate. For such readers, the Appendix also shows in which ways results hinge on parameters. Because I expect that most readers are psychologists, and that most psychologists do not regularly work with game theoretic models, I do not just report the general solutions but also explain how they are derived. The structure of the Appendix is directly mapped like the structure of the main article.

Baseline dilemma The incentive structure of a pure public good is captured by everybody having the following payoff function:



π i = e − ci + µ

N

∑c .

(A1)

k

k =1

Payoff π of agent i consists of three components. The first component is an exogenous endowment e. The agent is free to make a contribution c to a public project, the same way as any agent k of a community of size N (where k includes agent i). Everybody’s contributions benefit every member of the community. The sum of contributions is multiplied by a factor µ. For this to be a dilemma, and hence a public good, two conditions must be fulfilled. First, µ < 1 ensures that individuals do not want to invest. Any dollar they keep earns them $1.00, whereas a dollar they invest into the common good only earns them $µ, which means a smaller profit. The second condition is Nµ > 1. If every member of the community invests $1.00, every member gains $Nµ, and the entire society even gains $N2µ. Everybody investing maximally is efficient, that is, in everybody’s well-understood interest. Usually one further imposes 0 < ci ≤ e. Individuals have no savings and cannot borrow. In principle, one could model the overall level of scientific integrity as a public good of the scientific community. Yet, because this makes interpretation more intuitive, I introduce one change into the model of Equation A1.

π i = e − ci + α ci − µ

N

∑c .

(A2)

k

k =1



This payoff function constitutes a pure public bad, rather than a public good, provided α > 1 + µ and α − Nµ < 1. Let us first understand the incentives. The cost of

investment into my private project is 1 (every unit I invest is subtracted from my endowment), but given α > 1, the benefit outweighs the cost. If investment only had this private cost, I would want to invest maximally. Yet, through the final term, investment also has a social cost. I take it into account when deciding on investment. However, if the condition α > 1 + µ is fulfilled, the sum of private cost (i.e., 1) and my responsibility for the amount of social cost (µ) is smaller than the private benefit from investment. I maximally invest; if I further impose ci ≤ e, I invest my entire endowment. Every other member of the community does the same, so that everybody ends up with α − Nµ. Consequently, the game constitutes a dilemma provided that α − Nµ < 1. On that assumption, every member of the community would be best off if nobody invested anything into their private projects. This payoff function is so simple that one can derive these conditions from logical reasoning. Yet, to come closer to an appropriate understanding of scientific disintegrity, one will step-by-step have to increase complexity. With more complex utility functions, it is much easier to apply a simple mathematical procedure to find the normative solution. I exploit the opportunity of the very transparent baseline model to introduce this mathematical procedure. Payoff functions are assumed to be exogenous. Individuals must take them as given. However, they have a choice variable, which will always be their contribution ci. Each individual’s payoff depends on contribution choices of the remaining members of the community. Through the definition of the payoff function, one assumes that individuals maximize profit. Such individuals choose their best response to the choices of the remaining members of the community. Mathematically they find this best response by taking the first derivative of the payoff function with respect to their decision variable. In the case of Equation A2, individuals thus calculate the following:



∂π i = α − 1 − µ. ∂ci

(A3)

For their choice to be optimal, this first derivative must be zero.A1 Now the decision variable ci does not feature in the first derivative. One therefore cannot solve for the decision variable to find the optimal choice. This property of the decision problem follows from the fact that the objective function (Equation A2) is linear in the decision variable. This does not, however, imply that the problem has no solution. Rather, the solution is defined by the opportunity structure. If the first derivative is positive, the agent chooses the maximum quantity that she is allowed to choose. This is where the additional condition

Engel

374 ci ≤ e comes into play. Provided Equation A3 is positive, one predicts that agents will invest their entire endowments. Economic modelers usually refer to this as a corner solution. In equilibrium, best responses must match. In this game, this additional condition is straightforwardly met. Each individual has a dominant strategy: Irrespective of what other members of the community do, she is best off investing her entire endowment. Mathematically, this directly follows from Equation A3: The first order condition not only does not depend on my own contribution ci but it also does not in any way depend on the contribution cj of any other member of the population. As one should have thought, the public bad is a social dilemma. Everybody completely disregards the effects on others and just focuses on their individual projects. Using the same mathematical procedure, one can also find the socially optimal choice; in the world of rational choice models, a social problem is defined by the gap between individually and socially optimal choices. A hypothetical social planner would not maximize Equation A2 but would want to simultaneously maximize everybody’s payoff. If one assumes that all members of the community are equal, the social planner’s problem can be written as in Equation A4. The social planner chooses a contribution level ci for everybody. The critical difference between the individual problem (Equation A2) and the social problem (Equation A4) is of course in the final term. The social planner “internalizes” the externality of each individual’s contribution on everybody else’s payoff. Another, more formal, way of characterizing the difference is as follows: For the social planner, there is no difference between i and j. Π=

N

∑π k =1

i

= Ne − Nci + N α ci − N 2 µ ci .

(A4)

Again taking the first derivative with respect to the decision variable ci, the social planner finds the socially optimal contribution level:



∂Π = − N + N α − N 2 µ = 0 = −1 + α − N µ. ∂ci

(A5)

Because for the optimal choice the first derivative must be zero, one can divide by the size of the community N. The social planner seeks the optimal choice for a representative individual. The resulting formulation of Equation A5 directly compares with Equation A3. The problem is still linear in the decision variable so that one gets a corner solution. However, given the conditions for the problem being a public bad, the social planner would

impose that nobody contributes anything to their private projects. There would not be any social damage. Of course, Equations A3 and A5 are precisely the conditions that define the public bad.

Negligence Within the framework of the model, negligence is easy to capture.

π i = e − ci + α ci − p µ

N

∑c .

(A6)

k

k =1





All one has to introduce into the model is probability p < 1. Social damage is no longer automatic but comes as a risk. The first derivative changes to −1 + α − pµ. Scientific disintegrity is even more attractive for the individual scientist.

Gravity of the offense There are several possibilities for introducing a nonlinearity into the payoff function. In Equation A7, the degree of the disintegrity of each individual scientist matters disproportionately the more this individual lacks scientific integrity. One researcher who fabricates or manipulates data, such as Diederik Stapel, is not the same as 100 psychologists collecting more observations until the result is significant.

π i = e − ci + α ci − µ

N

∑c

k

k =1

2

.

(A7)

Because the final term encompasses the decision variable ci and is quadratic, if one takes the partial derivative with respect to the decision variable, the first order condition still contains ci.A2 One has an interior solution, as in Equation A8.A3 There is an individually optimal degree of contributing to the public bad that is not defined by the boundary condition 0 ≤ ci ≤ e. The same way as in Equation A4, one finds the “efficient” degree of scientific misconduct ci** , that is, that level of misconduct that maximizes the joint payoff of the community of scientists. Unlike with the linear definition of the public bad from Equation A3, the “socially optimal” degree of misconduct is strictly larger than 0. This is why, with a nonlinear payoff function, the best interest of the community of scientists is no longer the same as the best interest of society ** at large. Yet, one should note the ways that ci depends on the size of the scientific community N. Because this size is in the denominator, the larger the scientific community, the smaller the gap is between the interests of society at large and of the scientific community.

Scientific Disintegrity as a Public Bad



ci * =

375

α − 1 ** α − 1 , ci = . 2µ 2µ N

(A8)

The function in Equation A9 specifies sensitivity to the gravity of the offense differently. If the scientific community in question takes scientific standards lightly, trust in science is affected disproportionately more. An occasional mistake by this or that scientist has little effect. One big mistake by a single scientist carries a lot of weight, as does pervasive sloppiness, even if individual mistakes are not grave. In this model, the critical issue is the sum of deviations from best practices for all members of the scientific community.





α − 1 ** α − 1 , ci = . 2µ N 2µ N 2

(A10)

The prospect of a scientific career Formal models of a repeated game are technically considerably more demanding, which is why I refer interested readers to the canonical articles cited in the main article.

Targeted sanctions Targeted sanctions reduce individual gains α from violating the rules of scientific conduct. The socially beneficial effect is even to be expected if a sanction of severity σ is only meted out with certainty q < 1. This leads to the payoff function of Equation A11. A payoff-maximizing scientist now only misbehaves if α > 1 + µ + qσ. Provided the expected value of the sanction qσ is large enough, scientists are perfectly deterred.

π i = e − ci + (α − qσ )ci − µ

N

∑c .

(A11)

k

k =1

π i = e − ci + α ci − µ

(A9)

Using the same procedure as before, one finds the degree of scientific misconduct that maximizes individual payoff and the joint payoff of the scientific community. Comparing Equation A10 with Equation A8, one can see that this second definition of sensitivity to the gravity of the offense is even more beneficial for society. Both terms are divided by the size of the scientific community N, which reduces the individually and the socially optimal degree of misconduct. ci * =



2

 N  π i = e − ci + α ci − µ  ck  .    k =1 





Yet, the second-order public good discussed in the previous section is still an issue. In principle, one can rewrite Equation A11 in the form of Equation A12. In terms of contribution incentives, both formulations are N −1 s = qσ , where sji is the equivalent provided j ≠i ji punishment agent that i attracts from any other member of the scientific community for engaging in scientific disintegrity. The amount of punishment is assumed to be proportional to the contribution of agent i to her private project, that is, to the degree by which she oversteps the limits of what the scientific community deems acceptable. Decentralized punishment (Equation A12) can thus have the same deterrent effect as central punishment (Equation A11).



N

∑ k =1

ck −

N −1

∑ j ≠i

sij c j −

N −1

∑s j ≠i

ji ci .

(A12)

Yet, the equivalence presupposes that the remaining members of the scientific community are willing to actually punish those who violate the rules of scientific conduct. Realistically, punishment requires some effort; one must find out, verify, design a reaction, and actually implement it. Even more important, this is likely to be risky business. Science is not free from power relations, and one who becomes known not to be the integer scientist she pretends to be risks losing a lot. This may make it attractive for her to stifle criticism by whatever means available. In Equation A12, this cost is expressed in the penultimate term. If payoff is captured by Equation A12 correctly, the threat with decentralized punishment is not credible, at least not in the one-shot game. For the reasons developed in the previous section, in a repeated game with an uncertain end, this may be different. However, even if perpetrators can be exposed to targeted punishment, one needs the rather demanding mechanisms described in the main text of this article to stabilize scientific integrity. The scientific community in particular must punish those loyal scientists who do not punish rule violators.

Internalized norms There is a straightforward way of introducing internalized norms into the individual’s decision problem. Comparing Equation A13 with Equation A2, the only difference is in the additional term γ max{ci − c^, 0} . It is taken from the normativity model by Dufwenberg, Gächter, and HennigSchmidt (2011). I lose utility from knowing that my ˆ behavior ci deviates from the normative expectation c— the more so the stronger the deviation (γ is proportional

Engel

376 to the distance of my behavior from the norm). Of course, I only lose this utility if my behavior violates the norm, which is formally expressed by the max-operator.A4 ∧

ui = e − ci + α ci − γ max{ci − c, 0} − µ

N

∑c .



(A13)

k

k =1



The internalized norm changes my problem. It is still possible that I engage in scientific disintegrity. However, I will only do so if my reticence against norm violations—that is, γ—is not strong enough. Formally, the critical condition changes to α > 1 + µ + γ. This formulation of the effects of normativity invites a straightforward extension. For simplicity, I show this for the case of central enforcement. However, the same argument could also be made for decentralized enforcement.A5 As one can immediately see, in Equation A14, internalized norms and external norm enforcement are strict substitutes. The more pronounced the reticence to violate normative expectations, the smaller the need for external enforcement. The critical condition changes to α > 1 + µ + γ + qσ. As long as this condition is not fulfilled, the scientist does not engage in disintegrity. The more norms are internalized, the smaller the need for external vigilance and enforcement. ui = e − ci + α ci − γ max{ci − c^, 0} −qσ max{ci − c^, 0} − µ

(A14)

N

∑c . k

k =1



Heterogeneity Formally, heterogeneity would imply that one should work with γi rather than γ. However, if one revisits the other specifications of the payoff or utility function, they essentially all allow for heterogeneity. The disutility from generalized trust in science eroding need not be the same for everybody. This would change µ into µi. Sensitivity to punishment is also likely to differ, which would imply σi rather than σ. Last, but not least, the benefit from committing scientific disintegrity quite likely differs across scientists, so that one would have to work with αi rather than α. If heterogeneity only affects linear terms in the payoff or utility functions, results change very little. The respective general parameter in the critical conditions just has to be replaced with the individual specific parameter. If, for instance, there is heterogeneity with respect to disutility from breaking internalized norms, then the critical condition derived from Equation A13 changes to α > 1 + µ + γi. Only those scientists who are sufficiently sensitive

to norm violations respect scientific integrity, even if there is no external enforcement. Heterogeneity has a much more profound effect if the individual scientist’s utility depends on other scientists’ payoff. This is posited in one of the most widely cited models in behavioral economics (Fehr & Schmidt, 1999). Again, for simplicity, I only present the model for one technically particularly accessible case. Yet, along the same lines, this sensitivity to the payoff of others could also be introduced into the remaining models. I take payoff to be given by Equation A2. I thus assume the baseline definition of the public bad. Yet, I allow for disutility from breaking internalized norms, which I assume to be heterogeneous. Utility is thus given by Equation A13, with the proviso that γ i ∈ [0, α ]. There are scientists who have not internalized any norms. On the other extreme, there are scientists for whom the disutility from breaking an internalized norm of scientific conduct is so strong that it completely outweighs the benefit from norm violations. Such scientists do not engage in scientific disintegrity, irrespective of the negative repercussions on generalized trust in science. Obviously, in this world, for some scientists the individual utility from scientific disintegrity is positive. Because for them α > 1 + µ + γi, they will completely disregard the rules of scientific conduct. For those who are sufficiently averse against breaking their own normative standards, the opposite holds true. Such individuals never violate any rule of scientific conduct. Now, because of the character of scientific disintegrity as a public bad, the latter scientists still suffer. They are doing what they are expected to do from an ethical standpoint, and nonetheless trust in science erodes. It seems plausible to wonder whether this is not making a heroic assumption about the internalization of norms. Fehr and Schmidt (1999) captured this intuition the following way: ui = π i − θ

1 N −1

N −1

∑ max{π

j

− π i , 0}.

j ≠i

(A15)

I derive utility from my own payoff πi. However, I lose utility from any of my peers having a higher payoff than myself. This utility loss is more pronounced the more that my peers j have a higher payoff and the bigger the payoff difference. How much I am sensitive to relative, rather than absolute, payoff is measured by a factor θ. Now, in Equation A16, I introduce heterogeneity through the second term, which is a simplified version of Equation A13.A6 I thus assume that scientists differ with respect to the internalization of the norms of scientific conduct. ui = π i − γ i ci − θ

1 N −1

N −1

∑ max{π j ≠i

j

− π i , 0}.

(A16)

Scientific Disintegrity as a Public Bad In Equation A16, πi and πj are of course shorthand for the entire function (Equation A2). Yet, this succinct way of writing the utility function makes it easier to see the implications of agents caring about the choices of other agents.A7 One should remember that the game is a linear dilemma. Therefore, if one only has πi, scientists completely disregard scientific integrity. The more scientists care about the rules of scientific conduct (the larger γi), the more likely they are to set aside short-term gain and to abide by the rules of scientific conduct. However, unfortunately, the dilemma strikes back the more integer scientists do not want to be taken advantage of (θ). For if they abide by the rules of scientific conduct while other scientists break them, at least in the short run, the rule violators stand a better chance to advance their careers. The more of them n that do so (the bigger ), and the more they gain N −1 from breaking the rules (the larger πj − πi), the more it becomes difficult to maintain scientific integrity, even with scientists who would be quite willing to forgo short-term gains from breaking the rules of scientific conduct, if only they would not see others getting away with it. Appendix Notes A1. At the maximum of a function, the tangent is horizontal. 2 A2. The first derivative of ci is 2ci. A3. In Equations A8 and A10, and throughout the article, a single asterisk stands for the individually optimal choice, whereas double asterisks characterize the socially optimal choice. A4. This excludes that I gain utility from even going beyond what is normatively expected. A5. To make the parallel easier to see, I deviate from Equation A12 in that I do not automatically assume the norm to be that I contribute nothing to my private project. A6. Given that the payoff function is linear in the decision variable ci, Equations A12 and A14 are equivalent. If Equation A12 is appropriate, scientists either completely respect or completely disregard the rules of scientific conduct. The norm is that they respect these rules, which implies cˆ = 0 . A7. In the technical language of microeconomics, it is sufficient to check comparative statistics.

Acknowledgments This article has strongly benefited from encouragement and suggestions by Bobbie Spellman, Susann Fiedler, and Andreas Glöckner.

Declaration of Conflicting Interests The author declared no conflicts of interest with respect to the authorship or the publication of this article.

Notes   1. In the Appendix, I show that—provided all flatmates maximize personal gains—these are the only two possible outcomes.

377 There is a corner solution because their decision problem is linear in the decision variable—that is, their activity level.  2. These specific predictions of course hinge on the chosen parameters. For a general solution, see the Appendix.  3. In the technical jargon of game theory: The answer is no when assuming common knowledge of rationality.   4. In the formal language of a public good model, this act is a contribution to an individually profitable project.  5. However, even in a scientific community with only four members, this is a very unlikely event. If you assume that the effect of one scientist violating a rule and the effect of any other scientist violating this or another rule are independent, you may multiply the probabilities of ¼ that trust in science remains untouched. Hence, the probability that trust in science is completely upheld if all four scientists neglect standards is as 4 1 1 small as   = . In general, if you let the probability that 256 4 trust remains untouched to be p, the probability that no harm materializes if all N members of the scientific community disregard rules of conduct is as small as pN and, therefore, quickly goes to 0 if you make the more realistic assumption that the scientific community is large.   6. Of course these concrete predictions hinge on the parameters of the example. For a general solution, see the Appendix.   7. This argument assumes risk neutrality. If a scientist who is tempted to break a rule of scientific conduct is risk seeking, the social problem worsens. If she is risk averse, she is constrained by the additional weight that she attaches to the possible negative effects on herself. Yet, note that her utility can never be as small as if she expects damage on herself with certainty. Moreover, in the concrete case, risk seeking is not implausible. It could, for instance, follow from discounting the negative effects of stiffer control by the public at some future point in time when this scientist expects to have tenure.   8. Note that, strictly speaking, the model prediction does not change. Any utility difference, however small, is exploited by a utility maximizing agent. Yet, a rich experimental literature, starting as early as Rapoport and Chammah (1965), demonstrates that, empirically, cardinal differences matter.   9. Technically, a quadratic term is a convenient choice because it makes it easy to calculate an interior solution. However, for the story, this specific functional form is not critical. All one needs is disutility that grows disproportionally in the gravity of misconduct. Another option would, for instance, be the exponential function y = ax. 10. A homogeneous group is still assumed in the current version of the model. Heterogeneity is described later in the article. 11. In game theoretic parlance, misbehaving is dominated. 12. Of course, this statement hinges on the chosen parameters. For the general solution, see the Appendix. 13. An analogous point can be made for arbitrarily large scientific communities (see the Appendix). 14. One could, of course, discuss the effects of repeating the game with some form of quadratic utility, that is, with society disproportionally reacting to more severe, or to more widespread, violations. However, such a discussion would become technically involved and would be difficult to correctly translate into words.

378 15. Because I do want to keep the exposition simple, let me mention a number of qualifications only in this footnote. Grim trigger is not the only possible strategy for sustaining cooperation. Less radical strategies, such as tit for tat, can also work. In a community of two, tit for tat is quite attractive. If A misbehaves in period t, she knows that B will misbehave in period t + 1. Provided A was well behaved in period t + 1, B will return to socially desirable behavior in t + 2. In a larger group, this might be translated into the following strategy: If one member has misbehaved in t, all others misbehave in t + 1, but they return to desirable behavior in t + 2 if the culprit was well behaved in t + 1. If individual scientists discount future gains, even more elegant optimal penal codes can be designed (Abreu, 1988). The current model is static in the sense that the process has no memory. Every new period, society gives science the same amount of generalized trust (the same “endowment”), irrespective of the degree of past misbehavior. This assumption is of course strong. In a dynamic model, one must specify in which ways past misbehavior, both by oneself and others, reduces expected future benefit from one’s own scientific activity. In principle, this makes it easier to sustain scientific standards, but results hinge on assumptions about population composition (who has which outside options, at which point in time?) and on discounting (how important are long-term benefits compared with short-term gains?). 16. The original model is also open to the possibility that individuals suffer from being ahead, not from falling behind, others. If this is the case, solving the governance problem again becomes easier, and, consequently, less severe sanctions suffice. For details, see my previous work (Engel, 2014).

References Abreu, D. (1988). On the theory of infinitely repeated games with discounting. Econometrica, 56, 383–396. Andersen, S., Bulte, E., Gneezy, U., & List, J. A. (2008). Do women supply more public goods than men? Preliminary experimental evidence from matrilineal and patriarchal societies. American Economic Review: Papers and Proceedings, 98, 376–381. Anderson, L. R., Mellor, J. M., & Milyo, J. (2004). Social capital and contributions in a public goods experiment. American Economic Review: Papers and Proceedings, 94, 373–376. Aumann, R. J., & Shapley, L. S. (1994). Long term competition— A game theoretic analysis. In R. J. Aumann (Ed.), Collected papers I (pp. 395–409). Cambridge, MA: MIT Press. Bakker, M., van Dijk, A., & Wicherts, J. M. (2012). The rules of the game called psychological science. Perspectives on Psychological Science, 7, 543–554. Becker, G. S. (1968). Crime and punishment: An economic approach. Journal of Political Economy, 76, 169–217. Chaudhuri, A. (2011). Sustaining cooperation in laboratory public goods experiments: A selective survey of the literature. Experimental Economics, 14, 47–83. Cornes, R., & Sandler, T. (1996). The theory of externalities, public goods and club goods (2nd ed.). Cambridge, England: Cambridge University Press.

Engel Dufwenberg, M., Gächter, S., & Hennig-Schmidt, H. (2011). The framing of games and the psychology of play. Games and Economic Behavior, 73, 459–478. Engel, C. (2014). Social preferences can make imperfect sanctions work: Evidence from a public good experiment. Journal of Economic Behavior & Organization, 108, 343–353. Fehr, E., & Gächter, S. (2000). Fairness and retaliation: The economics of reciprocity. Journal of Economic Perspectives, 14, 159–181. Fehr, E., & Schmidt, K. M. (1999). A theory of fairness, competition, and cooperation. Quarterly Journal of Economics, 114, 817–868. Fehr, E., & Schmidt, K. M. (2006). The economics of fairness, reciprocity and altruism—Experimental evidence and new theories. In S.-C. Kolm & J. M,. Ythier (Eds.), Handbook on the economics of giving, reciprocity and altruism (Vol. 1, pp. 615–691). Amsterdam, The Netherlands: Elsevier. Fiedler, K. (2011). Voodoo correlations are everywhere—Not only in neuroscience. Perspectives on Psychological Science, 6, 163–171. Fuchs, H. M., Jenny, M., & Fiedler, S. (2012). Psychologists are open to change, yet wary of rules. Perspectives on Psychological Science, 7, 639–642. Heckathorn, D. D. (1989). Collective action and the secondorder free-rider problem. Rationality and Society, 1, 78–100. John, L. K., Loewenstein, G., & Prelec, D. (2012). Measuring the prevalence of questionable research practices with incentives for truth telling. Psychological Science, 23, 524–532. Kerr, N. L. (1998). HARKing: Hypothesizing after the results are known. Personality and Social Psychology Review, 2, 196–217. Lacetera, N., & Zirulia, L. (2011). The economics of scientific misconduct. Journal of Law, Economics, & Organization, 27, 568–603. Ledyard, J. O. (1995). Public goods: A survey of experimental research. In J. H. Kagel & A. E. Roth (Eds.), The handbook of experimental economics (pp. 111–194). Princeton, NJ: Princeton University Press. Levelt Committee, Noort Committee, & Drenth Committee. (2012). Flawed science: The fraudulent research practices of social psychologist Diederik Stapel. Retrieved from http://pubman .mpdl.mpg.de/pubman/item/escidoc:1569964/component/ escidoc:1569967/Eindrapport_Commissie_Levelt.pdf Masclet, D., Noussair, C., Tucker, S., & Villeval, M.-C. (2003). Monetary and non-monetary punishment in the voluntary contributions mechanism. American Economic Review, 93, 366–380. Nagin, D. S., & Pepper, J. V. (2012). Deterrence and the death penalty. Washington, DC: National Academies Press. Nosek, B. A., Spies, J. R., & Motyl, M. (2012). Scientific Utopia II. Restructuring incentives and practices to promote truth over publishability. Perspectives on Psychological Science, 7, 615–631. Olson, M. (1965). The logic of collective action: Public goods and the theory of groups. Cambridge, MA: Harvard University Press.

Scientific Disintegrity as a Public Bad Ostrom, E., Dietz, T., Dolsak, N., Stern, P. C., Stonich, S., & Weber, E. U. (Eds.). (2002). The drama of the commons. Washington, DC: National Academies Press. Page, T., Putterman, L., & Unel, B. (2005). Voluntary association in public goods experiments: Reciprocity, mimicry and efficiency. Economic Journal, 115, 1032–1053. Potters, J., Sefton, M., & Vesterlund, L. (2005). After you— Endogenous sequencing in voluntary contribution games. Journal of Public Economics, 89, 1399–1419. Rapoport, A., & Chammah, A. M. (1965). Prisoner’s dilemma: A study in conflict and cooperation. Ann Arbor: University of Michigan Press. Rosenthal, R. W. (1981). Games of perfect information, predatory pricing and the chain store paradox. Journal of Economic Theory, 25, 92–100. Rubinstein, A. (1979). Equilibrium in supergames with the overtaking criterion. Journal of Economic Theory, 21, 1–9. Selten, R. (1978). The chain store paradox. Theory and Decision, 9, 127–159.

379 Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). Falsepositive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22, 1359–1366. Simonsohn, U. (2012). Just post it: The lesson from two cases of fabricated data detected by statistics alone. Psychological Science, 24, 1875–1888. Stroebe, W., Postmes, T., & Spears, R. (2012). Scientific misconduct and the myth of self-correction in science. Perspectives on Psychological Science, 7, 670–688. van Dijk, F., Sonnemans, J., & van Winden, F. (2002). Social ties in a public good experiment. Journal of Public Economics, 85, 275–299. Yamagishi, T. (1986). The provision of a sanctioning system as a public good. Journal of Personality and Social Psychology, 51, 110–116. Zelmer, J. (2003). Linear public goods: A meta-analysis. Experimental Economics, 6, 299–310.

Scientific disintegrity as a public bad.

In this article, I argue that scientific dishonesty essentially results from an incentive problem; I do so using a standard economic model-the public ...
800KB Sizes 1 Downloads 7 Views