Behavioural Brain Research 280 (2015) 119–127

Contents lists available at ScienceDirect

Behavioural Brain Research journal homepage: www.elsevier.com/locate/bbr

Review

Does reward unpredictability reflect risk? Patrick Anselme ∗ Département de Psychologie, Cognition et Comportement, Université de Liège, 5 Boulevard du Rectorat (B 32), B4000 Liège, Belgium

h i g h l i g h t s • • • • •

In decision theories, risk is viewed as a synonym of unpredictable variability. And potential losses are assumed to result from a reduction in optimal gains. So doing, current experiments test the effects of uncertainty, not of true risk. Risk does only exist provided that own resources are limited and imperilled. New methodological approaches are discussed.

a r t i c l e

i n f o

Article history: Received 23 September 2014 Received in revised form 26 November 2014 Accepted 1 December 2014 Available online 8 December 2014 Keywords: Risk Unpredictability Opportunity cost Limited resources Dopamine

a b s t r a c t Most decisions made in real-life situations are risky because they are associated with possible negative consequences. Current models of decision-making postulate that the occasional, unpredictable absence of reward that may result from free choice is a negative consequence interpreted as risk by organisms in laboratory situations. I argue that such a view is difficult to justify because, in most experimental paradigms, reward omission does not represent a cost for the decision-maker. Risk only exists when unpredictability may cause a potential loss of own limited resources, whether energetic, social, financial, and so on. Thus, the experimental methodologies used to test humans and non-humans relative to risktaking seem to be limited to studying the effects of reward uncertainty in the absence of true decision cost. This may have important implications for the conclusions that can be drawn with respect to the neurobehavioural determinants of risk-taking in real-life situations. © 2014 Elsevier B.V. All rights reserved.

Contents 1. 2. 3. 4.

5.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Risk and opportunity cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . An evolutionary argument . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Opportunity without cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1. Animal studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2. Human studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3. Risk, uncertainty, and dopamine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1. Introduction Early models of foraging behaviour ignored variability in reward distribution; they assumed that animals had a complete knowledge of deterministic reward contingencies [25]. But the evidence

∗ Corresponding author. Tel.: +32 87 35 53 48. E-mail address: [email protected] http://dx.doi.org/10.1016/j.bbr.2014.12.003 0166-4328/© 2014 Elsevier B.V. All rights reserved.

119 120 121 122 122 123 124 125 125

that any action may have unpredictable outcomes has led to the consideration of variability as a crucial factor capable of influencing decision-making [95,99]. In real-life situations, an action or event whose consequences cannot be fully predicted is a potential source of risk. As unpredictability is associated with many actions and events, some risk is commonly met by organisms, including humans. Thus, identifying the parameters that influence risk-taking in well-controlled conditions is of primary importance

120

P. Anselme / Behavioural Brain Research 280 (2015) 119–127

in order to understand animal and human behaviour—a topic that has been extensively studied in behavioural ecology [10,21,55] and cognitive neuroscience [54,66,68,94,102]. Traditionally, researchers use microeconomic models to explain how organisms make decisions. Good decisions are those that maximise gains and minimise losses relative to rewards (food, money, etc.), while bad decisions have the reverse effect. Microeconomic models rely on a distinction established by the economist Frank Knight between decisions under risk and under uncertainty. After Knight [63], risk denotes a known probability distribution of possible outcomes, while uncertainty (also referred to as ambiguity) refers to an unknown probability distribution of possible outcomes—two concepts operationally defined by means of specific experimental procedures [67,86]. In a Russian roulette game with a six-chamber revolver, the decision to pull the trigger is said to be made under risk if the player knows how many bullets the revolver contains, because with N bullets, the probability of dying is N/6 and that of staying alive is (6–N)/6. In contrast, this decision is said to be made under uncertainty if the player does not know how many bullets are present (zero to six bullets can be in the revolver) because the consequences of pulling the trigger are ambiguous. In sum, the risk-uncertainty distinction is a convention whose objective is to distinguish what is measurable (called ‘risk’) from what is not measurable (called ‘uncertainty’). In most laboratory studies, animal subjects and human participants are assumed to know the probabilities of reward distribution, so that unpredictable variability in outcomes is viewed as an equivalent of risk [59,60,80,84,89,95]. The issue raised in this paper is to determine whether reward unpredictability can represent a plausible model of risk. It is argued that the standard definition of risk does not provide a plausible picture of real-life risk, essentially for one reason: the definition carries a too minimalist view of decision cost. An option associated with a possible absence of gains is judged as risky as an option associated with a possible loss in own limited resources, whether energetic, social, financial, safetyrelated, and so on. In contrast, in real-life situations, animals and humans are said to take risk only when own resources are at stake. In sum, known and unknown probabilities consist of two forms of uncertainty that say nothing about true risk. This view has direct implications for the plausibility of the experimental methodologies designed to study risk-taking and, thus, for the conclusions to be drawn from collected data. I am not trying to convince psychologists, economists, and neuroscientists, in general, that they use an invalid term but I think that distinguishing between resourceindependent risk (in the economic sense) and resource-dependent risk (such as predation risk) may help clarify the conceptual foundations of neuroeconomics. Experimental strategies allowing the resource-independent and resource-dependent conceptions of risk to be empirically tested are provided.

2. Risk and opportunity cost The ‘unpredictability = risk’ principle is traditionally used to guide the study of animal and human risk-taking behaviour [47,59,80]. Individuals are exposed to a free-choice task in which two options are presented. For example, one option provides reward for sure following a fixed ratio of responses, while the other option delivers reward following a variable ratio of responses [3,38,51,57,90]. Over the training sessions, the individual therefore learns that the former option is more reliable (no variance) than the second (unpredictable), for which the number of responses to elicit before obtaining a reward is unknown in advance. Another method consists of presenting one option for small certain reward and the other option for larger probabilistic reward [1,2,23,64,73,74,100,105]. This is a variant of the impulsive choice

task, in which one alternative guarantees small immediate reward and the other alternative larger delayed reward. The use of variable delays or variable reward amounts within this task is also a common way to measure risk, although all trials are here systematically rewarded (for a review, see [59]). In all cases, individuals are said to take risk when they choose the option associated with variability because occasional reward omission generates opportunity costs. Technically, opportunity cost means that, when an option is chosen, there are (a) a renunciation of the gains associated with other options (e.g., deciding to perform grooming behaviour prevents energy intake resulting from food consumption) and (b) a loss of the resources that were required for another option (e.g., the energy spent mating is no longer available to the search for food). The overall risk that an individual could achieve greater benefits with another option is the opportunity cost. Making decisions when the outcome is unpredictable may generate opportunity costs. We will see that those costs are negligible in most current experimental procedures used to test risk-taking (Section 4). But above all, it must be asked whether opportunity costs provide a satisfactory definition of risk. The concept of opportunity cost insists on the benefits associated with the other options (not chosen), but it does not tell us whether the option actually chosen is effective (e.g., in terms of survival or of monetary gains). An option A (not chosen) may be better than an option B (chosen), but this does not really matter if B has the expected/desired effects. I argue that opportunity costs are only a source of risk provided that they imperil (in part or in totality) an individual’s own limited resources. What is to be possibly lost in a risky choice is not a mere opportunity but some of the decision-maker’s own resources. The word ‘resource’ here denotes an organism’s own reserves rather than reward sources (food, shelter, etc.) that can be obtained by means of own reserves. Own resources may indistinctively refer to the energetic reserves of a bird, the financial means of a person, the richness of an individual’s social contacts or even the strength of a motivation for a task. Note that hoarded food or earned money are own resources, while food and money, in general, are rewards. The potential loss of own resources represents risks in many real-life situations. • Example 1 Predator inspection. A lot of prey, from fish to mammals, is known to approach and/or follow (i.e. to inspect) its natural predators. In Thomson’s gazelles (Gazella thomsoni), this behaviour allows them to acquire information about a potential threat and eventually avoid subsequent attack [40]. Predator inspection is costly in terms of energy expenditure and it is a source of potential injury or death for the prey. Predator inspection has been shown to increase mortality in guppies, Poecilia reticulata [36]. Opportunity costs are associated with this activity because, during that time, the animal does not act to increase its energy budget or to produce more offspring [40]. However, it seems reasonable to think that the possibility that the prey is injured or killed is what makes this behaviour risky. • Example 2 Toxin resistance. The consequences of inappropriate behaviour are often more severe for prey than for predators—what Dawkins [31] called the ‘life-dinner’ principle. But this ceases to be true when prey is dangerous. For instance, the garter snake (Thamnophis sirtalis) forages on newts, Taricha granulosa, whose skin produces tetrodotoxin (TTX), an extremely potent neurotoxin that acts as a Na+ channel blocker. The snake’s ability to resist to TTX is independent of its mass and of the number of exposures to this toxin; it is a biological adaptation whose physiological mechanisms are not yet known. However, there is a price to pay for consuming those newts: impaired locomotion during a period of time that varies according to the snake’s resistance and the dose of TTX received [17,18]. Eating a newt may affect survival of T. sirtalis in two distinct ways: a higher chance of

P. Anselme / Behavioural Brain Research 280 (2015) 119–127

being killed by a predator during immobilisation and an inability to regulate body temperature when the environment becomes colder. Because struggling against the toxic effects of TTX consumes resources that make survival uncertain, foraging is a risky activity for the garter snake. • Example 3 Gambling. In humans, the actions that cause potential fluctuations in financial resources, such as gambling, are considered risky. Although it may seem unnecessary to specify why gambling is a source of risk, it is worth noting that this is only true if an individual’s financial resources are imperilled. As a reminder, the standard definition suggests that risk is a mere consequence of the unpredictability of possible outcomes. In this view, the inability to predict how much money will be won or lost is what makes gambling risky. But it is quite easy to demonstrate that unpredictability remains insufficient to induce risk: a billionaire gambling for $10,000 or a man with average financial means who would receive some money for free before going to a casino cannot be said to act under risk, even though they are unable to predict the amount of their final gains, because these two gamblers have no—or almost no—own resources to lose. In both cases, the gamblers seem to be attracted by the ‘excitement’ that uncertainty may produce, not by risk (see Section 5.3). In the same vein, lottery games, for example, the lotto and scratch cards, have a very low probability of gain and a low pay-out ratio, so that the chance of losing money when buying them is almost guaranteed [8]. However, apart from people with a poor socioeconomic background [52], the decision cost for buying lottery games remains very limited given the small amount of money gamblers have to spend. The absence of decision cost makes risk inexistent for a majority of individuals, despite the variability inherent to the possible gains. Once again, decision cost (here, financial) and risk are intimately related. Reward unpredictability and opportunity costs are not at the origin of potential problems in those situations; the origin of risk comes from the fact that the own limited resources of decision-makers are imperilled. With ‘unlimited’ resources, it could be theoretically possible to counter the effects of unpredictability, and consequently any risk. If a gambler has enough money to buy all the lottery tickets, he will be able to win the top prize with certainty, even though the outcome is subject to pure hazard. Similarly, the long-term energy requirements allowing animals with a large body size, for example, hippopotami and elephants, to stay alive during a prolonged absence of food may reduce foraging risk [10,75], and the ability of salamanders to regenerate a leg and of lizards to re-grow a lost tail following injury may reduce predation risk, at least in the short-term [70]. In contrast, with limited resources, the number of trials necessary to suppress the effects of unpredictability is impossible to achieve and necessarily imperils persistence of those resources. Actions have to be adjusted to encountered situations in order for organisms to avoid wasting too much time and energy, but also—in humans—to avoid losses with respect to friends, administrative rights, familial relationships, a professional position, and so on. Because organisms have only limited resources at their disposal, they have to consider unpredictability and take some risk in order to keep those resources at an ‘acceptable’ level. In summary, risk results from the failure to predict how variability in the distribution of rewards will affect an individual’s own limited resources. Risk-taking is a strategy used to thwart the negative, unavoidable effects of unpredictability—it is not unpredictability (or opportunity costs) per se. 3. An evolutionary argument A modern argument enabling an interpretation of unpredictable variability as risk consists of suggesting that, in nature, the

121

unpredictable distributions of external resources, such as food and water, necessarily imperils survival [10,95]. Indeed, if finding food requires much effort (because its access is not guaranteed), an insufficient daily caloric intake may cause the individual’s death. Thus, evolution might have shaped organisms to interpret unpredictable variability as risk, as the standard definition of risk invites us to think. Upon the assumption that this evolutionary argument is correct, organisms, including humans, tested in laboratories should react to unpredictable events as if they were risky, even though resource-related losses are not at stake. Bateson and Kacelnik ([10], p. 1139) note that ‘[. . .] natural selection might have driven the evolution of a mechanism for measuring the riskiness of foraging sources by measuring their variance. Such a mechanism would be functionally indistinguishable from a mechanism that detected true unpredictability in a world in which all variable food sources were also risky’. Without denying the role that evolution may play in shaping animal behaviour and cognition, it must be noted that evolutionary arguments remain essentially hypothetical. In particular, the effects of learning relative to the conditions of reward delivery in natural as well as laboratory situations should not be neglected. Inherited representations can be changed through learning, even in elementary animals. For instance, when cockroach groups can freely choose between light and dark places, they prefer to shelter in the dark place in a majority (73%) of trials [50]. This preference is a biological adaptation exhibited by isolated individuals and amplified through social interactions. Interestingly, Halloy et al. [50] were able to reverse the sheltering preference of cockroaches after introducing small robots that initially mimicked their behaviour, that is, random exploration, preference for darkness, and influence from the presence of conspecifics. Once accepted within cockroach groups, the robots were programmed to prefer the light shelter. Robots were then able to induce a change in the collective shelter preference, as cockroaches chose the light shelter in 61% of the trials. This ability of animals as elementary as cockroaches to change their inherited preference as a result of learning indicates that acquiring new information can alter behavioural tendencies put in place by evolution. A fortiori, the same should be true with respect to bigger-brained animals, such as birds and mammals; it is therefore reasonable to hypothesise that they are able to distinguish risk from mere unpredictability depending on the type of situation they are experiencing. According to Bateson and Kacelnik [10], demonstrating a difference in the effects of risk and unpredictability is impossible because they cannot be tested separately. I agree with their suggestion that there is no risk in the absence of unpredictability. But contrary to their presupposition, the reverse might be not true: unpredictability does not involve the occurrence of risk, for example, the uncertainty about which leaf will be the next to fall from a tree in autumn is unrelated to any risk. The defenders of the evolutionary argument presented earlier will say that a leaf falling from a tree is a neutral event, that is, that the argument only works relative to significant stimuli, such as food, predators, and sexual partners. But this narrow connection between the unpredictability of significant stimuli and risk remains insufficient to suggest that they are processed in the same way. By analogy, the tasks traditionally designed to study attention in animals involve rewarding them in order to drive their attentional focus on those tasks. This is to say that, for practical reasons, attention is not studied independently of an animal’s motivation [71]. But nobody would suggest that attention and motivation are the same processes, even though they closely interact [5,87]. This argument could also work with respect to risk and unpredictability. Several methodological suggestions are proposed further in order to help determine whether

122

P. Anselme / Behavioural Brain Research 280 (2015) 119–127

or not animals are inclined to interpret unpredictability in terms of risk. 4. Opportunity without cost The standard definition of risk does not deny a possible effect of risky choices on the reduction in an individual’s own resources. But this definition assumes that risk is also present in situations in which such a reduction is impossible. That a resource-independent conception of risk can serve as a framework for experimentation is legitimate and should not be called into question. However, such a view becomes a potential problem as long as it is the only way concretely used to represent risk in animal and human experimentation [6]. In my opinion, only a comparison of resourceindependent and resource-dependent conceptions of risk can allow us to determine whether or not they are equivalent. After all, even in resource-independent models, the processing of reward delay, reward magnitude, and reward probability might involve different brain mechanisms (e.g. [22,37]). It is a widespread idea that reward loss acts like a punishment [77,88]. For example, wanted rewards are known to release dopamine in the nucleus accumbens, a mesolimbic brain region [12,13]. Accordingly, reward omissions at the expected time of its delivery and punishments have similar depressing effects on the activity of dopamine neurons [88]. Also, in the successive negative contrast procedure, where the reward rate received at training is suddenly reduced, a decrease in dopamine efflux is recorded [46], although other studies report an ineffectiveness of dopaminergic agents in modulating contrast effects [43]. These results legitimate the modelling of risk in the absence of any objective danger. In this logic, animals exposed to partial reinforcement in Pavlovian autoshaping should experience punishments induced by occasionally nonrewarded trials. However, do nonrewarded trials really act like punishments in this context? First, a punishment is an event whose effect is to decrease the probability of a response [93]. An animal that receives a shock when approaching a food tray will end up avoiding it [78]. The suggestion that nonrewarded trials have punishing properties is incompatible with the evidence that partial reinforcement in autoshaping, where rewards are unpredictable, enhances the vigour of responses directed to the conditioned stimulus (CS) compared with continuous reinforcement, where rewards are fully predictable (e.g. [7,15,27,48,85]). In fact, there is evidence that nonrewards generate much less conditioned inhibition than usually believed [104]. Second, a punishment may result from the non-occurrence of a wanted, expected reward. When animals are not allowed to access an expected reward, wanting is replaced by frustration and dopamine levels are decreased [46] (for frustration, see [4]). Since rewards cannot be expected under partial (50%) reinforcement (because the expectation of reward is equivalent to that of nonreward), it is unlikely that occasional nonrewards are experienced as punishments. Besides, dopamine levels are known to increase in animal and human individuals trained with unpredictable rewards [39,92,103]. In contrast, if nonrewarded trials came with electric shocks as a really punishing event capable of altering an animal’s motivation to perform in autoshaping, a decrease (rather than increase) in behavioural performance would certainty occur. Taken together, these results indicate that the postulate that resource-dependent and resourceindependent sources of risk are equivalent should not be taken for granted. Now, I would like to go further in the analysis of how risk is typically modelled in current experimental works. 4.1. Animal studies There are good reasons to think that most experimental methods used to study risk-taking in animals (see Section 2) do not

capture resource-dependent risk because, by using such methods, individuals have no resources of their own to potentially lose. They interpret potential losses as an absence of optimal gains rather than as a reduction in own limited resources [6] and this raises important problems. • First problem: In laboratory tasks, the search for reward is hardly demanding in terms of energy cost. In general, animals only have to press a lever or to peck at a key within a confined environment. And with respect to the reward options available, zero—or very few—pellets represent the worst possible outcome. There is no modelling of the negative consequences potentially associated with reward seeking in natural environments, such as injury and predation risk. • Second problem: The non-received quantity of 45 mg pellets during the task is provided in the home cage after each training session, in order that the total number of pellets received is the same for all animals (i.e. the total number of palatable rewards obtained is independent of the choices made). This would be no problem if animals were unable to learn that some food is made up after the end of a session. But making such an assumption is hardly plausible. When rats have been trained to press a lever for food and then receive the same food for free, this may cause a loss of motivation to work for that food. Of course, animals will press a lever associated with the delivery of palatable pellets anyway, even if they are kept at 100% of their free-feeding bodyweight. But the guarantee that, in the end, they will obtain the exact same quantity of pellets whatever they do may alter their response during the training sessions. • Third problem: The non-received quantity of 45 mg pellets during the task does not determine the quantity of regular pellets (chow) obtained in the home cage after each training session, in order to allow the mean bodyweight of animals to remain constant over the sessions (i.e. survival is not imperilled by bad choices). Even though animals are food deprived before the experiment begins, I suspect that they do not experience any risk as long as deprivation strength is held constant, that is, independent of the animal’s outcomes within a task. Why? Because the animal’s own resources are not at stake. The fact of testing animals under food deprivation (i.e. when own resources are reduced) does not enhance the objective level of risk if they know that food supply is stable and sufficient for allowing them to stay alive; risk can only exist provided that the quantity of food received depends on the consequences of their actions. In this case, and only in this case, food deprivation can amplify the objective level of risk. In other words, animals must have the opportunity to learn that the appetite strength they experience depends on their performance. Of course, such a criticism is only valid provided that performance-dependent changes in energy budget can affect subsequent foraging decisions. Currently, there is no evidence for or against this hypothesis, as all experimental manipulations maintain energy budget constant over the training sessions, that is, assess risk-taking in a resource-independent way. The fact that animal choices are often insensitive to energy budget in different experimental situations [16] (for a review, see [59]) does not disconfirm the hypothesis if we accept that modifying energy budget according to foraging successes and failures is a necessary condition for risk to occur. Even the work of Caraco et al. [21], one of the most convincing approaches to risk-taking in animals, holds bodyweight constant between the sessions in the negative budget condition. These methodological problems (especially the last two) are typical of open economy settings, in which animals receive some food after each training session. Are closed economy settings, in which no extra food is provided, more appropriate procedures with

P. Anselme / Behavioural Brain Research 280 (2015) 119–127

respect to the study of risk-taking? In closed economy, animals live in the operant chambers and receive their daily food amount during training (e.g. [11,44]), so that the bodyweight of an individual reflects its experimental behaviour [49,56]. In a sense, animals will run more risk in a closed economy setting. However, it is important to note that animals trained under closed economy often receive larger food pellets and longer sessions than under open economy, leading to food satiation [81]. In this case, closed economy may be ineffective in generating resource-dependent risk. Regarding most paradigms of decision-making under risk, Paglieri et al. [76] note that ‘every case of unsuccess is an “unlucky event” but not necessarily a “risk”. Therefore, while the attraction for uncertain reward may resemble the features of a “gambling proneness”, it is not necessarily fitting with the construct of “risk proneness”’ (pp. 4–5; see also [6]). The methodological precautions traditionally used regarding food supply, which consist of providing the same amounts of food for all animals over the training sessions, are rationally justified and even required for objective conclusions to be drawn. However, it is not certain whether risk-taking is the variable studied using such experimental procedures. My opinion is that they are rather concerned with testing preference versus aversion for the uncertainty of reward delivery. Additional constraints on the task should be used for a realistic assessment of risk. Decreasing (vs. increasing) the amount of regular pellets received in the home cage when the sum of gains during the task is low (vs. high) might be one of them. In order to avoid bodyweight varying as a result of performance, it could be held constant by means of repeated intravenous injections of nutritive substances. This technique is known to prevent the occurrence of physiological deficits while avoiding the suppression of appetite [12], despite increased nutrient (e.g. glucose) levels in the blood. In fact, the oral consumption of food is a necessary step in the satiation process [24,62]. Even introducing food directly into the stomach fails to satisfy appetite [101]. A simpler strategy might consist of punishing (e.g. with mild electric shock) lower outcomes in the variable option in order to act on motivational resources. Since the property of punishments is to suppress a particular action [93], punishments could be a good model of loss because, in many situations, bad decisions often lead individuals to stop an action associated with negative consequences. Of course, there is no guarantee that the use of mild electric shock will have the same behavioural effects as a change in food supply according to performance, but risk in natural environments may also have different forms. The important point here is that these two strategies do not presuppose that variability is sufficient to produce risk and therefore represent risk in a more realistic way. There is evidence that the presence of real negative consequences is able to change the behavioural strategies adopted by animals relative to a task. Simon et al. [91] allowed rats to press two levers. Pressing the first one gave an access to one food pellet (small, safe reward option), while pressing the second lever gave an access to three food pellets but was also accompanied by a possible footshock (large, risky reward option). In the initial 18-trial block, the probability of shock was 0% but gradually increased to 25, 50, 75, and 100% in the subsequent blocks. Simon et al. [91] found that there was a shift in preference from a large, risky reward to a small, safe reward when the probability of shock increased. Preference for the small, safe reward was more pronounced when shock intensity increased. Reversing the blocks order did not alter preference within each block, as a higher probability of shock continued to increase the preference for the small, safe reward. In this experiment, unpredictability was not about reward (which was always delivered) but about shock, so that shock did not mean ‘no reward’. This might be a relevant model for a risky situation in which some food is available but hard to access. We see that

123

the presence of shocks, which may reduce motivational resources, influences decision-making in animals. Let us take an example in the context of addiction. Drug-seeking reinstatement (‘relapse’) that may occur in humans following a prolonged period of abstinence is traditionally studied in rats conditioned to a cue that predicts the delivery of a drug. After learning the cue–drug association, rats are deprived of the drug during an extinction test, in which the drug is no longer obtained when they press a lever previously associated with its delivery, and their performance is then reassessed in the presence of the cue (e.g. [30]). However, as Cooper et al. [28] point out that this model ignores the negative consequences associated with drug seeking in humans, such as obtaining money to purchase a drug, the illegality of keeping the drug, and the anticipation of withdrawal symptoms. These negative consequences may come into conflict with the appetitive effects of drug seeking and therefore alter decision-making. In order to model the existence of such a conflict in rats, Cooper et al. [28] placed an electric barrier near the lever. Following 3 days of self-imposed abstinence to cocaine caused by the presence of the barrier, rats were tested in extinction and relapse was then assessed in the presence of the electric barrier. Interestingly for our purpose, rats subjected to this conflict model did not react in the same way as rats tested with the traditional model. In particular, reinstatement of cocaine seeking was observed in a smaller number of animals and a higher variance in responding occurred under conflict. Also, the conditioned cue increased lever-directed responses when it was presented after extinction. According to these authors, the differences in responses induced by the two models can be attributed to the importance these models attach to negative consequences for animals. Although this study did not aim to assess risk-taking, it also invites us to think that resource-dependent and resourceindependent models of risk might lead to different behavioural effects. For example, Pelloux et al. [79] showed that, in comparison with control rats, electric shock suppresses cocaine and sucrose seeking after a moderate training history, as well as after more extended training–except in a subgroup of cocaine rats that developed insensitivity to shock (see also [33,61]). 4.2. Human studies Human studies often use money rather than food as a reward, but similar problems occur with respect to the experimental modelling of decision cost (e.g. [9,14,29,54,68,83,94,96,98]). As with animals, participants have to decide between one option associated with the certainty of moderate gain (e.g. $3) and another option with a 50% probability of a more substantial gain ($6) or nothing ($0). Variants can exist relative to the probabilities and money amounts involved, but the basic principle remains the same. Here also, the variable option is assumed to be more risky because there is a chance of receiving no—or very little—money, and participants are said to lose when they receive the lowest outcome. There is nothing to lose in this sort of task; at worst, the actual gain on a particular trial is smaller than the top (and desired) gain. The accumulation of losses over the task’s dozens of trials is simply subtracted from the sum of optimal gains or from the initial endowment received before the experiment begins, because most of these works imply that participants are subject to the administration of a radioactive marker for fMRI and PET scanning. In general, participants are able to win $20–30, sometimes more, at the end of the experimental task. In terms of risk, is this situation comparable to that of a man with average financial means spending his own money at a casino? After all, receiving less money than desired in a psychology experiment is quite a different experience from that of having to mortgage one’s house or being rejected from one’s bank because of bad gambling decisions. Thus, it is unsure whether people will adopt the same cognitive strategies in these two situations.

124

P. Anselme / Behavioural Brain Research 280 (2015) 119–127

For example, during fMRI, Lawson et al. [65] exposed participants to CSs that predicted a £1 win, a £1 loss, shock, or a neutral outcome. Participants received £50 for their involvement in the experiment. The results showed that activation of the habenula is positively correlated to conditioned suppression in response to shock, but that there is no activation in response to win and loss CSs relative to neutral cues. If evolutionary constraints can possibly be suggested affecting performance of captive animals, this is less credible in human experiments because people understand (receive explanations for) the potential consequences of their actions before the task begins. Thus, participants are fully aware that they will not lose their own money during the experiment and also perhaps fully aware that they will be likely ‘richer’ at the end of it. In the worst scenario, they will leave the experimental room with no monetary gain. Therefore, why would they interpret such a task as a risky activity? In the literature on risk-taking, this contrast between the ‘pseudo-losses’ within psychology experiments and the true losses in real-life situations (e.g. in a casino) is noted by some authors but, strangely, this is not reported as a potential problem: ‘Decision theorists [. . .] view risk as increasing with variance in the probability distribution of possible outcomes, regardless of whether a potential loss is involved. For example, a prospect that offers a 50–50 chance of paying $100 or nothing is more risky than a prospect that offers $50 for sure–even though the “risky” prospect entails no possibility of losing money’ ([45], p. 145). Joutsa et al. [58] have proposed what they consider to be a more realistic money gambling task, because participants are offered the opportunity to use real money with a commercial electronic slot machine. Although this may indeed improve ecological validity of the task, the aforementioned problem remains: real money does not make gambling situations more risky than abstract money. As pointed out by these authors, ‘[e]ach participant was given 20D as their starting bankroll (loaded in the program as the task started), and they were instructed that, if they should lose the initial amount, it would be automatically reloaded by the slot machine, and that they could keep the possible winnings without having to pay for the possible losses’ ([58], p. 1994). It is important to note that the arguments developed in this paper address risk objectivity (i.e. the degree of risk that a situation concretely represents for an individual), not risk subjectivity (i.e. the perception of a situation as safe or risky). In humans, experimental data indicate that these two components are processed by distinct brain areas. Indeed, contrary to objective risk, subjective risk will depend on personality (i.e., inclination to seek or avoid risk in general) as well as on context (e.g., lacking caution as a car driver while limiting expenses as a consumer). For example, Holper et al. [53] found increased hemodynamic signals (measured by functional near-infrared spectroscopy, fNIRS) in the lateral prefrontal cortex of risk-seeking individuals in response to (resource-independent) high-risk options relative to (resourceindependent) low-risk options. In contrast, hemodynamic signals were decreased in that brain region with respect to risk-averse individuals. The sensitivity of the lateral prefrontal cortex to risk attitude suggests that it codes subjective risk (see also [26,96,97]). Holper et al. [53] also showed that both risk-seeking and risk-averse individuals exhibit larger skin conductance responses for high-risk options than for low-risk options, suggesting that electrodermal activity is insensitive to risk attitude, and hence codes objective risk. After these authors, electrodermal activity might reflect signals from brain regions such as the middle cingulate cortex and the posterior parts of the lateral orbitofrontal cortex (see also [19,26]). The evidence that objective and subjective risks are processed by distinct brain mechanisms may explain why at least, in humans, subjective perceptions may sometimes trump objective risk with respect to survival (e.g., in some religious orders, the monks take a vow of poverty to earn a place in the afterlife). However, I argue that

the arguments proposed are not called into question. Most studies reported in this paper are not trying to assess subjective risk. Defining risk in terms of reward unpredictability when probabilities are known (without any further details) is assumed to reflect objective risk because this parameter is the same for each individual. For example, the expression ‘risk-averse’ often means that a group of individuals avoids the uncertain option such as objectively fixed in the task, irrespective of what this represents from each individual’s point of view. Thus, the suggestion that resource-independent risk may be an inappropriate model of resource-dependent risk (as discussed in this paper) is unrelated to the distinction existing between objective risk and subjective risk. 4.3. Risk, uncertainty, and dopamine One of the dominant theories of midbrain dopamine’s role defends that phasic dopamine signals code the prediction error of reward delivery when the time of reward occurrence differs from that of its expectation [88]. Phasic dopamine release is indeed recorded after the appearance of an unexpected reward, or that of a CS cueing reward, while depression in dopamine rates is observed when reward is undelivered at the time predicted by the CS. In addition to phasic signals, dopamine is also known to show sustained activation during the time interval between the CS onset and reward delivery. Sustained activation of dopamine neurons is more pronounced when the uncertainty of a two-outcome event is maximal (P = 50%) and gradually decreases as the event’s uncertainty decreases—it becomes undistinguishable from baseline activity at P = 0% and at P = 100% [35,39]. Other studies have reported this propensity of dopamine signals to reflect reward uncertainty, although they did not aim to separate phasic and sustained activation [32,69,83]. There is evidence that dopamine levels in the nucleus accumbens affect the decisions of animals and human individuals to engage in risky activities, which may contribute to the development of pathological behaviours, such as drug addiction and problem gambling [39,41]. Sustained dopamine is sometimes interpreted as a risk signal in the brain [88]. The hypothesis that dopamine release—recorded using traditional experimental paradigms—is a risk signal indicates that it relies on the standard definition of risk—‘[. . .] the position is consistent with standard financial decision theory’ ([82], pp. 143–144). Without denying the provocative contribution brought by electrophysiological data, the conclusion that dopamine is a risk prediction signal may be premature. As in the case of behavioural studies, the fact that dopamine reflects reward uncertainty does not mean that we have to presuppose anything about risk. In most (if not all) reports, reward uncertainty is shown to boost dopamine efflux in situations where the individuals’ own resources are not at stake (see also [57,92,103]). In contrast, there is now strong evidence that dopamine release depends on the attractiveness of a reward source [12,13]. While dopamine agonists boost reward seeking, dopamine antagonists have the reverse effect [12]. Also, training animals under satiety and testing them under hunger (an operation that raises dopamine levels) instantly enhances the behaviours directed to a CS, despite the fact that the learned predictive value of the CS has remained unchanged [13]. The neurobiology of sign- and goal-trackers provides additional evidence that dopamine controls the incentive property of CSs rather than their learned predictive value. Animals seem to have genetic predispositions that orient them towards sign-tracking (a preference to inspect the CS over the food dish) or goal-tracking (a preference to inspect the food dish over the CS) in autoshaping [72]. In particular, sign-trackers release more dopamine in their nucleus accumbens than goal-trackers during training [41,42]. In sign-trackers, there is a decrease in dopamine release (measured by means of fast-scan cyclic voltammetry) in response to repeated

P. Anselme / Behavioural Brain Research 280 (2015) 119–127

reward delivery and an increase in dopamine release in response to the CS, as if dopamine was a prediction error signal [42]. But this does not occur in goal-trackers [42]. To say that dopamine is a prediction error signal in sign-trackers is problematic since sign- and goal-trackers are known to learn the predictive value of the CS to the same extent; the progression of response rates (time latency, number of contacts, etc.) to their respective CS (lever or dish) is comparable in both groups over training [72]. Of course, it is possible that prediction error works only for some forms of learning, but Pavlovian learning does not seem to be one of them [13]. For example, dopamine-deficient (DD) mice can learn Pavlovian associations as well as healthy mice, despite a lack of dopamine in their brain [20]. In fact, sign- and goal-trackers learn the predictive value of the CS; they only differ with respect to the location (sign or goal) where the conditioned response is directed [72]. The decision of sign-trackers to attribute incentive salience to the CS lever seems to be related to higher dopamine levels in their nucleus accumbens, while goal-trackers develop a more cognitive strategy that consists of using the CS as an informational stimulus, unattractive in itself [41,42]. At least, this pattern of results is observed in non-hungry and pharmacologically non-stimulated individuals. However, increased dopamine release following a stimulation of ␮opioid receptors in the central nucleus of the amygdala generates comparable elevations of incentive salience attribution to the CS lever in sign-trackers and to the CS dish in goal-trackers [34]. These results provide a picture of dopamine’s role that is radically different from that of prediction error and risk. Interestingly, we have shown for the first time that reward uncertainty produces more sign-trackers and fewer goal-trackers than reward certainty (M.J.F. Robinson, Anselme, Suchomel, and Berridge, under review). This suggests that animals exposed to reward uncertainty are motivationally aroused by the possibility of nonreward rather than taking risk. It might be argued that none of the examples considered in this paper provides evidence that reward-omission and resourcereduction models describe the effects of qualitatively different brain processes and hence that unpredictability and risk could simply be different quantitative effects stemming from a single brain mechanism. I recognise that this is a possibility. But solving this question requires pharmacological and/or neuroimaging studies that compare situations described by these two models; studies based only on the reward-omission model are insufficient to conclude anything about the neuronal correlates of reward-reduction risk. In fact, even if unpredictability and risk appeared to be similar in terms of neuronal mechanisms, the main issue discussed in this paper would persist: since the reduction in own limited resources seems to affect behavioural choices (e.g. [28,79,91]), can we really use reward-omission models in order to predict what should happen in situations involving a reduction in resources? In several experimental contexts (notably those involving delay and ratio schedules), animals show a preference for an unpredictable option over a predictable option [59,102]. Would still this be observed if reward omission was accompanied by punishing events such as mild electric shocks?

5. Conclusion The goal of this paper was to suggest that many animal and human studies of risk-taking behaviour represent loss as an absence of optimal/desired gain and are, therefore, more likely to investigate the effects of uncertainty rather than those of true risk. Reward variability can only be associated with risk under some conditions, that is, when an individual’s own resources are limited and imperilled. As shown, these conditions are almost never satisfied in experimentation, jeopardising the conclusions that can be

125

drawn with respect to the psychological and neuronal processes underlying real-life risk-taking behaviour. Nevertheless, providing realistic laboratory models of risk is possible and should be encouraged in order to determine how organisms process information in situations in which own resources are engaged as opposed to comparable situations in which they are not.

References [1] Adriani W, Laviola G. Delay aversion but preference for large and rare rewards in two choice tasks: implications for the measurement of self-control parameters. BMC Neurosci 2006;7:52. [2] Adriani W, Boyer F, Leo D, Canese R, Podo F, Perrone-Capano C, et al. Social withdrawal and gambling-like profile after lentiviral manipulation of DAT expression in the rat accumbens. Internat J Neuropsychopharmacol 2010;13:1329–42. [3] Ahearn W, Hineline PN, David FD. Relative preference for various bivalued ratio schedules. Anim Learn Behav 1992;20:407–15. [4] Amsel A. Frustration theory. Cambridge: Cambridge University Press; 1992. [5] Anselme P. The uncertainty processing theory of motivation. Behav Brain Res 2010;208:291–310. [6] Anselme P. Loss in risk-taking: absence of optimal gain or reduction in one’s own resources? Behav Brain Res 2012;229:443–6. [7] Anselme P, Robinson MJF, Berridge KC. Reward uncertainty enhances incentive salience attribution as sign-tracking. Behav Brain Res 2013;238:53–61. [8] Ariyabuddhiphongs V. Lottery gambling: a review. J Gambling Stud 2011;27:15–33. [9] Balci F, Freestone D, Gallistel CR. Risk assessment in man and mouse. Proc Natl Acad Sci U S A 2009;106:2459–63. [10] Bateson M, Kacelnik A. Starlings’ preferences for predictable and unpredictable delays to food. Anim Behav 1997;53:1129–42. [11] Bauman R. An experimental analysis of the cost of food in a closed economy. J Exp Anal Behav 1991;56:33–50. [12] Berridge KC. Motivation concepts in behavioral neuroscience. Physiol Behav 2004;81:179–209. [13] Berridge KC. From prediction error to incentive salience: mesolimbic computation of reward motivation. Eur J Neurosci 2012;35:1124–43. [14] Bickel WK, Giordano LA, Badger GJ. Risk-sensitive foraging theory elucidates risky choices made by heroin addicts. Addiction 2004;99:855–61. [15] Boakes RA. Performance on learning to associate a stimulus with positive reinforcement. In: Davis H, Hurvitz HMB, editors. Operant Pavlovian interactions. Hillsdale, NJ: Erlbaum Associates; 1977. p. 67–97. [16] Brito e Abreu F, Kacelnik A. Energy budgets and risk-sensitive foraging in starlings. Behav Ecol 1998;10:338–45. [17] Brodie III ED. Recovery of garter snakes (Thamnophis sirtalis) from the effects of tetrodotoxin. J Herpetol 2002;36:95–8. [18] Brodie III ED, Brodie Jr ED. Tetrodotoxin resistance in garter snakes: an evolutionary response of predators to dangerous prey. Evolution 1990;44:651–9. [19] Burke CJ, Tobler PN. Reward skewness coding in the insula independent of probability and loss. J Neurophysiol 2011;106:2415–22. [20] Cannon CM, Palmiter RD. Reward without dopamine. J Neurosci 2003;23:10827–31. [21] Caraco T, Blanckenhorn WU, Gregory GM, Newman JA, Recer GM, Zwicker SM. Risk sensitivity: ambient temperature affects foraging choice. Anim Behav 1990;39:338–45. [22] Cardinal RN. Neural systems implicated in delayed and probabilistic reinforcement. Neural Networks 2006;19:1277–301. [23] Cardinal RN, Howes NJ. Effects of lesions of the nucleus accumbens core on choice between small certain rewards and large uncertain rewards in rats. BMC Neurosci 2005;6:37. [24] Castren H, Algers B, Jensen P. Occurrence of unsuccessful sucklings in newborn piglets in a semi-natural environment. Appl Anim Behav Sci 1989;23:61–73. [25] Charnov EL. Optimal foraging, marginal value theorem. Theoret Popul Biol 1976;9:129–36. [26] Christopoulos GI, Tobler PN, Bossaerts P, Dolan RJ, Schultz W. Neural correlates of value, risk, and risk aversion contributing to decision making under risk. J Neurosci 2009;29:12574–83. [27] Collins L, Young DB, Davies K, Pearce JM. The influence of partial reinforcement on serial autoshaping with pigeons. Q J Exp Psychol 1983;35B:275–90. [28] Cooper A, Barnea-Ygael N, Levy D, Shaham Y, Zangen A. A conflict rat model of cue-induced relapse to cocaine seeking. Psychopharmacology 2007;194:117–25. [29] Critchley HD, Mathias CJ, Dolan RJ. Neural activity in the human brain relating to uncertainty and arousal during anticipation. Neuron 2001;29:537–45. [30] Crombag HS, Shaham Y. Renewal of drug seeking by contextual cues after prolonged extinction in rats. Behav Neurosci 2002;116:169–73. [31] Dawkins R. The extended phenotype. Oxford: Oxford University Press; 1982. [32] de Lafuente V, Romo R. Dopamine neurons code subjective sensory experience and uncertainty of perceptual decisions. Proc Natl Acad Sci U S A 2011;108:19767–71. [33] Deroche-Gamonet V, Belin D, Piazza PV. Evidence for addiction-like behavior in the rat. Science 2004;305:1014–7.

126

P. Anselme / Behavioural Brain Research 280 (2015) 119–127

[34] DiFeliceantonio AG, Berridge KC. Which cue to ‘want’? Opioid stimulation of central amygdala makes goal-trackers show stronger goal-tracking, just as sign-trackers show stronger sign-tracking. Behav Brain Res 2012;230:399–408. [35] Dreher J-C, Kohn P, Berman KF. Neural coding of distinct statistical properties of reward information in humans. Cereb Cortex 2006;16:561–73. [36] Dugatkin LA. Tendency to inspect predators predicts mortality risk in the guppy (Poecilia reticulata). Behav Ecol 1992;3:124–7. [37] Estle SJ, Green L, Myerson J, Holt DD. Differential effects of amount on temporal and probability discounting of gains and losses. Mem Cogn 2006;34:914–28. [38] Field DP, Tonneau F, Ahearn W, Hineline PN. Preference between variableratio and fixed-ratio schedules: local and extended relations. J Exp Anal Behav 1996;66:283–95. [39] Fiorillo CD, Tobler PN, Schultz W. Discrete coding of reward probability and uncertainty by dopamine neurons. Science 2003;299:1898–902. [40] FitzGibbon CD. The costs and benefits of predator inspection behaviour in Thomson’s gazelles. Behav Ecol Sociobiol 1994;34:139–48. [41] Flagel SB, Watson SJ, Robinson TE, Akil H. Individual differences in the propensity to approach signals vs. goals promote different adaptations in the dopamine system of rats. Psychopharmacology 2007;191:599–607. [42] Flagel SB, Clark JJ, Robinson TE, Mayo L, Czuj A, Willuhn I, et al. A selective role for dopamine in stimulus-reward learning. Nature 2011;469:53–7. [43] Flaherty CF. Incentive relativity. Cambridge: Cambridge University Press; 1996. [44] Foster TM, Blackman KA, Temple W. Open versus closed economies: performance of domestic hens under fixed-ratio schedules. J Exp Anal Behav 1997;67:67–89. [45] Fox CR, Poldrack RA. Prospect theory and the brain. In: Glimcher PW, Camerer CF, Fehr E, Poldrack RA, editors. Neuroeconomics: decision making and the brain. New York: Academic Press; 2009. p. 145–73. [46] Genn RF, Ahn S, Phillips AG. Attenuated dopamine efflux in the rat nucleus accumbens during successive negative contrast. Behav Neurosci 2004;118:869–73. [47] Giraldeau L-A. Stratégies d’approvisionnement solitaire. In: Danchin E, Giraldeau L-A, Cézilly F, editors. Ecologie comportementale. Paris: Dunod; 2005. p. 129–48. [48] Gottlieb DA. Acquisition with partial and continuous reinforcement in pigeon autoshaping. Learn Behav 2004;32:321–34. [49] Hall GA, Lattal KA. Variable-interval schedule performance in open and closed economies. J Exp Anal Behav 1990;54:13–22. [50] Halloy J, Sempo G, Caprari G, Rivault C, Asadpour M, Tâche F, et al. Social integration of robots into groups of cockroaches to control self-organized choices. Science 2007;318:1155–8. [51] Haw J. Random-ratio schedules of reinforcement: the role of early wins and unreinforced trials. J Gambling Issues 2008;21:56–67. [52] Hendricks VM, Meerkerk G-J, Van Oers HAM, Garretsen HFL. The Dutch instant lottery: prevalence and correlates of at-risk playing. Addiction 1997;92:335–46. [53] Holper L, Wolf M, Tobler PN. Comparison of functional near-infrared spectroscopy and electrodermal activity in assessing objective versus subjective risk during risky financial decisions. Neuroimage 2014;84:833–42. [54] Huettel SA, Stowe CJ, Gordon EM, Warner BT, Platt ML. Neural signatures of economic preferences for risk and ambiguity. Neuron 2006;49:765–75. [55] Hurly TA, Oseen MD. Context-dependent, risk-sensitive foraging preference in wild rufous hummingbirds. Anim Behav 1999;58:59–66. [56] Hursh SR. Behavioral economics. J Exp Anal Behav 1984;42:435–52. [57] Johnson PS, Madden GJ, Brewer AT, Pinkston JW, Fowler SC. Effects of acute pramipexole on preference for gambling-like schedules of reinforcement in rats. Psychopharmacology 2011;213:11–8. [58] Joutsa J, Johansson J, Niemelä S, Ollikainen A, Hirvonen MM, Piepponen P, et al. Mesolimbic dopamine release is linked to symptom severity in pathological gambling. Neuroimage 2012;60:1992–9. [59] Kacelnik A, Bateson M. Risky theories: the effects of variance on foraging decisions. Am Zool 1996;36:402–34. [60] Kahneman D, Tversky A. Prospect theory: an analysis of decision under risk. Econometrica 1979;4:263–91. [61] Kearns DN, Weiss SJ, Panlilio LV. Conditioned suppression of behavior maintained by cocaine self-administration. Drug Alcohol Depend 2002;65: 253–61. [62] Kenny JT, Stoloff ML, Bruno JP, Blass EM. Ontogeny of preference for nutritive over non-nutritive suckling in albino rats. J Comp Physiol Psychol 1979;93:752–9. [63] Knight F. Risk uncertainty and profit. Boston, MA: Houghton-Mifflin; 1921. [64] Koot S, Zoratto F, Cassano T, Colangeli R, Laviola G, van den Bos R, et al. Compromised decision-making and increased gambling proneness following dietary serotonin depletion in rats. Neuropharmacology 2012;62: 1640–50. [65] Lawson RP, Seymour B, Loh E, Lutti A, Dolan RJ, Dayan P, et al. The habenula encodes negative motivational value associated with primary punishment in humans. Proc Natl Acad Sci U S A 2014;111:11858–63. [66] Leblond M, Fan D, Brynildsen JK, Yin HH. Motivational state and reward content determine choice behavior under risk in mice. PLoS ONE 2011;6:e25342. [67] Lejuez CW, Read JP, Kahler CW, Richards JB, Ramsey SE, Stuart GL, et al. Evaluation of risk taking: the Balloon Analogue Risk Task (BART). J Exp Psychol Appl 2002;8:75–84.

[68] Levy I, Snell J, Nelson AJ, Rustichini A, Glimcher PW. Neural representation of subjective value under risk and ambiguity. J Neurophysiol 2010;103:1036–47. [69] Linnet J, Mouridsen K, Peterson E, Møller A, Doudet DJ, Gjedde A. Striatal dopamine release codes uncertainty in pathological gambling. Psychiatry Res 2012;204:55–60. [70] Maginnis TL. The costs of autotomy and regeneration in animals: a review and framework for future research. Behav Ecol 2006;17:857–72. [71] Maunsell JHR. Neuronal representations of cognitive state: reward or attention? Trends Cogn Sci 2004;8:261–5. [72] Meyer PJ, Lovic V, Saunders BT, Yager LM, Flagel SB, Morrow JD, et al. Quantifying individual variation in the propensity to attribute incentive salience to reward cues. PLoS ONE 2012;7:e38987. [73] Mobini S, Chiang T-J, Ho M-Y, Bradshaw CM, Szabadi E. Effects of central 5-hydroxytryptamine depletion on sensitivity to delayed and probabilistic reinforcement. Psychopharmacology 2000;152:390–7. [74] Mobini S, Body S, Ho M-Y, Bradshaw CM, Szabadi E, Deakin JFW, et al. Effects of lesions of the orbitofrontal cortex on sensitivity to delayed and probabilistic reinforcement. Psychopharmacology 2002;160:290–8. [75] Orduna V, Bouzas A. Energy budget versus temporal discounting as determinants of preference in risky choice. Behav Process 2004;67:147–56. [76] Paglieri F, Addessi E, De Petrillo F, Laviola G, Mirolli M, Parisi D, et al. Nonhuman gamblers: lessons from rodents, primates, and robots. Front Behav Neurosci 2014;8:33. [77] Papini MR, Wood M, Daniel AM, Norris JN. Reward loss as psychological pain. Int J Psychol Psychol Therapy 2006;6:189–213. [78] Pelchat ML, Grill HJ, Rozin P, Jacobs J. Quality of acquired responses to taste by rattus norvegicus depends upon type of associated discomfort. J Comp Physiol Psychol 1983;97:140–53. [79] Pelloux Y, Everitt BJ, Dickinson A. Compulsive drug seeking by rats under punishment: effects of drug taking history. Psychopharmacology 2007;194:127–37. [80] Platt ML, Huettel SA. Risky business: the neuroeconomics of decision making under uncertainty. Nat Rev Neurosci 2008;11:398–403. [81] Posadas-Sánchez D, Killeen PR. Does satiation close the open economy? Learn Behav 2005;33:387–98. [82] Preuschoff K, Bossaerts P. Adding prediction risk to the theory of reward learning. Ann N Y Acad Sci 2007;1104:135–46. [83] Preuschoff K, Bossaerts P, Quartz SR. Neural differentiation of expected reward and risk in human subcortical structures. Neuron 2006;51: 381–90. [84] Pyke GH. Optimal foraging theory: a critical review. Annu Rev Ecol Syst 1984;15:523–75. [85] Robinson MJF, Anselme P, Fischer AM, Berridge KC. Initial uncertainty in Pavlovian reward prediction persistently elevates incentive salience and extends sign-tracking to normally unattractive cues. Behav Brain Res 2014;266:119–30. [86] Rogers RD, Owen AM, Middleton HC, Williams EJ, Pickard JD, Sahakian BJ, et al. Choosing between small, likely rewards and large, unlikely rewards activates inferior and orbital prefrontal cortex. J Neurosci 1999;19:9029–38. [87] Sarter M, Gehring WJ, Kozak R. More attention must be paid: the neurobiology of attentional effort. Brain Res Rev 2006;51:145–60. [88] Schultz W. Dopamine signals for reward value and risk: basic and recent data. Behav Brain Funct 2010;6:24. [89] Schultz W, Preuschoff K, Camerer C, Hsu M, Fiorillo CD, Tobler PN, et al. Explicit neural signals reflecting reward uncertainty. Philos Trans R Soc B 2008;363:3801–11. [90] Sherman JA, Thomas JR. Some factors controlling preference between fixed ratio and variable-ratio schedules of reinforcement. J Exp Anal Behav 1968;11:689–702. [91] Simon NW, Gilbert RJ, Mayse JD, Bizon JL, Setlow B. Balancing risk and reward: a rat model of risky decision making. Neuropsychopharmacology 2009;34:2208–17. [92] Singer BF, Scott-Railton J, Vezina P. Unpredictable saccharin reinforcement enhances locomotor responding to amphetamine. Behav Brain Res 2012;226:340–4. [93] Skinner BF. The behavior of organisms. New York: Appleton; 1938. [94] Smith BW, Mitchell DGV, Hardin MG, Jazbec S, Fridberg D, Blair RJR, et al. Neural substrates of reward magnitude, probability, and risk during a wheel of fortune decision-making task. Neuroimage 2009;44:600–9. [95] Stephens DW, Krebs JR. Foraging theory. Princeton, NJ: Princeton University Press; 1986. [96] Tobler PN, O’Doherty JP, Dolan RJ, Schultz W. Reward value coding distinct from risk attitude-related uncertainty. J Neurophysiol 2007;97: 1621–32. [97] Tobler PN, Christopoulos GI, O’Doherty JP, Dolan RJ, Schultz W. Riskdependent reward value signal in human prefrontal cortex. Proc Natl Acad Sci U S A 2009;106:7185–90. [98] Tom SM, Fox CR, Trepel C, Poldrack RA. The neural basis of loss aversion in decision-making under risk. Science 2007;315:515–8. [99] Watson KK, Platt ML. Neuroethology of reward and decision making. Philos T Roy Soc B 2008;363:3825–35. [100] Wilhelm CJ, Mitchell SH. Rats bred for high alcohol drinking are more sensitive to delayed and probabilistic outcomes. Genes Brain Behav 2008;7:705–13. [101] Wolf S, Wolff HG. Human gastric function an experimental study of a man and his stomach. London: Oxford University Press; 1943.

P. Anselme / Behavioural Brain Research 280 (2015) 119–127 [102] Xu ER, Kralik JD. Risky business: rhesus monkeys exhibit persistent preferences for risky options. Front Psychol 2014;5:258. [103] Zack M, Featherstone RE, Mathewson S, Fletcher PJ. Chronic exposure to a gambling-like schedule of reward predictive stimuli can promote sensitization to amphetamine in rats. Front Behav Neurosci 2014;8:36.

127

[104] Zentall TR. Maladaptive ‘gambling’ by pigeons. Behav Process 2011;87: 50–6. [105] Zoratto F, Laviola G, Adriani W. Gambling proneness in rats during the transition from adolescence to young adulthood: a home-cage method. Neuropharmacology 2013;67:444–54.

Does reward unpredictability reflect risk?

Most decisions made in real-life situations are risky because they are associated with possible negative consequences. Current models of decision-maki...
854KB Sizes 3 Downloads 13 Views