TICS-1384; No. of Pages 8

Opinion

Do humans make good decisions? Christopher Summerfield and Konstantinos Tsetsos Department of Experimental Psychology, University of Oxford, South Parks Road, OX1 3UD, Oxford, UK

Human performance on perceptual classification tasks approaches that of an ideal observer, but economic decisions are often inconsistent and intransitive, with preferences reversing according to the local context. We discuss the view that suboptimal choices may result from the efficient coding of decision-relevant information, a strategy that allows expected inputs to be processed with higher gain than unexpected inputs. Efficient coding leads to ‘robust’ decisions that depart from optimality but maximise the information transmitted by a limited-capacity system in a rapidly-changing world. We review recent work showing that when perceptual environments are variable or volatile, perceptual decisions exhibit the same suboptimal context-dependence as economic choices, and we propose a general computational framework that accounts for findings across the two domains. Good or bad decisions? Consider a footballer deciding whether to angle a penalty shot into the left or right corner of the goal, a medical practitioner diagnosing a chest pain of mysterious origin, or a politician deliberating over whether or not to take the country to war. All of these decisions, however different in scope and seriousness, require information to be collected, evaluated, and combined before commitment is made to a course of action. To date, however, the biological and computational mechanisms by which we make decisions have been hard to pin down. One major stumbling block is that it remains unclear whether human choices are optimised to account for prior beliefs and uncertainty in the environment, or whether humans are fundamentally biased and irrational. In other words, do we make good decisions, or not? Optimal perceptual integration Although behavioural scientists continue to debate what might constitute a good decision (Box 1), optimal behaviour is usually limited only by the level of noise (or uncertainty) in the environment. Consider a doctor diagnosing a patient who is experiencing vice-like chest pains. Optimal binary decisions are determined by the likelihood ratio, that is, the relative probability of the evidence (chest pains) given one hypothesis (incipient cardiac arrest) or another (gastric reflux). When decision-relevant information arises from Corresponding author: Summerfield, C. ([email protected]). Keywords: perceptual decision-making; neuroeconomics; optimality; information integration; gain control; efficient coding. 1364-6613/ ß 2014 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.tics.2014.11.005

multiple sources, evidence must be combined to make the best decision. For binary choices, decisions are optimised by sequential summation of log-likelihood ratio, expressing the relative likelihood that information was drawn from one category or the other [1,2]. An observer who views a sequence of probabilistic cues (‘samples’) before making one of two responses (e.g., the shapes in Figure 1A) will make the best decision by considering all samples equivalently, that is, in proportion to the evidence they convey. In other words, the subjective weight of evidence (w.o.e.) for each sample will depend linearly on the weight assigned by the experimenter. Well-trained monkeys and humans seem to weigh information optimally in this way (Figure 1B) [3,4]. When some cues are more trustworthy than others, the best decisions are made by adding up information weighted by its reliability. Again, observers seem to do just this – giving less credence to features or modalities that the experimenter has corrupted with noise, for example, when combining information from across different senses [5–7]. Thus, the footballer alluded to in the opening sentence will most likely strike the ball with great precision towards a spot that is just out of the goalkeeper’s reach – factoring in uncertainty due to the weather conditions and his or her own bodily fatigue. Irrational economic decisions By contrast, humans choosing among economic alternatives are often unduly swayed by irrelevant contextual factors, leading to inconsistent or intransitive decisions that fail to maximise potential reward [8]. For example, consumers will bid higher for a house whose initial price tag has been deliberately inflated, as if their valuation is ‘anchored’ by the range of possible market prices [9]. Similarly, human participants undervalued risky prospects when most offers consisted of low-value gambles but overvalued the same prospects when the majority of offers consisted of high-value gambles [10]. Anchoring by an irrelevant, low-value third alternative can also disrupt choices between two higher-valued prospects, leading to systematic reversals of preference [11]. For example, the probability that a hungry participant will choose a preferred snack item A over another item B (rather than vice versa) is often reduced in the presence of a yet more inferior option C, in particular when C resembles B in value [12] (Figure 1C). Irrational economic behaviour can arise if the brain computes and represents stimulus value relative to the context provided by other previously [13] or currently available options [14,15]. Changing the context of a decision by adding other alternatives (even if they are rapidly Trends in Cognitive Sciences xx (2014) 1–8

1

TICS-1384; No. of Pages 8

Opinion Box 1. What makes a good decision? Behavioural scientists have often disagreed about what constitutes a good decision. For example, in the absence of overt financial incentives, experimental psychologists usually define good decisions as those that elicit ‘correct’ feedback, given the predetermined structure of their task. In experiments where stimuli are noisy or outcomes are uncertain, a theoretical upper limit can be placed on performance, by estimating how an ‘ideal’ agent would perform – one who is most likely to be right, given the levels of uncertainty in the stimulus. However, it is not always clear whether humans have the same motives, or are imbued with the same preconceptions, as an ideal observer [50]. For example, humans often make erroneous assumptions about the nature of the task they are performing. For example, in a decision-making task, if participants believe that the values of prospects are changeable when really they are stationary, then they will learn ‘superstitiously’ from recent outcomes [51]. Moreover, although many human volunteers are strongly motivated to maximise feedback, others might instead try to minimise the time spent performing a boring or disagreeable task. Behavioural economists argue that good decisions maximise expected utility over the short or long term [52]. However, behavioural ecologists emphasise that organisms need to maximise their fitness, in order to survive and reproduce [53]. Often these definitions align, but sometimes they diverge. For example, temporal discount functions that overweigh short-term reward might deter an animal from an investment strategy that maximises long-term income. However, if current resources are insufficient to survive over the near-term, steep temporal discounting may be the best insurance against an untimely end [54]. The subjective nature of utility has led theorists to define a series of axioms that are necessary (but not sufficient) for optimal decisions [55]. These rational axioms guarantee that an agent’s preferences – as revealed by overt choices – are internally coherent and consistent with a stable, context-independent utility function [56]. Because human behaviour is at systematic odds with rational axioms [14,44], psychologists used non-normative frameworks to describe human behaviour [49,57]. These approaches, despite their descriptive adequacy, have failed to explain why choice processes are irrational. By contrast, the efficient coding hypothesis and analogous frameworks, which consider the computational costs the brain faces while making decisions, promise to offer a normatively motivated account of irrationalities in human choice.

withdrawn, or so inferior as to be irrelevant) can alter the way that choice-relevant options A and B are valued, biasing the choice between them. For example, the pattern of data described in Figure 1C can be explained by a simple computational model in which the value of each alternative is normalised by the total value of all those available (Figure 1C). As the value of C grows, distributions representing noisy value estimates of A and B exhibit more overlap, increasing the probability that the inferior option B is mistakenly chosen over A (although in other circumstances, increasing the value of C may lead B to be selected less often, and rival models have been proposed to account for this alternative finding [16,17]). Indeed, single-cell recordings from macaques making choices among juice or food rewards suggest that neural encoding of value in the parietal and orbitofrontal cortices is scaled by the context provided by the range of possible options [18]. When one measures the scaling factor (or ‘gain’) that quantifies how subjective value (e.g., drops of juice) is mapped onto neuronal activity (e.g., spike rates) in these brain regions, it is found to vary according to the range of values on offer. This ensures that average firing rates remain in a roughly constant range across blocks or even trials with variable offers [19–21]. 2

Trends in Cognitive Sciences xxx xxxx, Vol. xxx, No. x

Efficient coding Why might the brain have evolved to compute value on a relative, rather than an absolute scale? One compelling answer to this question, first put forward by Horace Barlow [22] and known as the ‘efficient coding hypothesis’, appeals to the limited capacity of neuronal information processing systems and their consequent need for efficiency [23,24]. Efficient systems are those that minimise the level of redundancy in a neural code, for example, by transmitting a signal with a minimal number of spikes. Efficiency of neural coding is maximised when sensitivity is greatest to those sensory inputs or features that are most diagnostic for the decision at hand. In some situations, the most diagnostic features will be those that are most likely to occur, given the statistics of the natural environment or the local context. For example, an efficient system will become most sensitive to high-valued stimuli when they are abundant and to low-valued stimuli during times of scarcity [12,18]. Formally, this strategy maximises the information that a neuronal system can encode and transmit, thereby optimising processing demands to match the available resources [25]. By contrast, encoding absolute input values is an inefficient strategy. For example, if a hypothetical neuron was to linearly signal the value of goods with widely disparate economic worth (for example, a cappuccino and a holiday in Hawaii), then only a very limited portion of its dynamic firing range could be devoted to any given value of those alternatives. This would make it very hard for the neuron to signal consistently that the cappuccino was preferred over (say) a cup of tea. Normalisation of neural signals (for example, via lateral inhibition) is one operation that permits efficient information coding in sensory systems [26]. For example, the sensitivity of cells in the early visual system is adjusted over the diurnal cycle, ensuring that neurons with limited dynamic range can continue to encode information even as the strength of ambient illumination varies over many orders of magnitude [27]. When input signals are transformed in this way, choices can vary as a function of context provided by recent stimulation, potentially leading to preference reversals and other deviations from optimal behaviour. The efficient coding hypothesis thus implies that economic choices are irrational because they are overly susceptible to variation in the local context. By the same token, it could be that perceptual classification judgments tend towards optimal because psychophysical experiments usually involve repeated choices made in a single, unvarying context. This prompts a prediction: that perceptual judgments made in variable or volatile environments will deviate from optimal in ways that resemble economic decision-making. Robust perceptual decision-making In the task described in Figure 2A, humans are asked to categorise the average colour of eight elements (samples) as ‘red’ versus ‘blue’ (or the average shape as ‘square’ versus ‘circle’). This paradigm is similar to the ‘weather prediction’ task shown in Figure 1A except that the samples arrive all at once and their predictive value is conveyed by a continuously-varying visual feature (e.g., the degree of redness

TICS-1384; No. of Pages 8

Opinion

Trends in Cognitive Sciences xxx xxxx, Vol. xxx, No. x

(B)

(A)

Humans, n = 24

Subjecve w.o.e.

Monkey J

P(s|R) 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1

1

0

–1

–1

0

1

–0.7

–0.3

0.3

0.7

Assigned weight

(C)

Not normalised

Normalised

Low value l

High h value l TRENDS in Cognitive Sciences

Figure 1. Optimal perceptual classification and irrational economic decisions. (A) Schematic depiction of the ‘weather prediction’ task used in [3]. On each trial, four shapes appeared in succession. Each shape was associated with a given probability of reward (red/blue colour bar), conditioned on an eye movement to one of two targets (red and blue dots). Reproduced, with permission, from [3]. In [4], a similar task was used but each shape was replaced on the screen by its successor. (B) Subjective weight of evidence (w.o.e.) associated with each shape for a monkey [3] and the average of 24 humans [4]. Adapted and reproduced, with permission, from [3,4]. In both cases, dots fall on a straight line, suggesting that each sample is weighed in proportion to its objective probability. (C) In [14], participants choose between a preferred snack (e.g., apple) and a dispreferred snack (e.g., orange); their (uncertain) value is represented by red and blue Gaussian distributions. Because neural value signals are normalised by the total outcome associated with all stimuli, the introduction of a yet more inferior option (e.g., carrot; black Gaussian) brings the value estimates of the preferred and dispreferred options closer together, increasing the probability that the dispreferred option (orange) will be chosen.

versus blueness; we refer to this as the ‘feature value’). An ideal observer should thus add up equally weighted feature values (which are proportional to the log-likelihood ratios) and compare them to an internal category boundary (in this case, red versus blue or square versus circle), thereby using the same policy as the experimenter to decide which response is correct. However, this is not what humans do [28,29]. The relative weight associated with each feature value can be calculated by using a logistic regression model to predict choices on the basis of the feature values present in each array. The resulting weights form an inverted u-shape over the space of possible features (Figure 2B), indicating that humans give more credence to inlying samples – those that lie close to the category boundary (e.g., purple) – than to outlying samples (e.g., extreme red or blue). Another way of visualising this effect is by plotting these weights multiplied by their corresponding feature values, thereby revealing the psychophysical ‘transfer’ function that transduces inputs into response probabilities. For an ideal observer, this function would be linear. Empirically, however, it is sigmoidal

(Figure 2C). One way of understanding this behaviour is that when feature information is variable or heterogeneous, humans integrate information in a ‘robust’ fashion, discounting outlying evidence, much as a statistician might wish to eliminate aberrant data points from an analysis. Interestingly, the effect remains after extensive training with fully informative feedback, and equivalent effects are obtained when observers average other features over multiple elements, such as shape [28,29]. Efficient perceptual classification From the viewpoint of the researcher, this robust averaging behaviour is suboptimal. However, because the distribution of features viewed over the course of the experiment is Gaussian, inlying features occur more frequently than outlying features. Thus, humans are exhibiting greatest sensitivity to those visual features that are most likely to occur – an efficient coding strategy. The sigmoidal transfer function ensures that for inlying features, a small change in feature information leads to a large change in the probability of one response over another. Interestingly, 3

TICS-1384; No. of Pages 8

Opinion

Trends in Cognitive Sciences xxx xxxx, Vol. xxx, No. x

Fixaon Smulus Response red or blue?

(C)

Subjecve w.o.e.

(B)

Decision weight

(A)

Feature value l

Feature value l

Feedback

(D) Key:

Decision weight

1.5

Blue Purple

Key:

Purple Red

1 0.5 0 –0.5 –0.4

–0.2

0

0.2

0.4

Feature value TRENDS in Cognitive Sciences

Figure 2. Robust averaging of variable feature information. (A) In [28], participants judged the average colour (shown) or shape of an array of eight items, receiving feedback after each response. (B) Decision weights (calculated via logistic regression) associated with item ranks (e.g., sorted by most red to most blue) have an inverted-u profile, indicating that outlying elements (furthest from category boundary) carried less influence in decisions. (C) Decision weights associated with each portion of feature space, multiplied by that feature value, reveal the subjective weight of evidence (w.o.e.). Note that the shape is different to that in Figure 1B. (D) Decision weights (similar to B) for two experiments in which participants separately judged whether the array was more blue versus more purple (left curves: negative feature values) or more red versus more purple (right curves: positive feature values). Features that are outlying with respect to the relevant category boundary are downweighted. Light and dark grey lines show weights for items drawn for the two respective categories.

this notion that inputs are transformed in this way suggests an explanation for classical ‘magnet’ effects that characterise categorical perception, whereby observers are more sensitive to information that lies close to a category boundary [30], and for natural biases in perception, such as heightened sensitivity to contours falling close to the cardinal axes of orientation [31]. Note that these effects arise because information is sampled or encoded in a biased fashion; it may still be read out via Bayesian principles [32]. Efficient coding is an appealing explanation for the downweighting of outliers during perceptual averaging, but an alternative culprit could be nonlinearities in feature space, such as hardwired boundaries in human perception of red and blue hues. One way to rule out this possibility is to systematically vary the range of features over which judgments are made, so that previously inlying elements (e.g., purple during blue/red discrimination) becoming outlying elements (e.g., during blue/purple or red/purple discrimination). Under this manipulation, those features that fall far from the category boundary are downweighted, irrespective of their physical properties (Figure 2D). In other words, feature values are evaluated differently according to the range of information available, as predicted by the efficient coding hypothesis. Computationally, this finding can be explained if the sigmoidal ‘transfer’ function linking inputs to outputs migrates across feature 4

space in such a way that its inflection point remains aligned with the modal feature in the visual environment [28]. In the neuroeconomics literature, a key question pertains to the timescale over which value signals are normalised [18,33]. Perceptual classification has typically been measured in stationary environments, but when category statistics change rapidly over time, participants depart from optimality and use instead a memory-based heuristic that updates category estimates to their last known value, suggesting that rapid updating is at play [34]. However, can we measure the timescale over which adaptive gain control occurs during perceptual decision-making? Rapidly adapting gain control during decision-making The results of one recent study imply that the gain of processing of decision information can adapt very rapidly – even within the timeframe of a single trial [35]. Participants viewed a stream of eight tilted gratings occurring at 4 Hz and were asked to judge whether, on average, the orientations fell closer to the cardinal (08 or 908) or diagonal (458 or 458) axes (Figure 3A). Firstly, this allowed the authors to estimate the impact of each sample position on choice – for example, to ask whether early samples (primacy bias) or late samples (recency bias) were weighted most heavily [36]. Secondly, the authors calculated how the impact of each sample varied according to whether it was

TICS-1384; No. of Pages 8

Opinion

Trends in Cognitive Sciences xxx xxxx, Vol. xxx, No. x

(A)

Category-level averaging task – cardinal or diagonal stream? Cardinal category

K

3

2

1 250 ms

250 ms

(B)

Diagonal category

250 ms

4 250 ms

(C)

1

5 250 ms

6

7

250 ms

8

250 ms

250 ms

Low gain 1

P (diagonal)

Decision weight

High gain 0.5

0

0.5

0 0

2

4

6

8

–1

Sample posion

0

Feature value

1

–1

0

1

Feature value TRENDS in Cognitive Sciences

Figure 3. Adaptive gain control during sequential integration. (A) Cardinal–diagonal categorisation task. Participants viewed a sequence of eight tilted gratings occurring at 4 Hz. Each sample was associated with a decision update (DU) reflecting whether it was tilted at the cardinal axes (DU = 1) or diagonal axes (DU = +1) or in between ( 1 < DU < 1). Participants received positive feedback for correctly classifying the sum of DU as >0 or

Do humans make good decisions?

Human performance on perceptual classification tasks approaches that of an ideal observer, but economic decisions are often inconsistent and intransit...
1MB Sizes 0 Downloads 5 Views