This article was downloaded by: [New York University] On: 10 May 2015, At: 15:42 Publisher: Routledge Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

The Quarterly Journal of Experimental Psychology Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/pqje20

Uncertainty and predictiveness determine attention to cues during human associative learning a

a

a

a

Tom Beesley , Katherine P. Nguyen , Daniel Pearson & Mike E. Le Pelley a

School of Psychology, University of New South Wales, Sydney, NSW, Australia Published online: 02 Apr 2015.

Click for updates To cite this article: Tom Beesley, Katherine P. Nguyen, Daniel Pearson & Mike E. Le Pelley (2015): Uncertainty and predictiveness determine attention to cues during human associative learning, The Quarterly Journal of Experimental Psychology, DOI: 10.1080/17470218.2015.1009919 To link to this article: http://dx.doi.org/10.1080/17470218.2015.1009919

PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2015 http://dx.doi.org/10.1080/17470218.2015.1009919

Uncertainty and predictiveness determine attention to cues during human associative learning Tom Beesley, Katherine P. Nguyen, Daniel Pearson, and Mike E. Le Pelley School of Psychology, University of New South Wales, Sydney, NSW, Australia

Downloaded by [New York University] at 15:42 10 May 2015

(Received 10 July 2014; accepted 5 January 2015)

Prior research has suggested that attention is determined by exploiting what is known about the most valid predictors of outcomes and exploring those stimuli that are associated with the greatest degree of uncertainty about subsequent events. Previous studies of human contingency learning have revealed evidence for one or other of these processes, but differences in the designs and procedures of these studies make it difficult to pinpoint the crucial determinant of whether attentional exploitation or exploration will dominate. Here we present two studies in which we systematically manipulated both the predictiveness of cues and uncertainty regarding the outcomes with which they were associated. This allowed us to demonstrate, for the first time, evidence of both attentional exploration and exploitation within the same experiment. Moreover, while the effect of predictiveness persisted to influence the rate of novel learning about the same cues in a second stage, the effect of uncertainty did not. This suggests that attentional exploration is more sensitive to a change of context than is exploitation. The pattern of data is simulated with a hybrid attentional model. Keywords: Associative learning; Attention; Associabilty; Uncertainty; Eye tracking.

Imagine that you are trading on the stock exchange. Over the course of time you may learn that a certain piece of information, such as a company’s yearly revenue, is a reliable predictor of fluctuations in the share price of that company. It would be advantageous, therefore, to pay close attention to this reliable cue and to ignore other, less informative cues, in order to exploit this knowledge. But what happens when there is more uncertainty present in the environment? What if the share price cannot be fully predicted by yearly revenue alone? Here it may be more beneficial to devote attention to other cues that might potentially predict changes in the share price: It makes sense to explore the

information provided by other cues, so that perhaps more accurate predictions can be made in the future. This scenario highlights two characteristic processes that have become central to the debate regarding how attention interacts with associative learning. The first of these processes, which we term attentional exploitation, is exemplified by the model of Mackintosh (1975). According to this model, attention is shaped by differences in the extent to which the cues that are presented on a given trial predict the outcome that occurs on that trial—that is, by the “relative predictiveness” of the presented cues. Specifically, as learning

Correspondence should be addressed to Tom Beesley, School of Psychology, University of New South Wales, Matthews Building, Kensington Campus, Sydney, NSW, Australia. E-mail: [email protected] The authors would like to thank Oren Griffiths for his comments on an early version of this manuscript. This work was supported by ARC Discovery Project [grant DP140103268]. This research formed the basis of an Honours dissertation by Katherine P. Nguyen at the University of New South Wales. © 2015 The Experimental Psychology Society

1

Downloaded by [New York University] at 15:42 10 May 2015

BEESLEY ET AL.

progresses, attention will increase to those cues that are the best available predictors of outcomes and will decrease to those cues that are less accurate predictors. Intuitively this seems a plausible account of how attention might interact with learning: Environments are full of irrelevant information, and the role of attention is to filter out this unwanted noise and focus cognitive resources so as to respond on the basis of the most predictive source of information available. The second attentional process highlighted by the share price scenario is one that we describe as attentional exploration and is reflected in the attentional model presented by Pearce and Hall (1980; hereafter the Pearce–Hall model). According to this account, attentional resources are preferentially allocated to those cues for which the consequences are currently unknown. Again, there is intuitive appeal in this model. If our aim is to understand the cue–outcome relationships that are present in the environment, then one could argue that it makes little sense to devote attentional resources to cues whose outcomes are already well understood. Instead we should devote processing resources to cues whose consequences are not yet known in order to identify new contingencies in the environment; that is, we should use attention to explore potentially useful new sources of information for predicting future events. What should be clear from the preceding description is that, prima facie, these two accounts contradict one another: The Mackintosh model suggests that attention should be allocated to those cues that are most predictive, while the Pearce–Hall model suggests that attention should be allocated away from such cues. Both theoretical accounts are supported by data from studies of animal conditioning (for reviews, see Le Pelley, 2004; Pearce & Mackintosh, 2010) and human associative learning (see Le Pelley, 2010a). In 2particular, studies in humans that use eye-tracking methods have provided a direct measure of “overt attention” to cues during associative learning tasks (e.g., Beesley & Le Pelley, 2011; Hogarth, Dickinson, Austin, Brown, & Duka, 2008; Krushcke, Kappenman & Hetrick, 2005; Le Pelley, Beesley, & Griffiths, 2011, 2014; Le Pelley, Mitchell, & Johnson, 2013;

2

Rehder & Hoffman, 2005; Wills, Lavric, Croft, & Hodgson, 2007). For example, in a study by Le Pelley et al. (2011), participants were presented with a compound of two discrete visual cues on each trial and were required to predict which of two different outcome events (different sounds) would occur. Immediate corrective feedback was provided. Crucially, one of the cues presented on each trial was perfectly predictive of which outcome event would occur, while the other cue was nonpredictive. Participants received feedback on their decisions, allowing them to learn the correct response for each cue compound. Analysis of eye tracking data showed that participants came to spend more time looking at the predictive cue than the nonpredictive cue in each compound. That is, in line with the Mackintosh model, overt attention increased to those cues that were most useful to participants in predicting the correct response and decreased to cues that were not useful in this regard (see also Le Pelley et al., 2013; Rehder & Hoffman, 2005). In other words, the pattern of overt attention was consistent with participants exploiting knowledge of the relationship between predictive cues and outcomes, rather than exploring the value of the nonpredictive cues (whose consequences were unclear). However, not all studies using eye tracking have provided support for attentional exploitation in human learning. In particular, two experiments reported by Hogarth et al. (2008) instead found evidence of attentional exploration, consistent with the account offered by the Pearce–Hall model. Participants were presented with a compound of two discrete visual stimuli on each trial and had to rate their expectancy that a noise outcome would follow. One compound, AX, was always followed by the noise (AX+); one compound, BX, was followed by the noise on half of the trials (BX+); and a third compound, CX, was never followed by the noise (CX−). Hogarth et al. compared the duration on each trial that participants spent looking at the “unique” cue (A, B, or C) with the time spent looking at the “common” or “background” cue (X). Cues A and C were the most reliable predictors of whether or not the noise would occur, and hence the principle of attentional

THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2015

Downloaded by [New York University] at 15:42 10 May 2015

EXPLORATORY AND EXPLOITATIVE ATTENTION

exploitation embodied by the Mackintosh model anticipates that these cues would receive greater attention than the common cue X. In contrast, Cue B was nonpredictive regarding the occurrence of the noise, and so on this account should receive little attention. However, Hogarth et al. found essentially the opposite of this pattern. As compared to the common cue X, participants showed relatively little attention to the predictive cues A and C, but significantly greater attention to the nonpredictive cue B. Clearly these data are hard to reconcile with the account put forward by Mackintosh (1975). Instead, the results are in keeping with the idea of attentional exploration embodied by the Pearce– Hall model. That is, participants paid relatively little attention to cues (A and C) occurring on trials for which they could confidently predict the outcome (AX+ and CX−) and greater attention to a cue (B) that appeared on trials for which they were less certain of the outcome (BX+). The procedures used by Le Pelley et al. (2011) and Hogarth et al. (2008) are very similar; both studies presented participants with compounds of two visual cues and asked them to predict the auditory outcome event that would follow; some of the cues predicted the outcome while others did not. However, the pattern of overt attention revealed in these two studies was very different, with Le Pelley et al. finding evidence for exploitation and Hogarth et al. finding evidence for exploration. What, then, is the critical difference that determines the attentional pattern that will be observed? We believe that one likely candidate is uncertainty. In the study by Le Pelley et al. (2011), the outcome that occurred on every trial was perfectly predicted by the cues that were presented—that is, once participants had learned the various cue– outcome relationships, they could make a correct prediction on every trial, and hence there was zero uncertainty in this design. Under these circumstances, when perfect performance is possible, there seems little value in exploring the cues for new sources of information—instead it makes sense to exploit the predictive relationships that are already known. In contrast, in the study by Hogarth et al. (2008), the outcome on BX trials was unpredictable: The noise occurred on a

random half of BX trials and did not occur on the others. Under these circumstances of uncertainty, it makes sense to explore the cues for additional pieces of information that may allow more accurate predictions to be made in future. For example, perhaps participants might think that there is some subtle difference in the B cue that is presented on each BX trial that predicts whether or not the outcome will occur, and hence they will attend to this cue in order to try and identify this putative predictive feature. Of course in Hogarth et al.’s study there is no such predictive information on BX trials, but participants are not aware of this and so may continue searching under the belief that more accurate performance is possible. The suggestion, then, is that there are two critical factors that determine the overall pattern of attention that will be observed. Firstly there is the relative predictiveness of a cue: the extent to which it is a better or worse predictor of subsequent events than are other cues. Secondly, there is uncertainty, which relates to the absolute accuracy with which the presented cues allow outcome predictions to be made. However, no prior study has systematically manipulated both predictiveness and uncertainty in the same experiment in a way that allows us to examine how they interact to determine attention. By manipulating both factors in a human contingency learning procedure, and measuring the resulting impact on overt attention (using eye tracking), we aimed to demonstrate, for the first time, evidence for both attentional exploitation and exploration within a single experiment.

EXPERIMENT 1 Experiment 1 was based on the learned predictiveness design used by Le Pelley et al. (2011) and is shown in Table 1. Participants were presented with eight different compounds comprising two cues. Each cue compound was paired with one of two different outcomes. Cues A–D could be used to predict which outcome (o1 or o2) would occur on each trial, while cues W–Z were nonpredictive. The critical manipulation in this experiment was the uncertainty with which each outcome occurred.

THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2015

3

BEESLEY ET AL.

Table 1. Design of Experiment 1 Compound

Downloaded by [New York University] at 15:42 10 May 2015

AW AX BW BX CY CZ DY DZ

P(o1)

P(o2)

1 1 0 0 .7 .7 .3 .3

0 0 1 1 .3 .3 .7 .7

Note: P(o1) indicates the proportion of trials on which each compound was paired with Outcome o1, and P(o2) indicates the proportion of trials on which the same compound was paired with Outcome o2. Hence compounds AW, AX, BW, and BX were consistently followed by one particular outcome, while compounds CY, CZ, DY, and DZ were paired with both outcomes (with one outcome occurring more often than the other).

For compounds AW and AX, the outcome was always o1; for compounds BW and BX, the outcome was always o2. Hence for these compounds the outcome was fully predictable—that is, there was no uncertainty. In contrast, for compounds CY and CZ, Outcome o1 occurred on 70% of trials, and o2 occurred on the remaining 30%. Similarly, for DY and DZ, Outcome o2 occurred on 70% of trials, and o1 occurred on the remaining 30%. Hence while the outcome produced by these latter compounds could be predicted with above-chance accuracy, it was not fully predictable (i.e., there was uncertainty attached to the outcome for compounds CY, CZ, DY, and DZ). To summarize, each compound contained a predictive cue (A, B, C, or D) that provided some information about which outcome would occur, and a nonpredictive cue (W, X, Y, or Z), which provided no information. We refer to this withincompound factor as predictiveness. At the same time, there was a between-compound difference in uncertainty: The outcomes following “certain” compounds (AW, AX, BW, and BX) had zero uncertainty, while the outcomes following “uncertain” compounds (CY, CZ, DY, and DZ) were— to an extent—uncertain. One important point to note here is that predictiveness and uncertainty are not orthogonal variables. For the uncertain

4

compounds, the predictive cues in these compounds (C and D) were necessarily less predictive than their counterparts in the “certain” compounds (A and B). Crucially, though, there was a difference in the relative predictiveness of the two cues in each compound; A, B, C, and D were the most reliable available source of information when they appeared.

Method Participants Forty-three students from the University of New South Wales participated in exchange for course credit. Participants could also receive a performance-related monetary bonus. For every correct response during training they gained $0.10 AUD, and for every incorrect response they lost $0.10 AUD. Instructions stated that, since training trials were two-alternative forced choice, this meant that participants needed to perform at an above-chance level in order to end the experiment with a positive total. Those ending with a zero or negative total received no bonus. The average bonus received was $2.80. Apparatus and stimuli Participants were tested individually in a quiet room with a standard desktop computer and a 58.4 cm widescreen eye tracking monitor (TX300, Tobii Technology, Danderyd, Sweden), which samples eye gaze at 300 Hz. Participants sat with their eyes approximately 55 cm from the monitor, using a chin rest to maintain a fixed position, and wore headphones. The eye tracker was calibrated using a five-point procedure at the start of the experiment. Stimulus presentation was controlled by MATLAB using the Psychophysics Toolbox extensions (Brainard, 1997; Kleiner, Brainard, & Pelli, 2007; Pelli, 1997). Responses were made using the mouse. Cues A to D and W to Z were represented by images of fictitious molecules, which varied in the number of atoms, their colour, and their configuration (Figure 1 shows two examples). Each was shown on a white background of 10.6 cm × 8cm. Outcomes o1 and o2 were pictures of “mutant

THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2015

Downloaded by [New York University] at 15:42 10 May 2015

EXPLORATORY AND EXPLOITATIVE ATTENTION

Figure 1. A screenshot of a sample training trial. See text for a description of the stimuli and the task. To view this figure in colour, please visit the online version of this Journal.

creatures”, presented on a white background of 4 cm × 5.3cm. Assignment of images to specific cue and outcome roles in the design shown in Table 1 was randomized for each participant. Procedure Participants were instructed that they were to play the role of a scientist who had discovered a new set of chemicals that could create mutant creatures, and that their task on each trial was to predict which mutant would be created when a particular pair of chemicals was mixed with a “goo” substance. Participants were told that the experiment was designed to examine their learning and memory skills and were given details of the monetary reward for good performance. Each trial began with the presentation of a fixation cross for 1 s in the horizontal centre of the screen and vertically in line with the centre of the two cues. Two chemicals then appeared (arranged horizontally at the top of the screen 16 cm apart), with pictures of two mutants (arranged vertically at the bottom of the screen), representing Outcomes o1 and o2; Figure 1 shows a screenshot. Participants clicked the picture of the mutant that they thought would be created when these

chemicals were mixed with the goo and then clicked “OK” to confirm their decision. Immediate feedback was provided. “Correct!” appeared in green or “Incorrect” appeared in red in the centre of the screen, and a green outline appeared around the correct outcome. Incorrect responses were accompanied by a buzz played over headphones. The feedback, chemicals, and mutants remained on screen for 3 s before disappearing. The screen then cleared, and the fixation cross for the next trial was presented. Participants could take as long as they liked to respond, though trials with response times longer than 10 s were not analysed. Participants experienced 20 blocks, with each of the eight trial types shown in Table 1 occurring once per block. Trial order in each block was randomized with the constraint that the same compound could not occur on consecutive trials (i.e., the last and first trials of adjacent blocks). The two cues in each compound were presented an equal number of times (10) in the left and right cue positions, and the order of these presentations was randomized across the experiment. For the “certain” compounds, the same outcome occurred each time that compound appeared: For

THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2015

5

Downloaded by [New York University] at 15:42 10 May 2015

BEESLEY ET AL.

compounds AW and AX, the correct response was always Outcome o1, and for BW and BX the correct response was always o2. The “uncertain” compounds were followed by the more common outcome (o1 for CY and CZ; o2 for DY and DZ —see Table 1) on 14 of their 20 appearances and by the less common outcome (o2 for CY and CZ; o1 for DY and DZ) on the remaining six appearances. These common and rare trials occurred in random order, with the constraint that three of the six rare trials occurred in the first half of the experiment and three in the second half. Before the first trial and after every 20 trials a small red dot was presented in the centre of the screen. Participants were asked to fixate this dot and click the centre with the mouse to continue. This allowed us to check and adjust for any drift in the eye-tracking recordings. Eye gaze data preparation The gaze data were adjusted to compensate for any drift that occurred in the recording process. These data were then analysed for the decision period from stimulus presentation to when participants clicked the OK button to register their response. For each trial the percentage of missing samples resulting from tracking errors (e.g., due to eye blinks) was calculated, and the data from the eye with the lowest proportion of missing samples were used for that trial. Missing data that spanned a gap of no more than 75 milliseconds were replaced by interpolating between the data immediately preceding and following the gap. Fixations were determined by a displacement method (Salvucci & Goldberg, 2000). The range of values of both the vertical and horizontal coordinates of the gaze data were analysed in 150-ms windows. If neither coordinate deviated beyond a range of 75 pixels, then the analysed window was deemed a fixation. Fixation length was determined by extending this window until a displacement of more than 75 pixels was recorded. Fixation position was determined by the mean horizontal and vertical pixel values across the fixation sample. This method produced, on average, 6.3 fixations per trial on cue stimuli during the decision period in the first epoch

6

of training (see below), decreasing to an average of 4.3 fixations in the final epoch. Total fixation time spent on each cue stimulus during the decision period is hereafter used as the eye gaze dependent measure. Trials without any fixation on cues were not used in the eye gaze analysis. This led to an exclusion of more than 50% of data for five participants, and the data from these participants were not included in the eye-tracking analyses (mean trials excluded for the remaining 38 participants was 8%).

Results In order to determine participants’ expectations for the occurrence of the two outcomes, we calculated the proportion of “probable” outcome responses during training, rather than raw accuracy. A probable outcome response reflects a choice of the outcome most likely to occur given the compound of cues (see Table 1). Figure 2a shows the proportion of probable outcome responses across the 20 blocks: Data have been combined into four epochs (with each epoch representing the averaged data from five consecutive blocks). It is clear from Figure 2a that, unsurprisingly, the proportion of probable outcome responses to certain compounds was higher than that for uncertain compounds. A repeated measures analysis of variance (ANOVA) with factors of uncertainty and epoch revealed a main effect of uncertainty, F(1, 42) = 23.69, h2p = .36, p , .001, and epoch, F(3, 126) = 16.40, h2p = .28, p , .001, and a significant interaction, F(3, 126) = 4.02, h2p = .09, p = .009. While participants were on average less likely to select the more probable outcome for the uncertain compounds (compared to certain compounds), their responses were significantly above the guessing level of .5 in all but the first epoch: for Epoch 1, t(42) = 1.71, Cohen’s d = 0.26, p = .09; for each of Epochs 2 to 4, t(42) . 4.17, d . 0.64, p , .001. Figure 2b shows the mean fixation time on predictive and nonpredictive cues for certain and uncertain compounds. Participants spent more time fixating cues in uncertain compounds than cues in certain compounds. An ANOVA with factors of predictiveness (predictive vs.

THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2015

Downloaded by [New York University] at 15:42 10 May 2015

EXPLORATORY AND EXPLOITATIVE ATTENTION

Figure 2. Data from Experiment 1 plotted across epochs of five blocks. Participants were trained with both certain and uncertain compounds. (a) Proportion of responses to the most probable outcome paired with each compound in Stage 1. (b) Average fixation time to predictive (P) and nonpredictive (NP) cues in Stage 1. (c) Average fixation times as a proportion of the response time. Error bars reflect standard error of the mean, following the removal of any between-subject variability, as described by Cousineau (2005).

nonpredictive), uncertainty (certain vs. uncertain) and epoch found no main effect of predictiveness, F(1, 37) = 1.40, p = .25, but did reveal a main effect of uncertainty, F(1, 37) = 14.61, h2p = .28, p , .001, indicating that more overt attention was devoted to cues in uncertain compounds than

those in certain compounds. The main effect of epoch was significant, F(3, 111) = 19.33, h2p = .34, p , .001, and this interacted with uncertainty, F(3, 111) = 7.90, h2p = .18, p , .001, indicating that overt attention decreased more rapidly to cues in certain compounds than to cues in uncertain compounds across training. The factor of predictiveness did not interact with either uncertainty, F(1, 37) = 1.33, p = .26, or epoch, F(3, 111) = 1.91, p = .13, and the three-way interaction was not significant, F , 1. The preceding analysis shows that participants spent longer looking at cues in uncertain compounds than at cues in certain compounds. However, interpretation of these data is complicated by the fact that response times were significantly longer on trials featuring certain compounds (mean, M = 3.6s; standard error of the mean, SEM = 0.1s) than on trials featuring uncertain compounds (M = 4.1 s, SEM = 0.1 s), t(37) = 3.98, d . 0.64, p , .001. Consequently we might expect longer total fixation times on trials with uncertain compounds purely as a result of the longer exposure period over which these fixations can occur. In order to establish whether the difference in fixation times was entirely accounted for by differences in response times, we normalized the eye gaze data by dividing the time spent fixating each cue on a trial by the time taken to respond on that trial. As can be seen in Figure 2c, when the fixation data are expressed as a proportion of the response time, the difference in fixation times to cues in uncertain and certain compounds remains. Importantly, an ANOVA conducted on these data revealed an identical pattern to the analysis of absolute fixation times: There was no main effect of predictiveness, F(1, 37) = 2.32, p = .14, but there was a significant main effect of uncertainty, F(1, 37) = 16.04, h2p = .30, p , .001, and a significant main effect of epoch, F(3, 111) = 3.47, h2p = .09, p = .019. There was a significant interaction between uncertainty and epoch, F(3, 111) = 5.69, h2p = .13, p = .001, no interaction between uncertainty and predictiveness, F(1, 37) = 3.64, p = .064, no interaction between predictiveness and epoch, F(3, 111) = 2.347, p = .077, and no three-way

THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2015

7

Downloaded by [New York University] at 15:42 10 May 2015

BEESLEY ET AL.

interaction effect, F , 1. In order to explore the significant interaction between uncertainty and epoch further, the data were averaged across the factor of cue-predictiveness, and the data from each condition of uncertainty were submitted to a one-way ANOVA on the factor of epoch. This revealed that the proportional dwell time on cues reduced across the course of the experiment for certain compounds, F(3, 111) = 7.81, h2p = .17, p , .001, but maintained a consistent level for uncertain compounds, F , 1. Planned comparisons on the proportional fixation time found a significant effect of predictiveness for certain compounds, with greater overt attention to predictive cues over nonpredictive cues, t(37) = 2.61, d = 0.42, p = .026 (Bonferroni corrected for multiple comparisons); the corresponding effect of predictiveness for uncertain compounds was not significant, t , 1.

Discussion In Experiment 1, participants experienced cues that were either predictive or nonpredictive of the outcomes with which they were paired. These cues were presented in compounds that also differed in their overall level of uncertainty; that is, certain compounds were always followed by one outcome (they contained a perfectly predictive cue) while uncertain compounds were paired with their respective outcomes in a probabilistic fashion (they contained a partially predictive cue). Participants’ decisions differed across these different compound types. Participants were consistent in choosing the outcome paired with the certain compounds. However, participants seemed to use a probability matching strategy with respect to the uncertain compounds; the probability of the more likely outcome was .7, and participants on average selected this outcome on 62% of trials in the final epoch of training. This is a suboptimal strategy: Since it is impossible for participants to anticipate the particular trials on which the alternative outcome will be presented, the optimal strategy for all compounds is to select the more probable outcome on every trial (a maximizing strategy) and to tolerate the unpredictable minority of trials

8

on which this response is incorrect. However, the response data suggest that few (if any) participants adopted such a strategy: Only five out of the 43 participants picked the more probable outcome on more than 80% of trials, and only one on more than 90% of trials. Thus, almost every participant adopted a strategy that was suboptimal to achieving the best performance and achieving the greatest financial reward. Of particular interest were the fixation time data, in which we observed marked differences in the overt attention that participants paid to the different cues. Overall, participants spent longer observing cues within uncertain compounds than within certain compounds. This difference was not simply a consequence of the difference in response times between trials with uncertain versus certain compounds, since it persisted even when fixation times were expressed as a proportion of response time on each trial. Thus the data show that participants spent a greater proportion of the trial attending to cues in uncertain compounds than to cues in certain compounds. The implication, then, is that the cues generally had a higher attentional priority when they belonged to uncertain compounds than when they belonged to certain compounds. This between-compound difference in eye gaze is consistent with the pattern of attentional exploration anticipated by the Pearce–Hall model, wherein more attention is paid to stimuli when the consequences of those stimuli are less certain. This exploratory pattern of attention is also in keeping with the probability matching strategy described above. The response data indicate that participants are unwilling to tolerate a probabilistic relationship between cues and outcomes in the task (since otherwise they would show evidence of maximizing). Instead the probability matching strategy implies that, under uncertainty, participants are trying to use other (spurious) pieces of information to predict which trials will be followed by the minority outcome. The increased overt attention to cues in uncertain compounds may well reflect this fruitless search for additional predictive information in the cues.

THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2015

Downloaded by [New York University] at 15:42 10 May 2015

EXPLORATORY AND EXPLOITATIVE ATTENTION

At the within-compound level, there was no overall difference in the overt attention paid to predictive and nonpredictive cues. However, we did observe greater attention to predictive over nonpredictive cues in the certain compounds. This latter finding replicates the bias in eye gaze observed by Le Pelley et al. (2011) using a similar design with certain outcomes. This exploitative pattern of greater attention to those cues that are more diagnostic of the correct response is consistent with the selective attentional processes described by Mackintosh (1975). Thus, taken together, the data of Experiment 1 provide preliminary support for the role of both uncertainty and predictiveness in determining the attentional resources devoted to stimuli during associative learning. More specifically, these findings suggest that attentional exploration (which increases when uncertainty is greater) operates at the level of stimulus compounds, promoting attention to cues that belong to compounds whose consequences are uncertain. In contrast, attentional exploitation (which increases as a function of predictiveness) operates at the level of individual cues, with greater attention to cues that more accurately predict the outcome than to those that are less accurate predictors.

EXPERIMENT 2 In Experiment 1, participants experienced both certain and uncertain compounds simultaneously in a within-subjects manipulation of uncertainty. It is not clear from Experiment 1, however, whether the effect of increased attention to uncertain compounds (which we term attentional exploration) was dependent upon the simultaneous training of a set of certain compounds. That is, it is possible that attention may be devoted to uncertain compounds due to the relatively weak association these compounds have with their respective outcomes in the task, in comparison with the certain compounds. If the relative difference in uncertainty is crucial to the observation of an increase in attention, then we would not expect to observe the same effect in a between-subjects design. Alternatively, if

attentional exploration is determined by the absolute level of uncertainty, then similar effects of uncertainty should be observed when a betweensubjects manipulation is used. According to the Pearce and Hall (1980) model of attentional exploration, changes in attention are determined only by the associative strengths of the cues present on the current trial. The model does not assume that comparisons are made between cues occurring on separate trials. Therefore this model anticipates that it is the absolute level of certainty that is the critical determinant of the distribution of attention to cues and hence that the effect of uncertainty should also be observed in a betweensubjects manipulation. Experiment 2 also examined the effect of both predictiveness and uncertainty on the rate at which cues form associations with novel outcomes. Up to this point, we have considered only the influence that learning has on the pattern of overt attention paid to cues. However, attentional theories of learning such as the Mackintosh and Pearce–Hall models go further by specifying an interactive relationship between learning and attention. That is, such theories state not only that (a) learning about the predictiveness of stimuli influences the attention that is paid to those stimuli, but also that (b) these changes in attention influence the rate of subsequent associative learning about the stimuli, also known as the associability of those stimuli. In support of this latter suggestion, many studies in both humans and nonhuman animals have demonstrated an influence of prior learning about the predictiveness of cues on the subsequent associability of those cues (for reviews, see Le Pelley, 2004, 2010a; Pearce & Mackintosh, 2010). In humans at least, the vast majority of these studies have provided evidence consistent with the exploitative relationship between learning and attention exemplified by the Mackintosh model, with predictive cues having higher associability than nonpredictive cues. Notably, and in line with the thesis advanced in this article, those studies that support the Mackintosh account have typically used training with perfectly predictable outcomes (e.g., Bonardi, Graham, Hall, & Mitchell, 2005; Kruschke, 1996; Le Pelley & McLaren, 2003; Le

THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2015

9

Downloaded by [New York University] at 15:42 10 May 2015

BEESLEY ET AL.

Pelley et al., 2010; Le Pelley, Suret, & Beesley, 2009; Lochmann & Wills, 2003; Whitney & White, 1993; but see also Beesley & Le Pelley, 2010). To the best of our knowledge, only one study of associability in humans has provided evidence for the exploratory pattern anticipated by the Pearce–Hall model (Griffiths, Johnson, & Mitchell, 2011), with cues followed by less predictable (i.e., more uncertain) outcomes having higher associability than cues followed by perfectly predictable outcomes. Experiment 1 found evidence for both exploitative and exploratory processes in the context of overt attention (measured using eye gaze). Experiment 2 provided a more rigorous test of the attention–learning interaction at the heart of attentional theories of associative learning, by measuring both sides of this interaction. Thus Experiment 2 examined the influence of prior experience of predictiveness and uncertainty on both overt attention (by measuring eye gaze to the different cues) and associability. As in previous experiments (e.g., Le Pelley & McLaren, 2003), we compared the associability of cues by measuring the rate at which they formed associations with new outcomes in a second stage of training.

Method Participants Fifty-three students from the University of New South Wales participated for course credit. Allocation of participants to the certain condition (N = 24) or the uncertain condition (N = 29) was randomly determined. Experiment 2 used the same financial incentive as that in Experiment 1. The average reward earned was $2.00. Design Table 2 shows the design of Experiment 2. During Stage 1, participants experienced four compounds, each composed of a predictive and a nonpredictive cue. For participants in the certain condition, each compound was paired with the same outcome each time it was presented. For participants in the uncertain condition, each compound was paired with a particular outcome on a randomly selected two presentations from each set of three consecutive presentations and was paired with the alternative outcome on the remaining presentation. This ensured that the “uncertain trials” occurred with fairly regular frequency for participants in this condition.

Table 2. Design of Experiment 2 Stage 1 Condition

Compound

Certain condition

Uncertain condition

Stage 2

P(o1)

P(o2)

AW AX BW BX

1 1 0 0

0 0 1 1

AW AX BW BX

.67 .67 .33 .33

.33 .33 .67 .67

Compound

P(o3)

P(o4)

Cue test

Compound test

AW BX CY DZ

1 0 1 0

0 1 0 1

A → o3/o4? B → o3/o4? W → o3/o4? X → o3/o4?

AX → o3/o4?

AW BX CY DZ

1 0 1 0

0 1 0 1

A → o3/o4? B → o3/o4? W → o3/o4? X → o3/o4?

AX → o3/o4?

BW → o3/o4?

BW → o3/o4?

Note: With regard to Stage 1, P(o1) indicates the proportion of training trials on which each compound was paired with Outcome o1, and P(o2) indicates the proportion of trials on which the same compound was paired with Outcome o2. Hence in the certain condition, compounds AW, AX, BW, and BX were consistently followed by one particular outcome in Stage 1, while in the uncertain condition, these compounds were paired with both outcomes (though one outcome was more common than the other). With regard to Stage 2, P(o3) indicates the proportion of training trials on which each compound was paired with Outcome o3, and P(o4) indicates the proportion of trials on which the same compound was paired with Outcome o4. Hence in both conditions all compounds were consistently followed by one particular outcome in Stage 2. Stage 2 training and test phases were identical for participants in the certain and uncertain conditions.

10

THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2015

Downloaded by [New York University] at 15:42 10 May 2015

EXPLORATORY AND EXPLOITATIVE ATTENTION

The aim of Stage 2 was to assess the rate of new learning for cues that had been predictive or nonpredictive in Stage 1. Critical compounds AW and BX each contained a predictive cue and a nonpredictive cue from Stage 1; these compounds were paired with new outcomes o3 and o4, respectively. Importantly, for both groups of participants, there was no outcome uncertainty in Stage 2: AW was always paired with o3, and BX was always paired with o4. This meant that the Stage 2 cue– outcome contingencies were exactly the same in the two groups. Hence by comparing the extent to which participants in the two groups formed associations between these cues and the novel outcomes in Stage 2, we could examine the influence of differences in Stage 1 uncertainty and predictiveness on the associability of the cues. Moreover, during Stage 2 the two cues in each compound were both (objectively) equally predictive of the outcome with which they were paired. For example, Cue A was paired with Outcome o3 exactly the same number of times, and on exactly the same trials, as Cue W was paired with Outcome o3. Hence an unbiased observer would learn to an equal extent that A predicted o3 and that W predicted o3. By extension, any difference in the strength of the A–o3 and W–o3 associations would reflect a difference in the associability of these cues resulting from the difference in their predictiveness in Stage 1. Compounds CY and DZ, each comprising two novel cues, were also trained with Outcomes o3 and o4 in Stage 2. These filler compounds were used to ensure that the demands on working memory during Stage 2 were similar to those in Stage 1, and they are not discussed further. The strength of the associations between critical cues and the Stage 2 outcomes were assessed in two test phases. In the first of these—the cue test—participants rated which outcome (o3 or o4) was more likely to follow each cue individually. In the second test phase—the compound test—participants rated which outcome was more likely to follow compounds AX and BW. Within each of these compounds, one cue (A/W) had been paired with Outcome o3 in Stage 2, and the other (B/X) had been paired with Outcome o4. Consider compound

AX. If a stronger association formed between Cue A and Outcome o3 in Stage 2, compared to the association between Cue X and Outcome o4, then participants would rate compound AX as more likely to cause Outcome o3 than Outcome o4. Similarly, ratings for compound BW could be used to determine whether greater learning had occurred for Cue B or Cue W in Stage 2. Apparatus and stimuli The apparatus and stimuli were identical to those in Experiment 1, with the following exceptions: Two additional images of mutants were used for Outcomes o3 and o4 in Stage 2; four additional images were used for novel cues presented in Stage 2; chemicals were mixed with a red goo in Stage 2. Procedure The procedure for Stage 1 was identical to that for Experiment 1, with the exception that participants experienced only four compound stimuli per block, for 24 blocks. At the end of Stage 1, participants were informed that in the next stage, the same set of chemicals would now be mixed with a red goo, which produced two new types of mutant. Training trials in Stage 2 were the same as those in Stage 1, except (a) a flask of red goo, rather than green goo, appeared on each trial, and (b) two new mutant pictures, representing Outcomes o3 and o4, were presented. Stage 2 comprised six training blocks, with each of the four compounds (see Table 2) appearing once per block in random order, with the constraint that consecutive trials could not present the same compound. Prior to the test phases, participants were instructed that they would be required to make judgments using the knowledge they had acquired in Stage 2. On each trial of the cue test, a single chemical was displayed in the centre of the screen along with pictures of Mutants o3 and o4 on the left and right side of the screen, respectively, and the question “Which mutant do you think this chemical would create?”. Participants made their decision by clicking the appropriate button on a 10-point scale. The leftmost of these buttons appeared next to the picture of Mutant o3, and

THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2015

11

Downloaded by [New York University] at 15:42 10 May 2015

BEESLEY ET AL.

the rightmost next to Mutant o4; each of these endpoints was labelled “Very likely to create this mutant”. The label “Uncertain which mutant will be created” appeared under the buttons in the middle of this scale. Additional instructions were provided before the compound test trials, detailing that the participant was required to make ratings on the basis of the compound of cues. The form of compound test trials was similar to that for cue test trials, except that two cues were presented on each trial. The left/right arrangement of these two cues was randomly determined on each trial of this phase. No feedback was given during the test phases.

Data preparation Eye gaze data were prepared using the procedure described for Experiment 1. Three participants in the certain condition and one participant in the uncertain condition were excluded from eye gaze analyses due to having more than 50% missing samples. Participants’ choices during the test phase were used to calculate causal judgment scores. For the cue tests with A and W (which were paired with Outcome o3 in Stage 2), choice of the leftmost button (corresponding to “Very likely to create mutant o3”) was scored as +5, and choice of the rightmost button (“Very likely to create mutant o4”) was scored as –5; intermediate buttons were scored as intermediate values (with no zero button). For the cue tests with B and X (which were paired with Outcome o4 in Stage 2), choice of the leftmost button was scored as −5, and choice of the rightmost button was scored as +5. Consequently, for the cue tests, a high causal judgment score corresponds to a “correct” choice that the cue was a strong cause of the outcome with which it had actually been paired in Stage 2, and a negative score corresponds to an “incorrect” choice that the cue was a cause of the outcome with which it had never been paired. Mean scores were then calculated for “previously predictive” cues (i.e., cues that were predictive in Stage 1: A and B) and “previously nonpredictive” cues (V and W).

12

For the compound test with AX, the leftmost (o3) button was scored as +5 and the rightmost (o4) button as –5; for the compound test with BW this was reversed. Hence for the compound tests, a high causal judgment score corresponds to choice of the outcome with which the previously predictive cue in the compound was paired in Stage 2 (since A was paired with o3, and B was paired with o4), while a negative score corresponds to choice of the outcome with which the previously nonpredictive cue was paired (since X was paired with o4 and W with o3).

Results Stage 1 Figure 3a plots the proportion of probable outcome responses in Stage 1, for each epoch of 24 trials (with each epoch representing the averaged data from six consecutive blocks). As expected, participants in the certain condition were more consistent in choosing the probable outcome throughout this stage. This was confirmed by a mixed-model ANOVA with factors of epoch and uncertainty. This revealed a main effect of uncertainty, F(1, 51) = 84.08, h2p = .62, p , .001, a main effect of epoch, F(3, 153) = 12.09, h2p = .19, p , .001, and a significant interaction, F(3, 153) = 4.31, h2p = .08, p = .006. In all but the first epoch, the proportion of probable responses in the uncertain condition was significantly above the chance level of .5: for Epoch 1, t(28) = 1.65, p = .11; for each of Epochs 2–4, t(28) . 3.45, d . 0.64, p , .005. Figure 3b shows fixation time (our measure of overt attention) on predictive and nonpredictive cues for both the certain and uncertain conditions during Stage 1. A mixed ANOVA with factors of uncertainty, predictiveness, and epoch revealed a main effect of uncertainty, F(1, 47) = 9.56, h2p = .17, p = .003, reflecting greater overt attention to cues in the uncertain condition than in the certain condition. There was also a main effect of predictiveness, F(1, 47) = 11.02, h2p = .19, p = .002, reflecting that at the within-compound level, more attention was allocated to predictive cues than nonpredictive cues. The main effect of epoch was significant, F(3, 141) = 14.98,

THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2015

Downloaded by [New York University] at 15:42 10 May 2015

EXPLORATORY AND EXPLOITATIVE ATTENTION

Figure 3. Data from Experiment 2. Participants were trained with either certain or uncertain compounds. (a) Proportion of responses to the most probable outcome paired with each compound in Stage 1, plotted across epochs of six blocks. (b) Average fixation time to predictive (P) and nonpredictive (NP) cues in Stage 1, plotted across epochs of six blocks. (c) Average fixation times in Stage 1 as a proportion of the response time. (d) Proportion of correct responses in Stage 2, plotted across epochs of two blocks. (e) Data displayed as columns show absolute fixation times in Stage 2 to cues that had been trained as either predictive or nonpredictive in Stage 1; symbols show the same data expressed as a proportion of the response time. (f) Average test scores for cues trained as predictive or nonpredictive in Stage 1; positive compound test scores reflect greater learning about predictive over nonpredictive cues (see text). Error bars reflect the standard error of the mean.

h2p = .24, p , .001, and there was an interaction between epoch and uncertainty, F(3, 141) = 6.87, h2p = .13, p , .001, with overt attention to cues decreasing more rapidly in the certain condition than in the uncertain condition. The interaction between predictiveness and epoch was significant, F(3, 141) = 2.95, h2p = .06, p = .035, suggesting that the attentional bias towards predictive cues emerged over the course of Stage 1 training. No other interactions were significant, Fs ≤ 2.20, ps ≥ .15.

Figure 3c shows the fixation data expressed as a proportion of the response time on each trial. These data largely mirror the absolute fixation time data, in that there is a clear effect of uncertainty, with a greater proportion of time spent on uncertain cues, as well as an effect of predictiveness, with a greater proportion of time spent on predictive cues. This pattern was confirmed by ANOVA, which revealed main effects of uncertainty, F(1, 47) = 9.47, h2p = .17, p = .003, and predictiveness, F(1, 48) = 14.18, h2p = .23, p , .001,

THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2015

13

Downloaded by [New York University] at 15:42 10 May 2015

BEESLEY ET AL.

but no main effect of epoch, F , 1. In contrast to the analysis of absolute fixation times, the interaction between uncertainty and predictiveness was significant for the proportional fixation time data, F(1, 47) = 4.60, h2p = .09, p = .037, with a greater effect of predictiveness in the case of certain compounds than in the case of uncertain compounds. Collapsing across epochs, participants spent a greater proportion of time looking at predictive over nonpredictive cues in certain compounds, t(20) = 3.84, d = 0.84, p = .001, but demonstrated no reliable bias towards either cue in the uncertain compounds, t , 1. There was a significant interaction between epoch and predictiveness, F(3, 141) = 6.25, h2p = .12, p = .001, with the size of the predictiveness effect increasing across epochs. The interaction between epoch and uncertainty was also significant, F(3, 141) = 6.49, h2p = .12, p , .001. Specifically, proportional fixation time decreased across epochs for cues in certain compounds (simple effect of epoch for certain condition: F(3, 60) = 5.25, h2p = .21, p = .003), but increased, albeit only at a trend level of significance, for cues in uncertain compounds (simple effect of epoch for uncertain condition: F(3, 81) = 2.52, h2p = .09, p = .064). Finally, the Predictiveness × Uncertainty × Epoch interaction was not significant, F , 1. Stage 2 Figure 3d shows the accuracy of responses across Stage 2 (for epochs of two training blocks) as a function of the level of uncertainty that participants experienced in Stage 1. These data were analysed by ANOVA with factors of epoch and prior-uncertainty, where the “prior” prefix highlights that the compounds differed in the uncertainty with which

they were previously associated in Stage 1 [since all compounds were (objectively) equally certain in Stage 2]. This analysis revealed a significant main effect of epoch, F(2, 102) = 15.10, h2p = .23, p , .001. The main effect of prior-uncertainty approached significance, F(1, 51) = 3.76, h2p = .069, p = .058. There was no evidence of an interaction between epoch and prior-uncertainty, F , 1. While the main effect of prior-uncertainty did not reach significance, it is worth noting that by the final epoch of Stage 2, participants in the certain condition were making more accurate responses than participants in the uncertain condition, t(51) = 2.49, d = 0.72, p = .049 (Bonferroni corrected for multiple comparisons across the three epochs). The left side of Figure 3e shows the mean absolute fixation time in Stage 2 for cues that were trained either as predictive or nonpredictive in Stage 1. These data have been collapsed across all Stage 2 trials, rather than combined into epochs as in Figure 3d, because eye gaze data are considerably more noisy than outcome selections. ANOVA with factors of prior-predictiveness and prioruncertainty found a main effect of prior-predictiveness, F(1, 47) = 6.92, h2p = .13, p = .011, with greater overt attention during Stage 2 to cues that had been predictive in Stage 1 than to cues that had been nonpredictive. There was no main effect of prior-uncertainty, F , 1, and no interaction, F(1, 47) = 2.53, p = .12. This pattern of analysis was mirrored in the proportional fixation time shown on the right side of Figure 3e: a main effect of prior-predictiveness, F(1, 47) = 7.26, h2p = .13, p = .010, no effect of uncertainty, F(1, 47) = 1.32, h2p = .03, p = .26, and no interaction, F(1, 47) = 1.68, h2p = .04, p = .20.1

1

The analysis of Stage 2 gaze data presented in the main text averages over all blocks of Stage 2. This comprises six presentations of each compound, which is equivalent to each of the epochs in the analysis of gaze data from Stage 1 (i.e., each of the four epochs in Figures 3b and 3c also comprise six presentations of each compound). Eye gaze data are inherently rather noisy, and hence we are somewhat wary of drawing strong conclusions from analyses of smaller samples. Nevertheless, at the request of a reviewer we analysed gaze data from the first block of Stage 2 (which contains just one presentation of each of the two compounds). Three additional participants were removed who had missing data on these two trials (e.g., long RTs). The mean dwell times in the certain condition were 790 ms (SEM = 119 ms) and 695 ms (SEM = 70 ms), for the previously predictive and previously nonpredictive cues, respectively. The mean dwell times in the uncertain condition were 919 ms (SEM = 119 ms) and 937 ms (SEM = 150 ms), for the previously predictive and previously nonpredictive cues, respectively. A two-way ANOVA revealed no main effect of prior-predictiveness, F , 1, or uncertainty,

14

THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2015

Downloaded by [New York University] at 15:42 10 May 2015

EXPLORATORY AND EXPLOITATIVE ATTENTION

Test Figure 3f shows causal judgment scores (see Data Preparation section) from the cue and compound tests. ANOVA using the scores from the cue tests, with factors of prior-predictiveness (previously predictive vs. previously nonpredictive) and prior-uncertainty, found no effect of prior-predictiveness, no effect of prior-uncertainty, and no interaction, Fs , 1.8, ps . .19. For the compound tests, while scores were numerically larger in the certain condition than in the uncertain condition, this did not reflect a statistically significant effect, t(51) = 1.79, p = .079. Notably, collapsing across the two certainty conditions, the mean score for compound tests was significantly greater than zero, t(52) = 2.61, d = 0.33, p = .012. As noted in the Data Preparation section, this means that (on average) participants viewed the compounds as stronger causes of the outcome with which the previously predictive cue had been paired in Stage 2 than of the outcome with which the previously nonpredictive cue had been paired. This suggests that there was stronger learning about previously predictive cues (A and B) than previously nonpredictive cues (W and X) in Stage 2. Considering the groups individually, scores for the compound test were significantly greater than zero (indicating this bias in learning) for the certain condition, t(23) = 2.66, d = 0.46, p = .014, but not for the uncertain condition, t , 1.

Discussion In Stage 1 of Experiment 2, uncertainty was manipulated on a between-subjects basis. Despite this change, the eye gaze data in this stage were similar to those of Experiment 1, with more overt attention paid to cues trained in uncertain compounds (i.e., compounds that were paired with outcomes in a probabilistic fashion). Thus, these data support the view that the absolute level of (un)certainty modulates the attention paid to cue stimuli, with greater uncertainty producing greater attention: an exploratory pattern. At the within-

compound level we observed an effect of predictiveness on the attention paid to cues. This effect reached significance only in the certain condition, with participants spending longer looking at predictive cues than nonpredictive cues overall. This exploitative pattern of attention replicates the finding of Experiment 1 and Le Pelley et al. (2011). No significant effect of predictiveness was observed in the uncertain condition. As in Experiment 1, all of these patterns persisted when fixation times were normalized in order to control for the effects of any differences in response time on the different trial types. Experiment 2 also examined the effect of uncertainty and predictiveness on the associability of cues. In Stage 2, participants experienced two compounds that they had been exposed to in Stage 1, each containing one predictive and one nonpredictive cue, and these compounds were paired with novel outcomes o3 and o4. Across epochs, there was a trend-level (p = .058) effect in the direction of participants in the certain condition learning the new cue–outcome relationships in Stage 2 more accurately than participants in the uncertain condition; in the final epoch of Stage 2 this advantage for participants in the certain condition was statistically significant. Thus, while participants were spending proportionally more time attending to cues in the uncertain condition at the end of Stage 1, learning about relationships pertaining to these same stimuli in Stage 2 was, if anything, attenuated relative to cues trained under certainty in Stage 1. We found no evidence of a carryover effect of participants’ previous experience of uncertainty on overt attention during Stage 2: The pattern of eye gaze to cues in Stage 2 was equivalent across the two conditions. However, at the within-compound level there was evidence for an influence of priorpredictiveness on attention. Specifically, cues previously experienced as the best available predictors of outcomes were attended more in Stage 2 than cues previously experienced as nonpredictive. This replicates our prior finding that predictiveness-

F(1, 44) = 1.29, p = .26, and no interaction, F , 1. As noted above, however, the small sample of trials involved in this analysis means that these null effects should be treated with caution. THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2015

15

Downloaded by [New York University] at 15:42 10 May 2015

BEESLEY ET AL.

based biases in attention persist during subsequent learning episodes involving new cue–outcome relationships (Le Pelley et al., 2011). Notably, given that all cues were (statistically) equally predictive of the outcomes with which they were paired in Stage 2, this pattern of greater attention to previously predictive cues constitutes a bias in participants’ behaviour. The pattern of data from the final causal judgment tests also suggests that learning was influenced by the previous experience of predictiveness but not by the previous experience of uncertainty. In particular, when a compound test was used, participants were more inclined to select the outcome that had been paired with the predictive cue in Stage 2 over the outcome that had been paired with the nonpredictive cue. This implies that participants had learned more rapidly about the previously predictive cue in Stage 2. The numerical trends in the data from the cue tests also supported this suggestion, but did not reach statistical significance. This influence of prior learning about predictiveness on associability replicates many previous demonstrations of better learning about cues trained as predictive over those trained as nonpredictive (e.g., Le Pelley et al., 2011; Le Pelley & McLaren, 2003). The finding that uncertainty has an immediate impact on attention to compounds, but does not persist to increase attention to (or the associability of) cues during subsequent learning, is also supported by data from Trick, Hogarth, and Duka (2011). In a similar design to Hogarth et al. (2008; see introduction) participants were trained with three compounds that were paired in different relationships with an aversive noise: AX+, BX+, CX−. During this phase, Trick et al. (2011) found a larger bias in attention to Cue B over X than that for Cue A (there was no difference in the bias for B and C), suggesting evidence for an increase in exploratory attention brought about by the greater uncertainty on BX+ trials. Participants were then trained to make an instrumental response to avoid presentation of the aversive noise. In a subsequent training phase, participants experienced the same three compounds in the presence of the available avoidance response.

16

It was found that the previous bias in attention to Cue B over X did not transfer to this new phase, while the bias in attention to Cues A and C did transfer reliably. Thus, Trick et al.’s (2011) results concur with the data from Experiment 2 in suggesting that an increase in attention to cues, arising from conditions of uncertainty, reflects a transitory bias in attention, while the bias in selective attention to predictive cues appears to have a persistent effect (as seen primarily in the data from the certain condition). To summarize the findings of Experiment 2, increasing the uncertainty of outcome pairings led to much greater attention being paid to cues overall in Stage 1. However, the influence of greater uncertainty during Stage 1 did not persist to affect attention to cues, or the associability of those cues, in Stage 2. In contrast, cues experienced as predictive in Stage 1 were allocated more attention than those experienced as nonpredictive, and this bias persisted into Stage 2, where it was accompanied by a corresponding difference in the associability of the cues involved.

GENERAL DISCUSSION Two experiments examined the impact of uncertainty and predictiveness on overt attention to cues in the context of contingency learning. In both experiments, as participants learnt about the most likely outcomes that followed each compound, the pattern of eye gaze to cues changed considerably. Interestingly, the proportion of time spent attending to cues on each trial was quite different depending on whether the compounds were consistently (certain condition) or probabilistically (uncertain condition) paired with the two outcomes. In particular, across the course of training, the proportion of time spent attending to cues in certain compounds decreased, while the proportion of time spent attending to cues in uncertain compounds was maintained at a relatively high level. The finding of greater attention to cues under uncertainty reflects an “exploratory” pattern of attention. The suggestion is that, when participants are unable to predict the outcome of a trial

THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2015

Downloaded by [New York University] at 15:42 10 May 2015

EXPLORATORY AND EXPLOITATIVE ATTENTION

with sufficient certainty, they will prioritize exploration of the cues in an attempt to identify additional sources of information on which to base their predictions. That is, consistent with the Pearce–Hall model, exploratory attention is an increasing function of the magnitude of prediction error that is associated with a compound of stimuli during training. Thus early in training, before participants have learned the cue–outcome contingencies, and hence prediction errors are large in both conditions, attention to cues is generally high. In the certain condition, as participants learn to predict the correct outcome on each trial, attention to cues falls rapidly. However, in the uncertain condition, residual outcome uncertainty ensures that prediction error remains relatively high even after extensive training, and hence overt attention to cues will remain high as participants continue to search for useful information. In both experiments, uncertainty exerted its effect at a between-compound level; overt attention was greater to all cues that belonged to compounds whose outcomes were uncertain. However, at a within-compound level, the pattern of attention was determined by the relative predictiveness of cues, in a manner consistent with Mackintosh’s (1975) theory: Overt attention was greater to the cue in the compound that was the better predictor of the outcome on the current trial than to the nonpredictive cue. We label this an “exploitative” pattern of attention, in that participants allocate more attentional resources to those stimuli that they have previously learnt are most useful for making accurate predictions. Taken together, we suggest that these patterns in attention represent the functioning of two mechanisms—one exploratory and one exploitative. The exploratory process appears to direct attention to the cues preceding events that cannot yet be predicted with accuracy, in order to seek out potentially useful sources of predictive information. The exploitative process then compares the predictive validity of the available cues and biases attention towards those sources of information that are identified as being the most useful for making predictions. Experiment 2 also tested the influence that prior experience of uncertainty and predictiveness had on

the rate of future learning about cues (i.e., their associability), and the attention paid to them, in a second learning phase in which the cues were paired with new outcomes. There was no evidence that the greater overt attention to cues in the uncertain condition that was observed in Stage 1 carried over to increase the attention paid to, or the rate of learning about, these cues in Stage 2. In contrast, the influence of predictiveness did carry over from Stage 1. Data from Stage 2 and the test phase provided evidence of greater attention to, and faster learning about, cues that had previously been experienced as predictive than those previously experienced as nonpredictive. This suggests that predictiveness exerts a relatively general and persistent influence on attention to cues that acts to bias future learning towards those cues that have been experienced as useful predictors in the past. This relationship between predictiveness and associability could also explain the lower response accuracy in the uncertain condition, particularly at the end of Stage 2 (see Figure 3d). Averaging across all cues, predictiveness was lower for the uncertain condition than for the certain condition; hence mean associability would also be lower in the uncertain condition, and so new learning in Stage 2 would be slower. We note, however, that this analysis is confounded with a difference in the applied strategy or motivation of participants in the two conditions. For example, if, as we have previously suggested, participants in the uncertain condition have adopted a probability matching strategy in Stage 1, then these participants may well continue to respond on this basis in Stage 2, resulting in a decrement in performance. Alternatively, it may be the case that many participants in the uncertain condition are demotivated by the end of Stage 1 as a result of the more frustrating task they have faced, in which very high accuracy is impossible to achieve. Again, this may result in a slower rate of acquisition in Stage 2. The data from the test phase suggest a change in the associability of cues at the within-compound level, which is consistent with previous research. Several studies have demonstrated that the influence of prior differences in the predictiveness of cues carries over to influence the rate of novel

THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2015

17

Downloaded by [New York University] at 15:42 10 May 2015

BEESLEY ET AL.

learning, with faster learning about more predictive cues (e.g., Beesley & Le Pelley, 2010; Bonardi et al., 2005; Le Pelley et al., 2011; Le Pelley & McLaren, 2003; Le Pelley et al., 2010). These studies demonstrate that this influence persists across relatively large changes in context, or the content of learning (in particular, see Bonardi et al., 2005; Le Pelley et al., 2010). In contrast, to the best of our knowledge only one article has reported an influence of uncertainty on learning in humans, with more rapid learning about a cue previously associated with greater uncertainty (Griffiths et al., 2011). Notably, this study did not involve a marked change of context between the critical training phases. The implication, then, is that a sufficiently pronounced change of context acts to “reset” the exploratory attentional process (a suggestion that receives support from studies of analogous processes in animal conditioning, e.g. Channell & Hall, 1983; Swartzentruber & Bouton, 1986). In terms of the current Experiment 2, introducing new outcomes in Stage 2 reintroduced maximal prediction error for all compounds. This may have induced participants to begin their exploratory search process for useful predictive information afresh. However, within each compound of cues, participants still sought to exploit those cues that were known to be the most valid sources of information and hence targeted cues that had previously been useful for making predictions. Future research in humans could directly test this suggestion that transfer of associability based on uncertainty is more sensitive to a change of context than is transfer based on predictiveness, by systematically manipulating the extent of the context change between the initial training stage and the subsequent transfer stage and observing the resulting pattern on the associability of (and overt attention to) cues. More generally, while Experiment 2 did not demonstrate an influence of uncertainty on the rate of learning about cues in Stage 2, we are not suggesting that uncertainty (and the exploratory attention that it promotes) never has an influence on learning. Indeed, we argued in the introduction that the role of exploratory attention is to drive a search for further information that can allow for

18

more accurate predictions to be made in the future. But of course, when an influence of uncertainty was observed in the current studies (in Experiment 1 and Stage 1 of Experiment 2) there was no other such information that would allow more accurate predictions to be made. That is, the search that was driven by exploratory attention was bound to be fruitless in these studies, while the bias resulting from exploitative attention was sufficient to learn about the necessary task contingencies throughout. However, if there were to be additional information that could be discovered by an exploratory process, then the likelihood of detecting—and learning about—this information might be increased under conditions of uncertainty. Future research should assess this possibility.

Simulations We have argued that the current data suggest the operation of two attentional processes in contingency learning: an exploratory process based on uncertainty, and an exploitative process based on predictiveness. Here we present one version of a computational model that implements this suggestion. The equations used in the instantiation of this model are provided in the Appendix along with the specific parameters used for the presented simulation and the range of parameters that permit similar simulation results. Here it is sufficient to note that the model is derived from Le Pelley’s (2004, 2010b) “hybrid” model of associative learning. This model combines an exploratory attentional process (based on the Pearce–Hall model) with an exploitative mechanism (based on the Mackintosh model), such that overt attention is determined by the product of the exploitative and exploratory attentional parameters. The data from Stage 2 of Experiment 2 showed an effect of prior predictiveness at the within-compound level, with greater attention to, and learning about, previously predictive over previously nonpredictive cues. However, at the between-compound level, the greater attention to compounds in the uncertain condition at the end of Stage 1 did not appear to afford any benefit to cue processing in Stage 2 (in terms of either attention or rate of

THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2015

Downloaded by [New York University] at 15:42 10 May 2015

EXPLORATORY AND EXPLOITATIVE ATTENTION

Figure 4. Simulations of Experiment 2 with a version of Le Pelley’s (2004) hybrid model. (a) The “difference in associative strength” (see text) for compounds trained under certain or uncertain conditions in Stage 1. (b) Simulated overt attention to cues in Stage 1. (c) Simulated overt attention to cues in Stage 2. (d) The difference in associative strength for predictive (P) and nonpredictive (NP) cues in Stage 2.

learning). Earlier, we argued that this may reflect the fact that uncertainty-based, exploratory attention is more sensitive to a sudden change in context than is predictiveness-based, exploitative attention. This idea is implemented in the model by resetting all values of exploratory attention to their starting levels when the context shift occurs between training phases. Thus, it is only the parameter reflecting exploitative attention that has a differential effect on the rate of associative learning for the different cues presented in the second phase. Figure 4a plots changes in associative strength over the course of Stage 1 training for the two conditions. Specifically, for each compound, we take the combined strength of associations to the more probable outcome (o1 for compounds AX and AY, o2 for compounds BX and BY; see Table 2) and subtract from it the strength of associations to the less probable outcome (o2 for compounds AX and AY; o1 for BX and BY). These values are then averaged across all compounds for each

condition. Figure 4a therefore plots the extent to which compounds were associated with the probable outcome more strongly than the rare outcome, which relates to participants’ likelihood of making a “probable outcome” response as shown in Figure 3a. The model anticipates that the strength of selective association with the probable outcome will increase over the course of training, and that this increase will be greater in the certain condition than in the uncertain condition; this mirrors the pattern observed empirically, shown in Figure 3a. Since nonpredictive cues were paired equally often with both Outcomes o1 and o2, values above 0 in Figure 4a are driven entirely by selective associative strengths developed by predictive cues (A and B). Selective associations develop rapidly in the certain condition because in this condition each predictive cue was only ever paired with one of the two outcomes. In contrast, in the uncertain condition, predictive cues were paired with one outcome on two out of every

THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2015

19

Downloaded by [New York University] at 15:42 10 May 2015

BEESLEY ET AL.

three trials and were paired with the other outcome once. Hence, in the uncertain condition, selective associations with the more probable outcome will be slower to develop. Figure 4b shows simulated values of overt attention (Att) across the course of Stage 1. The pattern of simulated attention closely matches that in Figure 3b, in that there is a clear effect of uncertainty, with attention for certain compounds much lower than attention for uncertain compounds. There is also a clear effect of predictiveness in both types of compound, with higher attention for predictive cues than for nonpredictive cues. In the model simulations there is also an interaction between uncertainty and predictiveness, with the difference between predictive and nonpredictive cues greater in the case of certain compounds. A corresponding pattern was observed numerically in the proportional fixation time data for Stage 1 of Experiment 1 (Figure 2c) and reached statistical significance in Experiment 2 (Figure 3c). Figure 4c shows the simulated attention (Att) to all four types of cue, averaged across the six blocks of Stage 2. The pattern of simulated data reflects quite closely that observed empirically in Experiment 2 (Figure 3d). Overall, Att is greater for previously predictive cues than for previously nonpredictive cues. The model also predicts an interaction between predictiveness and uncertainty, in that the effect of predictiveness is larger in the case of the cues previously trained in certain compounds. It should be noted that whilst the empirical data also showed a similar pattern of interaction at a numerical level, this interaction was not statistically significant. Finally, Figure 4d shows the terminal associative strengths for the four types of cue at the end of Stage 2. This figure plots the strength of the association between each cue and the outcome (o3 or o4) with which it was paired in Stage 2 (note that, since each cue was paired with only one outcome in Stage 2, this gives the strength of the selective association with that outcome and hence is compatible with the dependent variable of Figure 4a). The data in Figure 4d thus provide the model’s prediction as to the causal ratings given on Stage 1 for the individual cues. The model predicts that there will be

20

an effect of prior-predictiveness on learning in Stage 2, such that cues trained as predictive in Stage 1 will develop stronger selective associations than cues trained as nonpredictive. The model also accurately predicts that this effect will be stronger for cues trained in certain compounds than for those trained in uncertain compounds. The pattern of data therefore is a close match to the empirical test ratings for individual cues (Figure 3e). One discrepancy is the low simulated strength of selective association for nonpredictive cues trained in certain compounds. Recall that these stimuli were trained alongside the predictive cue from certain compounds in Stage 2. Therefore the model predicts a strong competition for associative strength between these two cues in Stage 2, driven by the extreme values of exploitative attention at the end of Stage 1 for the predictive cues (exploitpredictive tends to the upper limit of 1) and nonpredictive cues (exploitnonpredictive tends to the lower limit of .1); see Appendix, Equation 2. The model’s prediction for the compound test scores (Figure 3e) can be inferred by comparing the strength of selective associations for predictive and nonpredictive cues in each condition. The model accurately predicts a positive compound test score in each condition, since the associative strength of predictive cues is higher than that of nonpredictive cues, with this difference being greater in the case of the certain condition. While we have presented one model that can simulate the observed patterns of overt attention and associative learning in our experiments, it is likely that there are other models that would produce similar results. For example, George and Pearce (2012) have developed an alternative hybrid attentional model that is based on Pearce’s (1987, 1994, 2002) configural model. In this approach, input units (representing stimulus elements; in our experiments, these would correspond to the individual cues, A, B, C, etc.) are connected to configural units (corresponding to the trained compounds, AX, AY, etc.). The pairing of compounds with outcomes leads to the strengthening of associations between these configural units and outcome units. Within George and Pearce’s model, increases in exploitative attention are

THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2015

Downloaded by [New York University] at 15:42 10 May 2015

EXPLORATORY AND EXPLOITATIVE ATTENTION

reflected by an increase in the salience of the input units as a result of their connected configural unit accurately predicting the outcome on a given trial. If an active configural unit predicts an incorrect outcome, a decrease in salience will occur for input units connected to that configural unit. Consider the pair of trials AX → 1 and BX → 2. Here the input units for Cues A and B will increase in salience, since they are paired with configural units that predict the correct outcome on each trial. However, Cue X is paired with the configural units for both AX and BX, and as such the input unit X will suffer a net decrease in salience (see George & Pearce, 2012, for details). The model also features an exploratory attention component, implemented through an associability parameter that modulates the rate at which associations form between configural units and output units. As in our model, this exploratory component implements the spirit of the Pearce– Hall model, such that configurations that are followed by surprising events tend to maintain high levels of associability, while associability declines for configurations that are consistently followed by the same events. The framework of George and Pearce’s (2012) hybrid model is rather different from that of Le Pelley’s (2004, 2010b) model (on which the model presented in the Appendix is closely based). Nevertheless, these models generate similar predictions under many circumstances (for example, see George & Pearce, 2012). Consequently, we see no reason why this alternative model should not also be able to account for the results of the current experiments. Like our current model, it would need to incorporate two additional, and critical, assumptions: (a) that overt attention is determined by the product of exploratory and exploitative attentional processes, and (b) that exploratory attention is reset when a change in context is encountered. Assumption (b) would be straightforward to implement. However, given that the exploitative and exploratory processes act at different levels in George and Pearce’s model (the former through the salience of input units, the latter through the associability of configural units) assumption (a) might be harder

to implement in a psychologically plausible manner. We refer to our model, and that of George and Pearce (2012), as hybrid theories because they combine two distinct attentional processes: an exploitative process following the principle of the Mackintosh model, and an exploratory process following the principle of the Pearce–Hall model. An alternative view has recently been proposed by Esber and Haselgrove (2011), who have suggested that the seemingly opposing attentional effects that support the Mackintosh and Pearce–Hall models can be reconciled within a single-process account of attention. According to this model, the attention paid to a cue is determined by the sum of the associations between that cue and the outcomes with which it has previously been paired. Under certain sets of parameters, the model can therefore predict that a cue that has been paired with two outcomes—that is, a cue whose consequences are uncertain—will receive greater attentional processing than a cue that is paired with just one. According to Esber and Haselgrove’s (2011) model, when two cues are paired in compound they will compete for associative strength. In the case of the current experiments, then, predictive cues in Stage 1 will acquire associative strength to the outcome they are frequently paired with, while nonpredictive cues in Stage 1 will accrue little associative strength, since they will be overshadowed by the predictive cue. Given that associative strength determines attention in this model, it therefore correctly anticipates greater attention to predictive cues than nonpredictive cues at the within-compound level (i.e., a Mackintosh-like effect). However, in the case of the uncertain compounds, the probabilistic contingency will result in a degree of prediction error on inconsistent trials, permitting associative strength to accrue to both of the outcomes with which the cues are paired. Thus, with appropriate parameterization to determine how associative strengths are combined (see Esber & Haselgrove, 2011, for details), the model should, in principle, be able to account for greater attention to the two cues trained in uncertain compounds, while retaining an attentional advantage for predictive cues at the within-compound level.

THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2015

21

Downloaded by [New York University] at 15:42 10 May 2015

BEESLEY ET AL.

However, it is less easy to see how Esber and Haselgrove’s (2011) model could account for the data from Stage 2 of Experiment 2. Here, we found that the influence of predictiveness, but not uncertainty, persisted to influence attention to cues. Our own model explained this finding by assuming that the uncertainty-based exploratory attentional process was more sensitive to the change in context between training stages than was the predictiveness-based exploitative process. This dissociation of the influence of predictiveness and uncertainty seems harder to reconcile with Esber and Haselgrove’s model, since both of these influences emerge from a single attentional process. That said, the effects of predictiveness and uncertainty arise in this model through different processes, as noted above. An advantage for predictive cues occurs because predictive cues have stronger associations to particular outcomes than do nonpredictive cues. An advantage for uncertain cues arises because uncertain cues have associations to several different outcomes, whereas certain cues have associations to only one outcome. The uncertainty effect therefore requires associative strengths to be summed over multiple outcomes, whereas the predictiveness effect does not. If this summation process were, for some reason, particularly sensitive to a change of context (i.e., it is more difficult to activate or recall multiple associates of a stimulus following a sudden change in context) then it may be possible for this model to account for a greater disruption of the influence of uncertainty than of predictiveness in the transition to Stage 2 (M. Haselgrove, personal communication, December 19, 2014).

CONCLUSION The current experiments suggest that both predictiveness and uncertainty have an influence on the attention paid to cues during human associative learning. In line with theories of attentional exploitation (e.g., Mackintosh, 1975), those cues that were the best predictors of outcomes were attended more than nonpredictive cues. Moreover, this influence of predictiveness persisted to influence

22

novel learning and attention to these cues in a subsequent learning phase. However, while conditions of uncertainty led to an increase in attentional exploration of compounds (consistent with the Pearce–Hall model), uncertainty did not increase the rate of learning or the subsequent level of attention to cues in subsequent learning episodes (which is at odds with the Pearce–Hall model). We have suggested that there may be conditions under which this exploratory attentional process would yield beneficial outcomes for learning, such as when new information within previously nonpredictive cues becomes relevant at a later stage. It will be the task of future research to explore a wider range of conditions to determine whether there are indeed cases in which uncertainty has such an effect on learning.

REFERENCES Beesley, T., & Le Pelley, M. E. (2010). The effect of predictive history on the learning of sub-sequence contingencies. Quarterly Journal of Experimental Psychology, 63, 108–35. Beesley, T., & Le Pelley, M. E. (2011). The influence of blocking on overt attention and associability in human learning. Journal of Experimental Psychology: Animal Behavior Processes, 37, 114–20. Bonardi, C., Graham, S., Hall, G., & Mitchell, C. J. (2005). Acquired distinctiveness and equivalence in human discrimination learning: evidence for an attentional process. Psychonomic Bulletin & Review, 12, 88–92. Brainard, D. H. (1997). The psychophysics toolbox. Spatial Vision, 10, 433–436. Channell, S., & Hall, G. (1983). Contextual effects in latent inhibition with an appetitive conditioning procedure. Animal Learning & Behavior, 11, 67– 74. Cousineau, D. (2005). Confidence intervals in withinsubject designs: A simpler solution to Loftus and Masson’s method. Tutorials in Quantitative Methods for Psychology, 1, 2–5. Esber, G. R., & Haselgrove, M. (2011). Reconciling the influence of predictiveness and uncertainty on stimulus salience: a model of attention in associative learning. Proceedings of The Royal Society, 278, 2553–61.

THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2015

Downloaded by [New York University] at 15:42 10 May 2015

EXPLORATORY AND EXPLOITATIVE ATTENTION

George, D. N., & Pearce, J. M. (2012). A configural theory of attention and associative learning. Learning & Behavior, 40, 241–54. Griffiths, O., Johnson, A. M., & Mitchell, C. J. (2011). Negative transfer in human associative learning. Psychological Science, 22, 1198–204. Hogarth, L., Dickinson, A., Austin, A., Brown, C., & Duka, T. (2008). Attention and expectation in human predictive learning: the role of uncertainty. Quarterly Journal of Experimental Psychology, 61, 1658–68. Kleiner, M., Brainard, D. H., & Pelli, D. G. (2007). What’s new in Psychtoolbox-3?. Perception, 36, ECVP Abstract Supplement. Kruschke, J. K. (1996). Dimensional relevance shifts in category learning. Connection Science, 8, 225–248. Kruschke, J. K., Kappenman, E. S., & Hetrick, W. P. (2005). Eye gaze and individual differences consistent with learned attention in associative blocking and highlighting. Journal of Experimental Psychology: Learning, Memory, and Cognition, 31, 830–45. Le Pelley, M. E. (2004). The role of associative history in models of associative learning: a selective review and a hybrid model. The Quarterly Journal of Experimental Psychology, 57, 193–243. Le Pelley, M. E. (2010a). Attention and human associative learning. In C. J. Mitchell & M. E. Le Pelley (Eds.), Attention and Associative Learning: From Brain to Behaviour (pp. 187–215). Oxford: Oxford University Press. Le Pelley, M. E. (2010b). The hybrid modeling approach to conditioning. In N. A. Schmajuk (Ed.), Computational Models of Conditioning (pp. 71–107). Cambridge: Cambridge University Press. Le Pelley, M. E., Beesley, T., & Griffiths, O. (2011). Overt attention and predictiveness in human contingency learning. Journal of Experimental Psychology: Animal Behavior Processes, 37, 220–9. Le Pelley, M. E., Beesley, T., & Griffiths, O. (2014). Relative salience versus relative validity: Cue salience influences blocking in human associative learning. Journal of Experimental Psychology: Animal Behavior Processes, 40, 116–132. Le Pelley, M. E., & McLaren, I. P. L. (2003). Learned associability and associative change in human causal learning. The Quarterly Journal of Experimental Psychology, 56, 68–79. Le Pelley, M. E., Mitchell, C. J., & Johnson, A. M. (2013). Outcome value influences attentional biases in human associative learning: dissociable effects of training and instruction. Journal of

Experimental Psychology: Animal Behavior Processes, 39, 39–55. Le Pelley, M. E., Reimers, S. J., Calvini, G., Spears, R., Beesley, T., & Murphy, R. a. (2010). Stereotype formation: biased by association. Journal of Experimental Psychology: General, 139, 138–61. Le Pelley, M. E., Suret, M. B., & Beesley, T. (2009). Learned predictiveness effects in humans: a function of learning, performance, or both? Journal of Experimental Psychology: Animal Behavior Processes, 35, 312–27. Lochmann, T., & Wills, A. J. (2003). Predictive history in an allergy prediction task. In F. Schmalhofer, R. M. Young & G. Katz (Eds.), Proceedings of EuroCogSci 03 (pp. 217–222). Mahwah, NJ: Lawrence Erlbaum Associates. Mackintosh, N. J. (1975). A theory of attention : Variations in the associability of stimuli with reinforcement. Psychological Review, 82, 276–298. Pearce, J. M. (1987). A model for stimulus generalization in Pavlovian conditioning. Psychological Review, 94, 61–73. Pearce, J. M. (1994). Similarity and discrimination: A selective review and a connectionist model. Psychological Review, 101, 587–607. Pearce, J. M. (2002). Evaluation and development of a connectionist theory of configural learning. Animal Learning & Behavior, 30, 73–95. Pearce, J. M., & Hall, G. (1980). A model for pavlovian learning : Variations in the effectiveness of conditioned but not of unconditioned stimuli. Psychological Review, 87, 532–552. Pearce, J. M., Kaye, H., & Hall, G. (1982). Predictive accuracy and stimulus associability: Development of a model for Pavlovian learning. In M. L. Commons R. J. Herrnstein, & A. R. Wagner (Eds.), Quantitative analyses of behavior (Vol. 3, pp. 241– 256). Cambridge: Ballinger. Pearce, J. M., & Mackintosh, N. J. (2010). Two theories of attention: A review and a possible integration. In C. J. Mitchell & M. E. Le Pelley (Eds.), Attention and associative learning: From brain to behavior (pp. 11–40). Oxford: Oxford University Press. Pelli, D. G. (1997). The VideoToolbox software for visual psychophysics: Transforming numbers into movies. Spatial Vision, 10, 437–442. Rehder, B., & Hoffman, A. B. (2005). Eyetracking and selective attention in category learning. Cognitive Psychology, 51, 1–41. Rescorla, R. A., & Wagner, A. R. (1972). A theory of Pavlovian conditioning: Variations in the

THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2015

23

Downloaded by [New York University] at 15:42 10 May 2015

BEESLEY ET AL.

effectiveness of reinforcement and non-reinforcement. In A. H. Black & W. F. Prokasy (Eds.), Classical conditioning II: Current research and theory (pp. 64–99). New York: Appleton-Century-Crofts. Salvucci, D. D., & Goldberg, J. H. (2000). Identifying fixations and saccades in eye-tracking protocols. Proceedings of the Symposium on Eye Tracking Research & Applications - ETRA ‘00, 71–78. Swartzentruber, D., & Bouton, M. (1986). Contextual control of negative transfer produced by prior CS-US pairings. Learning and Motivation, 17, 366–385.

Trick, L., Hogarth, L., & Duka, T. (2011). Prediction and uncertainty in human Pavlovian to instrumental transfer. Journal of Experimental Psychology. Learning, Memory, and Cognition, 37, 757–65. Whitney, L., & White, K. G. (1993). Dimensional shift and the transfer of attention. Quarterly Journal of Experimental Psychology, 46B, 225–252. Wills, A. J., Lavric, A., Croft, G. S., & Hodgson, T. L. (2007). Predictive Learning, prediction errors, and attention: evidence from event-related potentials and eye tracking. Journal of Cognitive Neuroscience, 19, 843–854.

APPENDIX

implementation uses a model based on Mackintosh (1975). Consider a trial with compound AW. The size of the individual prediction error for Cue A, |λ – VA|, provides a measure of how accurately A predicts the outcome of the current trial; the smaller this prediction error, the more accurately A predicts the outcome. Similarly |λ – VW| represents how accurately W predicts the same outcome. Hence comparing the size of these individual prediction errors reveals which cue is the more accurate predictor. More generally, on a trial n featuring cues Q and R, the exploitative attention to cue Q (exploitQn ) is updated according to:

Formal description of the computational model outlined in the General Discussion Exploratory attention The current data are consistent with the Pearce–Hall model in suggesting that exploratory attention is an increasing function of the prediction error associated with a cue compound considered as a whole. Let VA represent the prediction of the outcome made by the presence of Cue A, and VW represent the prediction made by Cue W, such that when the compound AW is presented, the total outcome prediction is given by ΣV = VA + VW. Let λ represent the actual outcome that is paired with compound AW. Hence the magnitude of the prediction error on this trial is given by the absolute magnitude of the discrepancy between the observed outcome (λ) and the expected outcome (ΣV )—that is, |λ – ΣV|. Following Pearce, Kaye, and Hall (1982), this prediction error is used to determine exploratory attention to each presented cue Q on trial n (explorenQ ) according to:   explorenQ = gl − SV  + (1 − g) × exploren−1 Q

(1)

where γ is a fixed parameter (between 0 and 1) determining the extent to which the updated value of exploreQ is determined by the prediction error on the current trial, |λ – ΣV|. Equation 1 ensures that exploreQ has the required property that exploratory attention decreases as prediction error decreases. Hence exploreQ will generally be lower if Q belongs to a compound that is followed by a highly predictable outcome than if it belongs to a compound that is followed by a highly uncertain outcome.

   n−1 exploitnQ = exploitQ + m |l − VR | − l − VQ 

(2)

where μ is a fixed parameter (between 0 and 1) determining the rate of changes in exploitative attention. Equation 2 has the required property that exploitQ will increase if Q is a more accurate predictor of the outcome than is R, since in this case | λ – VQ| , |λ – VR|. Conversely, exploitQ will decrease if Q is less predictive than R, since in this case |λ – VQ| . |λ – VR|. In the simulations presented in Figure 4, the starting values of explore and exploit were .8, and both were allowed to range between .1 and 1. This lower limit ensures that a cue is never entirely “frozen out” of the learning process, which would occur when either parameter reached 0. Psychologically it reflects the fact the all cues have at least some potential to capture attention and be learnt about, irrespective of their associative history.

Overt attention Here we make the simple assumption that exploreQ and exploitQ combine in multiplicative fashion to determine the overall attention to Q , AttQ . Thus: AttQn = explorenQ × exploitnQ

(3)

Exploitative attention

Learning

Our data suggest an exploitative process that is based on a comparison of the relative predictiveness of individual cues; our

Following the influential model of Rescorla and Wagner (1972), after each trial the associative strengths of all presented stimuli

24

THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2015

EXPLORATORY AND EXPLOITATIVE ATTENTION

are updated as a function of the overall prediction error experienced on that trial, (λ – ΣV). Importantly, however, the rate of change of associative strength for a particular cue, is also modulated by the attention paid to that cue. So for cue Q we have: VQn = VQn + AttQn−1 × b × (l − SV n−1 )

(4)

where β is a fixed learning-rate parameter.

Parameters

Downloaded by [New York University] at 15:42 10 May 2015

Simulation results presented in Figure 4 reflect the average performance of the model across 200 simulated subjects,

trained with the procedure used in Experiment 2, with the following parameter values (values in parentheses reflect the range of values that each parameter can take, with all other values held at the given starting values, in order to produce the observed ordinal patterns of simulated data shown in Figure 4): γ = .4 (.05–1.0); μ = .05 (.02–.20); β = .1 (.03–1.0), explore/exploit = .8 (.5–1.0). On the basis of the pattern suggested by the empirical data (as noted in the General Discussion), we assumed that the change in context between Stage 1 and Stage 2 resulted in all values of explore being reset to their starting values, while values of exploit persisted across the two phases.

THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2015

25

Uncertainty and predictiveness determine attention to cues during human associative learning.

Prior research has suggested that attention is determined by exploiting what is known about the most valid predictors of outcomes and exploring those ...
914KB Sizes 0 Downloads 5 Views