530725

research-article2014

PSSXXX10.1177/0956797614530725GoldA Perceptually Completed Whole

Research Article

A Perceptually Completed Whole Is Less Than the Sum of Its Parts

Psychological Science 2014, Vol. 25(6) 1206­–1217 © The Author(s) 2014 Reprints and permissions: sagepub.com/journalsPermissions.nav DOI: 10.1177/0956797614530725 pss.sagepub.com

Jason M. Gold Indiana University

Abstract How efficiently do people integrate the disconnected image fragments that fall on their eyes when they view partly occluded objects? In the present study, I used a psychophysical summation-at-threshold technique to address this question by measuring discrimination performance with both isolated and combined features of physically fragmented but perceptually complete objects. If visual completion promotes superior integration efficiency, performance with a visually completed object should exceed what would be expected from performance with the individual object parts shown in isolation. Contrary to this prediction, results showed that discrimination performance with both static and moving versions of physically fragmented but perceptually complete objects was significantly worse than would be expected from performance with their constituent parts. These results present a challenge for future theories of visual completion. Keywords visual completion, perceptual organization, information integration Received 10/22/13; Revision accepted 3/14/14

A fundamental challenge faced by any real-world visual pattern-recognition system is the ability to accurately identify objects that are partially hidden from view (Pessoa, Thompson, & Noe, 1998). There is considerable evidence that the human visual system solves this problem by interpolating missing image fragments to construct unified object representations (Gold, Murray, Bennett, & Sekuler, 2000; Kellman, Yin, & Shipley, 1998; Ringach & Shapley, 1996; Sekuler & Palmer, 1992; von der Heydt, Peterhans, & Baumgartner, 1984). Traditional Gestalt views of perceptual grouping and organization (Koffka, 1935) maintain that the formation of such a representation brings with it certain emergent properties, such that a visually completed whole is perceived to be something different than just the sum of its individual parts. Although this view may offer a compelling description of phenomenological experience when partially occluded objects are encountered, it is less clear how these properties are manifested at the level of behavioral performance in object-recognition tasks. Previous experiments using discrimination tasks have demonstrated that presenting features in configurations or contexts in which visual completion is thought to take

place can increase reaction times and accuracy. For example, Sekuler and Palmer (1992) found that when observers were primed with a partly occluded version of an object, they were faster to make subsequent decisions about filled-in versions of the object than about incomplete versions matching the stimulus they had seen previously. Similarly, Pomerantz and colleagues (Eidels, Townsend, & Pomerantz, 2008; Pomerantz, 2003; Pomerantz & Portillo, 2011; Pomerantz & Pristach, 1989; Pomerantz, Sager, & Stoever, 1977) have found that reaction times for discriminating among sets of items can be significantly increased when redundant features are added to the items to elicit the percept of perceptually complete figures (what they call the configural-superiority effect). Ringach and Shapley (1996) have found that orienting the elements of a Kanisza figure so that observers no longer perceive illusory or occluded contours can result in dramatic decreases in discrimination accuracy. Corresponding Author: Jason M. Gold, Indiana University, Psychological and Brain Sciences, 1101 East 10th St., Bloomington, IN 47405 E-mail: [email protected]

A Perceptually Completed Whole 1207 The results of such experiments suggest that an observer’s ability to integrate the disparate image fragments of a partly occluded object is aided by visual completion. Thus, one might expect observers’ performance with perceptually complete wholes to exceed what would simply be expected from their ability to use each image fragment shown in isolation. One approach that has been used successfully to test this kind of prediction in other domains is the summation-at-threshold technique (Gold, Mundy, & Tjan, 2012; Graham, Robson, & Nachmias, 1978; Nandy & Tjan, 2008). This approach involves measuring observers’ contrast sensitivities (i.e., the reciprocal of their contrast thresholds) when they discriminate the isolated elements of a set of stimuli and using them to quantitatively predict what those observers’ sensitivities should be when they discriminate the combined versions of the stimuli. Under certain conditions, if observers simply use the information from each element in a constant fashion, regardless of whether those elements appear in isolation or in combination, it can be shown that the sum of the observers’ squared sensitivities to each of the individual elements should equal their squared sensitivity to the combined stimulus (referred to as optimal integration; Nandy & Tjan, 2008). If presenting the elements in combination allows observers to make better use of information than would be predicted by their performance with the individual elements, their squared sensitivity to the combination should exceed the sum of their squared sensitivities to the individual parts (superoptimal integration). Alternatively, if there is a significant cost to processing all of the elements in combination, observers’ squared sensitivity to the combination should fall below the sum of their squared sensitivities to the individual elements (suboptimal integration). These predictions can be conveniently expressed by computing an integration index, Φ: 2 Φ = Scombined /

n

∑S i =1

2 part i



(1)

Here, S denotes sensitivity (i.e., 1/contrast threshold), and n equals the number of individual parts that make up a combined stimulus. An integration index equal to 1 indicates optimal integration, greater than 1 indicates superoptimal integration, and less than 1 indicates suboptimal integration (Gold et al., 2012). Given the above, I applied this summation-atthreshold technique to a series of tasks in which the stimuli were physically fragmented but perceptually complete. I reasoned that if the process of visual completion enhances an observer’s ability to use information carried by the individual elements of an object, then superoptimal integration should occur for perceptually complete but not for perceptually fragmented figures.

Method Participants Three volunteers between the ages of 19 and 42, as well as the author, participated in the study. All had normal or corrected-to-normal visual acuity, and all volunteers provided consent within a protocol approved by the Indiana University Institutional Review Board.

Stimuli and procedure All 4 observers performed a series of discrimination tasks in which objects appeared as either perceptually complete or as disconnected fragments (Fig. 1). One of the requirements for making the prediction described in Equation 1 is that the isolated fragments be orthogonal to each other (i.e., their dot product must be equal to zero; Nandy & Tjan, 2008). Thus, I designed three different tasks whose features were spatially nonoverlapping and therefore met this requirement. In the bent-bar task (Fig. 1a), two Pac-Man-like circles, each with a rectangular section missing, were displayed with their “mouths” either facing each other or facing in the same direction. Each Pac-Man was slightly rotated clockwise or counterclockwise. When the mouths of the Pac-Men were facing each other, this created the percept of an occluding bar that was slightly bent either to the left or right (complete stimuli; Fig. 1a). When the mouths of the Pac-Men were facing the same direction, no percept of an illusory bar was perceived, and both Pac-Men simply appeared to be oriented slightly to the left or right (fragmented stimuli; Fig. 1a). For each of these pairs of stimuli, the contrast of the images was varied across trials, and observers were asked to classify them as either oriented toward the left or right, in order to obtain 71% correct contrast-discrimination thresholds. The stimuli on each trial were embedded in a random sample of Gaussian white-pixel noise to make the task more difficult (see Fig. S1 in the Supplemental Material available online for an illustration of Fig. 1a with added noise at near-threshold signal contrast). Each observer’s ability to classify the individual Pac-Man elements that made up the stimuli (top only and bottom only; Fig. 1a) was also tested, in addition to the two main bent-bar conditions. Thus, there were three conditions tested for each kind of stimulus: combined, top only, and bottom only. This same approach was applied to two other tasks that were based on the occluded-rotating-square tasks developed by Lorenceau and Shiffrar (1992) and Murray, Sekuler, and Bennett (2001). In the rotating-square task (Fig. 1b), a set of four white line segments rotated either clockwise or counterclockwise. When a set of four solid black squares appeared at the corners of the stimulus, the line segments appeared to unite into a single rotating

Gold

1208

a

b

Rotating Squares

Bent Bars Left

Right

Clockwise

Right

CounterClockwise

CounterClockwise Clockwise

Combined

Combined Bottom Only Top Only

Bottom Only

Left Only

Top Only

Right Only

Left

Complete Bent Bar

Fragmented Bent Bar

c

Complete Rotating Square

Fragmented Rotating Square

Shrinking/Expanding Squares Expanding

Shrinking

Expanding

Combined Bottom Only Top Only

Left Only

Right Only

Shrinking

Complete Shrinking/Expanding Square

Fragmented Shrinking/Expanding Square

Fig. 1. Stimuli and conditions used in the (a) bent-bar, (b) rotating-square, and (c) shrinking/expandingsquare tasks. All stimuli were presented in conditions in which they appeared perceptually complete and in which they appeared fragmented. In addition, the elements of each stimulus were presented both combined and in isolation. Each panel in (b) and (c) shows a single frame from the entire dynamic stimulus sequence that observers actually saw. See the text for details about each stimulus type. See Videos S1 and S2 in the Supplemental Material for dynamic versions of the rotating- and shrinking/expanding-square stimuli, respectively.

A Perceptually Completed Whole 1209 square that was partially occluded by four black corner elements (complete condition; Fig. 1b). When these corner elements were painted the same color as the background (midgray), the square appeared as a set of four disconnected rotating fragments (fragmented condition; Fig. 1b). Much like in the bent-bar task, observers were asked to classify the stimulus as rotating either clockwise or counterclockwise when either all of the elements were present (combined) or just a single element was present (bottom only, top only, left only, right only). The shrinking/expanding-square task (Fig. 1c) was similar to the rotating-square task, except that the line segments moved in toward the center of the figure or out toward the edge of the figure. In the presence of the four black corner elements, this created the percept of an occluded shrinking or expanding square (complete condition; Fig. 1c); in the absence of the corner elements, the stimuli appeared as a disconnected set of line segments moving either in or out from the center of the figure (fragmented condition; Fig. 1c; also see Videos S1–S4 in the Supplemental Material for dynamic versions of the rotating- and shrinking/expanding-square stimuli, with and without added noise). All stimuli were shown on a CRT display with MATLAB (The MathWorks, Natick, MA), using in-house software and the extensions provided by the Psychophysics Toolbox (Brainard, 1997; Pelli, 1997). The resolution of the display was 1,024 × 768 pixels, which subtended 73.6° × 77.6° from a viewing distance of 130 cm. The frame rate of the display was 85 Hz. Luminance on the display ranged from 5.5 to 120.5 cd/m2, with an average (background) luminance of 38 cd/m2. The contrast noise added to the stimuli on each trial was white and Gaussian, and it had a standard deviation of 0.032 (noise spectral density = 2.66e-07). The specific properties of each kind of stimulus used in the experiments were as follows. Bent bars.  Each Pac-Man element of the bent-bar stimuli was 46 pixels (0.75° of visual angle) in diameter, with a “mouth” that was 15 pixels (0.25°) wide and 24 pixels (0.39°) deep. Depending on the condition, each element was rotated by ±5° from facing either straight upward or downward. In the combined conditions, the distance between the edges of the two elements was 45 pixels (0.74°). In the top- and bottom-only conditions, the elements would appear in the same locations that they appeared in the corresponding combined condition. In all conditions, the elements were embedded in a background of average luminance that was 147 pixels (2.40°) in height × 64 pixels (1.05°) in width. The background region was surrounded by a dark outline that was 2 pixels (0.03°) in width. The noise that was added to the stimulus covered the entire stimulus region. The stimulus duration was 43 frames (505 ms).

Rotating squares.  Each rotating square was constructed by placing a 76 × 76 pixel (1.25° × 1.25°) outline of a rotating square within a 138 × 138 pixel (2.26° × 2.26°) region of average luminance. The corners of the rotating square were removed by replacing each corner of the 138 × 138 pixel region with a 52 × 52 pixel (0.85° × 0.85°) square that was either darker than the background (for the complete condition) or the same as the background (for the fragmented condition). This produced four separate rotating line segments that were 5 pixels (0.08°) thick and that varied between 32 (0.52°) and 35 (0.57°) pixels in length across the different angles of rotation. Depending on the location of the segment and the direction of rotation, each swept through a series of 21 angular rotations that covered a 10° range, which yielded a rotation rate of 40.5°/s. The stimulus duration was 21 frames (247 ms). As with the bent-bar stimuli, the line segments in the top-, bottom-, left-, and right-only conditions appeared in the same locations where they appeared in the combined conditions. A unique sample of noise was added to the entire stimulus region on each frame. Shrinking/expanding squares.  Each shrinking or expand­ ing square was constructed by placing the outline of a square within a 216 × 216 (3.53° × 3.53°) pixel region of average luminance. The corners of the square were removed by replacing each corner of the 216 × 216 pixel region with a 96 × 96 pixel (1.57° × 1.57°) square that was either darker than the background (for the complete condition) or the same as the background (for the fragmented condition). This produced four separate line segments that were 5 pixels (0.08°) thick and 24 pixels (0.39°) in length. Depending on the location of the segment and whether the square was shrinking or expanding, each swept between a distance of 30 and 60 pixels (0.49° and 0.98°) from the center of the display (sampled every 2 pixels; 0.03°), which yielded a velocity of 165 pixels per second (2.70°/s). The stimulus duration was 31 frames (365 ms). As with rotating-square stimuli, the line segments in the top-, bottom-, left-, and right-only conditions appeared in the same locations where they appeared in the combined conditions. A unique sample of noise was added to the entire stimulus region on each frame.

Threshold and sensitivity measurement Thresholds in each condition were measured by varying the root-mean-square (RMS) contrast of the stimuli across trials using a two-down, one-up adaptive-staircase procedure. For the bent-bar task, 200 trials were measured per observer in each condition (i.e., top only, bottom only, combined) and for each stimulus type (i.e., complete, fragmented). For the rotating and shrinking/ expanding squares, 150 trials were measured in each

Gold

1210 condition and for each stimulus type (only 150 trials were measured in each condition because of the additional time it took to generate dynamic rather than static noise on each trial). Weibull psychometric functions were fit to the staircase data for each condition, and threshold was defined as the contrast that yielded 71% correct performance (chosen for its location at approximately the center of the psychometric function). The contrast of an isolated stimulus feature was defined as the contrast of the combined stimulus from which the feature was extracted (its nominal contrast). That is, the combined stimulus was first set to the specified level of contrast, and then individual features were extracted from this image according to the specific condition. Sensitivity was then computed as 1/contrast threshold.

type) repeated measures analysis of variance confirmed that there was a significant effect of completeness, F(1, 3) = 14.49, p < .05, as well as an effect of stimulus type, F(2, 2) = 226.75, p < .01, with no significant interaction between these two factors, F(2, 2) = 0.98, p = .51. So how might these surprising results be explained? One possibility is that the presence of externally added noise coupled with threshold signal contrast may have led observers to adopt a strategy in which they relied only on the single feature to which they were most sensitive. This “best-feature” strategy can be characterized as follows:

Experimental trial sequence

The predictions of such a best-feature model are plotted as small triangles on top of each panel in Figure 3. These data show that the best-feature model actually did a fairly good job at predicting integration efficiency when the combined stimuli were perceived as being complete. However, this was not the case for the fragmented stimuli, for which the best-feature model generally underpredicted integration efficiency. Although the possibility that observers were using a best-feature strategy when discriminating among the complete but not the fragmented stimuli cannot be ruled out, it does force one to draw the awkward conclusion that observers used only a single feature when engaging in visual completion yet relied on multiple features when a stimulus was perceived as being composed of disconnected fragments. Additionally, there are several good reasons to believe that adding external noise and presenting signals at low contrast generally have little impact on observers’ strategies. First, previous experiments using reverse correlation have shown that observers engage in completion under nearly identical conditions as those used in the current experiment (Gold et al., 2000). Second, many experiments have demonstrated that observers’ contrast-energy thresholds for detecting, discriminating, and identifying a wide variety of visual patterns (i.e., gratings, objects, faces) are linearly related to the noise spectral density of an externally added noise (Pelli & Farrell, 1999). This pattern of results is exactly what one would predict if observers were employing a strategy that was independent of stimulus contrast (Pelli, 1981). Another possible explanation for why such poor integration efficiency was found with perceptually complete stimuli is that the process of interpolation itself may have paradoxically counteracted any potential benefits gained from feature binding. The finding that visual completion can hinder performance under certain circumstances is not without precedent. In particular, decrements in performance have been reported when perceptual

On each trial, a fixation point appeared at the center of the screen, followed by the stimulus display for the specified duration and a selection window. In the bent-bar task, high contrast, noise-free versions of the two possible choices were shown (e.g., bottom feature rotated counterclockwise, bottom feature rotated clockwise), and corresponding response keys were displayed beneath the choices. For the rotating- and expanding-square tasks, text that described each possible response (i.e., “clockwise” vs. “counterclockwise” and “in” vs. “out,” respectively) was shown, along with a corresponding response key for each option. Observers were given unlimited time to indicate which option they perceived and were given auditory accuracy feedback. For each stimulus type in each task, trials for the combined and individual-feature conditions were randomly intermixed within a given session. A random task order was assigned to each observer.

Results and Discussion Squared sensitivities and integration indices for each observer as well as the means across observers in each condition are plotted in Figures 2 and 3, respectively. Surprisingly, these data show that none of the observers exhibited superoptimal integration in any of the conditions in which the stimuli were perceived as being complete. Instead, one-sample t tests (two-tailed) showed that the mean integration efficiency for the complete stimuli was significantly less than the prediction of optimal integration for all three tasks—bent bars: t(3) = 5.76, p = .01, rotating squares: t(3) = 19.79, p < .001, and shrinking/expanding squares: t(3) = 7.98, p < .005. Perhaps even more surprising is the fact that integration efficiency for the fragmented stimuli was generally higher than for the corresponding completed versions of each stimulus type. A 2 (completeness) × 3 (stimulus

2 2 Φ best feature = Scombined / arg max( S part ) i i ε[1, n ]



(2)

1211

Bottom

Top

Bottom

Combined Top Bottom Left

Right

2

8 6

2

4

8 6

2

4

8 6

2

104

105

106

107

2

Combined

Right

102

2

4

6

103 8

2

4

6

104 8

2

4

6

105 8

4

104

Bottom Left

Fragmented Rotating Square

Combined Top

Complete Rotating Square

4

6

105 8

2

4

6

106 8

2

4

6

107 8

102

2

4

6

1038

2

4

6

1048

2

4

6

1058

Right

Combined Top Bottom Left

Right

Fragmented Shrinking/ Expanding Square

Combined Top Bottom Left

Complete Shrinking/ Expanding Square

Fig. 2.  Squared contrast sensitivity for individual observers and the mean across observers in each condition of the bent-bar, rotating-square, and shrinking/ expanding-square tasks. The top row shows results for conditions under which the combined stimulus was perceived as being a complete object; the bottom row shows results for conditions under which the combined stimulus was perceived as a collection of fragments. Error bars on individual observer sensitivities correspond to ±1 SD and were estimated through bootstrap simulations (Efron & Tibshirani, 1993). Error bars for mean sensitivities correspond to ±1 SEM.

102

Top

Fragmented Bent Bar

Combined

Complete Bent Bar

JG OC

4

6

103 8

2

4

6

104 8

2

4

6

105 8

102

2

4

6

103 8

2

4

6

104 8

2

4

6

105 8

GR MS Mean

Sensitivity2 (1/Threshold2) Sensitivity2 (1/Threshold2)

Sensitivity2 (1/Threshold2)

Sensitivity2 (1/Threshold2)

Sensitivity2 (1/Threshold2) Sensitivity2 (1/Threshold2)

MS

OC

Fragmented Bent Bar

JG

OC

Mean

0.1

7 6 5 4

2

3

7 6 5 4

2

MS

1

10

0.1

2

3

7 6 5 4

2

3

7 6 5 4

2

JG

Mean

1

10

3

GR

Mean Index = 0.9 (SEM = 0.2)

GR

Mean Index = 0.63 (SEM = 0.05)

Complete Bent Bar

3

7 6 5 4

2

3

7 6 5 4

2

3

7 6 5 4

2

3

7 6 5 4

MS

OC

Fragmented Rotating Square

JG

Mean

GR

JG

MS

OC

Mean

Mean Index = 0.46 (SEM = 0.03)

GR

Mean Index = 0.38 (SEM = 0.02)

Complete Rotating Square

0.1

1

10

0.1

1

10

2

3

7 6 5 4

2

3

7 6 5 4

2

3

7 6 5 4

2

3

7 6 5 4

MS

OC

Fragmented Shrinking/ Expanding Square

JG

GR

JG

MS

OC

Mean Index = 0.81 (SEM = 0.14)

GR

Mean Index = 0.51 (SEM = 0.05)

Complete Shrinking/ Expanding Square

Mean

Mean

Fig. 3.  Integration index for individual observers and the mean across observers for the bent-bar, rotating-square, and shrinking/expanding-square tasks, separately for stimuli perceived as complete (top row) and stimuli perceived as fragmented (bottom row). Also shown is the integration index predicted by a best-feature model (see the text). Error bars on individual observer integration indices correspond to ±1 SD and were estimated through bootstrap simulations (Efron & Tibshirani, 1993). Error bars for mean integration indices across observers correspond to ±1 SEM.

0.1

1

10

0.1

1

10

Best-Feature Model Optimal Integration

Integration Index (Φ) Integration Index (Φ)

Integration Index (Φ)

Integration Index (Φ)

Integration Index (Φ) Integration Index (Φ)

1212

A Perceptually Completed Whole 1213 completion is placed into conflict with other cues, such as stereoscopic depth (Hou, Lu, Zhou, & Liu, 2006; Liu, Jacobs, & Basri, 1999). Given the strong behavioral (Gold et  al., 2000; Keane, Lu, & Kellman, 2007; Ringach & Shapley, 1996; Sekuler & Palmer, 1992), physiological (Pillow & Rubin, 2002; von der Heydt et al., 1984), and theoretical (Grossberg & Mingolla, 1985; Kellman & Shipley, 1991) support for the idea that illusory and occluded contours are added into an observer’s internal representation of a stimulus, one would predict that interpolated edges should have corresponding direct effects on task performance. In the current set of experiments, all three tasks had random samples of visual noise added to each pixel in the display. This noise fell in regions of the stimulus that both carried information (the physically present edges) and carried no information (the regions between the physically present edges). As a result, if an observer were to compare a completed representation to such a stimulus, the regions of interpolation would serve only to contribute noise to the ultimate response of the visual system, which would result in a reduction in integration efficiency. This possibility was tested by having 3 new observers (naive to the purpose of the study), as well as the author, perform a slight variant of the bent-bar discrimination task (Fig. 4a), in which a ring 3 pixels (0.05°) wide was drawn around the perimeter of each Pac-Man element. This gave the impression of an occluded rather than illusory right- or left-pointing bar when the top and bottom elements were shown in combination (e.g., Gold et al., 2000; Ringach & Shapley, 1996). Because the interpolated edges were perceived as continuing behind rather than in front of an intervening background, this new stimulus allowed local noise to be added in just the circular regions where the two Pac-Man elements could appear without running the risk of interrupting the interpolation process. The contrast of the added noise was also increased (σ = 0.14, noise spectral density = 2.68e-04) to ensure that it would be limiting observers’ performance rather than any internally generated noise (Burgess, Wagner, Jennings, & Barlow, 1981; Pelli, 1981). All other aspects of the experiment were the same as in the previous bent-bar condition. Given the above conditions, if observers’ use of noisy regions falling between the stimulus elements was responsible for their suboptimal integration efficiency in the original experiments, these effects should be ameliorated by restricting the presentation of noise to just the regions where the inducing elements appeared. Contrary to this prediction, results showed that integration efficiency was suboptimal for all 4 observers in the presence of local noise, as well as in the presence of global noise that covered the entire rectangular region within which the stimulus appeared (Figs. 4b and 4c). One-sample

t tests (two-tailed) confirmed that the mean index in the presence of both kinds of noise was significantly less than the prediction of optimal integration—local noise: t(3) = 10.74, p < .002; global noise: t(3) = 27.65, p < .0001—and a correlated two-samples t test (two-tailed) showed that the difference in integration efficiency between local and global noise was not significant, t(3) = 2.49, p = .09. Thus, it appears that the suboptimal integration efficiency associated with visual completion stems from a source other than observers’ reliance on stimulus regions that correspond to perceptually interpolated (but physically uninformative) features. Of course, it could also be that the summation-atthreshold technique suffers from some unforeseen methodological flaw. Although there is no particular reason to suspect this to be the case, only a handful of previous studies have used the technique (Gold et al., 2012; Graham et al., 1978; Nandy & Tjan, 2008), and none have reported evidence of superoptimal integration. To address this possible concern, I carried out an additional proofof-concept experiment with a stimulus and task that should unquestionably lead to superoptimal integration. I used a vernier-acuity task, which has been traditionally employed to evaluate the resolving ability of the visual system (Westheimer, 1965). A vernier-acuity task requires observers to determine whether two line segments placed end to end are either perfectly aligned or slightly misaligned relative to one another (Fig. 5a). In such a task, observers will undoubtedly rely on the relative rather than absolute positions of the two lines, because of the limiting effects of intrinsic spatial uncertainty in the human visual system (Tjan, Lestou, & Kourtzi, 2006; Zeevi & Mangoubi, 1984). However, this relational cue is not available when only one of the two line segments is presented, as was the case in the top-only and bottom-only isolated-line-segment conditions (Fig. 5a).1 In these conditions, observers were asked to indicate whether the individual segment was shifted to left or right relative only to the center of the display. Observers were expected to benefit much more from the combined presentation of the two line segments than would be predicted from their performance with the individual isolated line segments. In fact, this is exactly what was found (Figs. 5b and 5c): Integration efficiency was superoptimal for all observers, with a mean integration index of 1.5. A one-sample t test (two-tailed) confirmed that the mean index was significantly greater than the prediction of optimal integration, t(3) = 3.32, p < .05. In fact, these data are consistent with recent results reported by Pomerantz and Portillo (2011) and Pomerantz and Cragin (in press), who found that colinearity of line segments in a manner akin to the vernier stimuli used here yielded stronger configural-superiority effects than Kanisza-square stimuli defined by illusory contours.

Gold

1214

a

b

AB JG Mean 104

8

c

DB SB

Best-Feature Model Optimal Integration 10 9

Occluded Bent Bar

8 7 6

Local Noise

6

Top Only

4

5

Occluded Bent Bar Local Noise

Mean Index = 0.63 (SEM = 0.04)

4

103

3

Integration Index (Φ)

Bottom Only

Sensitivity2 (1/Threshold2)

2

8 6 4

2

102

8

2

19 8 7 6 5 4

6

3

4

2

2

101

Combined

Combined

104

8

Right

0.1

Bottom

AB

10 9

Occluded Bent Bar

4

5

JG

SB

Mean

Global Noise Mean Index = 0.51 (SEM = 0.02)

2

3

103

Integration Index (Φ)

Sensitivity2 (1/Threshold2)

4

DB

Occluded Bent Bar

8 7 6

Global Noise

6

Left

Top

8 6 4

2

102

8

2

19 8 7 6 5 4

6

3

4

2

2

101

Combined

Top

Bottom

0.1

AB

DB

JG

SB

Mean

Fig. 4.  Stimuli (a) and results (b, c) from the occluded-bent-bar task. Stimuli and conditions were the same as in the previous bent-bar task (see Fig. 1a), except that a thin ring was drawn around the perimeter of each circle. Squared contrast sensitivities in each condition (b) and integration indices (c) are shown for individual observers and averaged across observers, separately for trials on which there was local noise (top row) and global noise (bottom row). Also shown is the integration index predicted by a best-feature model (see the text). Error bars on individual sensitivities and indices correspond to ±1 SD and were estimated through bootstrap simulations (Efron & Tibshirani, 1993). Error bars for mean sensitivities and indices correspond to ±1 SEM.

A Perceptually Completed Whole 1215

a

b

GR MS

DL JG Mean

Top Only

106

Vernier Acuity

8 6 4

105

8 6 4 2

104

Combined

Bottom Only

Sensitivity2 (1/Threshold2)

2

8 6 4 2

Aligned

103

Misaligned

c

Combined

Top

Bottom

Best-Feature Model Optimal Integration

Vernier Acuity

10 9 8 7 6 5 4

Mean Index = 1.5 (SEM = 0.12)

Integration Index (Φ)

3 2

19 8 7 6 5 4 3 2

0.1

DL

GR

JG

MS

Mean

Fig. 5.  Stimuli (a) and results (b, c) from the vernier-acuity task. The two line segments of the stimulus were presented both aligned and misaligned, as well as in isolation. Squared contrast sensitivities in each condition (b) and integration indices (c) are shown for individual observers and averaged across observers. Also shown is the integration index predicted by a best-feature model (see the text). Error bars on individual sensitivities and indices correspond to ±1 SD and were estimated through bootstrap simulations (Efron & Tibshirani, 1993). Error bars for mean sensitivities and indices correspond to ±1 SEM.

Gold

1216

Conclusions Taken together, these results offer compelling new evidence that there are unanticipated costs associated with the process of visual completion. Admittedly, the tasks and stimuli used in the present experiments were artificially generated to allow quantitative measurement of the efficiency of feature integration with both perceptually complete and incomplete objects. However, the history of experimental psychology is filled with similar experiments and demonstrations in which the underlying mechanics of psychological phenomena have been elucidated using tasks and stimuli that exist only in the laboratory. Unlike in these relatively artificial laboratory conditions, object recognition in the natural world typically involves dealing with a tremendous degree of uncertainty about various properties of objects, such as their viewpoint, size, reflectance, and shape (Tjan et  al., 2006). Because the true properties of an object are generally underspecified by the raw data that the visual system receives, strong assumptions must often be made to reliably recover these properties and accomplish tasks such as discriminating one object from another (Ramachandran, 1988). It is perhaps under these kinds of conditions that the benefits of visual completion may outweigh the costs, allowing visual completion to play a constructive role in promoting efficient information integration. Author Contributions J. M. Gold is the sole author of this article and is responsible for its content.

Acknowledgments I thank Michael Simmons, the Indiana University Vision Lab, James Pomerantz, and two anonymous reviewers for helpful input and suggestions.

Declaration of Conflicting Interests The author declared that he had no conflicts of interest with respect to his authorship or the publication of this article.

Funding This research was supported by National Institutes of Health Grant R01-EY019265.

Supplemental Material Additional supporting information may be found at http://pss .sagepub.com/content/by/supplemental-data

Note 1. Each line segment of the vernier stimulus was 2 pixels (0.03°) thick and 36 pixels (0.59°) in height. For the aligned stimulus,

the top and bottom lines were abutting, such that they formed a single continuous line when shown in combination. For the misaligned stimulus, the bottom line was shifted to the right by 5 pixels (0.08°), and the top line was shifted to the left by 5 pixels. As in the previous tasks, the top- and bottom-only lines appeared in the same locations where they appeared in the corresponding combined condition. In all conditions, the elements were embedded in a background of average luminance that was 128 × 128 pixels (2.09° × 2.09°) in size. The background region was surrounded by a dark outline 2 pixels in width. Noise was added over the entire stimulus region. The stimulus duration was 43 frames (505 ms). There were 300 trials per observer in each condition.

References Brainard, D. H. (1997). The Psychophysics Toolbox. Spatial Vision, 10, 433–436. Burgess, A. E., Wagner, R. F., Jennings, R. J., & Barlow, H. B. (1981). Efficiency of human visual signal discrimination. Science, 214, 93–94. Efron, B., & Tibshirani, R. (1993). An introduction to the bootstrap. New York, NY: Chapman & Hall. Eidels, A., Townsend, J. T., & Pomerantz, J. R. (2008). Where similarity beats redundancy: The importance of context, higher order similarity, and response assignment. Journal of Experimental Psychology: Human Perception and Performance, 34, 1441–1463. Gold, J. M., Mundy, P. J., & Tjan, B. S. (2012). The perception of a face is no more than the sum of its parts. Psychological Science, 23, 427–434. Gold, J. M., Murray, R. F., Bennett, P. J., & Sekuler, A. B. (2000). Deriving behavioural receptive fields for visually completed contours. Current Biology, 10, 663–666. Graham, N., Robson, J. G., & Nachmias, J. (1978). Grating summation in fovea and periphery. Vision Research, 18, 815–825. Grossberg, S., & Mingolla, E. (1985). Neural dynamics of form perception: Boundary completion, illusory figures, and neon color spreading. Psychological Review, 92, 173–211. Hou, F., Lu, H., Zhou, Y., & Liu, Z. (2006). Amodal completion impairs stereo acuity discrimination. Vision Research, 46, 2061–2068. Keane, B. P., Lu, H., & Kellman, P. J. (2007). Classification images reveal spatiotemporal contour interpolation. Vision Research, 47, 3460–3475. Kellman, P. J., & Shipley, T. F. (1991). A theory of visual interpolation in object perception. Cognitive Psychology, 23, 141–221. Kellman, P. J., Yin, C., & Shipley, T. F. (1998). A common mechanism for illusory and occluded object completion. Journal of Experimental Psychology: Human Perception and Performance, 24, 859–869. Koffka, K. (1935). Principles of Gestalt psychology. New York, NY: Harcourt, Brace. Liu, Z., Jacobs, D. W., & Basri, R. (1999). The role of convexity in perceptual completion: Beyond good continuation. Vision Research, 39, 4244–4257.

A Perceptually Completed Whole 1217 Lorenceau, J., & Shiffrar, M. (1992). The influence of terminators on motion integration across space. Vision Research, 32, 263–273. Murray, R. F., Sekuler, A. B., & Bennett, P. J. (2001). Time course of visual completion revealed by a shape discrimination task. Psychonomic Bulletin & Review, 8, 713–720. Nandy, A. S., & Tjan, B. S. (2008). Efficient integration across spatial frequencies for letter identification in foveal and peripheral vision. Journal of Vision, 8(13), Article 3. Retrieved from http://www.journalofvision.org/content/8/13/3 Pelli, D. G. (1981). Effects of visual noise (Unpublished doctoral dissertation). University of Cambridge, England. Pelli, D. G. (1997). The VideoToolbox software for visual psychophysics: Transforming numbers into movies. Spatial Vision, 10, 437–442. Pelli, D. G., & Farrell, B. (1999). Why use noise? Journal of the Optical Society of America A, 16, 647–653. Pessoa, L., Thompson, E., & Noe, A. (1998). Finding out about filling-in: A guide to perceptual completion for visual science and the philosophy of perception [Target article and commentaries]. Behavioral & Brain Sciences, 21, 723–802. Pillow, J., & Rubin, N. (2002). Perceptual completion across the vertical meridian and the role of early visual cortex. Neuron, 33, 805–813. Pomerantz, J. R. (2003). Wholes, holes, and basic features in vision. Trends in Cognitive Sciences, 7, 471–473. Pomerantz, J. R., & Cragin, A. I. (in press). Emergent features and feature combination. In J. Wagemans (Ed.), Oxford handbook of perceptual organization. New York, NY: Oxford University Press.

Pomerantz, J. R., & Portillo, M. C. (2011). Grouping and emergent features in vision: Toward a theory of basic gestalts. Journal of Experimental Psychology: Human Perception and Performance, 37, 1331–1349. Pomerantz, J. R., & Pristach, E. A. (1989). Emergent features, attention, and perceptual glue in visual form perception. Journal of Experimental Psychology: Human Perception and Performance, 15, 635–649. Pomerantz, J. R., Sager, L. C., & Stoever, R. J. (1977). Perception of wholes and of their component parts: Some configural superiority effects. Journal of Experimental Psychology: Human Perception and Performance, 3, 422–435. Ramachandran, V. S. (1988). Perception of shape from shading. Nature, 331, 163–166. Ringach, D. L., & Shapley, R. (1996). Spatial and temporal properties of illusory contours and amodal boundary completion. Vision Research, 36, 3037–3050. Sekuler, A. B., & Palmer, S. E. (1992). Perception of partly occluded objects: A microgenetic analysis. Journal of Experimental Psychology: General, 121, 95–111. Tjan, B. S., Lestou, V., & Kourtzi, Z. (2006). Uncertainty and invariance in the human visual cortex. Journal of Neurophysiology, 96, 1556–1568. von der Heydt, R., Peterhans, E., & Baumgartner, G. (1984). Illusory contours and cortical neuron responses. Science, 224, 1260–1262. Westheimer, G. (1965). Visual acuity. Annual Review of Psychology, 16, 359–380. Zeevi, Y. Y., & Mangoubi, S. S. (1984). Vernier acuity with noisy lines: Estimation of relative position uncertainty. Biological Cybernetics, 50, 371–376.

A perceptually completed whole is less than the sum of its parts.

How efficiently do people integrate the disconnected image fragments that fall on their eyes when they view partly occluded objects? In the present st...
868KB Sizes 0 Downloads 3 Views