Conclusions from the research

To this article's home page

These results, to our knowledge, are the first documented evidence of the ability of pigeons to process such rapidly presented quasi-sequential visual information in a target search task. Despite the dynamically changing color values in the various RSVP conditions, the pigeons retained the capacity to locate the target region in these textured stimuli even at the fastest rates. Of most theoretical importance was performance in the 100 ms display-variable condition, where an entirely new display appeared every tenth of a second yet accuracy showed only a moderate drop in comparison to the static displays. A second important new finding was the superior localization accuracy observed whenever a display=s dynamic changes were associated exclusively with the target -- and in some cases actually resulting in elevated performance above that recorded with the highly familiar static displays. As prefaced in the introduction, these results have several implications for our general understanding of how pigeons perceive, search, select, and process complex, hierarchically-arranged, visual information.

The temporal properties of texture processing. The first implication concerns the speed at which pigeons visually perceive and perceptually group a textured display=s elements. We had previously estimated, using the indirect subtraction method described in the introduction, that pigeons could perceive textured differences within about a 150 ms time frame (Cook, 1992a). The present RSVP results experimentally verify this estimate, and lower its value closer to the range of human texture perception (Bergen & Julesz, 1983; Gurnsey & Browse, 1989). The best evidence for this claim is the birds' clear success at localizing targets in the various 100 ms conditions. Most especially the six display-variable condition, where they experienced unique colors every 100 ms during the time most critical to their determination of the target=s location. As such, those perceptual processes needed to detect and group similar colored regional contrasts must have been sufficiently completed within this interval for any type of discrimination to have occurred. Had the birds required a longer interval to perceive these target/distractor contrasts, then the rapid succession of colors would have both continuously masked the previous colors and left insufficient time to process the currently displayed pair, and should have made discrimination impossible. Although performance was reduced in display-variable conditions in comparison to the static color displays, it is also clear that dimensionally-based target/distractor contrast information, useful to determining the target=s location, was available to these birds at the highest rates of display presentation.

One might wonder whether at the highest rates the separate frames simply blurred together, fusing into a set of emergent, but stable, colors that in turn controlled performance. Thus, in contrast to the considerations just discussed, this line of argument suggests the birds' success in the current tests was due to a failure to resolve the individual frames of the displays. This does not appear to be what happened, however. First, studies of pigeon critical flicker fusion threshold verify this animal=s capacity to have temporally resolved the frames of these displays even at the fastest rates (Graf, 1969; Hendricks, 1966; Henton, Ellingson, & Edwards, 1981). More directly from this experiment, the pigeons= above chance performance in the target/distractor reversal condition, where the color of the two regions rapidly alternated back and forth, argues empirically against such an interpretation. If at the highest rates the different colors in each frame had temporally fused into new ones, then this specific condition should have resulted in a psychologically uniform stimulus. This was not the case. While performance in this potentially confusing condition was lower than when unique colors formed these regions across frames (i.e., the two and six display-variable conditions), even this moderate level of accuracy shows the birds= capacity to visually resolve the separate frames of the RSVP displays.

These findings have further implications for the time course of the kind of visual information that is extracted from complex hierarchical displays. Overall, these results are consistent with our previous hypothesis that pigeons= learn and process textured-based oddity localization tasks by means of the display=s relational global properties rather than its local absolute attributes (Cook, 1992a; 1993b; Cook et al., 1995) . Again, performance in the 100 ms six-display condition is of most relevance. Because this condition=s swiftly changing nature provided little opportunity to identify the absolute individual values of the colors presented on a trial, it would seem that success in this condition could have only been mediated by the relational target/distractor information. Recall that in this condition particular color values appear only briefly once or twice (a total presentation time of between 100 to 200 ms per color pair ) within the 1000 ms taken to normally execute a localization response. Such presentation times are generally not sufficient in duration for birds to make absolute stimulus identifications. For instance, in element-compound matching-to-sample experiments, testing stimuli of comparable size to the individual elements of the current task, presentation durations of 100 ms or less are generally too short to support the absolute identification of a sample=s properties (Cook, Riley, & Brown, 1992; Lamb & Riley, 1981). Further, when we tried to identify the specific individual colors of such dynamic displays we found it quite difficult, requiring many seconds of careful scrutiny to confidently name the colors. Given these considerations, the birds= success in performing with these ephemerally-valued, essentially Amoment-unique,@ displays almost certainly must have been mediated by the global visual relations of the contrasting target embedded within the distractor region, and without specific reference to a display=s absolute properties. If the pigeons are rapidly extracting this type of higher-order visual relation from textured stimuli, it would follow that when asked to respond to the presence or absence of a target contrast in a choice task, that controlled brief presentations should be sufficient to support accurate choice responding. Using a texture-based Asame/different@ conditional choice discrimination, Cook & Wixted (in press) recently presented evidence, in the context of another issue, that 100 ms presentation times could support above chance Asame/different@ discrimination. Together, the pattern of data from these converging lines of texture research suggests that certain kinds of relational information, especially as mediated by early grouping processes, involving the detection of a target (and possibly Aobjects@ in general ) and its likely location in visual space, is available to the birds prior to knowledge about other absolute properties, such as its color.

This differentiation obviously shares much in common with the proposed distinctions between the separate processing of Awhere@ and Awhat@ information in human visual search and perception research (Atkinson & Braddick, 1989; Green, 1991, 1992; Sagi & Julesz, 1985; Ungerleider & Mishkin, 1982; Zeki, 1993, see also Kirkpatrick-Steger, & Wasserman, 1996). Because the target was stationary across successive frames in the present experiment and target localization was not required by Cook & Wixted=s (in press) choice task, it cannot be determined from these specific data whether a single 100 ms textured flash is also sufficient for actually localizing the target=s position in a textured stimulus. For instance in the present setting, this kind of spatial information could have, and likely did, accumulate over frames within a trial. Nevertheless, while the exact time needed to localize a textured target=s position in this information processing sequence is unknown right now, given the temporal properties of the birds= responding here, we speculate that this kind of information is present in the first few hundred milliseconds of a texture=s presentation and before knowing anything about what it is.

The role of stimulus-driven processes in target localization. It has been proposed that human visual search and selection is the combination of two different processes: one consisting of a top-down, goal-driven form of directed attention and a second mechanism consisting of a bottom-up, stimulus-driven, automatic, interrupt-like process (e.g., Bravo & Nakayama, 1992; Yantis & Johnson, 1990). For instance, recent human research has suggested that spatially localized changes in the luminance, relative novelty, feature onsets, and offsets can all automatically capture visual attention and guide it to the location of their occurrence (Johnston, Hawley, Plewe, Elliott & DeWitt, 1990; Theeuwes, 1995; Watson & Humphreys, 1995; Yantis & Johnson, 1990). As mentioned, one of our motivations for the present experiment was to begin exploring what role similar stimulus-driven mechanisms play in avian visual cognition.

Regarding this general issue, the six and two target-variable conditions are of most interest. At the higher rates of display change, the birds' were far more accurate in these two conditions than in their corresponding distractor-variable conditions, and often better than with the static baseline displays. This facilitation indicates the presence of more information about the target's location in these types of dynamic displays than in the other conditions. What is the source of this facilitative information? One possibility that can be rejected is that with as many as six different targets appearing within a trial, there is a greater probability of a highly discriminable target appearing (even if very briefly) on these trials than in the static baseline condition. When performance is compared with the two and six distractor-variable conditions -- where the inherent discriminability of the colors are identical to their target-variable cousins-- the birds only showed facilitation when these color changes were spatially coincident with the target=s location.

We propose this facilitation is better attributed to the hypothesis that the pigeons= processing or Aattention@ was automatically activated or attracted to the transient visual changes at the target=s position, in a manner analogous to the mechanics of stimulus-driven "attentional capture" in humans. If this latter hypothesis is true, it suggests one reason why the distractor and display-variable conditions were generally more difficult for the birds. For these conditions the output of such stimulus-driven processes would have competed with the birds= learned rule-driven search for the odd target, by consistently diverting processing to the changing values of the distractors in such displays.

Using a different type of search task, Shimp and Friedrich (1993) recently presented evidence that a briefly presented stimulus cue (a 50 ms white flash) also seemed to be automatically processed by pigeons, regardless of its relative validity to the subsequently presented target stimulus. Although their final explanation was more associative in nature, it similarly emphasized the general idea that search performance was a conjunction of Aevent-driven@ and Aknowledge-driven@ processes. While many details clearly remain to be elucidated, these and other search results with pigeons (e.g., Blough, 1989) suggest a promising new direction for exploring the similarities and differences between human and nonhuman animals in the interaction of top-down and bottom-up processes in the selection and control of behavior by visual information.

Besides the new details about the structure and processes of avian visual cognition revealed by this experiment, especially about the timing and type of information extracted from complex texture displays and the contribution of stimulus-driven events to the control of search behavior, these data add to the growing evidence suggesting the functional similarity of human and avian early visual mechanisms (Cook 1992b; Cook et al., 1996). They suggest both species have early perceptual mechanisms for rapidly computing information about the global organization and location of similarity based visual contrasts. They also raise the possibility that these two species share functionally similar control mechanisms for automatically directing visual attention to locations of potential interest. Collectively, such similarities raise an interesting question. Do these similarities represent homologous visual processes derived from the common ancestry of birds and mammals more than 250 million years ago, or do they represent an example of convergent psychological evolution (Shepard, 1984), where species employing dissimilar visual architectures (Zeigler & Bischof, 1993) have evolved similar psychological solutions to the shared problems of rapidly and accurately processing the visual world. If it is the latter, it would suggest that the design of certain psychological mechanisms are closely tied (constrained) to by the structure of the physical world they are designed to process.