Avian Visual Cognition

 Home Page

Next Section: References



V. A Synthetic Approach Using Natural Stimulus Classes

Complex Visual Classes

Research on pigeon categorization has produced ample evidence in favor of all three theories of categorization: exemplar, feature, and prototype theory. The most pessimistic conclusion might be that the artificial stimuli, or the artificial feature combinations, devised by the experimenter produced experimental artifacts in the pigeons' behavior. The birds may have simply learned "a set of nonsense stimuli associated with reward rather than relying on the type of classification they may use outside the test chamber" (Watanabe et al., 1993; p. 372). In their natural visual worlds, pigeons are faced with objects rather than with the simplistic stimuli used in psychophysical experiments (Fetterman, 1996). It is very likely that the current inability to specify the kind of features extracted by pigeons from objects or complex scenes, is the main reason for the success of associative theories. In order to provide evidence for either of the other two categorization theories, a precise knowledge of the information that enters the cognitive process of feature integration would be required. However, since the perceptual processes by which the patterns of excitation in sensory nerves are transformed into stable representations of objects and classes are unspecified, neither categorization model has a chance of surviving.

Many of the above experiments have been interpreted in terms of human language concepts. The same cognitive revolution that earlier washed across the study of human learning had also a (delayed) impact on the study of animal behavior. In that era, comparative psychologists were tempted to employ a high-level semantic framework to describe complex classes appropriately. From the bird's point of view, however, even these semantic classes may have allowed a solution to the classification in terms of a few global dimensions of invariance, detected by surveying the entire scene. The basic stimulus aspects to which the pigeon is pre-adapted when performing complex categorization tasks still remains open to question. This lack of knowledge is not a sufficient reason for supposing "that pigeons are doing anything more complex than associating a large number of pictures and/or the features they contain with a reward, and then showing transfer to new pictures to the extent that they contain features previously associated with a reward" (Mackintosh, 2000, p. 125).

The Role of Texture and Two-Dimensional Shape in Images of Human Faces

It is a paradox that, despite the impressive progress physiologists and psychologists have made in understanding the pigeon's visual capacities, little progress has been made in understanding the role played by of the most fundamental properties of any object in the environment: surface and space (from here on called "texture" and "shape"). Objects are described predominantly in terms of their morphology, or the shape and the geometry of their components, while their surface properties are often ignored. Only color and overall luminous flux have been considered in a post-hoc analysis (Lubow, 1974). Texture is mostly related to the surface properties of aerial photographs, landscapes or industrial materials. Furthermore, it has been reported that pigeons perceive textured displays much like humans, i.e. with the preattentative global perception of contrasting textural regions (Cook, 1992a; Cook, 1992b; Cook, 1993; Cook, Cavoto & Cavoto, 1995; Cook, Cavoto & Cavoto, 1996). We know of no attempt to investigate the role of texture in natural categorization tasks, only some vague speculations that texture might be a cue controlling some discriminations (e.g. in Lubow, 1974; Cook et al., 1990; Jitsumori & Ohkubo, 1996).

One reason for this may be the difficulty in finding the appropriate stimuli. On the one hand, the stimuli should be sufficiently complex to contain both shape and texture information relevant to and diagnostic for class membership. On the other hand, the stimuli should be sufficiently simple to enable the experimenter to control the amount of texture and shape information. In a series of recent experiments (Loidolt et al., 1997; Troje et al., 1998; Troje et al., 1999; Huber et al., submitted.) we used human faces as stimuli and the concept "sex" to define class membership.

The stimuli were selected for three reasons:

1) "Male" and "female" are two categories that reflect natural stimulus variation, and evolved to be classified correctly. Although it is not easy to quantify the differences between them, male and female faces are sufficiently different to be easily discriminated by humans (96% correct; Bruce et al., 1993) and artificial neural networks (98% correct; Troje & Vetter, 1998). 

2) Pigeons are naive with respect to the task of classifying human faces according to sex, although they have undoubtedly experienced them. Training is thus completely under the control of the experimenter. 

3) Human faces provide complex variation in terms of both texture and shape.

In our initial experiment, we compared the classification performance of pigeons presented with different versions of the same set of stimuli. The stimuli could be distinguished according to their texture and shape information, and were derived from laser scanned models of the faces of 100 men and 100 women (Troje & Buelthoff, 1996). The faces were free from any kind of accessories such as glasses or earrings. The men were carefully shaven and the hair on the head was digitally removed from the 3D-models. The 200 faces were randomly sClick Here To View Figure 23divided into two sets (A and B), each containing 50 male and 50 female faces. Group O were shown the original images (see a sample in Figure 23), while Groups T and S were shown images onlyClick Here To View Figure 24 after they had been subjected to a technique described in Vetter & Troje (1997) which involved separating the texture and shape components of each image. Group T was shown images generated by combining the original texture of each face with  an average shape. This yielded an image set that varied with respect to texture but not shape (see a sample in Figure 24). Group S Click Here To View Figure 25 was shown images generated by combining the original shape of each face with an average texture, which yielded an image set that varied with respect to shape but not texture (see a sample in FigureClick Here To View Figure 26 25).  The 100 faces of Set A shown to Group O are simultaneously depicted in Figure 26. See also the homepage of N. F. Troje to learn more about the correspondence-based representations of faces.

The results of this experiment indicated that Groups O and T learned very Click Here To View Figure 27 quickly and accurately to discriminate faces, whereas Group S failed to do so (Figure 27). Comparing the overall performance at the end of training (i.e. the last 16 presentations of each image), revealed that all of the Group O (min r = 0.865; mean r = 0.951) and Group T (min r = 0.870; mean r = 0.913) subjects distinguished between the classes very well, while only three of the Group S subjects achieved r values of greater than 0.86 (mean of Group S: r = 0.713). These results suggest that pigeons are extraordinarily sensitive to texture differences, but that they find it very difficult to discriminate shapes.

The ability to generalize between stimuli is a widely used measure of whether an open-ended categorization capacity has been acquired. Examining their spontaneous responses to novel images of human faces tested the generalization ability of pigeons. The pigeons were presented with the 100 images from the previously unseen set of images. If the pigeons were using the information that generally divides male and female faces, then generalization to novel patterns should be easy. However, if pigeons were not using this information then generalization should be impaired. The results of this experiment indicated that the subjects assigned to Groups O and T generalized to novel faces. The 100 test patterns were divided into male and female at a level significantly beyond 0.001 for each bird (Mann-Whitney U-test), and response rates to the test Click Here to View Figure 28 patterns were nearly identical to response rates to the training patterns shown in these sessions (Figure 28). However, subjects assigned to Group S failed to show good transfer. Only those three birds that eventually mastered the original task transferred to the novel patterns, while the other five subjects failed to do so. The results of this transfer test support the conclusion that learning was not specific to the original training patterns; the 50 novel male and female faces lay within the pigeons' classification schemes. Even the three Group S subjects acquired a rule for separating male and female faces.
The Role of Low-Level Features in Images of Human Faces

The stimulus parameters that controlled the performance of successful subjects remain to be determined. Male and female faces differ both in average size and in average intensity. Female faces are generally smaller and brighter than male faces. Therefore, we computed the rank correlation between pecking rate to individual faces and either the average size or the average intensity of these images. The five parameters that describe the texture of images (energy, contrast, entropy, homogeneity, and hurst) as well as the three components that describe their color (red, green, and blue) were also quantified. In order to exclude the partial correlation between pecking rate and sex that was not due to the parameter under investigation, we computed the correlation separately for male and female faces. The pecking rates of almost all Group O and Group T subjects correlated significantly with intensity (R>0.533; P<0.001), but not with any of the other texture parameters or size. Pigeons assigned to these groups appeared to use the intensity of faces as a cue to discriminate between male and female faces. However, in Group T there was one interesting exception. The pecking rate of the animal with the highest r value did not correlate with intensity. In Group S, we found no correlation between pecking rate and average intensity, although for the three birds that showed reasonable classification performance there was a weak correlation between pecking rate and size (R>0.320; P<0.05).

Despite the high correlation between pecking rate and intensity, the performance of the successful Group O and Group T subjects cannot be explained in terms of the exclusive use of this stimulus parameter as a cue. Ranking the stimuli according to their average intensity reveals a r of 0.787, which is considerably smaller than the r values for response rates of Group O and T subjects. In contrast, while size is a much better cue for assessing the sex of a human face (r=0.924), only three of the eight Group S subjects were able to capitalize on it.

In a second test the spontaneous responses of pigeons to images that were normalized with respect to their average intensity (Group T) or size (Group S) was measured. Group S subjects were presented with male faces from the test set that were the same average size as female faces, and female faces that were the same average size as male faces. The male faces were now smaller than the female faces, and if the Group S pigeons used size as a cue their pecking behavior should be reversed. The same logic was applied to the texture group for which intensity was normalized. Finally, as a control for the generality of our conclusion that texture information is much more readily used by pigeons to classify faces than shape information, we tested Group O using the two versions of the 100 test faces that were shown to Groups T and S during training.

Click Here to View Figure 29The performance of Groups T and S in this test is shown in Figure 29. While the birds were able to discriminate the training stimuli, transfer to the test stimuli was impaired. For Group T, the difference between pecking rates in response to positive and negative test stimuli decreased to a level that was no longer significant. Thus, pecking in this group seemed to be strongly controlled by the brightness of the faces. For Group S, pecking behavior was slightly reversed with respect to training, indicating that size differences played an important role in the classification strategy of the successful subjects in this group. Finally, the question of whether texture was preferred over shape as a cue when both sources of information were available can beClick Here To View Figure 30 answered. Group O subjects showed a much weaker generalization decrement in the presence of test stimuli that differed with respect to texture than in the presence of test stimuli that differed with respect to shape (Figure 30).

The results of these experiments showed that the performance of successful birds was tightly controlled by differences in the average intensity of faces. However, average intensity could not have been the only cue that the birds were using because pecking behavior was not reversed during the transfer test of Group T, in which this parameter was exchanged between classes. In order to determine whether the birds would succeed on this task if average intensity was removed as a class discriminator, 15 subjects from the former Groups T and S were subjected to further training using texture only stimuli that were normalized with respect to their overall intensity. Although this meant that intensity was factored out as a discrimination cue, the subjects mastered the task and performed at a level greater than chance by the end of training. Clearly, another stimulus property had taken the role of class predictor. If the pigeons had possessed memories of the faces that they experienced during the first training and testing phase, then they would not have to relearn the classification during this phase of training.

In order to provide quantitative support for the feature account, we investigated the pigeons' subjective separation of the feature space according to the experimenter- defined class rule using principal component analysis (PCA). We found significant correlations between pecking rate and some of the dimensions captured by PCA. There was a high correlation between pecking rate and the stimulus projection values of the second, and partly the first and the third, principal axes. Since it is impossible to determine what stimulus properties the pigeons extracted, we created synthetic faces that varied along these PC axes. Two opposite faces (each 6 standard deviation units away from the mean) fromClick here to view Figure 31 the first 20 principal components are depicted in Figure 31. From inspection of these images, we were able to tentatively conclude which aspects the pigeons were attending to. The first PCA picks up a difference in relative luminance between the upper and the lower half of the face; which is stronger in men than in women, presumably because of the shadow created by the beard. The second PCA includes a subtle difference in color between male and female faces; male faces are more red than female faces, while female faces are more blue and green than male faces. The third PCA is related to patterns of shading. 

In order to determine whether the pigeons actually used these stimulus properties, or some Click Here to View Figure 32 combination of them, birds were subjected to test sessions involving the presentation of these PCA images. This revealed that their classification behavior was controlled by these feature parameters. In the case  of the first two axes, animals perceived even small variations within +/- 1 standard deviation unit and responded in terms of a category decision (Figure 32). The strongest effect was found in the second axis representing color differences. This is not very surprising given the fact that pigeons have an extraordinary physiological capacity for the exploitation of color (Thompson, Palacios & Varela, 1992; Varela, Palacios & Goldsmith, 1993).

Although the above results cannot sufficiently disprove the possibility that item-specific details or higher-order stimulus aspects guided the pigeons' classification strategy, we were able to show that this was quite improbable. We measured the pigeons' spontaneous classification of Click Here To View Figure 33interspersed test images that were derived from the original color images by substantiClick Here To View Figure 34ally destroying the higher-order stimulus properties. For example, using a Gaussian filter we  produced blurring and using a mosaic filter we produced block-portraits of the faces. Accurate responding was maintained across a large range of destruction in both Gaussian (Figure 33) and Mosaic tests (Figure 34).

In summary, these experiments were an exercise in animal visual categorization. Considered from a purely behavioral point of view, the present outcome would fit seamlessly into a list of experiments that provide evidence to suggest that pigeons form complex concepts (Herrnstein, 1985; Wasserman, 1995; Watanabe et al., 1993). When presented with the proper stimuli, pigeons learned quickly and generalized widely. Although pigeons have strong resources for learning specific exemplars, and display surprising cognitive capacities, neither categorization in terms of exemplar memorization nor in terms of abstract concept formation is plausible. Common to both of these theories is that they underestimate the pigeon's ability to instantaneously adopt a perceptual description of visual classes that are coextensive with natural categories. Cerella's (1979) oak leaf experiment, but also two experiments with blue jays discriminating species of moths (Pietrewicz & Kamil, 1977) and patterns of leaf damage due to different species of caterpillars (Real et al., 1984) support this notion. These findings raise question of whether animals might sort the complex objects of the natural environment, even the so-called higher-order concepts like "persons" and "fish", by fixing on some specific, single feature.

The surface properties of objects represent a feature domain that provides enough possible codes to reflect the actual distribution of reinforcement in the environment (Haralick, 1979, Pentland, 1984). Unfortunately, the surface properties of images have never been seriously considered as providing the appropriate descriptor of seemingly complex stimulus classes. In contrast, considerable effort has been made to construct artificial categories out of simple forms such as line drawings in order to control for feature content. In our own experiments we were able show that surface properties are not only sufficiently informative for pigeons to easily classify a particular complex natural category, but are perhaps, at least for this species, superior to shape attributes.

Even if a class definition based on surface attributes remains obscure to the experimenter, pigeons may utilize this lower-order statistic inherent in pictures by the effortless, preattentative processes of perception (Marr, 1982). A sophisticated texture analyzing system might be of great value for viewpoint independent object recognition, for the recognition of objects without concrete boundaries, and for the recognition of degraded or partially occluded objects (Julesz, 1981; Julesz & Kröse, 1988).

VI. Conclusions

Although there is now ample evidence to suggest that pigeons, and many other species of bird and non-human animal, navigate through their visual world by attending to only a small subset of relevant features and by learning to associatively integrate these features into a class rule, this is not sufficient reason to discredit exemplar memorization as another useful classification strategy. Furthermore, it is quite reasonable to assume that many visually gifted species are able to attend to the relations between different aspects of the same stimuli. However, whether pigeons are also able to utilize relations between two or more stimuli (e.g. one is brighter than the other) is still an open question. Recent evidence in its favor has come from experiments on same/different concept learning by Cook and coworkers as well as Wasserman and coworkers (but see their chapters in this volume). Nevertheless, it seems quite reasonable to assume that representations of stimuli at the level of relationships between two or more arrays are at the limit of the cognitive capacity of pigeons (see also Mackintosh, 2000) or a cognitive last resort. From this quite conservative estimation, and the fact that "concepts" cannot be sufficiently decoupled from linguistic competences (Chater & Heyes, 1994), a focal shift towards the interface between sensory analysis and associative integration in the classification mechanisms of pigeons seems warranted. If so, categorization research with pigeons will continue to constitute a fruitful endeavor in the field of comparative cognition into the next century.

 Next Section: References