Avian Visual Cognition

 Home Page

Next Section: Feature Learning



"Generalization within classes and discrimination between classes - this is the essence of concepts"(Keller & Schoenfeld, 1950, p. 155). 

II. Exemplar View 

The universality and simplicity of generalization and discrimination has led many theorists to the most elementary level of categorization, i.e. behavioral analysis (e.g. Wasserman & Astley, 1994). According to behavioral analysis, which is based on the principles of conditioning, classes represent sets of stimuli with identical functions. These sets of stimuli are unified, not by a rule, a common feature, or an abstract representation, but by a common psychologically significant consequence (Skinner 1935). In the terminology of human categorization theorists, exemplar models predict categorization in such a manner.

Exemplar models are the most parsimonious models of categorization in terms of the underlying associative mechanism (see Chase & Heinemann (2001) for more on exemplar models and an actual working example). Proponents of exemplar models assume that intact stimuli are stored in memory, and that classification or recognition is determined by the degree of similarity between a stimulus and the stored exemplars. Simple generalization effects explain correct classification of novel (previously unseen) instances of categories. The tendency to respond to instances to which an individual has already been exposed will generalize to a similar, novel stimulus when it is first presented. This means that only the item information is used for classification decisions, and that categorization relies on the comparison of a new stimulus with known exemplars of the category. Therefore, generalization gradients have to be anchored to individual stimuli, and the individual exemplars have to maintain their memorial integrity even if the number of experiences relevant to a category is increased dramatically.

There are, however, a number of problems with the exemplar theory of categorization. Proponents of such theories cannot agree as to: a) how many exemplars can be stored or retrieved for comparison, and b) how similarity can be computed so as to ensure that responding generalizes only to those instances of the same category. The first problem is not only a matter of memory capacity, but also of feasibility. Categorization, according to exemplar based processing, is determined by the number of exemplars to be remembered. "Therefore, any exemplar model which assumes that all experienced exemplars are remembered may not apply to pigeon categorization when birds are trained with large numbers of non-repeating stimuli" (Bhatt et al., 1988). This criticism of the exemplar model can be overcome by assuming that categorization is based on a small subset of the total number of stimuli, or that specific retrieval rules act to determine which patterns are most likely to be accessed. The same is also true of novel stimuli presented in a generalization test. The average distance model (Reed, 1972), for example, assumes that the classification process involves the computation of similarity distances between the new pattern and all known members of a category. According to this extreme variant of the exemplar model of categorization, the subject stores all experienced patterns in memory, and retrieves information when a new pattern has to be classified. Some exemplar models do, however, allow for more abstract representations (e.g., Medin & Schaffer's context theory, 1978).

One may question the relevance of artificial tasks for natural categorization. Learning about perceptual classes in nature by recognizing every instance may sound like an impossible feat, but it has to be taken seriously. Although the number of possible pictures of "a tree" or "not a tree" is infinite, the actual number used in any particular experiment is likely to be finite. A series of experiments reported by Vaughan and Greene (1984), for example, revealed that the pigeon has quite striking powers of exemplar learning and retention. In these frequently cited experiments, eighty pictures of outdoor scenes were arbitrarily divided into two categories, positive and negative, and shown to three pigeons daily in different sequences. The only way in which performance on this task could have improved would be if each pigeon memorized whether each exemplar was positive or negative, since there was neither a concept nor a feature rule that distinguished the two categories. Against the odds, the pigeons were not only able to sort the 80 pictures correctly (with a probability of discriminating by pure chance well beyond 10-10), but they also learned to respond correctly to no less than 320 such pictures.

Fersen and Delius (1989) subsequently obtained even more outstanding results. Their pigeons were capable of discriminating 100 different positive stimuli from a further 625 similar negative stimuli. Tests conducted with novel stimuli indicated that the birds had not only memorized the positive stimuli, but also the negative ones. Furthermore, the birds retained this information over a period of several months, supporting the view that persistent and capacious memories are not restricted to food-storing animals, but may reflect a fitness advantage accruing from extensive and thorough knowledge of the environment.

Given the remarkable capacity of pigeons for learning specific exemplars, some researchers may tend to judge it as a sign of dullness, rather than of intelligence. The ability to learn relational or abstract concepts is more likely to provide evidence of intelligence. Throughout this chapter, however, I will argue for the need to be cautious in this respect. Greene (1983) provides an illustrative example of the overestimation of the pigeon's spontaneous discriminative abilities. Greene trained pigeons to discriminate between slides on the basis of a "repetition" concept. However, those birds actually mastered the task by responding to minute differences between the two "copies" of the slides. In the following section, I will offer yet another example of exceptional exemplar recognition by pigeons that were originally trained on an abstract concept. In this example, the pigeons were required to discriminate between different "chess-board patterns" according to the presence or absence of vertical symmetry.

Learning By Rote Instead of Abstracting A Symmetry Concept

The data presented in this section are from a long-term study of symmetry recognition that involved several stages and variants (Huber et al., 1999). However, I will confine my report to a single experiment that provides ample evidence of the pigeons' ability to switch from learning about the experimenter's abstract Click here to view a Strasser pigeon class rule to learning about the more congenial instances. We used in this, and in all following experiments, members of the lively and robust "Strasser" strain of pigeon.  A total of 16 subjects were trained to discriminate a set of 20 vertically symmetrical chessboard-patterns (assigned as "positives" for one group, and as "negatives" for the other group) from a set of 20 patterns without a symmetrical axis (with the reverse contingencies).

We used two different types of black-and-white chessboard patterns in order to test certain models of symmetry recognition. One type (the "COMPACT" pattern) consisted of a vertical axis with "whiskers" on both sides. In the case of the symmetrical pattern, the whiskers on one side of the vertical axis were a mirror image of the whiskers on the other side. Click Here To View Figure 3 In the case of the asymmetrical pattern, the whiskers were randomly distributed across both sides of the axis (Figure 3). The outer contours of all the figures were filled black. The exemplars of the second type of pattern (the "SCATTERED" pattern) were much more like chessboard patterns; i.e. the small elements were scattered across a 6x6 mClick Here To View Figure 4atrix (Figure 4). However, it would detract from the main point of this study if the differences between these two types of stimuli with respect to the assessment of symmetry were discussed more thoroughly. Only the striking difference between the learning curves of the two experimental groups warrants mentioning this difference. 

Members of the stimulus classes appeared, one at a time, on a computer monitor close behind a transparent pecking-key. The pigeons were required to peck frequently when a positive stimulus appeared in order to obtain food, and to suppress pecking when a negative stimulus appeared in order to terminate a non-reinforced trial. This type of go/no-go procedure (see details in Figure 5 below) was first used in Herrnstein's laboratory (Herrnstein et al, 1976; Herrnstein, 1979; Vaughan & Greene, 1984) and has also been repeatedly employed in our own laboratory.  

Figure 5: The go/no-go successive discrimination procedure

A standard training session--which is Click Here To View Figure 5 administered once a day, five times a week--consists of 40 trials (elementary training unit). At the beginning of a trial a stimulus is presented directly behind the pecking key. During the first 10 s of a trial, pecks emitted onto the response key are counted but have no scheduled consequences. Only these responses enter into the data analysis. Following this 10-s period, and a further 10-s variable interval (VI) with a range of 1-20 s, subjects enter the consequential phase of the experiment. If the stimulus is positive, the first response to occur within 2 s of the previous response results in reinforcement being made available for a 3-s duration. If a negative stimulus is presented, the trial is terminated if the pigeon is withholds pecking for 8 sec during the consequential phase. If, however, the bird pecks continuously during the presentation of the negative picture, following the timing-out of the VI schedule, the picture presentation remains active. After trial termination, an inter-trial-interval (ITI) of 3 s follows, during which time the monitor is dark. The sequence of positive and negative trials follows a pseudo-random schedule. No more than 4 positive or 3 negative trials can occur in a row and the first trial is always positive. 

If pigeons learn, this procedure results in a high rate of responding to patterns identified as positive, and a low or zero-rate of responding to patterns identified as negative. In trials pigeons are uncertain as to what group a stimulus belongs, they peck at an intermediate rate. Learning speed and accuracy are measured using Herrnstein's rank-order statistic, rho.

Plotting the learning curves of the two Click Here To View Figure 6 groups in a single figure (Figure 6) reveals a surprising pattern of results. The investigation of symmetry conceptualization was overshadowed by a strong effect of stimulus type. Pigeons that were presented with the compact patterns experienced considerable difficulty when discriminating between symmetrical and asymmetrical stimuli. On the other hand, pigeons presented with the scattered figures showed substantial discriminative abilities. This result is surprising mainly with respect to the abstraction of the symmetry rule. If this rule is nothing more than a logical device, then no difference between the types of stimuli should have occurred. For example, humans informed about the underlying concept would have no difficulty in sorting the figures in both cases. If its acquisition, however, is bound to particular visual aspects of the figures, as many pattern recognition theories suggest (e.g. Osorio, 1996), then one may wonder whether we should continue to speak about abstract concepts at all. In any case, generalization tests involving the presentation of novel stimuli should allow us to determine whether the successful subjects learned the task solely by memorizing the specific training exemplars.

Such an endeavor requires the presentation of specifically selected test patterns, rather than the presentation of previously unseen instances of the training sets. In principle, the ability to capitalize on abstract relations, such as symmetry, can expand the class boundary by an unlimited amount, since boundaries will no longer be restricted by absolute class characteristics. Virtually any picture or object that is bilaterally symmetrical should be correctly classified when it is first encountered. Generalization then becomes a matter of inference, rather than of perceptual similarity. If, on the other hand, common features are involved, then transfer should occur along this dimension. Generalization will then become a matter of similarity. Finally, learning the training stimuli by rote in a photographic, integrative manner should severely impair generalization performance. Even minute changes in the idiosyncratic structure of memorized stimuli should result in a sharp trade-off. Generalization should then become a matter of indiscriminability.

In the type of transfer tests that we have chosen, we were able to determine the pigeons' categorization strategy from their unforced choice of novel stimuli, which were selected because of their similarity to the training patterns. In the first transfer test we simply used novel exemplars from homologous sets. Twenty novel compact figures and 20 novel scattered ones were interspersed in further training sessions Click Here To View Figure 7 (see stimuli in Figure 7). We then examined whether transfer was limited to the training set by exchanging the test stimuli, e.g. showing the COMPACT group the scattered figures and vice versa. A negative result here would imply that transfer was dependent on absolute class characteristics. Even if the pigeons did master the transfer test, it would not provide unequivocal evidence of the possession of a pure abstract symmetry concept. Therefore, a more precise assessment of whether performance was contaminated by remembered stimulus aspects was sought using a final test that involved modified training patterns (see stimuli in Figure 8). For each group, we modified five symmetrical training patterns in six steps; three of which generated asymmetric patterns. These steps corresponded to various amounts of change Click Here To View Figure 8 (either two, four, or six square elements have been displaced, omitted, or added). Thus, transfer behavior could vary according to either a) degree of symmetry, or b) amount of change. If it were found that the birds were influenced by a), then this would provide evidence of symmetry conceptualization. Whereas an effect of b) would indicate that transfer was the result of similarity to memorized training patterns.

The first two tests, which involved the presentation of test stimuli from the birds' own training class as well as from a foreign training class, revealed that only among the successful birds of Group SCATTER did performance generalize to novel instances. However, even this successful transfer was restricted to those instances belonging to the birds' own training class (SCATTER), but not to the foreign class (COMPACT). That transfer performance was severely bound to common absolute stimulus characteristics was shown more directly by "tacitly" presenting "ambiguous" figures, or stimuli that closely resembled (symmetric) training patterns but whose symmetry content had now changed (those three modifications of each of five originally symmetric figures that now became asymmetric; see Figure 8).

TClick here to View Figure 9his forced choice test revealed that generalization was solely the result of similarity rather than the possession of a symmetry concept. Figure 9 shows that all test stimuli were judged as similar to symmetric training stimuli (little deviation from symmetrical training stimuli, yellow bars), but as dissimilar to asymmetric training stimuli (large deviation from symmetrical training stimuli, red bars), regardless of their own symmetry type. Thus, irrespective of whether a test stimulus was symmetric or asymmetric, it was classified according to its similarity to the symmetric training stimulus it was derived from. Transfer was a matter of similarity and not of symmetry. Pigeons generalized, if at all, very conservatively (i.e., to only a small range around the stored templates, much like a cross-correlation-based template-matching system, Cerella, 1990).

The difference in the ability of the two groups of pigeons to store their respective training instances remains to be explained. If pigeons are living "photo-cameras", with large films that store the pictures they are presented without any further decomposition processes occurring, then such huge learning differences should not have occurred. Clearly, if one is only interested in models of categorization or conceptualization, then one can omit further discussion of this finding, or leave it to theories of pattern recognition and memory (as was done predominantly in the human literature on template matching). Those interested in the basic mechanisms underlying categorization in animals, however, require a more rigorous investigation of how pictures are processed in the animal's brain. This means that we will have to enter into a discussion of how visual objects or scenes are represented in the brain, and what aspects of these objects or scenes are used for discrimination or categorization tasks.

    Exemplar-based categorization of birds versus mammals

    A very clear support for the exemplar view of visual categorization was found in an experiment by Cook et al. (1990) in which pigeons were required to learn a discrimination between line drawings depicting naturally occurring objects, namely birds and mammals. The primary evidence for this came from the uniform rates of discrimination acquisition among groups that were trained either with the S+ and S- stimuli unified by a category (the true categorization condition) or with reinforcement being uncorrelated with category membership (the pseudoconcept condition). Further evidence for learning each exemplar separately came from the significant facilitation of acquisition by using only 5 instances per category instead of 35 instances per category.  This difference in learning speed can be explained by a lack of within-category generalization, an important facet of relational or featural learning. Finally, the fact that transfer to novel category instances was determined not by the typicality of the pictures ('good' or 'poor' exemplars as assessed by human raters) but by the specific nature of the exemplars used during training is also in line with the exemplar view of categorization. 

    An important facet of this experiment is the successful specification of those aspects of the stimulus array that entered into the pigeons' memory. Using specific test variants of the pictures, the authors found that not the entire picture was stored, but only the animal figures themselves. From these figures, some features were not controlling the pigeons' classification behavior, like the 90 degree rotation or the reflection about the vertical axis. It seems as if the birds had selectively attended to specific aspects of the stimulus array and that they had decomposed the pictures into at least figure and ground. Such analytic processes, however, are perfectly in line with a more feature-based account of categorization and this apparent theoretical incompatibility points to a fundamental problem of perceptual learning: How do information-processing systems make determinations of similarity (Blough, 2001; Lea, 1984; Shepard, 1987)?

In principle, we may distinguish between item-specific and category-specific aspects of the stimuli. The first ones are needed to discriminate between instances of the same class; the second ones are needed to discriminate between instances of different classes. Differential within- versus between-class generalization is commonly considered as a key feature of visual categorization in animals (Wasserman & Astley, 1994). If no category-specific information is available as a kind of relational information about common properties of a category, then classification learning will be restricted to learning about each individual stimulus separately and distinctly. If it were available, learning to use it will significantly facilitate classification, both in terms of acquisition and transfer to novel instances. The facilitation of classification is due to intra-class generalization, that is transfer to novel instances of the same class is mediated by generalization or item similarity to the previously learned exemplars. The classic procedure to disclose learning about category-specific features is the pseudoconcept task which involves the arbitrary assignment of category (for example "fish", Herrnstein & de Villiers, 1980) and non-category exemplars ("non-fish" in this example) to positive and negative classes, respectively. Lea and Ryan (1990) called this the "perverse pseudoconcept task" in order to distinguish it from the alternative "random pseudoconcept task" in which no 'concept' exists within the stimulus set).

A number of categorization experiments involving such pseudoconcept tests have successfully shown that pigeons learn about category-specific information in contrast or in addition to item-specific information (Herrnstein & de Villiers, 1980; Edwards, & Honig, 1987; Pearce, 1988; Wasserman, Kiedinger & Bhatt, 1988). On the other hand, using a similar procedure, some authors have found differences in acquisition rate and concluded that the animals achieved a discrimination by learning only about the specific stimuli in the acquisition phase (Vaughan & Greene, 1983; Schrier, Angarella & Povar, 1984; Cook, Wright & Kendrick, 1990). 

A very reasonable suggestion brought forward by Cook et al. (1990, also see sidebar) is that categorical discriminations may consist of two phases, a "stimulus learning" and a "concept" phase. The first would involve only learning about the exemplars by attending to the item-specific information that distinguishes them from all other stimuli. The second phase, in which category-specific information is extracted, may then follow. In cases in which it doesn't, as in our symmetry experiment above, true open-ended categorization as a means of minimizing memory requirements is not achieved. It is fair to say, that absolute, data-driven, exemplar-based strategies of categorization are a plausible alternative to more relational and analytic processing mechanisms, especially from an engineering point of view (Cook et al., 1990). However, the relevance for pigeons and other animals in the wild remains to be specified (see an extensive discussion of this issue in Huber, 1995, 1999).

Next Section: Feature Learning