In order to survive,
organisms must respond appropriately in an enormous number of stimulus
situations. They must learn which stimuli provide information relevant
to their goals, and use that information to attain these goals. How is
this accomplished? The purpose of this chapter is to show how a simple
exemplar model of memory and decision making can account for the acquisition
and use of information. We express the model as a computer program that
we refer to as the Natural Intelligence Model (NIM). This model has been
developed to account for the behavior of pigeons seeking food under controlled
conditions. However, we believe that, with modification and expansion,
an exemplar model such as NIM is applicable to the analysis of behavior
in more complex situations, as well as behavior motivated by incentives
other than food, and that it is applicable to other organisms as well as
pigeons.1 The model is a developing
entity that we make available here as an executable program. The program
may be used to compare discrimination of stimuli that vary in discriminability
(d') along one or two dimensions including outline drawings.
Importance of Discrimination Training
As an exemplar model
of memory, NIM assumes that individual events or instances, rather than
an abstraction of the common features of these instances, are represented
in memory. Whether or not a particular exemplar is stored in memory depends
on the consequences of the organism’s interaction with its environment;
that is, information stored in exemplar memory is selected as a
result of differential reinforcement. The term reinforcement
is used here in a sense similar to that proposed by Guthrie (1959), and
Estes (1950), that is, reinforcement is the co-occurrence of stimulation,
behavior, and the consequences of the behavior. Following this definition,
we define exemplars as representations of sensations induced by environmental
stimuli, the subject’s behaviors in the presence of these stimuli, and
the consequences of these behaviors. The term differential reinforcement
refers specifically to the differences in the consequences of the behavior
that occurred in the presence of specific stimuli.
is the outcome of the training procedure referred to as discrimination
training. There are two types of discrimination training procedures. In
the “go, no-go” procedure the response is reinforced in the presence of
one stimulus (the positive stimulus), but not in the presence of another
(the negative stimulus). This procedure provides a signal (stimulus) that
a behavior (response) will or will not be reinforced. For example, a pigeon
may be trained to peck at a disk (pecking key) only when it is illuminated.
The second type of discrimination training involves choice between two
or more alternatives. For example, more than one key may be illuminated
and the pigeon must decide which key to peck. If the correct key is pecked,
reinforcement follows; if the incorrect key is pecked both keys are darkened
and neither choice will provide reinforcement until the keys are illuminated
once again. In the experiments to be described here, we shall deal primarily
with situations involving choice between two alternatives.
(e.g. the often cited experiment of Jenkins & Harrison, 1960) have
shown that discriminations established by differential reinforcement are
more precise than those obtained with non-differential procedures in which
a selected response is simply reinforced repeatedly in the presence of
the chosen stimulus. Differential reinforcement affects the detection
of the discriminative stimuli. It determines whether or not changes in
these stimuli will be noticed and remembered. It figures importantly in
all aspects of learning, from discriminations based on tonal frequency,
to pattern recognition and to categorization.
Characteristics of Exemplar Memory
The term “exemplar
memory” is most often used to account for the results of experiments in
which it is evident that specific details of the discriminative stimuli
are remembered. This is most apparent in experiments in which the stimuli
are complex such as the pictures of natural scenes generally used in experiments
requiring sorting according to categories such as tree, people, fish or
even paintings by artists whose work has a characteristic style (Watanabe,
1995). In such experiments, a set of pictures illustrative of the category,
and a set of pictures in which exemplars of this category are absent or
are exemplars of another category, are selected. These usually are quite
varied so that the defining characteristics of each set cannot be clearly
specified. Following training, in which differing responses to these two
sets of pictures are reinforced, additional pictures, representative of
these two sets, are presented. Almost invariably excellent generalization
to these novel instances is found. It appears that pigeons are able to
categorize the stimuli according to the experimenter’s definition of the
category in question. This has lead to the suggestion that the pigeon bases
its choice on an “overarching principle,” “family resemblance” or a “polymorphous
concept.” (see Huber, 2001, and Urcuioli,
2001, for examples
of such experiments and a review of the literature). Whether or not categorization
is based on abstraction of a unifying concept is yet unresolved (see Young
& Wasserman, 2001). What is
of interest here is the finding that, in many cases, pictures may be categorized
on the basis of specific details that have no relation to the concept in
For example, Greene
(1983), in an attempt to determine whether pigeons can recognize that the
same scene was shown a second time, discovered that pigeons are able to
identify slides that differ only in very minor details. For this experiment
photographs of the same scene were taken by hand in rapid succession with
the intention of minimizing the availability of cues other than the intended
concept, “first presentations are positive, repetitions negative.” The
same slides were used for first and second presentations of the scene.
After the task was learned the second slide was presented first. This manipulation
should not have disrupted performance if the “repetition concept” had been
acquired. This was not the case. The disruption in performance indicated
that the pigeons solved the task largely by relying on subtle differences
in the appearance of the slides. Greene concluded that “pigeons are extremely
good at sorting out stimuli based on very small perceptual features and
on the association of these features with reinforcement.” (p. 216). By
arbitrarily designating the pictures “positive” (reinforced) or “negative”
(not reinforced) Greene and Vaughan (1983) also found that it is not necessary
to provide a unifying concept for pigeons to categorize pictures on the
basis of their relation to past reinforcement. In fact, Vaughan and Greene
(1984) showed that pigeons were able to discriminate between 160 pairs
of such slides and perform this discrimination when tested two years later.
Further evidence that pigeons remember details of the pictures comes from
experiments in which the background of the pictures controlled behavior
more than the intended unifying concept (Greene, 1983; Jitsumori, 1996).
In the experiments
described above, the go, no-go discrimination training procedure was used,
with rate of response serving as the measure of learning. Pigeons can be
trained to respond differentially to pictorial stimuli in a choice situation
as well. Typically, the trial is initiated by the appearance of the discriminative
stimulus. A peck at this stimulus illuminates the choice keys signaling
the opportunity to make a choice. In an experiment similar to that of Vaughan
and Greene (1984), Heinemann, Ionescu, Stevens and Neiderbach (unpublished)
trained pigeons to peck the left key in the presence of 320 slides and
the right key in the presence of another set of 320 slides. As in the Vaughan
and Greene experiment, individual slides were arbitrarily assigned as “correct”
for each of the keys. After substantial training, during which the number
of slides was gradually increased, errors fell towards a proportion of
.20. The training data are shown in Figure
model of memory must take into account findings, such as those of Vaughan
and Greene (1984) which show that pigeons can remember a large number of
pictorial stimuli arbitrarily assigned to two categories (the limits are
yet to be established). How can these data be accounted for? A possible
solution to this problem is to reduce the details of the stimulus to a
set of defining features or analyzers (e.g. Jitsumori, 1993, also see the
chapter by Huber (2001) for a description of feature models). In NIM we
do not take this approach but instead treat the remembered stimuli as isomorphic
with the original stimulation.Before
presenting the model we shall describe some results obtained in representative
our earlier experiments we used stimuli that even well practiced human
observers would have difficulty telling apart. To describe the discriminability
of such stimuli we used d', the measure of sensitivity that is used
in signal detection theory. If the sensations that are induced by a very
large number of stimulus presentations fall into a normal distribution,
then d' is the difference in standard deviation units between the
means of the sensation distributions induced by two different stimuli.
As a unit-free (dimensionless) quantity d' provides a means of describing
differences in sensitivity between and within sense modalities.
order to understand how information is organized and retrieved from memory,
we chose to work with a very simple situation. Our subjects were hungry
pigeons that were foraging for food in the sparse, controlled environment
of the operant chamber. We used artificial stimuli to minimize the effects
of previous experience with the stimuli encountered in the experimental
situation. These stimuli were lights or sounds that differed from each
other only in intensity, or they were outline drawings in the form of dot-matrix
patterns that were displayed on a computer monitor. To simplify the situation
further, during discrimination training, pecks on one of two illuminated
disks (keys) were followed by a period of access to grain (usually for
2 seconds). This procedure gives us a relatively direct measure of probability
of response, namely, the relative frequencies of occurrence of the alternative
birds are initially trained to peck an illuminated key regardless of its
position. Trials usually start with illumination of a centrally located
area (a disk or a more extended surface), the display key. A peck
on the display key produces one of the stimuli to be discriminated, and
illuminates the two choice keys, one to the left and one to the
right of the display key. A peck on the choice key designated “correct”
provides access to food. Each error is followed by repetition of the trial
until the pigeon pecks the correct key. Trials are separated by a short
period (usually 10 seconds) during which all keys are dark and the behavior
of the pigeon has no programmed consequences. Typical daily sessions consist
of 80 trials.
Figure 2 is an illustration of a pigeon in the type of apparatus
used in these experiments.
of a Discrimination
3 shows acquisition curves for birds that were trained to discriminate
between two sounds that differed in intensity by 5, 3 or 1 dB. These curves
are typical of those of other birds that were trained under these conditions.
The size of the difference between the training stimuli (discriminability)
affects acquisition in three ways: (1) The smaller the difference the longer
the presolution period, the initial period of chance performance
seen here as the flat region early in training when the curves for correct
and incorrect choices intertwine. (2) The smaller the difference the slower
the discrimination develops. (3) The smaller the difference the smaller
the separation between curves for correct (solid lines) and incorrect (dotted
lines) responses after extensive training.
categorization experiments, more than one stimulus is associated with each
response. The stimuli may be as simple as intensities of lights or sounds,
or as rich as the colored photographs used in concept experiments. In a
categorization experiment that used unidimensional stimuli, Heinemann and
Avin (1973, Experiment 1) trained pigeons to categorize sound intensities
as “soft” or “loud” by presenting, on each trial, one of 10 levels of white
noise which ranged from 60 to 96 dB re. 0002 dyne/cm2. Pecks
on one key, R1, were reinforced in the presence of five sound intensities
that were less than or equal to 83 dB and pecks on the other key, R2, were
reinforced in the presence of five intensities equal to or greater than
here for interactive simulation of this procedure using brightness instead
of sound. If categorization were perfect, the proportion of R2 responses would
be zero for stimuli at or below 83 dB and at 1.0 for stimuli at or above
86 dB. Figure 4 shows choice curves (proportion of R2 responses as a function
of three-day blocks of training) for four pigeons that were trained to
make this discrimination. At first, (days 1-3) the proportion of responses
to each of the choice keys was virtually independent of sound intensity.
The flat curves show that birds were not processing the sound intensities,
that is, they were in the presolution period. As training continued, the
difference between the lower and upper asymptotes of the choice curves
increased, and the curves became steeper.
has been suggested that “abstraction of a concept” is required to account
for the excellent generalization to new exemplars of categories used in
training. The need for that notion is less apparent if generalization involves
categorization of simple stimuli (e.g. lights or sounds that differ only
in intensity) than when the stimuli are photographs. In experiments with
simple stimuli typically only two values are used in training. Responding
to additional stimuli is examined in the absence of differential reinforcement.
Figure 5 shows the results of one such experiment. Heinemann, Avin, Sullivan
and Chase (1969) trained pigeons to discriminate between two levels of
white noise. The stimulus differences were 29 dB (top row), 6 dB (middle
row) and 2.3 dB (bottom row). After choice accuracy ceased to improve,
the birds were presented with 11 additional sound intensities. The results
of this generalization test are very similar to those obtained at the completion
of training in the categorization experiment of Heinemann and Avin (1973).
Although the 11 new intensities of sound were encountered for the first
time during the generalization test, the birds responded to the test stimuli
much as they did to the stimuli in the categorization experiment, that
is, they tended to divide the continuum of sound intensities into two categories,
“soft” and “loud.”
of Increasing Stimulus Dimensionality
& Heinemann (1972) and Heinemann and Chase (1970) also trained pigeons
with compounds made up of lights and sounds differing in intensity. They
found that fewer errors are made when values on both dimensions are correlated
with the correct key choice. Following are some examples of data from these
the Chase and Heinemann (1972) experiment, R1 was correct in the presence
of a soft sound paired with a dim light; R2 was correct in the presence
of a loud sound paired with a bright light. For all birds the sound intensities
used in training differed by 35 dB. The light intensities differed by 0.6
log units for three birds and by 1.4 log units for three other birds.
training, the pigeons were tested for generalization to eight sound intensities,
each of which was combined with eight light intensities, a total of 64
test stimuli. Figure 6 shows the proportion of R2 responses that were made
during the generalization tests. With the smaller light intensity difference
(top row) the proportion of R2 responses increases with both light and
sound intensity, evidence of control of response choice by both dimensions.
The sound gradients are virtually flat following training with the larger
light intensity (bottom row) although the sound intensity difference used
in training was the same for all birds. This “overshadowing” of the less
discriminable by the more discriminable stimulus in a compound was first
described by Pavlov (1927) in classical conditioning and has often been
attributed to limited attentional resources (e.g. Sutherland & Mackintosh,
1971). In Chase and Heinemann (1972), we show that it was the optimal use
of both dimensions that resulted in flatter gradients for sound in the
presence of a larger light intensity difference. This was not a failure
the Heinemann and Chase (1970) experiment, the training stimuli were four
compounds: These were a soft sound paired with a dim or bright light and
a loud sound paired with a dim or bright light. For the three pigeons,
for whom results of the generalization test are shown in the top row of
Figure 7, R1 was correct only in the presence of a soft sound that was
accompanied by the dim light. For the three pigeons, whose results of the
generalization test are shown in the bottom row, R2 was correct only in
the presence of the loud sound paired with the bright light. The shape
of the generalization surface shows that choices were based on the combination
of sensations associated with both dimensions.
We have done a number of experiments in which pigeons were required to categorize
outline figures that were shown as dot-matrix patterns on a computer monitor.
In one such experiment, Donis and Heinemann (1993) trained pigeons to discriminate
between two lines tilted 45 degrees from the vertical that were either shown alone or in a context provided by the addition of an identical element,
namely an L-shaped form. These stimuli are shown in Figure 8.
Stimuli 1 and 3 were
correct for the left key; stimuli 2 and 4 were correct for the right key. Click
here for interactive demonstration of this procedure.
birds served in these experiments (for eight birds the stimuli were white
on a black background and for four the stimuli were black on a white background).
For 11 of the 12 birds, the proportion correct was consistently higher
for the inclined lines alone (stimuli 1 and 2) than for these lines in
context (stimuli 3 and 4). The data for the birds trained with the figures
on the black background are shown in Figure
results are in sharp contrast to those found for human observers in both
accuracy (Enns & Prinzmetal, 1984) and reaction time (Pomerantz, Sager
& Stoever, 1977). For example, the mean reaction time of observers
required to detect as quickly as possible which of the four quadrants in
an array (4 items in each quadrant) was different from the others was 641
ms for the lines in context and 1,480 ms for the oblique lines alone (Pomerantz
et al., Experiment 4). One possible explanation is that the lines in context
could be named (Stimulus 3 an “arrow” and Stimulus 4 a “triangle.”). Enns
and Printzmetal suggest that the context may provide a redundancy gain
due to the creation of a dimension such as “triangle-arrow” that is correlated
with the angular orientation of the single lines. Pigeons had to depend
on vision alone. In the discussion of NIM we shall show why, we think,
the addition of the uninformative context in the form of an L-shape increased
the number of errors made by the pigeons.
Natural Intelligence Model (NIM)
of Events in Memory
treats remembered events as records. Each remembered event is represented
in memory by three components: the discriminative stimulus, the behavior,
and the outcome of the behavior. In developing the model we have concentrated
on the treatment of the discriminative stimuli with minimal attention to
the stimuli produced by the responses and the results of these responses.
Therefore, in the description of NIM that follows, we identify choice behaviors,
e.g., pecking on a choice key located to the left or right of the display
key, by the labels R1 and R2. Similarly, we refer to the outcomes of behavior
simply as “positive” or “negative.”
10. Sensations induced by stimuli that differ in value on a single
dimension are shown here as two overlapping distributions of sensations.
conventions adopted from signal detection theory, sensations are treated
as varying along a normal deviate (z-score) axis. Differences in the sensations
associated with the discriminative stimuli are represented by the distances
between the means, d', of the distributions of sensation associated
with each of the discriminative stimuli. The relation between our present
analysis and signal detection theory is discussed in Chase and Heinemann
(1991). In the simplest case, for example, for two stimuli that vary along
a single dimension, such as light intensity, the sensations associated
with these stimuli may be illustrated as shown in Figure 10.
arising from stimuli that differ in value on two dimensions may be visualized
either in three dimensional space or as contours showing equal probability
densities. In Figure 11 below, the bivariate distributions representing the two
light-sound compounds used as training stimuli in the Chase and Heinemann
(1972) experiment are shown in three dimensions. In Figure 12 below the bivariate
distributions representing the four light-sound compounds used as training
stimuli in the Heinemann and Chase (1970) experiment are shown by contours
of equal probability density.
11. Sensations produced by compound stimuli, such as those used
in the Chase and Heinemann (1972) experiment, are represented here by two
joint probability (bivariate) distributions. The marginal distributions
for the two light-sound compounds used in training are also shown. X3 and
X6 refer to the two light intensities and Z1 and Z8 to the two sound intensities.
The line that passes through this surface shows the optimal decision boundary
for the birds whose data are shown in Figure 6 (top row).
12. Sensations produced by compound stimuli, such as those used
in the Heinemann and Chase (1970) experiment are represented here by four
joint probability distributions. The four concentric circles are isodensity
contours for the four joint probability density distributions of the training
stimuli. The curve that passes through this surface shows the optimal decision
boundary for the birds whose data are shown in Figure 7 (top row).
sensations associated with the points on dot-matrix figures, such as those
illustrated in Figure 8, are represented in the same way as compound stimuli.
Each point is treated as a bivariate distribution with its location defined
by its x,y coordinates in Euclidean space. For dot-matrix figures only
the centers of the bivariate distributions are shown (see Figure 13). The
isodensity contours of the bivariate distributions shown in Figure 12 were
omitted to avoid clutter.
assume that, when a motivationally significant event occurs, the sensory
information that accompanies this event is stored as a record in exemplar
memory (XM). As illustrated in Figure
14 (below), each record shows the response
that was made, the sensation induced by the stimulus in the presence of
which this response was made and the outcome of the event. The + in this
illustration indicates that food was obtained (the motivationally significant
event for a hungry pigeon.).
each trial a stimulus is presented. We refer to the sensations produced
by this stimulus as the current input. Given a specific current
input, the subject decides which key choice is most likely to be followed
by reinforcement (here access to grain). According to NIM, the decision
is made as follows: On each trial a few records are retrieved from randomly
determined locations in XM and placed in working memory. The number
of records in working memory appears to differ somewhat among individuals
(see Heinemann, 1983a) and, more significantly, among species (see Chase,
1989). Chase’s simulations have shown the number to be relatively small
(between 3 and 18).2
14. Diagram representing storage of information in
15. Five records of remembered events in working memory. Two
are records of the sensation when R1 was reinforced (green) and three of
records of sensation when R2 was reinforced (blue).
assume that the sensation induced by the stimuli represented by records
in working memory differ somewhat from the sensations originally experienced
(we describe such records as “noisy”). The distortions in the remembered
sensations may result from trial-to-trial differences in the experienced
sensation, as well as events that occur during storage and/or retrieval
from XM. It is assumed that these changes vary normally, with small deviations
common and large ones rare. Figure 15 shows five remembered sensations
that represent stimuli that vary along a single dimension. The two records,
shown in green, represent remembered sensations when R1 was reinforced;
the three records, shown in blue, represent the remembered sensations when
R2 was reinforced.
choice of response (to make R1 or R2) is based on a comparison of the sensation
experienced, the current input (shown in Figure 15 by an arrow),
and the information provided by the records in working memory. Given only
this information, the response made is the one that is most likely to be
correct. The process of response selection amounts to obtaining the sum
of the heights (probability densities) of the R1 curves above the point
that represents the current input, doing the same for the R2 curves, and
making the response for which the sum is greater. We refer to the sum of
the densities as the decision quantity (DQ) for that response. In
Figure 15 the probability density at the current input for the two R1 records
is .30 and .11 (sum = .41). For the three R2 records the probability densities
are .20, .14 and .08 (sum = .42). R2 is made.
proportions are obtained over many trials. The records in working memory
differ from trial-to-trial. The remembered sensations cluster around the
sensations associated with the training stimuli. Therefore, small stimulus
differences used in training are represented in working memory by closely
spaced records with different response labels. In this case errors
will occur frequently (for the same current input the DQ for the correct
response will be higher on some trials and lower on others). Large stimulus
differences are represented in memory by separate clusters of records representing
each response; the separation between the clusters is dependent upon the
stimulus difference. When the stimulus difference is large the DQ for the
correct response tends to be consistently higher than for the incorrect
response. In such situations errors are rare.
stimuli that vary along two dimensions, e.g. stimuli that vary in light
intensity and in sound intensity, or outline drawings expressed as dot-matrix
patterns, the DQ is computed as follows: At each point on the current input
calculate the arithmetic mean probability density contributed by each point
on the memory record. This will yield as many means as there are points
on the current input. The DQ is the geometric mean of these densities.
In our simulations these events occur sequentially. However, implementation
of the model in real time (e.g. in the pigeon) probably occurs in parallel.
|Figure 16. A frame from NIM showing a point (P) on a current input (purple)
being compared to a point (PR) on the record (white
decision process may be observed by using the NIM simulation program included
in this chapter. For illustrative purposes, stimuli similar to those used
by Donis and Heinemann (1993) are embedded in the program. Figure 16 is
a single frame from this program. In this frame, the current input corresponding
to the stimulus to be identified, an “arrow,” is represented by the pattern
of yellow squares. The remembered sensation (record) to which it is being
compared, a “line tilted to the left,” is superimposed upon the current
input. Blue circles represent the points on the record. Note that the positions
of the circles on the record deviate from that of a straight line. This
reflects the assumption that sensations are not remembered exactly (see
Figure 15). Only the mean of the bivariate distribution corresponding to
each record point is shown. In this frame a point on the current input
(shown in purple) is being compared to a point on the record (shown in
white). The shortest distance between these points in z-scores is displayed
as well as the probability density at the current input contributed by
the record point.
the sample of records is uninformative, that is, none of the records is
“reasonably similar” to the stimulus to be identified, a new sample is
drawn. If repeated sampling fails to yield useful information, the pigeon
bases its choice on reinforcement probability alone. In the current form
of the program, an uninformative sample is one in which none of the points
on any record in the sample is within 5 z-scores of any point on the current
input. A sample is also uninformative if there is a point on the current
input that is more than 5 z-scores from any point on all records in the
a motivational significant event occurs, a record of that event is placed
in a randomly determined location in XM. This record replaces the record
that previously occupied that location.
is the basic form of NIM. Without further assumptions the model provides
a quantitative account of the choice behavior of pigeons in a wide range
of situations. In comparing simulations produced by the model to the data
only two parameters are allowed to vary — the number of records in working
memory and the discriminability, the distance between the means of the
sensation distributions. The number of records in working memory seems
to depend upon the processing capacity of the organism. The distance between
the means of the sensation distribution is related to the organism’s sensitivity
to stimulus differences.
of the Model
of a Discrimination
in training, especially if the discrimination is difficult, there is a
period, the presolution period, during which there is no evidence that
the selection of the response is controlled by the potential discriminative
stimuli (see Figures 3, 4 and
17). It appears that during this period the
pigeon discovers which of the many stimuli present when a given response
was reinforced is likely to be a good predictor of reinforcement in the
future. Though there is no evidence of learning during this period, the
presolution period is an important stage in acquisition. Information processed
during the presolution period makes possible selective storage of
information in exemplar memory.
assume that the response that is selected is based on the most informative
records in working memory. Until the end of the presolution period, the
only information available when a decision is made is the proportion of
trials on which each choice was reinforced. After the presolution period,
the relatively uninformative records in XM are gradually replaced with
records that contain information about the discriminative stimuli. As the
number of informative records in working memory increases, fewer errors
are made. This is the result of better representation of the frequency
with which each response was reinforced and reduction in the variability
associated with remembered sensations (an effect equivalent to the decrease
in the standard error of the mean as the sample size is increased).
rate of learning and asymptotic accuracy depends primarily upon discriminability.
Errors are inevitable if identical sensations are produced by stimuli that
require different responses. The proportion of trials on which this occurs
varies inversely with the size of the stimulus difference (see Figure
as well as the individual’s sensitivity to such stimulus differences.
of learning is also affected by the capacity of exemplar memory. Simulations
of choice behavior in situations as
varied as probability learning, reversal
of a discrimination, abolition of a discrimination as well as variations
in stimulus discriminability showed that the capacity of exemplar memory
of pigeons seeking food in the simplified environment of the operant chamber
is about 1200 records.4 The number
of records in working memory varied somewhat but for most simulations the
data were well fit with a sample size of 10. The biggest difference among
birds was in their sensitivity to stimulus differences, in this case the
difference in the intensity of white noise (see Figure
18. Five records of remembered sensations (the green curves are
for R1 and the blue curves for R2). Hypothetical test stimuli are shown
as triangles (black for new stimuli, green and
blue for stimuli used in
training) at different positions along the sensation axis.
to NIM the same processes are operative in categorization and generalization
as during discrimination training. In discrimination training only two
stimuli are presented. In categorization and tests for generalization more
than two stimuli are presented (see Figure 18). Categorization tasks differ
from tests for generalization only in that in the former case the remembered
sensations are more varied. In categorization experiments that use unidimensional
stimuli, choice curves for R2 will rise from proportions of zero to 1.0.
No errors are made in categorizing stimuli near the extremes of the continuum
if the sensations induced by these stimuli do not have different response
labels. Choice curves obtained in generalization tests that follow training
with two similar stimuli (such as those for the birds shown in the bottom
row of Figure 5) tend to have asymptotes other than at zero and 1.0. According
to NIM, if the test stimuli are far from the training stimuli, the sample
of records in working memory may be uninformative on many trials, that
is, all probability densities at the current input for the test stimulus
are close to zero. Under these conditions the pigeon bases its choice on
response probability alone, that is, it “guesses.”
of Increasing Stimulus Dimensionality
work with compounds has shown that, at least for light and sound intensities,
pigeons use both dimensions in response selection (see Figures 6 and
In Figure 19 two compounds such as those used in the Chase and Heinemann
(1972) experiment are shown. The d' difference between the stimuli
on each dimension can be thought of as the legs of a right triangle. The
distance between the means of the compound is the hypotenuse of this triangle.
The improvement in discriminability of a compound in which d' on
each dimension is equal is increased by a factor of the square root of
2. Increasing the dimensionality of the stimuli, thus, increases d'
between stimuli that require different responses. This results in fewer
have shown in our signal detection analysis of discrimination learning
(Chase & Heinemann, 1972) that this increase in d' is reflected
in the decision strategy of pigeons trained with stimuli varying along
two dimensions. The birds use both dimensions when deciding which key to
peck. The weight given to each dimension depends upon the relative discriminability,
of the stimuli on the two dimensions. Equal weight is given to both dimensions
when d' for the elements is equal — this was approximately true
for the birds whose data are shown in the top row of Figure
6. The relative
difference in discriminability determines the relative weight given to
each dimension — the flatter gradients for sound intensity shown in the
bottom row of Figure 6 reflect the greater weight given to the light intensity
difference, not inattention to sound. According to signal detection theory
decisions are based upon a criterion, a decision line or boundary, separating
the continuum of sensations into a region in which R1 is more probable
from that in which R2 is more probable. The response made is the one for
which the probability density is greatest, that is, the response that is
most likely to be correct. This is not unlike the decision rule used by
NIM (see Figure 15). Statistical decision theory provides a mathematical
description of the data, an exemplar model such as NIM shows how these
decisions may come about. 5
treatment of dot-matrix patterns follows from our treatment of compounds; that
is, discriminability is based on distances in Euclidean space. The decision rule
we use to compare two dot-matrix patterns may be followed step-by-step by using
version of NIM that
may be downloaded here. (A more general version of this program is used to
simulate performance under other conditions, and thus compare NIM’s
performances to available data and to test theoretical predictions). This
demonstration program uses patterns similar to those used in the experiment of
Donis and Heinemann (1993) – see Figures 13 and
16, and may be used to simulate performance under conditions in which the
stimuli are remembered exactly (noiseless
memories), under conditions in which a single record of each of the stimuli is
used in the comparison process and under conditions in which records are
retrieved randomly from exemplar memory. The effects of varying the number of
records in the sample that was retrieved from exemplar memory, can also be
examined, as can the discriminability of the stimuli about which decisions are
20 shows the effect of varying discriminability and the number of records in
the sample, on acquisition curves for the patterns used by Donis and Heinemann
(1993). These simulated curves may be compared to the acquisition curves of the
real birds whose data are shown in Figure 9. The simulation shows that the
difference between the accuracy for the lines alone and the
lines in context is robust. In spite of large differences in discriminability
(SP) and in the number of records in the sample (5 or 10), the proportion of
correct responses in the presence of the lines alone is consistently greater
than that for the lines in context. This results primarily from the decreased
differences between the DQs in the presence of identical parts of the patterns
(the added L-shape). According to NIM, the more the similarity in the DQs for
stimuli that require different responses, the greater the probability of an
incorrect response choice.
has successfully simulated the results of other pattern recognition experiments
with pigeons as subjects. These include simulations of the confusion matrixes
for the letters of the alphabet and random dot patterns obtained by Blough
(1985), simulations of the effects of distance of irrelevant contexts on
accuracy obtained by Donis, Heinemann, and Chase (1994) and the effects
on accuracy of distortion of patterns composed of elementary forms obtained
by VanHamme, Wasserman, and Biederman (1992). Although these results are
encouraging, other interpretations of the data are possible (see, for example, Kirkpatrick’s
(2001) chapter in this volume for a discussion of the VanHamme
et al. experiment). NIM was originally developed to account for discrimination
and generalization of diffuse stimuli. Our success in applying these same
principles to pattern recognition is encouraging, however, whether NIM
provide the most parsimonious and general model of pigeon visual cognition
awaits much further testing and refinement.
have shown that a model such as NIM provides a quantitative account of
how exemplar learning may work. By expressing it in the form of a computer
program, we have been able to examine data in detail. In some cases this
has led to unexpected findings which the model was originally not designed
to deal with, e.g., Donis and Heinemann’s (1993) finding that a redundant
context makes it harder for pigeons to identify the orientation of oblique
lines. Apart from the application of the model to gain deeper understanding
of specific problems, we have found NIM, as a quantitative model, to be
extremely useful in improving and fine-tuning itself and specifying where
it needs modification.
far we have focused our analyses on the sensory effects produced by the
discriminative stimuli. While an exemplar model such as ours provides a
good account of choice behavior, we have not yet dealt with some of most
interesting demonstrations of the cognitive abilities of birds that are
described in this cyberbook. Among these are, for example, emergent relationships
resulting from common responses or outcomes such as those described by
Urcuioli (2001). It would be interesting to see whether some of the findings
Urcuioli describes could be accounted for by an exemplar model if the responses
and outcome of the behavior were treated in more detail than we do here.
Can an exemplar model deal with categorization of stimuli as “same” or
“different” (Cook, Katz, & Cavoto, 1997; Young and Wasserman, 2001) if the relationship between the stimuli cannot be described in terms
of perceptual similarity? These are just a few of the challenging questions
raised in this book.
How to Download PC Demonstration
Program of NIM
Just click here
to download program to your local machine. It will be saved as
"nimdemo.exe" on your PC (sorry, Mac users!). The
program can then be run off-line to do your simulations.
Ashby, F. G. & Waldron, E. M. (1999). On the nature of implicit categorization.
Bulletin & Review, 6, 363-378.
Blough, D. S. (1985). Discrimination of letters and random dot patterns
by pigeons and humans. Journal of Experimental Psychology: Animal Behavior
Processes, 11, 261-280.
Blough, D. S. (1996). Error factors in pigeon discrimination and delayed
matching. Journal of Experimental Psychology: Animal Behavior Processes,
Chase, S. (1983). Pigeons and the magical number seven. In M. L. Commons,
R. J. Herrnstein, & A. R. Wagner (Eds.), Quantitative analyses of behavior, Vol. 4: Discrimination
processes (pp. 37-57). Cambridge, MA: Ballinger.
Chase, S., & Heinemann, E. G. (1972). Choices based on redundant information:
An analysis of two-dimensional stimulus control. Journal of Experimental
Psychology, 92, 161-175.
Chase, S., & Heinemann, E. G. (1989). Effects of stimulus complexity
on identification and categorization. The International Journal of Comparative
Psychology, 3, 165-181.
Chase, S., & Heinemann, E. G. (1991). Memory limitations in human and
animal signal detection. In M. L. Commons, J. A. Nevin, & M. C. Davidson.
(Eds.) Signal detection: Mechanisms, models, and applications (pp. 121-138). Cambridge:
Cook, R. G., Katz, J. S., & Cavoto, B. R. (1997). Pigeon same-different concept
learning with multiple stimulus classes. Journal of Experimental Psychology: Animal
Behavior Processes, 23, 417-433.
Donis, F., & Heinemann, E. G. (1993). The object-line inferiority effect
in pigeons. Perception and Psychophysics, 53, 117-122.
Donis, R.J., Heinemann, E.G. & Chase, S. (1994). Context effects in visual
pattern recognition by pigeons. Perception and Psychophysics, 55,
Ebbinghaus, H. (1885). Über das Gedächtnis. Leipzig: Duncker
Enns, J. T., & Printzmetal, W. (1984). The role of redundancy in the
object-line effect. Perception and Psychophysics, 12, 278-286.
Estes, W. K. (1950). Towards a statistical theory of learning. Psychological
Review, 57, 94-107.
Guthrie, E. R. (1946). Psychological facts and psychological theory. Psychological
Bulletin, 43, 1-20.
Heinemann, E. G. (1983a). A memory model for decision processes in pigeons.
In M. L. Commons, R. J. Herrnstein, & A. R. Wagner (Eds.),
Quantitative analyses of behavior, Vol.4: Discrimination processes (pp. 3-21). Cambridge,
Heinemann, E. G. (1983b). The presolution period and detection of statistical
associations. In M. L. Commons, R. J. Herrnstein, & A. R. Wagner (Eds.), Quantitative
analyses of behavior, Vol.4: Discrimination processes (pp. 21-37). Cambridge,
Heinemann, E. G., & Avin, E., (1973). On the development of stimulus
control. Journal of the Experimental Analysis of Behavior, 20, 183-195.
Heinemann, E. G., Avin, E., Sullivan, M. A., & Chase, S. (1969). Analysis
of stimulus generalization with a psychophysical method. Experimental
Journal of Psychology, 80, 215-224.
Heinemann, E. G., & Chase, S. (1970). Conditional stimulus control.
of Experimental Psychology, 84, 187-197.
Heinemann, E. G., & Chase, S. (1990). A quantitative model for pattern
recognition by animals and people. In M. L. Commons, R. J. Herrnstein,
S. M. Kosslyn, & D. B. Mumford (Eds.), Quantitative analyses of behavior,
Vol. 9: Computational and clinical Approaches to pattern recognition
and concept formation (pp. 109-126). Hillsdale, NJ: Erlbaum.
Hintzman, D. L. (1988). Judgments of frequency and recognition memory in
a multiple-trace memory model. Psychological Review, 95, 528-551.
Huber, L. (2001). Visual
categorization in pigeons. In R. G. Cook (Ed.),
Avian visual cognition [On-line]. Available: www.pigeon.psy.tufts.edu/avc/huber/
Jitsumori, M. (1993). Category discrimination of artificial polymorphous
stimuli based on feature learning. Journal of Experimental Psychology:
Animal Behavior Processes, 19, 224-254.
Jitsumori, M., & Ohkobo, O. (1996). Orientation discrimination and
categorization of photographs of Natural Objects by Pigeons. Behaviour
Processes, 38, 205-226.
Kirkpatrick, K. (2001). Object
perception. In R. G. Cook, (Ed.),
Avian visual cognition [On-line]. Available: www.pigeon.psy.tufts.edu/avc/kirkpatrick/
Nosofsky, R. M. (1986). Attention, similarity and identification-categorization
relationship. Journal of Experimental Psychology: General, 115,
Nosofsky, R. M. (1991). Tests of an exemplar model for relating perceptual
classification and recognition memory. Journal of Experimental Psychology:
Human Perception and Performance, 17, 3-27.
Pavlov, I. (1927). Conditioned reflexes
(G V. Anrep,
Trans.). Oxford University Press.
Pomerantz, J. R., Sager, L. C., & Stoever, R. J. (1977). Perception
of wholes and their component parts: Some configural superiority effects.
of Experimental Psychology: Human Perception and Performance, 3, 422-435.
Sutherland, N. S., & Mackintosh, N. J. (1971). Mechanisms of animal discrimination
learning. New York, NY: Academic Press.
Urcuioli, P. (2001).
Categorization & acquired equivalence. In R. G. Cook (Ed.),
Avian visual cognition [On-line]. Available: www.pigeon.psy.tufts.edu/avc/urcuioli/
Van Hamme, L. J., Wasserman, E. A., & Biederman, I. (1992). Discrimination
of contour-deleted images by pigeons. Journal of Experimental Psychology:
Animal Behavior Processes, 18, 387-399.
Vaughan, W., Jr., & Greene, S. L. (1983). Acquisition of absolute discriminations
in pigeons. In M. L. Commons, R. J. Herrnstein, & A. R. Wagner (Eds.), Quantitative
analyses of behavior, Vol. 4: discrimination processes (pp. 231-238). Cambridge,
Vaughan, W., Jr., & Greene, S. L. (1984). Pigeon visual memory capacity.
of Experimental Psychology: Animal Behavior Processes, 10, 256-271.
Watanabe, S., Sakamoto, J., & Wakita, M. (1995). Pigeons’ discrimination
of paintings by Monet and Picasso. Journal of the Experimental Analysis
of Behavior, 63, 165-174.
Young, M. E. & Wasserman, E.A.
(2001). Stimulus control in complex arrays. In R. G. Cook (Ed.),
Avian visual cognition [On-line]. Available: www.pigeon.psy.tufts.edu/avc/young/
The research reported in this chapter was supported in
part by PSC/CUNY and NIMH grants to the two authors. The web page
and graphics were developed initially by Erich A. Heinemann.