Psychologists seem to know a chunk when they
see one. A definition, however, is hard to come by.
Neither the large literature on chunking by humans nor the more
modest literature on chunking by animals provides an operational
definition of this term. The definitional problem is compounded by the
uncritical use of chunking as an explanation of storage and retrieval of
information from short-term and long-term memory. The aim of this chapter is to introduce a distinction between
input and output chunks in analyses of short-term and long-term memory.
The main focus will be on output chunks because all of the evidence
for chunking in animals has been obtained from experiments involving
long-term memory. The
spontaneous temporal structure of inter-response times (IRTís) during
the execution of arbitrary sequences will be used to provide an objective
measure of chunk boundaries. Without
any requirement to do so, college students and monkeys pause on virtually
every trial while executing a list, mainly after responding to a few
items. Pauses were
approximately twice as long as other IRTís.
This suggests that subjects download list items as chunks during
pauses and that chunk size for order information is approximately 3 items.
Mazes and lists of nonsense syllables are two of the most familiar instruments of the
psychological laboratory (Small, 1900; Ebbinghaus, 1964). They owe their popularity to the
recognition that the ability to learn arbitrary sequences is a hallmark of advanced
intelligence and that experiments that measure individual responses provide no information
on serial competence (e.g., Lashley, 1951). The widespread use of nonsense syllables and
mazes also reflects the venerable assumption that the same associative principles that are
used to explain how a human adult memorizes a list of arbitrary items can be used to
explain how an experimentally naive rat learns a sequence of arbitrary responses, and vice
ability to learn arbitrary sequences is crucial for intelligent action,
both verbal and non-verbal. For more than a century, psychologists have
investigated the organization of such sequences in experiments on the
memorization of nonsense syllables (Ebbinghaus, 1964) and the mastery of
various types of mazes (Small, 1900). The results of both types of
experiment gave rise the classic theory that serially organized behavior
can be represented as a linear sequence of associations.
explained list learning by reference to associations between successive
items and between a particular item and its list position. Hull offered a
similar explanation of maze learning by rats (Hull, 1952). Thus,
associative principles that were used to explain how a human adult
memorizes a list of arbitrary items were used to explain how an
experimentally naive rat learns a sequence of arbitrary responses, and
vice versa (Osgood, 1953; Underwood, 1957).
dominating psychological thinking for more than half a century, the
validity of association theories of serially organized behavior was
questioned on a variety of theoretical and empirical grounds (Lashley,
1951; Chomsky, 1957). In his classic analysis of serially organized
behavior, Lashley rejected linear models because they could not explain
knowledge of relationships between non-adjacent items (for example,
between words before and after an embedded clause) and because
inter-response times between successive responses are often shorter than
the time that would be needed for feedback from one response to trigger
the next (for example, playing a sequence of notes on a musical
instrument). These and related arguments have been described in detail by
others and will not be elaborated in this chapter (e.g., Anderson &
Bower, 1974; Gardner, 1985). Instead our focus will be the concept of
chunking (Miller, 1956), one of the most influential but, as we shall see,
one of the most poorly understood concepts of modern cognitive psychology.
significance of chunking derives from its ability to overcome objections
to linear models of serially organized behavior association by augmenting
linear structures with hierarchical structures. Although the concept
of chunking was proposed to define the capacity of short-term memory, it
has also been used to characterize such diverse phenomena as long-term
memory, visual perception, and motor plans. The main purpose of this
chapter is twofold: to examine some problems that arise when the concept
of chunking is used uncritically and to distinguish between two basically
different types of chunks: input and output chunks.
Miller introduced the concept of chunking in his classic paper, "On
the magical number 7 ( 2" (1956). Miller argued that a chunk
was the basic unit for measuring the capacity of immediate memory [in
current terminology, short-term or working memory; see Baddeley (1992)
for a discussion of the taxonomy of different memory systems].
The idea was that subjects could retain a large number of discrete items
of information if they were encoded as chunks before they were transferred
to long-term memory. For example, the 12 digits
1-4-9-2-1-7-7-6-1-8-1-2 could be encoded as 3 historical dates. In
contrast to the enormous capacity of long-term memory (LTM), Miller
estimated the capacity of STM to be 7 Ī 2 chunks and argued that the
amount of information that is retained in STM is independent of the amount
of information contained by each chunk.
facilitory effect of chunking on human memory has been confirmed by a
broad variety of experiments. Familiar examples include the
enhancement of recall on lists on which subjects can assign items to
verbally defined categories (Bousfield & Bousfield, 1966; Bower,
1972a) and on lists composed of temporally defined clusters of items
(Bower & Winzenz, 1969). On a conceptual level, chunking is
regarded as a basic cognitive process despite differences between
theorists as to the functional and anatomical boundaries of STM and LTM
(Atkinson & Shiffrin, 1971; Baddeley, 1981; Craik & Tulving, 1975;
Mishkin & Petri, 1984; Squire, 1986; Weiskrantz, 1970), the types of
evidence that have been used to distinguish STM and LTM (Bower, 1972b;
Estes, 1972; Johnson, 1972; Murdock, 1993) and the actual capacity of STM
(Broadbent, 1975; Crowder, 1976; Mandler & Dean, 1969; Wickelgren,
has also been used to explain the performance of animals on a variety of
sequential tasks, e.g., detecting the temporal pattern of the amount of
reinforcement that can be earned from trial to trial (Capaldi et al.,
1986), learning a rule for responding to a row of response levers in a
particular sequence (Fountain & Annau, 1984), learning to execute a
simultaneous chain on the basis of qualitative similarities between list
items (Terrace, 1987) and organizing response sequences in a radial maze
on the basis of the different types of food reward used to bait each
arm (Dallal & Meck, 1990). Dallal and Meck's study is of
particular interest because it demonstrates chunking in the serially
organized behavior of an animal that is free to respond in any order it
chooses [cf. free recall studies performed on human subjects (Bousfield,
1953; Bousfield et al., 1964)].
absent from the literature on chunking, both human and animal, is an
operational definition. As Miller noted perceptively at the end of his
article: "...we are not very definite about what constitutes a chunk
of information" (Miller, 1956). Forty years and dozens of
published articles later, psychologists appear to be just as indefinite
(Cowan, 2001). The few definitions that have been proposed are ad
hoc and apply only to items that can be encoded verbally. Consider
an instructive example (from a recent commentary on Miller's paper) in
which a chunk is defined as "...a pronounceable label that may be
cycled within short-term memory" (Shiffrin & Nosofsky, 1994).
The capacity of STM is then defined as "... the number of labels
pronounceable in 2 s".
an operational definition of chunking, it is difficult to decide whether
the serial tasks on which animals have been trained are homologous to
those used in experiments on human chunking. For example, the
literature on human chunking is based on the performance of verbal
subjects on tasks that require the retention of sequential information in
STM. The literature on animal chunking is based entirely on the
performance of non-verbal subjects who were trained on tasks that require
the retention of sequential information in LTM. This problem is
compounded by differences in the pre-experimental histories of human and
animal subjects. Human subjects are typically college students who
have had extensive experience at learning lists. Animal subjects typically
have no prior training on serial tasks.
subjects have obvious advantages over non-verbal subjects. Verbal
subjects can readily grasp spoken and written instructions and they can
describe the experience of learning a list. Less obvious are certain
disadvantages of using human subjects to investigate basic processes of
memory. Foremost, is the implication that chunking
presupposes linguistic knowledge and that subjects learn verbal
associations while mastering a list, whether between adjacent and
non-adjacent items or between an item and its ordinal position (e.g.,
"first", "fourth", "middle",
"second-before-middle", etc). Reliance on verbal
subjects also obscures the influence of previously acquired expertise at
chunking. That expertise is the product of years of experience with spoken
and written language and the memorization of innumerable rote lists (e.g.,
the alphabet, sequences of numbers, the planets, etc.).
experiments I will describe in this chapter are concerned with long-term
memory of sequences in animals and humans. In some instances, the
experimenter provides a basis for organizing subsets of the required
sequence during training. In others, the subjects organized the
sequences they were asked to learn into smaller temporally defined
subsets, without any requirement to do so. The spontaneous
temporal organization of sequences will be used as a criterion for
defining output chunks, as contrasted with input chunks, the type of chunk
originally proposed by Miller (1956).
Simultaneous Chaining Procedure
In most of the experiments I will
review, sequences were trained as simultaneous chains, sequences that
differ in many respects from those used in previous studies of serial
learning in animals. The simultaneous chaining paradigm was first used in
an experiment whose purpose was to question claims about the grammatical
competence of apes, in particular, the claim that sequences trained by
rote were sentences (Straub et al., 1979). To simulate the conditions used
in ape language experiments, Straub et al. trained pigeons to respond to
randomly configured arrays of four colors in a particular sequence: redgreenyellowblue.
All of the colors were presented simultaneously, in a different
configuration on each trial. Random variation of the position of the
colors from trial to trial insured that subjects could not learn the
sequence as a specific motor program (i.e., as a successive chain).
Correct responses to each item allowed a trial to continue. A correct
response to the last item of the sequence produced food reward. An
error terminated the trial (e.g., forward errors, such as B, ABD,
C, AD and
backward errors, such as, ABA,
etc.). Repetitive responses to the same item had no
consequences (e.g., AAABBCCCCD).
training began with trials on which all of the items appeared
simultaneously, subjects might stop responding because of the low
probability of guessing the correct sequence by chance. On a simultaneous
chain, subjects have to determine the identity of each item of a
simultaneous chain by trial and error. The probability of guessing A
on the first trial of a 4-item list is 1/4. Because repetitive
responses to the same items are not considered errors, the probability of
guessing B following a correct response to A is 1/3. Given the
generous assumption that subjects are able to recall any of the items to
which they've responded previously, the probability of guessing the entire
sequence is p = 1/4 x 1/3 x 1/2 x 1 = .04. To avoid the risk of
extinction at the start of training, each list was introduced by the
successive phase method. A new item was added to the end of a partial list
each time the subject satisfied an accuracy criterion. On a 4-item
list, the successive phases of training were A, AB,
of the subjects of the Straub, et al. experiment learned the 4-item list
of colors on which they were trained. That result suggested a
simpler interpretation of experiments purporting to demonstrate the
grammatical ability of apes. Why interpret the sequences of plastic
chips or lexigrams that apes were trained to produce as anything more
complicated than rotely learned sequences whose function was to obtain
some specific reward [e.g., MarygiveSarahapple
(Premack, 1976) or Pleasemachinegiveapple
(Rumbaugh, 1977)? As Terrace (1979) has noted, there is no evidence
the first 3 symbols of these sequences had any meaning for the apes in
sequences on which pigeons (and apes) were trained are also of interest
because they cannot be explained as successive chains, the kind of
sequence that has been used in previous experiments on serial learning
(Terrace, 1984). As illustrated in the following comparison, successive
and simultaneous chains are fundamentally different:
Successive Chain: SA: RA----> SBRB---->
SC: RC----> SD: RD----> SR
Simultaneous Chain: SASBSCSD: RA---->
RB----> RC----> RD---->SR
successive chaining paradigm, a simultaneous chaining paradigm presents
all list items throughout each trial (e.g., the numbers on the face of a
telephone). In a successive chain, the subject encounters each cue
individually (e.g., the choice points in a maze). A second
difference is the variation of the physical configuration of list items
from trial to trial. This prevents subjects from using a particular
physical sequence of responses to produce the required list (for example,
when making a telephone call with a sequence of rotely learned movements
on a number pad). To execute a simultaneous chain correctly, the
subject has to respond to each item in a particular order, regardless of
its spatial position.
third distinguishing feature of the simultaneous chaining paradigm is the
absence of differential feedback during the execution of a correct
sequence. Following a correct response to itemn, no information is
provided as to the identity of itemn+1. Consider, for example, the
consequences of responding to item B on the 4-item list, ABCD.
After responding to A, subjects are given no information that the next
response should be directed to C (as opposed to A or D).
another important difference between successive and simultaneous chains is
the order in which individual responses are trained. On simultaneous
chains, the first response is trained first. On successive chains, the
last response must be trained first. The backward training of a
successive chain follows from theoretical analyses of the backward effect
of reinforcement for each response of a successive chain (Hull, 1932) and
the Law of Chaining (Skinner, 1938). The idea was that response
sequences can be broken down into a series of linked responses: "The
response of one reflex may constitute or produce the eliciting or
discriminative stimulus of another" (Skinner, 1938, p. 32).
Skinner illustrated the Law of Chaining with the method he used to train a
rat to press a bar to obtain food. First the rat is trained to
approach the food tray. That behavior consists of the 2-response
sequence: SD (tray): R1 (approach)
SR (food): R2 (approach seize food). Subsequently, the rat is shaped
to perform a longer sequence: SD (visual lever): R1 (approach (lifting
(tactual lever): R2 (pressing)
SD (sound of magazine: R3 (approach tray)
SR (food): R4 (seize food).
chains cannot be trained backwards because subjects have a strong bias for
making forward errors. Consider, for example, the backward phases of
training that would be needed to train a 4-item sequence: D, CD,
During the initial phase of training, each response to D is followed by
reinforcement. When C & D are presented simultaneously, the subject
persists in responding to D (a forward error). Since each response
to D ends the trial without reinforcement, responding to D is eventually
attempts to overcome the bias for responding to D have been unsuccessful.
Training the sequence in a forward manner (A, AB,
avoids this problem. After learning to respond to A, the subject is
trained to respond to displays of A and B in the sequence AB.
Since repetitive responses were not treated as errors, the only
consequence of additional responses to A is to prolong the trial.
Eventually, the subject responds to B and the trial ends with
reinforcement. The same process is repeated when C & D are introduced.
learning by Pigeons
& Terrace (1981) extended Straub et al.'s findings by evaluating
subjects' knowledge of associations between non-adjacent items.
First, Straub & Terrace (1981) showed that the number of sessions
needed to satisfy
the accuracy criterion increased progressively each time the list was
lengthened. The relevant
data are shown in Figure
1. Straub & Terrace then administered a subset test that consisted
of the six 2-item pairs that could be derived from a 4-item list (AB, AC,
AD, BC, BD & CD). For 5 of the 6 subsets, accuracy of responding to
the subsets exceeded the level predicted by chance and was uniformly high
across all subsets. The one exception was subset BC. The relevant data are
shown in Figure
performance to the subsets BC and BD is puzzling for different reasons.
Chance performance on the subset BC is puzzling because subjects completed
the required sequence (ABCD)
correctly on 75% of the trials during a criterial session that was
administered immediately prior to the subset test. Indeed, the conditional
probability of responding to C (given a response to B), was greater than
0.9. The contrast between the chance level of responding to BC
during the subset test and the high transitional probability of responding
correctly to BC during the 4-item phase of training, suggests that the
response to C was conditional upon the sequence AB.
The high level of accuracy to subset BD is equally puzzling as it provided
neither the opportunity to start the sequence at A nor the opportunity to
respond to C before D.
latencies of the responses to the first and second item of each subset
provide important clues as to the manner
in which subjects represent the sequence. The latency of the first
response to subsets beginning with A was approximately half a second
faster than the latency of the first response to subsets beginning with
either B or C. The relevant data are shown in Figure
3. For all subsets, the latency to the second item was shorter than
the latency to the first item. These data suggest that pigeons adopted the
following rules when responding on the subset test:
Rules used by pigeons
to solve subset tests
respond first to item A.
2.) respond last to item D.
3.) respond to any other item by default.
predict the latencies shown in Figure 3. The short latencies to the
first item of subsets that began with A (AB, AC, and AD) follow from the
subjects' extensive histories of responding to A. It should take
less time to apply rule 1 to subsets beginning with A than to apply rule 2
to subsets that don't begin with A. Application of the default rule
(3) predicts shorter latencies to the second item of each subset than to
the first item. Rules i-iii predict accurate performance on all subsets
that contain an end item (AB, AC, AD, BD and CD). By contrast, they
provide no basis for selecting B or C.
extensive training that pigeons need to master a 4-item list (more than 3
months) suggests that 4 items may approach the limit of their memory span.
For human subjects the classic remedy for overcoming limitations of memory
span is to reorganize unrelated list items into chunks (Miller, 1956).
The efficacy of that approach was evaluated with pigeons that were trained
to learn 5-item lists composed of colors and achromatic geometric forms
(Terrace, 1991; Terrace, 1987). To differentiate the types of items
used on each list, colors are represented by unprimed letters; achromatic
geometric forms, by primed letters. Two lists provided a basis for
chunking similar items: ABCD'E'
Control groups learned lists in which colors and forms were interspersed:
or which consisted of 5 colors: ABCDE.
that provided a basis for chunking were learned twice as rapidly as those
that could not be chunked. The relevant data are shown in Figure
4. These results are consistent with the hypothesis that
pigeons could organize the segregated lists into two chunks: [ABC]
of sessions needed to master successive phases of training provides
another basis for differentiating the chunking and the control groups. The
relevant data is shown in Figure 4. The control groups needed
progressively more time to master a particular phase each time a new item
was added. That pattern of acquisition is consistent with the
results of the Straub and Terrace (1981) study in which pigeons were
trained to execute 4-item lists (cf. Figure
1). A different pattern was observed in the case of the two
chunking groups. The phase of training in which an achromatic item
was added to the colored items was completed more rapidly than the
previous phase (in which the subject was required to produce a list
containing one fewer item). The advantage of the two chunking groups
cannot be attributed to release from proactive inhibition (Keppel &
Underwood, 1962). List ABC'DE,
which provides an opportunity for the release of proactive inhibition at
the start of training on the ABC'D
phase, took as much time to master as lists AB'CD'E
evidence that pigeons chunk similar items on clustered lists was provided
by analyses of the time it took each group to execute the 5-item list and
by their performance on subset tests (Terrace, 1991; H. Terrace & S.
Chen, 1991). As shown in Figure 5, the two "chunking"
groups executed their lists more rapidly than any of the control groups
(on average, 5.8 vs. 7.2 sec.). Curiously, these data appear to be the
only data in the animal and human literatures on serial learning which
show that chunked sequences are executed more rapidly than unchunked
sequences (Terrace & Chen, 1991a).
5 shows two components of the time needed to execute a simultaneous
chain: latency and dwell times. Latency
is the time that precedes the initial response to each item. Dwell
time is the interval between the first and the last response to each item.
If pigeons responded only once to each item, dwell time would be zero.
Since pigeons tend to make multiple responses to each item, dwell time is
typically longer than the latency of the first response to that item.
The data presented in Figure 5 show that dwell time varied considerably
within lists and between groups.
6 shows that dwell time increased at the item that preceded purported
chunk boundaries. For the
non-chunking groups, dwell time decreased gradually as the pigeon worked
its way through the sequence and showed no abrupt increases. Taken
together, the temporal data shown in Figures 5 and 6 supports the
hypothesis that the pigeons chunk list items that are segregated into
qualitatively different segments. The latencies of the first
response to successive items were shorter for the two chunking groups than
they were for the two non-chunking groups. That factor resulted in
faster times for executing the entire list on the part of the two chunking
groups. Short latencies may reflect a relatively rapid search time for
locating qualitatively similar items, in this instance (chromatic
stimuli). The dramatically longer dwell times at the last item of
purported chunks suggest that the pigeons used that time to locate the
remaining (achromatic) items.
subjects satisfied the accuracy criterion on a 5-item list, they were
given a two-item subset test. Rules similar to those derived for
4-item lists (i-iii) provided a basis for predicting performance on each
of the 10 types of subset that could be derived from a 5-item list.
Subjects from all groups would be expected to respond accurately to the 7
subsets that contain either a start or an end item (A, E and E').
However, different predictions follow for the chunking and the control
groups in the case of the 3 subsets that were composed of interior items.
If the list ABCD'E'
was parsed as two chunks, [ABC]
and if those chunks were functionally equivalent to 3- and 2-item lists,
subjects should respond at a greater than chance level of accuracy to all
of the 3 "internal" subsets that can be generated from the
original list (BC, BD' & CD'). Similarly, if the list ABCDE'
was executed as the chunks [ABCD]
& [E'], subjects should respond at a greater than chance level of
accuracy to the subsets BD & CD', but not to the subset BC.
The 3 control groups would be expected to respond to subsets composed of
interior items at chance levels of accuracy (subsets BC, CD & BD
on list ABCDE,
subsets B'C, CD' & B'D' on list AB'CD'E,
and subsets BC', C'D & BD on list ABC'DE).
predictions were confirmed for each of the 50 subsets that were tested
after subjects mastered the accuracy
criterion (5 lists x 10 subsets for each list.) ll groups responded
at high levels of accuracy to subsets that contained the first or the last
items (A, E or E'). The chunking groups responded at similar levels of
accuracy to subsets that contained purported chunk boundaries.
By contrast, accuracy on internal subsets that lacked a chunk boundary did
not exceed the level predicted by chance. The relevant data are shown in Figure
Is A Chunk?
facilitory effects of clustering similar items on a 5-item list appear to
be prima facia evidence of chunking by pigeons. However, that evidence
does not stand up to scrutiny when evaluated as a means of enhancing STM,
the defining characteristic of a chunk proposed by Miller. The basic
function of a chunk is to enhance STM. Yet studies of animal chunking have
relied exclusively on tasks that require long-term rather that short-term
memory. This is true not only of the experiments on simultaneous
chaining described in the previous section, but also of experiments in
which animals learned rules concerning the spatial organization of
different reinforcers (Dallal & Meck, 1990), the monotonicity of
changes in the relative magnitude of reinforcers (Capaldi et al., 1990;
Hulse, 1978), and the temporal and spatial patterns of reinforcers
(Fountain et al., 1984). In each instance, the same items were
repeated in the same sequence on each trial. That would rule out the
limited capacity of short-term memory as an explanation of the facilitory
effects of grouping particular sets of stimuli during training.
Instead, these effects appear to result from organizational processes that
occur during the retrieval of familiar information from LTM.
distinguish between the organizational principles used to encode new
information in STM and to retrieve familiar information from LTM, I will
refer to the former as input chunking, and to the latter as output
chunking. Postulating a second type of chunking does, of course,
raise the same definitional questions that apply to the general concept of
chunking. In the case of output chunks, however, some recent experiments
on the execution of simultaneous chains by monkeys and college students
suggest that the temporal organization of a sequence can be used to
define output chunks. These experiments and their background are reviewed
in the next section.
Learning By Monkeys
& Colombo (1988) used the simultaneous chaining paradigm to train
monkeys to produce arbitrary 5-item lists. Of minor interest was their
finding that monkeys acquired 5-item lists more rapidly than pigeons.
greater significance, were the results of a 2-item subset test.
Unlike pigeons, monkeys responded accurately to all 10 of the subsets that
can be derived from a 5-item list of heterogeneous items. Of
particular significance is their ability to respond accurately to subsets
composed exclusively of items from the middle of a list (BC, CD and BD).
As shown earlier (in Figure 7), pigeons responded at chance levels of
accuracy to subsets drawn from lists on which items weren't clustered. The
accuracy of each species on 2-item subsets is shown in Figure
and pigeons also differed with respect to the latencies of their responses
to the first and second items of each subset. The top portion of Figure
9 shows the latency of responding to the first item of a two-item
subset. For monkeys, the latency of responding to the first item
increased monotonically with the position of that
item on the original list. For pigeons, the position of the first
item on the original list had no effect on latency. As can be seen
in the bottom portion of Figure 9, the latency of the monkeys' responses
to the second item also increased monotonically as a function of the
number of items on the original list that intervened between subset items.
For pigeons, the size of that interval had no effect. These data show
that, unlike pigeons, monkeys form a linear representation of a list.
Functions similar to those shown in Figures 9 have also been obtained from
rhesus monkeys, who were trained to produce 4- and 6-item lists (Ohshiba,
1997; Swartz et al., 1991b), and from 4-year-old children, who were
trained to produce a 5-item list (McGonigle & Chalmers, 1996)
important difference between the serial skills of monkeys and pigeons was
the ease of acquiring new lists. Pigeons showed no signs of
improvement on successive 3- or 4-item lists, each composed of novel items
(digitized color photographs of natural scenes). Monkeys trained to
learn successive 4- and 6-item lists of different photographs became
progressively more efficient at mastering each list (Chen et al., 1991;
Swartz et al., 1991a). Indeed, after mastering approximately a dozen
4-item lists by the successive phase method, monkeys were able to learn
new 4-item lists on which all items were displayed from the start of
training (Chen et al., 2000).
recent experiment by Terrace (2001) showed that monkeys could learn
7-item lists on which all items were introduced
at the start of training. Experimentally naÔve monkeys were first trained
on 3- and 4-item lists on which all items were presented from the start of
training. The monkeys were then trained in the same manner on four
7-item lists. As shown in Figure
10, the monkeys not only mastered each list but they did so with
progressively fewer trials on each new list. To place this achievement in
perspective, the reader should note that the probability of guessing
correctly the ordinal position of each item at the start of training on a
7-item list is 1/7! = .0002 (assuming no backward errors). Thus,
monkeys are not only capable of learning arbitrary lists as long as phone
numbers, but they are also became progressively more adept at devising
trial and error strategies for determining the ordinal positions of each
item during the course of mastering successive lists.
of ordinal position
availability of list-sophisticated monkeys provided an opportunity to
evaluate their knowledge of the ordinal position of list items with a
"derived list" paradigm used previously with human subjects (Ebbinghaus,
1964; Ebenholtz, 1963). In Ebenholtz's experiment, two groups
of college students learned two 10-item lists of nonsense syllables.
All of the items of List 1 were novel. Half of the items of List 2
were drawn from List 1. The remaining items were new. Items
derived from List 1 occupied every other position on List 2. For
Group I, the original ordinal positions of the derived items were
maintained on List 2. For Group II, they were changed. This
arrangement insured that the subjects of each group had to learn the same
number of new item-item associations while mastering their derived lists.
subject's knowledge of the original list was limited to item-item
associations, both derived lists should be equally difficult. This
was not the case. Group I mastered its derived list more rapidly
than Group II. Indeed, Group II required as many trials to learn its
derived list as a control group needed to learn a single list. The
positive transfer shown by Group I provides compelling evidence that
subjects acquired knowledge of the ordinal position of list items while
learning List 1.
test of ordinal knowledge was adopted for two monkeys (Franklin and
Rutherford) who learned to produce
4-item lists on which all items were present from the start of training.
Four derived lists, each containing 4 items, were composed of items drawn
from four previously learned 4-item lists (Chen et al., 1997).
The composition of the original and the derived lists is shown in Figure
11. Each item's original ordinal position was maintained on two
of the derived lists. The original ordinal position of the items was
changed on the other two derived lists. All items on the derived lists
were equally familiar since each of the original lists was trained to the
same accuracy criterion. Also, because each list contained only one item
from each of the previously learned lists, all previously acquired
item-item associations were irrelevant on both the maintained and changed
maintained lists could be executed correctly from the start of training by
using each item's original ordinal position as a basis for ordering the
newly juxtaposed items. On changed lists, the correct sequence could
only be determined by trial and error. To the extent that the
monkeys acquired knowledge of each item's original ordinal position, the
maintained lists should be easier to acquire than the changed lists.
12, the lists on which each item's ordinal position was maintained
were acquired rapidly and with
virtually no errors. The derived lists on which each item's ordinal
position was changed were as difficult to learn as novel lists. The only
explanation of the dramatic difference between the amount of training
needed to learn maintained and changed lists is that the subjects were
able to retrieve the ordinal positions of items from previously learned
lists while learning the derived lists on which the ordinal
positions of list items were maintained.
organization of simultaneous chains
our description of performance on simultaneous chains has focused on
accuracy of responding during list learning, subset tests and tests of
ordinal knowledge using derived lists. Another important aspect of
simultaneous chains is their temporal organization. Reaction times (RTs)
to the first item of a list and interresponse times (IRTs) between
subsequent items can provide information about how a subject plans a
the following two strategies for planning a sequence. The first is to
search for item1, respond to it, search for item2, respond to it, and so
on, until the sequence is completed. Another strategy is to scan all of
the items (or some subset of the sequence), and to then devise a plan for
executing the entire sequence (or some subset thereof) before making the
first response. The application of these strategies would result in
patterns of responding to successive items of the list. If subjects
adopted a "select-one-item-at-a-time" strategy, they would need
progressively less time to select each item. With a
"plan-the-sequence-first" strategy, the RT of the first response
should be long and the IRTs between the remaining responses should be
relatively short. Initial analyses of mean RT and IRTs of two
monkeys trained on four 6-item lists (Bugs and Garbo), supported the
"plan-the-sequence-first" strategy. As shown in Figure
13, the mean RT to item1 was long and the mean IRTs to the subsequent
5 items were uniformly short.
replication of the multiple-list experiment with human subjects led to the
unexpected discovery that the uniformly short IRT functions obtained from
Bugs and Garbo were artifacts of averaging and that pauses occurred on
most trials, albeit at different positions. The reliability with
which pauses occurred, both on correct and incorrect trials, suggests that
they could be used to define the boundaries of output chunks (Terrace et
procedure and the apparatus used to train human subjects was similar to
that used to train monkeys to produce 6-item lists. Human
subjects (N = 40) learned 4 eight-item lists composed of achromatic
nonsense geometric shapes (Terrace et al., 1996). One of those lists
and two of the hundreds of different configurations of the list items on
which subjects were trained are shown in Figure
a 3-item practice list, subjects were told to determine, by trial and
error, the correct order in which to respond to 8 items displayed on the
monitor. As expected, human subjects learned their lists much more rapidly
than monkeys. Details of the list-acquisition process for each
species can be found in (Swartz et al., 1991) and (Jaswal, 1995).
The mean latency functions obtained from human subjects are shown in
Figure 15. As was true of monkeys, the mean latency of
responding to the first item (2-3 sec) was longer than the uniformly
shorter mean IRTs between responses to subsequent items (0.75-1.5 sec).
For both species, the long latency of the response to the first item
appears to reflect the time needed to orient to the array of list items
and to search for the initial items (Sternberg et al., 1982).
molecular analysis of these data revealed that the uniformly short IRTs
shown in Figure 15 were artifacts of averaging IRTs, across subjects and
trials, and that, on most trials, one of the IRTs was significantly longer
than the others. The longer IRT could not be detected in the average
functions shown in Figure 14 because the location of the pause varied from
trial to trial.
analysis of each subject's IRTs on each correct trial showed that pauses
occurred on virtually every trial, typically after one of the first few
responses. Pauses were approximately twice as long as other IRTs. Figure
shows relativized data for each of the four lists on which human subjects
were trained. For example, when the response to C had the longest
latency, the location marked X-1 refers to the latency of the response to
item B, while X refers to the latency of the response to C. X+1
refers to the latency of the response to item D, X+2 to the latency of the
response to item E, and so on. On trials on which the latency of the
response to D was longest, (solid triangles), X-2 refers to the latency of
the response to B, X-1 refers to the latency of the response to C, X to
the latency of the response to D, X+1 to the
latency of the response to E, and so on. The values of each function
were determined by locating the longest IRT on each trial and then
calculating the relative magnitude of the IRT's at other positions.
By definition, the maximum value of each function is 1.0. Analogous
functions were obtained from a molecular analysis of Bugs' and Garbo's IRT
data. These are shown in Figures
17 (Bugs) and 18
procedure used to train monkeys and human subjects to execute simultaneous
chains lacked any contingencies that favored short or long IRTs at any
point of the required sequence. Subjects had ample time to complete each
trial. Indeed, less than 0.5% of all trials were ended because a
subject failed to respond to the list items in the time allotted for each
trial (20 sec. for human subjects; 15 sec. for monkeys). The only
requirement for reinforcement was responding to list items in the correct
order, at any pace.
Other evidence of
human list learning seldom analyze IRTs. The few that have also
reported pauses during the execution of sequences. For example,
Thorpe and Rowland (1965) described an experiment in which subjects
learned to produce 9-item lists of numbers. Subjects paused
spontaneously, typically after every 3rd response, as they executed those
lists. Similar results were reported by Ryan (1969), Wilkes
(1975), Wilkes & Kennedy, 1969, and by Wilkes, et al., 1972). As in
the experiments described above, there were no constraints on the temporal
pattern of responding on each trial.
pauses were also observed in an experiment in which human subjects were
trained to reproduce 12-item lists on which the items were presented
successively (Brannon, 1996). Following the presentation of
the 12th item, all items were displayed simultaneously. The
subject's task was to respond to the items in the order in which they were
presented. Thirteen of the 14 subjects paused at least twice on
correctly completed trials, typically after sequences of 3 or 4 items.
play an important role in Johnson's (1972) model of chunking which
characterizes chunks as "opaque containers". Johnson's model
predicts long IRTs and high transitional error probabilities (TEPs)
between chunks but short IRTs and low TEPs within chunks. Johnson
confirmed these predictions in a task in which subjects were asked to
recall lists composed of clusters of letters (Johnson, 1970).
paradigm that Johnson used differed in two respects from those used to
train simultaneous chains. The lists on which Johnson's subjects
were trained were segregated into temporally defined chunks throughout
training. Accordingly, the pauses Johnson observed could reflect the
structure provided by the experimenter. By contrast, no temporal structure
was provided for either monkey or human subjects in the experiments on
simultaneous chaining. The 2nd difference is that TEPs cannot be
calculated for performance on simultaneous chains because trials were
terminated after any error. The pauses shown in Figures 16-18 are
nevertheless consistent with Johnson's conceptual analysis of chunks.
As suggested by Johnson, pauses occur because subjects need more time to
retrieve a chunk of list items from LTM than they need to execute a
response to a particular item from a chunk that has been downloaded into
working memory. Each chunk is held in working memory until the subject
responds to the relevant items in the correct order. Subjects then
retrieve another chunk, respond to the items it contains, and so on, until
they complete the list.
The optimal size of
(1972) model of chunking shows how a chunk size of 3 requires fewer
associative and inhibitory connections between list items than do larger
chunks. Using a different set of premises, McGonigle and
Chalmers (1996) show how the "exponential explosion of combinatorial
possibilities" with increasing chunk size favors an output chunk size
of 3-4 items. Wickelgren (1964, 1957) and Broadbent (1975) have also
provided empirical evidence that the optimal size of input chunks is 3-4
items. An extensive review of experiments on the capacity of STM concluded
that both empirical and theoretical analyses of chunking yielded an
optimal chunk size of 4 items (Cowan, 2001).
spontaneous occurrence of pauses during the execution of a simultaneous
chain is significant because they reflect a self-imposed organization of
list items. In contrast to experiments in which the temporal structure of
a sequence is constrained by the subject's verbal history (e.g.,
Bousefield, 1953) or by the experimenter (e.g., Bower and Winzenz, 1972;
Johnson (1972), the spontaneous chaining paradigm provides none. Pauses
serve a useful function in that they reduce the load on working memory
during the execution of a list. Less attention is needed to prepare
a relatively short sequence that, on average consists of 3-4 items, than
important feature of pauses is the variability of their location from
trial to trial. Although pauses occurred mainly after the response
to the 3rd or 4th items, the distribution of pause locations was broader.
This variability of the location of output chunks is to be contrasted with
the lack of variability in the structure of input chunks. In experiments
on STM, list items vary from trial to trial and subjects encode items in a
fixed manner as they memorize them. During training with the
simultaneous chaining paradigm, the same items are presented on each trial
and the required sequence doesn't vary.
Input vs. Output
chunks reflect the limitations of working memory during the encoding of
new information, how new information is stored in long-term memory, and
how it is retrieved during recall. Output chunks reflect the
organization of over-learned motor programs that are generated
"on-line" in working memory.
literature on the organization of sequential behavior rarely distinguishes
between input and output chunks [see Estes (1972) & Schneider and
Detweiler (1987) for exceptions]. Miller (1956) made the case
for the importance of organizing new material into input chunks to
compensate for the limitations of working memory. Earlier, Lashley
(1951) noted the importance of hierarchically organized motor programs in
his analysis of the inability of linear models to account for the speed
with which skilled sequences were executed. Chomsky (1957) made an
analogous argument in his influential analyses of grammatical structure (Chomsky,
1957, 1965). The experiments on simultaneous chaining described in this
chapter indicate that language is not crucial for generating hierarchical
motor programs spontaneously
the influence of Lashley's and Chomsky's views have created the widespread
impression that the problem of serial organized behavior has been solved.
Instead of asking about the origins of serially organized behavior,
cognitive scientists (mainly theoretical linguists and investigators of
artificial intelligence) take for granted the availability of ordered
sets, recursive strings and other formal primitives in their theories of
seriation. The result is a confusion between theories of the
internal representation of serial knowledge in STM and LTM
(''competence") and theories of the execution of serially organized
behavior ("performance"). That confusion is avoided by the
distinction between input and output chunks.
concept of chunking has had a powerful influence on modern studies of
cognition, in particular, analyses of serially organized behavior.
However, the uncritical application of this concept has led to the
neglect of operational definitions of a chunk. The few definitions of
chunks that have been proposed assume that chunks are organized verbally
[e.g., (Simon, 1974) and (Shiffrin & Nosofsky, 1994)].
Such definitions are of no value in analyses of chunking in
problem of chunking can be clarified by distinguishing between input and
output chunks. Input chunks,
which are of fixed length and which have zero variability, reflect the
limited capacity of working memory to encode new information.
Output chunks, whose length can vary, reflect attentional
limitations of working memory during the generation of rotely learned
sequences that are composed of familiar items.
The results of recent experiments on list learning by monkeys and
humans confirm the validity of this distinction.
J. R., & Bower, G. H. (1974). Human associative memory. Washington,
DC: Hemisphere Publishing.
R. C., & Shiffrin, R. M. (1971). The control of short-term memory.
Scientific American, 225, 82-90.
A. (1981). The concept of working memory:
A view of its current state and probable future development.
Cognition, 10, 17-23.
A. (1986). Working memory. Oxford, U. K.: Clarendon Press.
A. (1992). Is working memory working? The fifteenth
Bartlett lecture. Quarterly Journal of Experimental Psychology, 44A,
A. K., & Bousfield, W. A. (1966). Measurement of clustering and of
sequential constancies in repeated free recall. Psychological Report,
W. A. (1953). The occurrence of clustering in the recall of randomly
arranged associates. Journal of General Psychology, 49, 229-240.
W. A., Puff, C. R., & Cowan, T. M. (1964). The development of
constancies in sequential organization during repeated free recall.
Journal of Verbal Learning and Verbal Behavior, 3, 489-495.
G. H. (1972a). Perceptual groups as coding units in immediate memory.
Proceedings of the Psychonomic Society, 27(4), 217-219.
G. H. (1972b). A selective review of organizational factors in memory. In
E. Tulving & W. Donaldson (Eds.), Organization of Memory (pp.
93-137). New York, NY: Academic Press.
G. H., & Winzenz, D. (1969). Group structure coding and memory for
digit series. Journal of Experimental Psychology Monographs, 80,
E. (1996). Measures of chunking. Unpublished Master's thesis. Columbia University,
D. E. (1975). The Magic Number Seven After Fifteen Years. In A. W.
Kennedy (Ed.), Studies in Long-Term Memory (pp. 3-18).
John Wiley & Sons.
E. J., Miller, D.J., Alptekin, S., & Barry, K. (1990).
Organized responding in instrumental learning: Chunks and superchunks.
Learning and Motivation, 21, 415-433.
E. J., Nawrocki, T. M., Miller, D. J., & Verry, D. R. (1986).
Grouping chunking memory and learning. The Quarterly Journal of
Experimental Psychology, 38B, 53-80.
S., Swartz, K. B., & Terrace, H. S. (1991, June). Preliminary evidence for
the development of learning set for 4-item lists by rhesus monkeys.
Paper presented at the annual meeting of the Eastern
Psychological Association. Washington, DC.
S., Swartz, K. B., & Terrace, H. S. (1997). Knowledge of
the ordinal position of list items in rhesus monkeys. Psychological
Science, 8, 80-86.
S., Swartz, K., & Terrace, H. S. (2000). Serial learning by rhesus
monkeys: II. Learning
4-item lists by trial & error. Journal of Experimental Psychology: Animal Behavioral Processes, 26, 274-285.
N. (1957). Syntactical structures. The Hague, NL: Mouton Publishers.
N. (1965). Aspects of the theory of syntax. Cambridge, MA: MIT Press.
N. (2001). The magical number 4 in short-term memory: A reconsideration of
mental storage capacity. Behavioral and Brain Sciences, 24, 87-185.
F. I. M., & Tulving, E. (1975). Depth processing and the retention
of words in episodic memory. Journal of Experimental Psychology,
R. G. (1976). Principles of learning and memory. Hillsdale, NJ: Lawrence
M. R., & Colombo, M. (1988). Representation of serial order in
monkeys (Cebus apella). Journal of Experimental Psychology, 14, 131-139.
N. L., & Meck, W. H. (1990). Hierarchical structures: Chunking by
food type facilitates spatial memory. Journal of Experimental
Psychology: Animal Behavior Processes, 16(1), 69-84.
H. (1964). Memory: A contribution to experimental psychology. New York, NY: Dover.
( Original work published in 1885).
S. M. (1963). Serial learning: position learning and sequential
associations. Journal of Experimental Psychology, 66, 353-362.
W. K. (1972). An associative basis for coding and organization in
memory. In A. W. Melton & E. Martin (Eds.), Coding processes in
human memory (pp. 161-190). Washington, DC: Winston.
S. B., & Annau, Z. (1984). Chunking, sorting, and rule learning from
serial patterns of brain-stimulation reward by rats. Animal Learning
& Behavior, 12, 265-274.
S. B., Henne, D. R., & Hulse, S. H. (1984). Phrasing cues and
hierarchical organization in serial learning in rats. Journal of
Experimental Psychology: Animal
Behavior Processes, 10, 30-39.
H. (1985). The mind's new science: A history of the cognitive
revolution. New York, NY, Basic Books Inc.
C. L. (1932). The goal gradient hypothesis and maze learning. Psychological
Review., 39, 25-43.
C. L. (1952). A behavior system: An introduction to behavior theory
concerning the individual organism. New Haven, CT: Yale Univ. Press.
S. H. (1978). Cognitive structure and serial pattern learning by
animals. In S. Hulse & H. Fowler & W. K. Honig (Eds.), Cognitive
processes in animal behavior (pp. 311-340). Hillsdale, NJ: Lawrence
V. (1995). Acquisition of 8-item simultaneous chain by human subjects.
Unpublished Manuscript, New York.
N. F. (1970). Chunking and organization in the process of recall. In G.
H. Bower (Ed.), The psychology of learning and motivation: Vol. IV. New
York, NY: Academic Press.
N. F. (1972). Organization and the concept of a memory code. In A. W.
Melton & E. Martin (Eds.), Coding processes in human memory (pp.
125-159). Washington, DC: Winston.
G., & Underwood, B. J. (1962). Proactive inhibition in short-term
retention of single items. Journal of Verbal Learning and Verbal
Behavior, 1, 153-161.
K.S. (1951). The problem of serial order in behavior. In L.A. Jeffries
(Ed.), Cerebral Mechanisms in Behavior (pp. 112-136). New York,
NY: John Wiley & Sons.
G., & Dean, P. J. (1969). Seriation: Development of serial order in
free recall. Journal of Experimental Psychology, 81(2), 207-215.
B., & Chalmers, M. (1996). The ontology of order. In L. Smith (Ed.),
Critical readings on piaget (pp. 279-311). New York, NY:
G. A. (1956). The magical number seven plus or minus two: Some limits on
our capacity for processing information. Psychological Review, 63(2),
M., & Petri, H. L. (1984). Memories & habits: Some implications
for the analysis of learning and retention. In L. R. Squire & N.
Butters (Eds.), Neuropsychology of memory. (pp. 287-296). New York, NY:
B. B. (1993). TODAM2: A model for the storage and retrieval of item,
associative, and serial-order information. Psychological Review, 100(2),
N. (1997). Memorization of serial items by Japanese monkeys, a chimpanzee,
and humans. Japanese Psychological Research, 39, 236-252.
C. E. (1953). Method and theory in experimental psychology. New York,
NY: Oxford University Press.
D. (1976). Intelligence in ape and man. Hillsdale, NJ: Lawrence Erlbaum.
D. M. (1977). Language learning by a chimpanzee: The Lana Project. New
York, NY: Academic Press.
J. (1969). Grouping and short-term memory: different means and patterns
of grouping. Quarterly Journal of Experimental Psychology, 21, 137-147.
W. & Detweiler, M. (1987). A connectionist / control architecture
for working memory. The Psychology of Learning Motivations, 21.
R. M., & Nosofsky, R. M. (1994). Seven pus or minus two:
A commentary on capacity limitations. Psychological Review,
H. A. (1974). How big is a chunk? Science, 183, 482-488.
B. F. (1938). The behavior of organisms. New York, NY:
W. S. (1900). An experimental study of the mental processes of the rat.
American Journal of Psychology, 11, 80-100.
L. R. (1986). Mechanisms of memory. Science, 232, 1612-1619.
S., Knoll, R. L., & Wright, C. E. (1982). Control of rapid
action sequences in speech and typewriting. Murray Hill, NJ: Bell Laboratories.
R. O., Seidenberg, M. S., Bever, T. G., & Terrace, H. S. (1979).
Serial learning in the pigeon. Journal of the Experimental Analysis of
Behavior, 32, 137-148.
R. O., & Terrace, H. S. (1981). Generalization of serial learning in
the pigeon. Animal Learning and Behavior, 9, 454-468.
K. B., Chen, S., & Terrace, H. S. (1991, June). Acquisition of 6-item
lists by rhesus monkeys. Paper presented at the Eastern Psychological
Association annual meeting.
K. B., Chen, S., & Terrace, H. S. (1991). Serial learning by Rhesus
monkeys. I. Acquisition and retention of multiple four-item lists. Journal
of Experimental Psychology: Animal Behavior Processes, 17, 396-410.
H.S. (1979). Is the problem solving language? A review of Premack's
Intelligence in apes and man. Journal of the Experimental Analysis of
Behavior, 31, 161-175.
H. S. (1984). Simultaneous chaining: The problem it poses for
traditional chaining theory. In M. L. Commons & R. J. Herrnstein
& A. R. Wagner (Eds.), Quantitative analyses of behavior:
Discrimination processes (pp. 115-138). Cambridge, MA: Ballinger
H. S. (1987). Chunking by a pigeon in a serial learning task. Nature,
H. S. (1991). Chunking during serial learning by a pigeon: I. Basic
evidence. Journal of Experimental Psychology: Animal Behavior Processes,
H.S. (2001). Serial expertise and the evolution of language. In A.
Wray, J.R. Hurfor, & R. Newmeyer (Eds.), The transition to language.
Oxford, U. K.: Oxford University Press.
H. S., & Chen, S. (1991a). Chunking during serial learning by a
pigeon: II. Integrity of a chunk on a new list. Journal of Experimental
Psychology: Animal Behavior Processes, 17(1), 94-106.
H. S., & Chen, S. (1991b). Chunking during serial learning by a
pigeon: III. What are the necessary conditions for establishing a chunk?
Journal of Experimental Psychology: Animal Behavior Processes, 17(1),
H. S., Jaswal, V., Brannon, E., & Chen, S. (1996). What is a chunk?
Ask a monkey. Abstracts of Psychonomic Society, 1, 35.
H. S., Son, L., & Brannon, E. (2001). The development of serial
expertise by rhesus macaques. Nature, (under review).
C. E. & Rowland, G. E. (1965). The effect of "natural"
grouping of numerals on short-term memory. Human Factors, 7, 38-44.
B. J. (1957). Interference and forgetting. Psychological Review, 64,
L. (1970). A long-term view of short-term memory in psychology. In G.
Horn & R. A. Hinde (Eds.), Short-term changes in neural activity and
Cambridge, U. K.: Cambridge U. Press.
W. A. (1964). Size of rehearsal group and short-term memory. Journal of
Experimental Psychology, 68(4), 413-419.
W. A. (1967). Rehearsal grouping and the hierarchical organization of
serial position cues in short-term memory. Quarterly Journal of
Experimental Psychology, 19, 97-102.
A. L. (1975). Encoding processes and pausing behaviour. In A.Wilkes
& Alan Kennedy (Eds.), Studies in long-term memory (pp.
19-42). Oxford, U. K.:.John Wiley
A. L., & Kennedy, R. A. (1969). Relationship between pausing and
retrieval latency in sentences of varying grammatical form. Journal of
Experimental Psychology, 79(2), 241-245.
A. L., Lloyd, P., & Simpson, I. (1972). Pause measures during
reading and recall in serial list learning. Quarterly Journal of
Experimental Psychology, 24, 48-54.
This research was supported by grants from the
National Institute of Mental Health (MH40462), the Whitehall
Foundation and NATO. Correspondence concerning this article should
be addressed to H. S. Terrace, Department of Psychology, 418
Schermerhorn Hall, Columbia University, New York, NY 10027; email:
Figures 1, 2, & 3, are adapted from:
R. O., & Terrace, H. S. (1981). Generalization of serial learning in
the pigeon. Animal Learning and Behavior, 9, 454-468.
Figures 4 is adapted from: Terrace,
H. S. (1987). Chunking by a pigeon in a serial learning task. Nature,
Figures 5, 6, & 7 are adapted from: Terrace,
H. S., & Chen, S. (1991a). Chunking during serial learning by a
pigeon: II. Integrity of a chunk on a new list. Journal of Experimental
Psychology: Animal Behavior Processes, 17(1), 94-106.
Figures 8 is adapted from: Chen,
S., Swartz, K. B., & Terrace, H. S. (1997). Knowledge of
the ordinal position of list items in rhesus monkeys. Psychological
Science, 8, 80-86.