Abstracts of 1995 Conference Papers


Melissa Bateson & Alex Kacelnik
University of Oxford, Department of Zoology

Rate Currencies and the Foraging Starling

Optimality models are built on the assumption that the payoffs of different decisions can be expressed in a common currency. In classical optimal foraging models the currency maximized is generally the long-term rate of energy intake (the ratio of expected food over expected time). This currency is chosen because it is argued to be the best surrogate for Darwinian fitness on the grounds that natural selection should favor animals that on average collect more energy per unit of time spent foraging. We present experimental evidence that foraging starlings (Sturnus vulgaris) do not maximize long-term rate of energy intake, but instead maximize the expected ratio of food over time. This strategy results in apparently sub-optimal behavior in environments in which the time associated with the finding and handling of prey items is variable. We discuss how this result could be explained in terms of (i) unconstrained optimality and (ii) optimality models into which known psychological constraints are introduced. Finally we discuss the evidence pertaining to these different explanations.


Hilary A. Broadbent - University of Oxford
York A. Maksik, & Russell M. Church - Brown University

A Fractal Analysis of Random Interval Data

Rats tested in random-interval schedules with various rates of reinforcement responded in a complex pattern of multiple periodicities. These periodicities occurred at various time-scales, from milliseconds to seconds to minutes, within sessions, suggesting the presence of self-similarity of pattern between these various time scales. The degree of self-similarity was studied by computing the fractal dimension of the data. The results obtained suggest that fractal analysis can provide a useful means of summarizing data, and may also help in developing a generating model that can account for observed patterns of responding (molecular phenomena) as well as for overall changes in response rate (molar phenomena) caused by changes in the pattern or rate of reinforcement presentation.


Jose E Burgos
University of Masschusetts, Amherst

The Evolution of Neural Networks in Pavlovian Environments

A genetic algorithm (GA) was used to simulate the evolution of artificial neural networks (ANNs) in Pavlovian environments. ANNs consisted of neural processing elements (NPEs), whose functioning was described by a biobehaviorally motivated neurocomputational (NC) model. The GA consisted of rules for developing ANNs and selecting them for reproduction. Eight ANNs were developed from each chromosome of a pseudorandom genotype founder population. The resulting 800 ANNs were divided into groups of 200 ANNs, and trained in a forward-delay procedure with an interstimulus interval (ISI) of either 2, 4, 8, or 16 time-steps. Also, ANNs were trained with different conditioned stimuli. During training, NPE activations and synaptic efficacies were modified according to the NC model. Individual fitness was determined by performance after conditioning. Results demonstrated genotypic, architectural, and behavioral convergence. Genotypic convergence was expressed as an increase in average population chromosomic overlap as a function of generation. Architectural convergence was expressed as an increase in ANN size as a function of ISI and generation. Behavioral convergence was expressed as an increase in average population fitness, acquisition speed, and maintenance as a function of generation.


Michael L. Commons
Department of Psychiatry, Harvard Medical School

Richard Herrnstein's Vision and the Society for Quantitative Analyses

As one of the founders and then President of the Society for Quantitative Analyses of Behavior, Richard Herrnstein turned part of his vision of biological and behavioral science into a set of open and serious cross-disciplinary discussions. I will review the logic of his research and writing program.


John W. Donahoe
University of Masschusetts, Amherst

Experimental Analytical Constraints on Quantitative Modeling

To meet the demands of scientific interpretation (in Skinner's sense), a quantitatively expressed theory must implement only those processes that have been identified through experimental analysis. So defined, quantitative theorizing is neither a substitute for experimental analysis nor a surrogate for intervening variables. Instead, quantitative theory is a means for precisely tracing the implications of fundamental processes, especially when multiple processes act simultaneously over extended periods of time. Two areas are examined in the light of this view of quantitative theorizing--the treatment of choice by the matching principle and the treatment of neural networks in cognitive science. The normative forms of theorizing in both areas do not meet the criteria of scientific interpretation, although proposals that conform to such criteria are possible.


Edmund Fantino & Hernan Savastano
University of California, San Diego

Delay-reduction Theory: Support for Still Another Counterintuitive Prediction

According to delay-reduction theory, the effectiveness of a stimulus as a conditioned reinforcer may be predicted most accurately by calculating the reduction in the length of time to primary reinforcement measured from the onset of the preceding stimulus. We discuss research that assesses counterintuitive predictions of delay-reduction theory both in experimental analogues to foraging behavior and in choice as measured by the concurrent-chains procedure. In an ongoing experiment we ask if choice is controlled by the relative value of two outcomes (for example by the ratio of the rates of reinforcement correlated with them) as many theories require, or by the difference between their correlated reinforcement rates as required by delay-reduction theory. The data support delay-reduction theory.


Adam King & C. R. Gallistel
University of California, Los Angeles

Laplace Transform Models of Anticipatory Feeding and Classical Conditioning

Modern conditioning protocols are examples of multivariate (many CSs) non-stationary (reinforcement contingencies change) time series. The conditioning process is computationally specialized for solving such problems. How can we conceptualize the neurobiological bases for this computational capacity? One way the CNS might tackle the problem--a way that uses the many different clocks known to be present in the CNS--is by temporal filtering, which is a well documented process for extracting temporal structure in sensory systems. The temporal filters we suggest are like receptive fields; their response is determined by the match between the time line for a stimulus and the filter function. The use of exponentially decaying sinusoidal filter functions captures the temporal structure of each time line, allows the computation of the relations between time lines, and permits the recognition of non-stationarities. Processing a time line with such a filter bank is equivalent to computing its Laplace transform. Thus, a principal purpose of this talk is to present an intuitive conception of the Laplace transform and its possible contribution to our understanding of the conditioning process.


Stephen Grossberg
Boston University

Adaptively Timed Reinforcement and Recognition Learning

The concepts of declarative memory and procedural memory have been used to distinguish two basic types of learning. A neural network model suggests how such memory processes work together as recognition learning, reinforcement learning, and sensory-motor learning take place during adaptive behaviors. To coordinate these processes, the hippocampal formation and cerebellum each contain circuits that learn to adaptively time their outputs. Within the model, hippocampal timing helps to maintain attention on motivationally salient goal objects during variable task-related delays, and cerebellar timing controls the release of conditioned responses. This property is part of the model's description of how cognitive-emotional interactions focus attention on motivationally valued cues, and how this process breaks down due to hippocampal ablation. The model suggests that the hippocampal mechanisms that help to rapidly draw attention to salient cues could prematurely release motor commands were not the release of these commands adaptively timed by the cerebellum. The model hippocampal system modulates cortical recognition learning without actually encoding the representational information that the cortex encodes. Learning within the model hippocampal system controls adaptive timing and spatial orientation. Model properties hereby clarify how hippocampal ablations cause amnesic symptoms and difficulties with tasks which combine task delays, novelty detection, and attention towards goal objects amid distractions. When these model recognition, reinforcement, sensory-motor, and timing processes work together, they suggest how the brain can accomplish conditioning of multiple sensory events to delayed rewards, as during serial compound conditioning.


Gene M. Heyman
Harvard University

The matching law: history and implications

This talk will include a discussion of some of the historical and conceptual factors relevant to the formulation and subsequent development of the matching law. The idea of a general law of response strength had been discussed since at least the publication of "Behavior of Organisms" (Skinner, 1938). Several quantitative forms were tried, but these proved of limited generality until the publication of Herrnstein's matching law (1970). A key feature of Herrnstein's approach was that in operant experiments there were important, but uncontrolled sources of reinforcement. In the matching law formulation, "spontaneous" sources of reinforcement competed with those that the experimenter had arranged and were represented by a fitted constant. In contrast, Skinner and others emphasized that operant experiments were controlled environments. Uncontrolled factors, when recognized, were treated-as unwanted intrusions. Subsequent experiments and theoretical discussion led to the claim that the matching law provided an empirically-based alternative to rational choice theory. This idea remains controversial, and some recent experiments relevant to this controversy will be reviewed.


Philip N. Hineline, Paul Neuman, and William Ahearn
Temple University

Temporal Extension In Aversive And Appetitive Domains

The present line of work began three decades ago, in some experiments concerned with analyzing reinforcement of the behavior that we call avoidance. Specifically, Herrnstein and Hineline found that under certain conditions, laboratory rats' responding was sensitive to its effects on overall rates of aversive events. Hineline also verified that short-term and long-term sensitivities are not mutually exclusive, for similar animals postponed shock in the short term, even in arrangements when shock frequency was held constant. Eventually we recognized that this illustrates a more general principle -- first a distinction between immediate vs. more remote influences on behavior, and later a reconceptualization that accommodates multiple scales of process, each with its own pattern of behavior and level of analysis. This also applies in the appetitive domain where, for example, foraging in a depleting patch can be conceptualized as repeated choices in which immediate consequences of remaining in a food patch contrast with the longer term consequences of switching patches. Progressive schedules of reinforcement mimic this characteristic, as in a procedure by Hodos and Trumbule (1967), who exposed chimpanzees to a concurrent-chains procedure that involved progressive ratios. Their subjects performed in a manner that indicated sensitivity to the overall mean rate of reinforcement rather than to immediate consequences. Hineline and Sodetz replicated this result with monkeys, in an experiment reported at a SQAB meeting several years ago. In subsequent cross- species replications, however, we have found that pigeons tend to abandon a progressive schedule later than arithmetic averaging would predict. Other participants in the SQAB symposium (Shull and Spear, and later, Mazur and Vaughan) had predicted this before we carried out the replication with pigeons. .. averaging technique based on reciprocals of distances to several reinforcers, each simultaneously operative, has been shown to be more effective for predicting pigeons' choices in progressive schedules that escalate by constant increments. At present, we are examining this averaging technique in further experiments with pigeons, in which the progressive-ratio requirements escalate by constant proportions instead of by equal increments. It appears that the sums-of-reciprocals measure is applicable here as well, although the effect appears to be sensitive to levels of food deprivation.


Peter R. Killeen
Arizona State University

Ro Means Galileo

Ro offered something new to behavior analysts: a hypothetical construct sanctioned by all establishment behaviorist. It was a mathematical, not verbal construct, part of the powerful mathematical model called the Matching Law. Experiments support the model and the construct, but not Herrnstein's verbal interpretation. In this presentation I draw a parallel between Herrnstein's liberation of behavioral theory with this construct, and Galileo's liberation of physical theory from its Aristotelian framework. I then suggest an alternative interpretation of Ro as the carrier of information about the drive level of the organism and the incentive value of the reinforcer.


Armando Machado
Indiana University

A dynamic model of temporal regulation

Operant behavior may be controlled by the temporal properties of environmental events. To account for some of the data on temporal regulation I developed a dynamic, mathematical model that has four basic assumptions: l) A time marker (e.g., food) initiates a serial process of activation of different behavioral states. The dynamics of the states are described by a gamma density function; 2) the rate of transition across states is generally proportional to the overall rate of reinforcement - assumptions 1 and 2 are mathematically equivalent to Killeen Behavioral Theory of Timing; 3) a linear operator describes the process of conditioning presumed to take place between the behavioral states and the operant response; 3) the momentary strength of the operant response is given by the dot product between the activation of the behavioral states and their conditioning strength. I illustrate the predictions of the model for Fixed-Interval schedules, the peak procedure, the bisection procedure, and for Catania's (1968) experiment. Finally, I discuss some of the issues raised when we attempt to integrate models of the learning process with models of timing.


Frances K. McSweeney and John M. Hinson
Washington State University

A mathematical model for within-session changes in responding

A quadratic equation and combinations of linear, hyperbolic, power, and exponential functions were fit to the bitonic within session changes in response rates that are often observed during conditioning procedures. The difference between an exponential decay function and an ascending hyperbolic function described most of the data well even when the free parameter of the hyperbolic function was held constant at 0.18. These results suggest that different mechanisms produce the ascending and descending part of the bitonic function. The hyperbolic increases in responding are relatively constant across experimental procedures. A waning of arousal may produce the descending exponential function.


John A. Nevin
University of New Hampshire

Presidential Address

In 1970, Herrnstein presented a quantitative law of effect that characterizes both choice and the sheer output of behavior. At the end of this well-known article, he stated that "The territory circumscribed is sizable, expandable, and susceptible to precise measurement." This talk will trace its expansion over the past 25 years.


John A. Nevin
University of New Hampshire

Open Discussion: Richard J. Herrnstein, 1930-1994

When Richard Herrnstein died in September 1994, the science of behavior lost one of its most outstanding contributors. Among Herrnstein's many contributions, the matching law for choice is responses, stimuli, and reinforcers. Moreover, he applied to topics ranging from pigeons' performance on single schedules to human economic and addictive behavior. Herrnstein also demonstrated that pigeons could learn to distinguish between open-ended, natural categories. He thereby brought the study of discrimination and generalization in non-human animals into direct contact with the study of concept formation and utilization in humans, and opened the way for exploration of the factors that determine effective categories. Herrnstein was instrumental in founding the Society for Quantitative Analyses of Behavior, and his work on choice gave it a firm foundation, We will miss his creative thinking, his incisive criticism, and his friendship. The first portion of the 1995 SQAB meeting is dedicated to his memory.


John A. Nevin - University of New Hampshire
Anthony McLean - University of Canterbury

Resistance to Noncontigency

Ideally, proedures for evaluating resistance to change should disrupt responding without altering the stimulus-reinforcer relations that determine resistance to change. We trained four pigeons on multiple VI 4-min schedules of food reinforcement and then shifted them to multiple VI 1-min, VT 4-min schedules, with two subsequent replications. For all birds and replications, the transition from contingent to noncontigent food reduced responding relatively more, and more rapidly, in the component with the leaner schedule. This result is consistent with previous research using intercomponent food, prefeeding, or extinction as disruptors. We consider several ways to quantify relative resistance in these data.


Leslie A. Real
Indiana University, Bloomington

Foraging and Cognitive Models of Choice.

I will review my labs current research on the cognitive aspects of floral choice in bumble bees under conditions of reward variability. We have designed experiments to distinguish the computational algorithms that may underlie energetic rate characterizations in establishing currencies of choice. Foraging bees may use either long-term rate averaging or short-term rate averaging as alternatives. In a series of experiments, we tested individual foraging bees for their preference over Allais lotteries with different associated short-term and long-term rates of energy gain. In all cases, bees used short-term rate as their currency for choice. State-preference analysis of choices over fixed probability sets also indicates that individual bees may show significant subjective probability bias in a manner consistent with short-term rate averaging. I will conclude with a discussion on the adaptive value of short-term rate averaging and some of our work on field tests on the adaptive nature of the computational rules employed by foragers exploiting natural floral landscapes.


David W. Stephens
University of Nebraska, School of Biological Sciences

Saltatory Search Behavior: A Variational Approach

Many searching animals use a pause-travel pattern of movement. There have been two explanations proposed for this saltatory (pause-travel) search behavior. Evans and O'Brien (1988) argue that saltatory search occurs because the ability to detect prey degrades with search speed. If movement and prey detection are completely incompatible, a forager must either sit-and-wait, or punctuation its movement with periodic pauses. An alternative explanation is an economic one based on movement costs: alternating between rest and movement may be cheaper. The calculus of variations is used to model both of these possibilities. The resulting models are compared to observed search behavior of bluejays, Cyanocitta cristata , trained to search for food pellets in the laboratory. Spectral analyses are used to determine the qualitative and quantitative properties of observed search behavior.


Geoffrey K. White
University Of Otago, New Zealand

Quantitative Analysis Of Remembering

Remembering is jointly determined by its consequences and the temporally distant stimuli comprising the to-be-remembered event. The scaling function relating remembering to temporal distance is characterized by a monotonic decrement in discriminability across time. The effect of the reinforcers that maintain remembering increase with increasing temporal distance of the to-be-remembered event. A quantitative analysis is presented that describes the interacting effects of the change in discriminability across time and the reinforcer differential.



Poster Session


Lee G. Bloomquist
Steelcase Inc.

Modeling Interaction With an Environment That Reinforces Common Knowledge

Certain formulations that model data acquired about organizations can be expressed in terms of "invariances" over intervals of time with observable measure. But in Decision Theory, explanations have been derived from statements implying "cause and effect" over an interval of time with measure zero. And from the former, a particular kind of metric space enables maximum expression of invariances. So for the first kind of formulation, a falsifiable hypothesis can be posited that its particular kind of metric space, versus the alternative, would have been selected through evolution for mediating interaction with the environment. Within computer simulations that assume (a) this hypothesis, (b) isomorphisms from the first model, above, to Situation Theory (a mathematical theory of information), and (c) agents learning by operant conditioning, "common knowledge" (as described in Decision Theory) appears to emerge. This is a path toward linking quantitative analyses of behavior to economic analyses of human interaction.


Michael L. Commons, Eric A. Goodheart, Rebecca M. Young, Wilson R. Fong
Harvard University

How Stage of Reasoning Affects Strategies in a Modified Prisoner Dilemma

The prisoner's dilemma and Hardin's tragedy of the Commons both present the subject with a conflict between two different costs (i.e., a complex set of reinforcement contingencies). These studies demonstrate that peoples' behavior does not maximize their total payoffs (molar maximizing) when there is a sufficient short-term cost. The current study examined a more extreme experimental analog of these situations. Here, the short-term payoffs were identical but one choice depleted the total amount obtainable. Only high stage-performing subjects maximized. Low stage-performing subjects tended to reject the problem or adopt unsuccessful matching strategies. Poor assessment of risk may be associated with unwillingness to take risk.


Robert G. Cook - Tufts University
John T. Wixted - Univ. of California, San Diego

Models of Avian Multidimensional Perception and Discrimination

The behavior of six pigeons performing a multidimensional same/different choice discrimination was examined using a signal detection framework. On each trial, the animals had to choose between two "choice" hoppers depending on whether a color, shape, or redundant target signal was present or not in a briefly presented textured stimulus. ROC curves were produced by variations in signal presentation probabilities across conditions (e.g., in some conditions, a same trial was more likely than a different trial). Quantitative analyses of these curves were used to evaluate different models about the detection, discrimination, and integration of color and shape information by pigeons.


Jennifer Higa
Duke University, Durham

Time Discrimination and Within-Session Changes in Interfood Interval Duration

Previous studies show that pigeons can track unexpected changes in the duration of events. For example, on certain periodic reinforcement schedules, estimation of the time to the next food delivery is rapid and based on the just-preceding interfood interval (IFI). Under other conditions, however, time discrimination seems to depend on earlier events, for example on the shortest IFI in a series. A diffusion-generalization (DG) model provides a framework for exploring and understanding the effects of recent and older events. Results from experiments show dynamics of time discrimination in rats, clarify the conditions leading to rapid timing, and show the generality of effects across species and experimental conditions.


William R. Hutchison
Behavior Systems Inc.

Operant Neural Network Model Can Learn Useful Verbal Repertoires

Computer "autonomous agent" models based on neural networks (Maes, 1994) have proven very effective in learning fairly complex operant sensorimotor behaviors, sometimes categorized as contingency-shaped behaviors. This poster and demonstration show how more sophisticated repertoires can be trained using behavioral analyses and training technology, with an appropriate mathematical model of operant learning. A variety of cases will be described in the poster, but the demonstration will focus on one capability that cognitively-oriented neural network theorists have asserted to be impossible for a simple (pure operant) neural network. In this demonstration, the author applied Skinner's theory of verbal behavior to train the system to memorize, understand and follow verbal advice. Specifically, when presented with various objects, the system correctly tacts ("labels") each one (e.g., "green square"), then says aloud the memorized rule that matches the situation ("If green square then move back"), then follows the rule's advice (it moves back). This computer simulation provides a powerful sufficiency proof for the radical behaviorist view, since the system is a pure operant system with no additional mechanisms of the kinds assumed necessary by cognitive theorists (e.g., storage and processing of propositions). The procedure involved training a naive network to emit motor responses in the presence of specific external stimuli, to tact the same and other stimuli in its environment, to follow self-mands, and to memorize intraverbal "rules." These component skills enabled the network to perform both contingency-shaped and rule-governed behaviors and to shift control of behavior from one "mode" to the other as a function of contingencies.


Randolph C. Grace
University of New Hampshire

Rate of Change of Preference as a Function of Terminal-Link Duration in ConcUrrent Chains: Some Predictions and Preliminary Data

It is well established that preference in concurrent chains increases if the absolute duration of terminal link delays increases, with the ratio held constant. This phenomenon is called the "terminal-link effect" and is predicted by various accounts of choice, such as delay- reduction theory, incentive theory, and my recent contextual choice model. However, there are no data on the rate of change of preference with respect to terminal-link duration; for example, preference could be a negatively-accelerated, linear, or positively- accelerated function of terminal-link duration. Pigeons responded in a three-component concurrent-chains procedure in which, for each condition, terminal links delays were in constant ratio, but average terminal-link duration increased across components by a factor of two (7.5 s, 15 s, 30 s). Comparing the ratio of log response ratios across the components (15 s / 7.5 s and 30 s / 15 s) gives an estimate of the sign of the second derivative of preference with respect to terminal-link duration. Predictions of delay- reduction theory and the contextual choice model are presented, together with some preliminary data.


Benjamin C. Mauro and Charles F. Mace - University of Pennsylvania
Amy Boyajian - Children's Seashore House

A Quantitative Analysis of Reinforcer Quality and Behavioral Momentum

We conducted an experiment with four rats to examine the effect of reinforcer quality upon behavioral momentum, Baseline involved a multiple variable interval variable interval schedule with unequal reinforcer quality (sucrose versus citric acid solutions) across components. The concentration of these solutions were parametrically varied across three baseline conditions (0.075%, 0.75%, and 1.5% w/v solutions) Reinforcer quality was scaled along a continuum with sweet and sour represented at its extremes. A quantitative analysis showed that the relative rate of responding across components matched the relative quality of reinforcers. Extinction tests were performed after each baseline, An analysis of the slope of the log proportions of baseline responding showed that behavioral mass (m) was usually greater with the sucrose (m1) than citric acid (m2) solution. No differences in log m1/m2 were observed across different concentrations of the sucrose and citric acid solutions (c.f., Nevin, Mandell & Atak, 1983).


N. A. Schmajuk, E. T. Axelrad, & B. S. Zanutto
Duke University, Durham

A Neural Network Approach to Animal Cmmunication

We apply a neural network model to the description of animal communication. The model, used previously to describe avoidance learning by a single animal (Schmajuk, 1994), has a classical conditioning block, where classical associations between warning stimuli (WS) and shock (US) are formed; and an operant conditioning block, where operant associations between WSs and responses (R) are formed. We use two of these models to simulate Miller's (1960) "cooperative avoidance" task, in which two monkeys have to communicate in order to avoid shock presentations. In the task, one of the monkeys is presented with the WS but cannot generate R to avoid the shock. A second monkey is presented with the first monkey's calls but not with the WS, and is able to generate R to avoid the shock for both monkeys. Computer simulated results capture many aspects of this task.


SQAB main page



Date Updated : May 31, 1999