Listed below are the SQAB abstracts for presentations, poster presentations, and tutorials. In addition, we've listed the abstracts for the two EAB-invited presentations for ABA.
Trial Duration Determines Pacemaker Rate
Trial duration was varied both within- and between-sessions in a two alternative free-operant psychological choice procedure. Pigeons were reinforced for discriminating between the first and second halves of a trial, and total session duration was kept constant across conditions. Two keys were illuminated either both red or both green at the beginning of a trial. Half the trials were long (red keys) and half short (green keys). Responding during the first half of a trial on the left-key was reinforced according to a VI 30 s, and responding during the second half of a trial on the right-key was reinforced according to a VI 30 s. Psychometric functions describing the temporal discrimination of the various durations superimposed when plotted as a function of relative duration. Contrary to predictions of the behavioral theory of timing estimates of pacemaker period varied as a direct function of trial duration. The coefficients of variation were relatively constant across trial duration consistent with Weber's law. The present experiment will be discussed with regard to implications for current theories of timing.
Responses and Time in the Quantitative Analysis of the Delay-of-Reinforcement Gradient
Delay-of-reinforcement gradients relate some index of response strength (e.g., rate) to the separation between a response and subsequent reinforcer. They have been expressed mathematically in terms of either time or amount of behavior, and typically take the form of either exponential decay or hyperbolic (reciprocal) functions. In a two-key pigeon chamber, m left (L) pecks followed by n right (R) pecks were reinforced according to random-interval schedules, so that separations between the last L peck and a reinforcer could be manipulated by varying n and/or by differentially reinforcing either high or low R rates. Whether the delay term should be time or number of responses can be assessed by arranging roughly equal L-to-reinforcer times but different values of N or different times but equal values of N; the data favor time rather than responses as the appropriate delay metric. The account explicitly considers relative contributions of experimental and quantitative analysis to resolving the form of the gradient (e.g., effects of delay on L rate must not be confounded with generalization of rate differentiation from R to L). It also has practical implications for stimulus-control procedures (e.g., correction procedures in which errors are likely to precede reinforced corrects may actually maintain the errors they had been designed to reduce, because errors then often share with corrects large portions of the effects of the reinforcer).
Learning to Play the Producer-Scrounger Game
Social foraging theory has developed a number of quantitative game models to deal with foraging decisions under conditions of frequency-dependence. Although many games were originally formulated in terms of genetic evolution, they also apply to the more likely cases where individuals modify their behaviour from experience. When this happens, individuals should alter their use of alternatives until the group reaches equilibrium proportions, a situation of developmental stability that is analogous to genetic evolutionarily stable strategies. Group foraging gives rise to producer-scrounger games where the "producer" (P) tactic searches for its food and the "scrounger" (S) exploits the P's discoveries. PS foraging economics are characterized by strong, negative frequency-dependence of the payoffs to S. S does better than P when it is rare but worse when it is common. I present a PS foraging model and then test its predictions showing that individual spice finches (Lonchura punctulata) foraging in small flocks adjust their use of P and S until the group converges towards the point of developmental stability. I use computer simulations to explore whether the relative payoff sum learning rule could account for the results.
Experimental and Theoretical Analysis of Temporal Dynamics: A Quantitative Model
I propose to address rapid time discrimination during interval reinforcement schedules. The focus is on the dynamics. I begin by presenting results from experiments in which animals are exposed to sudden transitions in interval duration. The primary dependent measure is postreinforcement wait time duration. Of particular interest are the local, interacting, effects of within-session shifts in interval duration, on the wait time in upcoming intervals. I plan to provide a quantitative analysis of behavior under these transitional conditions and identify key features of rapid timing. I then present a dynamic model that describes several of the rapid-timing effects. Features and points of discussion include: 1) evidence for a fast-acting timing process, where variations in interval duration cause systematic changes in wait times after relatively few sessions of training; 2) an analysis of how wait time duration depends on the number, spacing, and direction of transition in interval duration; 3) comparison of rapid timing effects in pigeons and rats and different methods of programming intervals; and 4) suggestions on what the dynamics indicate about the nature of the basic timing mechanism and a model for the process.
A General Model of Motivation
A model is proposed that includes both an energy variable and a threshold variable. The former has multiple sources of energy (both internal and external) and multiple means of dissipating energy (performance of the behavior, performance of alternative behavior, and passage of time). The latter incorporates all the other factors that influence the occurrence of the behavior including, but not limited to, circadian clocks, stimulus factors, and inhibition from other systems. I consider the threshold to be a structural variable because it sets limits on the occurrence of a particular behavior, but does not contribute directly to the level of motivation for performing that behavior. Some results from studies on attack and other behavior in fish, dustbathing in chickens, and sleep in humans will be reviewed to illustrate how this model can be quantified.
Choice in a Flock of Pigeons: Testing the Ideal Free Distribution
The distribution of an individual's behavior among choice alternatives grounded in the matching law parallels the distribution of individuals in a group among alternatives grounded in ideal free distribution theory. Research on the matching law has revealed deviations called undermatching and overmatching, toward lesser and greater sensitivity than predicted by the original theory, depending on factors besides the amount of food, such as switching requirements and scheduling of food. Similar deviations might occur from the ideal free distribution. We studied a flock of 34 domestic pigeons (Columba livia) feeding at two sites and varied both the degree of competition at the sites and the separation between them. Deviations from the ideal free distribution analogous to undermatching occurred, but degree of undermatching depended on degree of competition and separation of sites.
A quantitative model for the effects of intensity and duration of light on the mammalian circadian pacemaker
The mammalian circadian pacemaker consists of a ~10,000 neurons in the suprachiasmatic nuclei (SCN) of the hypothalamus. Many of these neurons, when isolated in vitro, show a circadian rhythm of action potential firing. Non-electrical coupling between these neurons yields an average oscillation that is remarkably precise and stable. Direct photic simulation of the SCN comes via specialized retinal ganglion cells of the retinohypothalamic tract.
We model the self-sustained rhythm of the neural aggregate as a single van der Pol oscillator of low stiffness (i.e., quasi-sinusoidal oscillator). The stiffness coefficient is estimated by experimental interventions which change rhythm amplitude, A. We analyze the van der Pol oscillator in the Lienard form as a pair of coupled first order differential equations to emphasize the essential 2-dimensionality. The variables are designed as x and xc (subscript c). The variable x is closely represented by the endogenous core body temperature.
In nocturnal rodents, light pulses, activity and melatonin have all been shown to shift pacemaker phase, but i humans, so far, only light has a demonstrated strong phase-shifting ability. Light has been modeled as having a direct effect principally on the x differential equation and only a weaker effect on the xc (subscript c) equation. Data clearly show the effect of light to be much stronger at night than by day so a significant circadian modulation of light drive is incorporated in the model. Early realizations of the model postulated that the pacemaker might respond to different light intensities, I, in much the same way as humans perceive intensity, so the drive to the pacemaker was taken to b proportional to I 1/3 (superscript 1/3). (Reducing I by a factor of 1/8 reduces effectiveness by 1/2.) Recent data substantiate this.
The model matches well the data from controlled laboratory experiments. It has also been applied with success in industrial situations where workers are required to adapt to rotating shift schedules.
Monte-Carlo Simulation Studies of Undermatching
A collection of 545 concurrent-schedule data sets was used as a criterion against which to evaluate the results of five computational experiments on the nature and causes of undermatching. An interesting and unexpected feature of the collection of data sets was a modest correlation between the exponent of the best-fitting power function and the coefficient of variation (CV) of the reinforcement-rate ratios. In other words, larger values of the exponent tended to be associated with larger ranges of reinforcement-rate ratios. For all computational experiments, response-rate ratios were assumed to exhibit homoscedastic, gaussian error around the log transform of the power-function matching equation. In the first two experiments, the exponent of the matching equation was assumed to equal the median exponent from the collection of data sets, either with or without gaussian error added. Neither experiment produced a good replication of the collection of data sets, indicating that the data are not consistent with a single, characteristic exponent for the power-function matching equation. In the third computational experiment, a linear relationship between the exponent and the CV of the reinforcement-rate ratios was introduced. This experiment produced an excellent replication of the data sets, indicating that the data are consistent with a characteristic exponent that has the additional property of varying as a function of the CV of the reinforcement-rate ratios. The remaining experiments tested various asymptotic forms of this functional relationship, and produced some evidence that the exponent may approach unity as the CV of the reinforcement-rate ratios increases.
Some effects of ITI duration on changeovers in discrete-trial choice
Pigeons served as subjects in two experiments that investigated molar and molecular determinants of choice behavior. Experiment 1 employed a discrete-trial concurrent variable interval 1-min variable interval 3-min schedule, and Experiment 2 employed a discrete-trial concurrent variable interval 1.5-min variable interval 1.5-min schedule. In each experiment, the durations of the intertrial intervals were 0 s, 6 s, 22 s, and 120 s, and the schedules were independent and interdependent. Experiment 1 found that relative response rate decreased from .75 toward .50 with both independent and interdependent schedules as the duration of the intertrial interval increased, but to a greater extent with independent schedules. Obtained relative reinforcement rate remained at .75 with the interdependent schedules as intertrial interval duration increased, but decreased from .75 toward .50 with the independent schedules, resulting in an overall closer approximation to matching with independent schedules. Both experiments found that for any given intertrial interval, changeover probabilities were variable for individual birds but without systematic trend as run length increased. Moreover, no systematic trends were apparent between basic changeover functions of independent and interdependent schedules, or when run length was calculated to begin after each reinforcement, in addition to after a response on the other key. In general, the data did not support molar effects in choice as being reducible to molecular.
A Relation Between Preference and Resistance to Change
Variables that influence preference have similar effects on resistance to change. We describe a method for evaluating both preference and resistance to change within experimental conditions using pigeons as subjects. One half of each session employs concurrent chains to measure preference between the terminal-link schedules of food reinforcement, and the other half of the session arranges the same terminal links as components of a multiple schedule. After performance stabilize, the relative resistance to change of the multiple-schedule component response rates is evaluated. We varied relative food rates or delays over several conditions, and found that relative resistance to change in the multiple-schedule components was a power function of preference in the initial links of the concurrent chains. The power-function form follows from theoretical considerations set forth by Staddon in 1978, but the parameters of the function may depend on the arrangement of the terminal-link schedules and the way in which relative resistance is measured.
A Blind Reader for the Cognitive Map
A local diffusion model (Staddon & Reid, 1990) can reproduce exponential and Gaussian stimulus-generalization gradients. We show that this model, together with simple reinforcement assumptions, can also simulate the main features of stimulus generalization, such as peak shift and the generalization gradients produced by various 3-stimulus procedures. A 2-dimensional diffusion model can reproduce many of the empirical properties of goal-directed spatial search, including area-restricted search, open-field foraging, barrier and detour problems, short cuts, maze learning, and spatial "insight." The model provides a simple associationistic "reader" for Tolman's cognitive map.
A general motivational threshold model applied to free feeding in rats
We propose a depletion-repletion model of motivated behavior applied to free-feeding in rats that accurately simulates number of meals, total intake, and average distribution of feeding across time. In this model, motivation to eat is a combined function of three depletion-repletion processes with different temporal delays and extent of action. All processes increase in value with time since feeding, and decrease with food intake. Onset of feeding occurs when the total motivation level exceeds a probabilistic threshold for eating, and stops when the motivation level falls below a second probabilistic threshold for cessation of eating. We were able to improve on previous motivational models of feeding by distinguishing day and night thresholds, adding an ultradian rhythm (Widman & Timberlake, 1995), and introducing a circadian oscillation that rose toward dawn. The final model is quite general and similar to a model of sleep by Daan and his coworkers and a model of dust bathing by Hogan and van Boxel (1993).
Generalization, Probability, and the Problem of Induction
I approach, once again, Hume's venerable "Problem of Induction." For clarity, I confine myself to what may be the simplest case -- that in which an intelligent agent (whether human, animal, or machine) is confronted with a continuing sequence of binary outcomes and attempts to anticipate the next outcome. I focus, particularly, on successive outcomes, such as those on which the edifice of science has been erected, that are consistent with the operation of some deterministic law. A familiar example, instanced by C. S. Peirce, is that in which each stone that is released from a height is found to fall, in accordance with Newton's law of gravitation. After N such outcomes, what justifies our expectation that the next rock to be released will also fall?
I consider a sequence of probability models devised to capture increasingly plausible assumptions the inducing agent might make about the world that is generating the sequence of outcomes. My arguments, originally suggested to me by extensions of my own theory of generalization (Shepard, 1987), draw, first, on Bayes's theorem of inverse probabilities, then, on subsequently developed theories of algorithmic complexity. I tentatively conclude that Hume's sceptical conclusion is based on a particular model that is not itself justified, and suggest that more justifiable, probability models support conclusions that are more consonant with common sense as well as with the practices of science.
Response comparison, duration comparison, and a psychophysical/ remembering model
Pigeons were trained on a duration comparison task in Experiment 1. For each trial, a red light of one duration was followed by a green light of a different duration. Then a choice was made whether red or green lasted longer. Duration pairs ranged from shorter (e.g., 0.5-s red and 2-s green) to longer (e.g., 90-s red and 22-s green). Discrimination performance was a function of relative duration differences. Sensitivity declined as duration pairs became longer. Bias to report red as longer shifted to a bias to report green as longer as duration pairs became longer. Experiment 2 compared compared performance on the duration comparison task and an equivalent response comparison task where the task was to report whether more pecks had been emitted to red or green. The pigeons were more sensitive on the response comparison task, and the change in sensitivity and bias differed as a function of increases in the duration/response requirement pairs. The data from both experiments were well fit by a model suggested to us by Staddon.
and
Alan Silberberg American University and Walter Reed Army Institute of Research
A replication of Belke's (1992) and Gibbon's (1995) transitivity-of-preference test
On some 1-minute trials, pigeons chose between variable-interval 20-second (blue key) and variable-interval 40-second (red key) schedules. On other trials, choice was between variable-interval 40-second (green key) and variable-interval 80-second (while key) schedules. Once choice ratios in both schedule pairs were stable, a probe session began. In the probe session, most trials were unchanged from the arrangement described above. However, with p=0.25, a choice trial was between red and green. During this 1-minute probe trial, no reinforcement was given. The next two sessions returned to the training regime and were followed by another probe session. A total of nine probe sessions were conducted, one every third session.
In training, choice ratios matched reinforcement ratios when measured in terms of time allocation, and somewhat undermatched when measured in terms of responses. During the first probe session, the probe-trial green/red choice ratio approximated 0.5 measured in terms of time or responses. Over successive probe sessions, this choice ratio increased until, by probe session 6, the green/red choice ratio approximated 0.8, the value predicted by Gibbon's recent Scalar Expectancy Account and the result obtained in studies by Gibbon and by Belke. While this result replicates their work, it was due not to the comparative value of these schedules assessed immediately after training. Instead, it was due to across-session differences in extinction during probe trials: Responding to the red key decreased more rapidly than responding to the green key. This result invites consideration of whether the results from Belke and Gibbon included choice ratios substantially influenced by the effects of extinction.
Implicit Responding in Choice-Making Behavior
Both Scalar Expectancy Theory (SET) and the Behavioral Theory of Timing (BeT) are based on pacemaker-counter systems and make specific predictions concerning the way in which time is perceived. The "clock speed" in SET is absolute, variability occurring during the information processing stage. The clock speed in BeT is proportional to the degree of arousal which is mediated by the rate of reinforcement in the situation. The following study explores whether the passage of time is relative, as BeT would predict, or absolute as a result of variability effects on information processing, as suggested by SET.
During training, human subjects responded on a fixed-time (FT) interval of 7.5 seconds by using a mouse to click a stimulus on the computer screen. During any one 7.5 second trial an additional stimulus would appear on either the left or the right, after 2.5 and 5 seconds, respectively. Testing was administered between-subjects, the first group responded on a FT interval of 3.75 seconds (short testing phase) and the second on a FT of 15 seconds (long testing phase). The FT-3.75 group was presented with two stimuli, left & right, after 2.5 seconds, and the FT-15 group was presented with two stimuli, left & right, after 5 seconds. The ratio of training to testing was 10:1, that is, after every ten training trials subjects were tested once, until criterion of 100:10 was reached.
Overall, subjects tended to respond relatively in both the short and long testing phases. SET would predict absolute responding in a setting as subtle as the aforementioned, BeT suggests that relative responding would result because the clock speed was "set" during training and carried over into the infrequent testing conditions. Timing responses, such as counting or beating time, were never observed, suggesting that such responses, were they occurring, were subtle in nature.
Three-alternative choice: What makes alternatives irrelevant?
Some previous research has clearly shown that choice between a constant pair of concurrent-schedule alternatives is unaffected by the presence or value of a third alternative (indifference to irrelevant alternatives). Other research has equally clearly shown that choice between two constant alternatives becomes less extreme when the reinforcer value of a third alternative is increased. There are a number of differences between the procedures that have produced these radically different results. These are: The presence or absence of a travel time or blackout between alternatives; the use of a switching versus an N-key procedure; and, in switching procedures, the next component being probabilistic rather than fully predictable. The present experiment attempted to discover what procedural aspects of concurrent-schedules produced independence versus dependence on other alternatives by using a wide range of different procedures.
A Dynamic Theory of Operant Conditioning
No truly comprehensive model for free-operant and discrete-trial instrumental learning has been proposed. Most successful models in this area are relatively insensitive to historical properties of behavior and applicable to only a limited data set. In this study we explore new implications of a historical real-time theory of operant learning introduced by Dragoi (1996). The theory shows that interplay between simple short and long-term memory mechanisms is sufficient to explain a large number of operant phenomena. The theory is also consistent with well-known classical-conditioning and avoidance results, suggesting a unified framework for investigating both pavlovian and instrumental conditioning effects. Our theory describes short and long-term effects of reinforcement and how these effects modulate the operant response, how novel events are detected and processed, and how their consequences also modulate the operant response. The critical features that make the present theory different from other mechanistic approaches to learning are: (1) it is hypothesized that the function of operant conditioning is to predict the association between responses and the reinforcement by defining the reinforcement expectancy as the aggregate prediction of stimulus-reinforcement and response-reinforcement associations. (2) reinforcement expectancy controls the rate of increase of the operant response. (3) the response is controlled by behavioral inhibition and behavioral excitation units which integrate the mismatch between expected (long-term) and experienced (short-term) events. The theory predicts the general features of several operant phenomena such as response selection, contingency effects, effects of delay, development of preference, contrast effects, reversal learning, resistance to extinction, spontaneous recovery, regression, within-session changes in response strength, as well as classical conditioning phenomena such as delay conditioning, extinction, discrimination acquisition, and overshadowing. We illustrate the operation of our theory by devising an experiment that tests our predictions about how reinforcement probability and amount of training jointly determine operant choice behavior to reveal new properties of operant learning. Our theoretical results offer a unitary set of fundamental principles, essential for understanding the mechanism of operant learning, that help resolve a long-standing debate about the fundamental variables controlling operant behavior.
The kinetics of matching: is there a role for inferred background reinforcement?
There is a growing interest in the dynamics of matching. One approach has been transfer-of-training tests (Belke, 1992; Mark & Gallistel, 1994; Gibbon, 1995). The findings have been surprising (see Gibbon, 1995). For example, novel VI 40 sec VI 40 sec probe schedules drawn from the pairs VI 20 sec VI 40 sec and VI 40 sec VI 80 sec failed to produce a 50:50 distribution of behavior. Instead subjects reliably showed a 4:1 preference ratio for a VI 40 sec schedule that had been paired with a VI 80 sec schedule (Belke, 1992; Gibbon, 1995). Mark and Gallistel and Gibbon have proposed molecular models of matching that successfully predict the 4:1 preference. These models, however, do not take into consideration background reinforcement, an inferred factor that has played a major theoretical role in some areas of matching law research (Herrnstein, 1970; Heyman & Monaghan, 1987). The experiment reported in this poster was motivated by a model of matching that includes background reinforcement.
The model is based on four assumptions: (1) the local probabilities of switching between schedules during training sums to a constant (Heyman, 1979). (2) The local rate of switching from schedule i to schedule j is described by the equation: vRj/(Ri + Rj + Re) where v is a constant, Ri and Rj are reinforcement rates, and Re is the rate of background reinforcement. (3) Rate of switching transfers from the training schedule to the new probe schedule (Mark & Gallistel, 1994). (4) The value of Re systematically changes as a function of number of probe trials, starting off at about 0.0 and then, with successive probes, growing quite large. These assumptions predict a shift in preference from about 3:1 to 4:1 in the VI 40 sec VI 40 sec example described above. In an experiment with pigeons, we observed shifts from about 1:1 to 4:1, over the course of 10 probe sessions. Thus, the data are consistent with several aspects of the model's predictions, but not all.
The Aristotle model of operant behavior.
Aristotle, a computer-based, dynamic model of operant behavior, combines a neural-net-type perceptual procedure with a Reid and Staddon (1996) diffusion procedure. According to this model, behavior is controlled by discriminative stimuli (both external and behavior-based), schedules of reinforcement, and "reinforcement gradients," the latter in ways similar to that described by Reid and Staddon. The combination of control by discriminative stimuli, reinforcement contingencies, and reinforcement gradients allows us to model a wide range of behavioral results, including rates of responding under single and concurrent schedules of reinforcement, foraging, catching a thrown baseball, and Tic-Tac-Toe. We will demonstrate the operation of the model as well as results from previous simulations.
Shaping a Neural Network
A neural network simulation is used to generate a series of black and white images. No antecedent stimuli are supplied. The neural network is reinforced according to a percentile reinforcement schedule (p=.30, m=200) with respect to how close the emitted images are to a target image drawn by the user. Gradually, the distribution of images moves closer to the target until the neural network generates an exact copy of the target. The learning algorithm is based on In-Vitro Reinforcement (IVR), a mechanism whereby spontaneous burst rates of CA1 pyramidal cells of the hippocampus can be increased due to dopamine infusion immediately following a burst. Each unit in the neural net corresponds to one pyramidal cell. Each signal output from a unit corresponds to one burst. Independent thresholds for each unit determine the probability of output from that unit. Results indicate that an eight by eight image can be duplicated in anywhere from 25 thousand to 250 thousand cycles, equivalent to a range of approximately 20 minutes to 3 hours 20 minutes in real time. Modifying thresholds using a linear (truncated) learning rule generates a typical asymptotic curve in various aggregate measures of approach to the target. Using an asymptotic learning rule fails to produce convergence.
The Discrimination of Stimulus Frequencies in Pigeons: Some Data and a Model
In a modified matching-to-sample task, pigeons were exposed to different frequencies of 3 stimuli during the sample phase and were then rewarded in the choice phase for choosing the least-frequent stimulus. The results show that the probability of an error increased when the correct stimulus was either one o the first or one of the last in the sample series. I propose a two component model for the "memory trace" that can account for these recency and the primacy effects. Consistent with Jost's law, the model also predicts that the primacy effect should increase with greater retention intervals.
Hick's Law in Pigeons
Hick's Law -- showing that reaction time (RT) increases as a logarithmic function of number of choice alternatives -- is often used to study human information processing. We have developed a method to test Hick's Law in pigeons using touch sensitive computer monitors. Five pigeons were presented with 1, 2, 4, or 8 possible target stimuli in a choice RT paradigm. We found that Hick's Law provided an excellent description of the data, accounting for 98% of average variance. We are now using this procedure to test predictions concerning pigeons' "speed of processing" as a function of omega-3 fatty acid deprivation. There is reason to suspect that such deprivation might result in a slowing of choices.
A preliminary model for within-component responding during a three-component multiple schedule.
Pigeons pecked a key for mixed grain delivered by a multiple variable-interval (VI) VI VI schedule. During baseline conditions, the value of the VI schedules in each 30-s component were equal. In other conditions, the rate of reinforcement in the second component was either increased or decreased (Experiment 1) or the rate of reinforcement in the first and third components was systematically increased and/or decreased (Experiment 2). It was hypothesized that the within-component pattern of responding would change with changes in the rate of reinforcement during the preceding (i.e., local contrast) and the subsequent component (i.e., anticipatory contrast). Relatively large changes in rate of responding were usually observed early within a component. Changes in response rates were sometimes, but not always, observed in the latter portion of a component. A preliminary four-parameter model is forwarded to account for the observed within-component changes in responding.
and
Alan Silberberg
American University and Walter Reed Army Institute of Research
Application of Killeen's (1994) quantitative account to ratio-schedule performances
Each of five pigeons responded for at least 15 sessions on a series of fixed and variable ratio reinforcement schedules valued 4, 8, 16, 32, 64, 128, and 256. Except for the limitation that no bird began on a ratio of 256, the order of ratio type (fixed or variable) and ratio size was random across the 14 schedules defining this study.
Consistent with the predictions of Killeen's (1994) quantitative account for these schedules, response rates on fixed schedules were higher than for comparably valued variable schedules when ratios were small, but not when they were large (>32). Also consistent with this account was the fact that the function relating response rate to ratio size was bitonic. However, when data were plotted in terms of running rates (responses/minute excluding post-reinforcement pause) differences emerge between data and theory. In particular, the bitonicity between response rate and ratio size predicted by Killeen's account and found in the prior analysis are replaced by monotone-decreasing functions that are approximately linear.
From Basics to Contemporary Paradigms: Timing.
From Basics to Contemporary Paragdigms: Matching.
From Basics to Contemporary Paradigms: Neural Networks
Complex behavioral interactions and complex brain mechanisms can be described in terms of the non-linear dynamics of neural networks. The systems of differential equations that formalize the neural networks, although difficult to solve by hand, are easily solved by computers. Scientists can rapidly experiment with computer models and predict the behavior of animals in different experimental situations.
In this tutorial, we will describe how neural networks provide mechanistic descriptions of simple and complex behaviors. First, we will analyze networks that incorporate an error correcting mechanism that modifies an internal model of the world when predicted events differ from observed events. Second, we will study networks that combine classical associations to infer new knowledge from previous experience. Third, we will examine a network that includes an attentional mechanism that increases the processing of those CSs associated with novelty. Fourth, we will inspect a neural network that incorporates stimulus configuration to predict complex environmental states. Fifth, we will discuss a neural network applied to operant conditioning and communication. Seventh, we will review a network that portrays maze learning and some problem solving tasks.
Because neural networks rigorously describe learning and cognitive processes, provide the necessary tools to understand the intricate interactions among the numerous temporal and nontemporal parameters controlling behavior, and constitute a starting point for detailed neurophysiological analysis of behavior, they are likely to become a standard style of theorizing.
From Basics to Contemporary Paragdigms: Dynamics.
* Videotapes of these presentations are expected to be available.
Bringing Stimulus Control into the Law of Effect: The Three-Term Contingency Revisited
This address will trace the development of our three successive behavioural models that try to describe, in quantitative terms, the action of three-term contingencies -- how antecedent stimuli, and consequential reinforcers and punishers, combine to control behavior. These models derive, in a general sense, from signal-detection approaches to measuring the effects of stimulus, and from matching approaches to measuring the effects of reinforcers. Applications of the most recent model to behavior controlled by multiple stimuli and multiple consequences, to memory, to delay of reinforcement, to real-life applications, and to historical effects on present behavior are discussed. The approach taken in this presentation will try to be more conceptual than quantitative.
What a Skinnerian Might Like about Signal Detection Theory
The ability to instantly compute the ratio of two Gaussian distributions in one's head is what some signal detection models of recognition memory implausibly ascribe to human subjects. This position is actually forced (by some data to be presented) so long as one regards the subject as a historyless information processor. If we instead assume that subjects actually have a learning history, then what appears to be the result of a rather complex and sophisticated likelihood ratio computation can be seen as reflection of the consequences experienced by the subject for appropriate (or inappropriate) recognition judgments over the course of a lifetime.