Pigeons' choice between reliable (100%) and unreliable (50%) reinforcement was studied using a concurrent-chains procedure. Initial links were fixed-ratio 1 schedules, and terminal links were equal fixed-time schedules. The duration of the terminal links was varied across conditions. The terminal link on the reliable side always ended in food; the terminal link on the unreliable side ended with food 50% of the time and otherwise with blackout. Different stimuli present during the 50% terminal links signaled food or blackout outcomes under signaled conditions but were uncorrelated with outcomes under unsignaled conditions. In signaled conditions, most pigeons displayed a nearly exclusive preference for the 100% alternative when terminal links were short (5 or 10 s), but with terminal links of 30 s or longer, preference for the 100% alternative was sharply reduced (often to below .5). In unsignaled conditions, most pigeons showed extreme preference for the 100% alternative with either short (5 s) or longer (30 s) terminal links. Thus, pigeons' choice between reliable and unreliable reinforcement is influenced by both the signal conditions on the unreliable alternative and the duration of the terminal-link delay. With a long delay and signaled outcomes, many pigeons display a suboptimal tendency to choose the unreliable side.
Four pigeons were exposed to multiple schedules with concurrent variable interval (VI) components and then tested for preference transfer. Half ofthe pigeons were trained on a multiple concurrent VI 20-sec, VI 40-sec/concurrent VI 40-sec, VI 80-sec schedule. The remaining pigeons were trained on a multiple concurrent VI 80-sec, VI 40-sec/concurrent VI 40-sec, VI 20-sec schedule. After stability criteria for time and response proportions were simultaneously met, four preference transfer tests were conducted with the stimuli associated with the VI 40-sec schedules. During the transfer tests, each pigeon allocated a greater proportion of responses (M = 0.79) and time (M = 0.82) to the stimulus associated with the VI 40-sec schedule that was paired with the VI 80-sec schedule than to the VI 40-sec schedule stimulus paired with the VI 20-sec schedule. Absolute reinforcement rates on the two VI 40-sec schedules were approximately equal and unlikely to account for the observed preference. Nor was the preference consistent with the differences in local reinforcement rates associated with the two stimuli. Instead, the results were interpreted in terms of the differential value that stimuli acquire as a function of previous pairings with alternative schedules of reinforcement.
The present study investigated the effect of reinforcer duration on running and on responding reinforced by the opportunity to run. Eleven male Wistar rats responded on levers for the opportunity to run in a running wheel. Opportunities to run were programmed to occur on a tandem fixed-ratio 1 variable-interval 30-s reinforcement schedule. Reinforcer duration varied across conditions from 30 to 120 s. As reinforcer duration increased, the rates of running and lever pressing declined, and latency to lever press increased. The increase in latency to respond was consistent with findings that unconditioned inhibitory aftereffects of reinforcement increase with reinforcer magnitude. The decrease in local lever-pressing rates, however, was inconsistent with the view that response strength increases with the duration of the reinforcer. Response rate varied inversely, not directly, with reinforcer duration. Furthermore, within-session data challenge satiation, fatigue, and response deprivation as determinants of the observed changes in running and responding. In sum, the results point to the need for further research with nonappetitive forms of reinforcement.Key words: reinforcer duration, wheel running, lever press, ratsA response-strength conception of reinforcement (de Villiers & Herrnstein, 1976;Herrnstein, 1970) implies that as magnitude of reinforcement increases, the rate of the reinforced response should increase. However, the findings of numerous attempts to demonstrate this relationship using appetitive forms of reinforcement have been equivocal. Previous research into the relationship between reinforcer magnitude and overall response rates in free-operant paradigms using a variety of simple schedules (e.g., fixed interval, fixed ratio, variable ratio, variable interval) with various types of reinforcers (e.g., pellets, grain, sucrose solution) has yielded a confusing array of findings. Overall response rates varied directly
Pigeons chose between 50% and 100% reinforcement on a discrete-trials concurrent-chains procedure with fixed-ratio 1 initial links and fixed-time terminal links. The 100% alternative always provided food after a terminal-link delay, whereas the 50% alternative provided food or blackout equally often after a delay. Additionally, the terminal-link stimuli on the 50% alternative were correlated with the outcomes in signaled, but not in unsignaled, conditions. The effects of intertrial-interval duration and length of the terminal-link delays on choice of the 50% alternative were investigated in four experiments. Preference for the 50% alternative varied with signal condition and duration of the terminal link leading to food, but not with duration of either intertrial interval or the terminal link leading to a blackout. The results are discussed in terms of conditioned-reinforcement effects, Mazur's hyperbolic-decay model, and delay reduction.
Pigeons' choices between a reliable alternative that always provided food after a delay (i.e., 100% reinforcement) and an unreliable one that provided food or blackout equally often after a delay (i.e., 50% reinforcement) was studied using a discrete-trials concurrent-chains procedure modified to prevent choice between alternatives following a blackout outcome. Initial links were fixed-ratio 1 schedules, and terminal links were fixed-time schedules. Stimuli presented during the terminal-link delays were correlated with the food and blackout outcomes. In Experiment 1, terminal-link durations were varied. With short terminal links (i.e., 10 s), 6 of 8 subjects showed strong preference for the 50% side. As terminal-link duration increased to 30 s, preference, regardless of direction, became less extreme. In Experiment 2, the side-key location of the 50% and 100% alternatives was reversed for 3 subjects. Preference for the 50% alternative reoccurred following the key reversal. When a 5-s separation was subsequently interposed between the initial and terminal links for both alternatives, all birds reversed to a preference for the 100% side. In general, the strong preference for the 50% side was qualitatively consistent with the expectation that the procedure enhanced the conditioned-reinforcement effectiveness of the food-associated terminal-link stimulus on the 50% side. Implications of the results for various accounts of choice of the 50% alternative are discussed.Key words: choice, percentage reinforcement, signaled outcomes, conditioned reinforcement, delay reduction, delayed reinforcement, concurrent chains, key peck, pigeons Previous investigations of choice using a percentage reinforcement concurrent-chains procedure have produced results inconsistent with molar reinforcement maximization. In the typical procedure, subjects choose between a reliable alternative that always produces food reinforcement after a fixed delay (i.e., 100% reinforcement) and an unreliable alternative that produces either food reinforcement or blackout with equal probability after a fixed delay (i.e., 50% reinforcement). Exclusive choice for 100% reinforcement minimizes the average interreinforcement interval, a result that is consistent with maximization of the rate of primary reinforcement. In contrast, choice of the 50% alternative increases the interreinforcement interval, a result that is inconsistent with reinforcement maximization. Several studies have found conditions under which the 50% reinforcement alternative is chosen. Kendall (1974, 1985) found that preference for the 50% alternative varied with signal condition. When terminal-link stimuli on the 50% alternative were correlated with outcomes (i.e., signaled), subjects preferred the 50% alternative. When stimuli were not correlated with the outcomes (i.e., unsignaled), subjects strongly preferred the 100% alternative. Dunn and Spetch (1990) showed that under signaled conditions with long terminal links (i.e., 50 s), preference for the 50% alternative varied inversely with...
2Mice from replicate lines, selectively bred based on high daily wheel-running rates, run more total revolutions and at higher average speeds than do mice from nonselected control lines. Based on this difference it was assumed that selected mice would find the opportunity to run in a wheel a more efficacious consequence. To assess this assumption within an operant paradigm, mice must be trained to make a response to produce the opportunity to run as a consequence. In the present study an autoshaping procedure was used to compare the acquisition of lever pressing reinforced by the opportunity to run for a brief opportunity (i.e., 90 s) between selected and control mice and then, using an operant procedure, the effect of the duration of the opportunity to run on lever pressing was assessed by varying reinforcer duration over values of 90 s, 30 min, and 90 s. The reinforcement schedule was a ratio schedule (FR 1 or VR 3). Results from the autoshaping phase showed that more control mice met a criterion of responses on 50% of trials. During the operant phase, when reinforcer duration was 90 s, almost all control, but few selected mice completed a session of 20 reinforcers; however, when reinforcer duration was increased to 30 min almost all selected and control mice completed a session of 20 reinforcers. Taken together, these results suggest that selective breeding based on wheel-running rates over 24 hr may have altered the motivational system in a way that reduces the reinforcing value of shorter running durations. The implications of this finding for these mice as a model for attention deficit hyperactivity disorder (ADHD) are discussed. It also is proposed that there may be an inherent trade-off in the motivational system for activities of short versus long duration.
Herrnstein's (1970) hyperbolic matching equation describes the relationship between response rate and reinforcement rate. It has two estimated parameters, k and Re. According to one interpretation, k measures motor performance and Re measures the efficacy of the reinforcer maintaining responding relative to background sources of reinforcement. Experiment 1 tested this interpretation of the Re parameter by observing the effect of adding and removing an additional source of reinforcement to the context. Using a within-session procedure, estimates of Re were obtained from the responsereinforcer relation over a series of seven variable-interval schedules. A second, concurrently available variable-interval schedule of reinforcement was added and then removed from the context. Results showed that when the alternative was added to the context, the value of Re increased by 107 reinforcers per hour; this approximated the 91 reinforcers per hour obtained from this schedule. Experiment 2 investigated the effects of signaling background reinforcement on k and Re. The signal decreased Re, but did not have a systematic effect on k. In general, the results supported Herrnstein's interpretation that in settings with one experimenter-controlled reinforcement source, Re indexes the strength of the reinforcer maintaining responding relative to uncontrolled background sources of reinforcement.Key words: Herrnstein's hyperbola, matching law, background reinforcement, signaled reinforcement, lever press, rats Herrnstein (1970) formulated an elementary matching law equation for the case in which there is only a single measured source of reinforcement and a single measured response rate. The form of that equation is kRj BIRi+Re' (1) where B1 is response rate, R1 is reinforcement rate, and k and Re are fitted constants. The structural or curve-fitting definitions of the constants reveal the relationship between response rate and reinforcement rate implied by Equation 1. In the numerator, k is an estimate of the response-rate asymptote. For instance, as reinforcement rate increases, response rate approaches but does not exceed k. Thus, k is measured in the same units as the measured behavior (e.g., responses per minute). In the numerator, Re is equal to the rate of reinforcement that maintains a one-half asymp-The authors gratefully acknowledge the helpful comments of the reviewers and members of the Behavioral and Decision Analysis research seminar in the Department of Psychology at Harvard in the preparation of this manuscript. Correspondence regarding this article should be sent to Terry W. Belke, Department of Psychology, Biological Sciences Building, University of Alberta, Edmonton, Alberta T6G 2E9, Canada (E-mail: tbelke@cyber.psych.ualberta.ca).totic response rate. For example, when RI is equal to Re, response rate must be equal to k/2. Thus, Re is measured in the same units as the experimenter-controlled reinforcer (e.g., 0.10 mL sucrose servings per hour).On the basis of the matching law, Herrnstein (1970, 1974) provided em...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.