This review critically analyzes experimental data relevant to the concept of conditioned reinforcement. The review has five sections. Section I is a discussion of the relationship between primary and conditioned reinforcement in terms of chains of stimuli and responses. Section II is a detailed analysis of the conditions in which the component stimuli in chained schedules of reinforcement will become conditioned reinforcers; this section also analyzes studies of token reinforcement, observing responses, switching responses, implicit chained schedules, and higher-order conditioning. Section III analyzes experiments in which potential conditioned reinforcers are used either to prolong responding or to generate responding during experimental extinction. This section discusses hypotheses that have been offered as alternatives to the concept of conditioned reinforcement and hypotheses concerning the necessary and sufficient conditions for establishing a conditioned reinforcer. Section IV discusses other variables that act when a conditioned reinforcer is being established or that act when an established conditioned reinforcer is used to develop or maintain behavior. Section V is a general discussion of conditioned reinforcement. The evidence indicates that the conditioned reinforcing effectiveness of a stimulus is directly related to the frequency of primary reinforcement occurring in its presence, but is independent of the response rate or response pattern occurring in its presence. Results from chained schedules comprised of several components indicate that a stimulus can be established as a conditioned reinforcer by pairing it with an already established conditioned reinforcer rather than a primary reinforcer; however, this type of higher-order conditioning has not been clearly demonstrated with respondent conditioning procedures. Although discriminative stimuli are usually conditioned reinforcers, the available evidence indicates that establishing a stimulus as a discriminative stimulus is not necessary or sufficient for establishing it as a conditioned reinforcer. Discriminative stimuli in chained schedules with several components are not always conditioned reinforcers; stimuli that are simply paired with reinforcers can become conditioned reinforcers. The hypotheses that have been offered as alternatives to the concept of conditioned reinforcement are too limited to integrate the data that exist. The concepts of conditioned reinforcement and chained schedule, however, can be used to integrate the data obtained with diverse techniques. Recent experiments have revealed several techniques for the development of effective conditioned reinforcers. These techniques provide a powerful tool for advancing understanding of conditioned reinforcement and for extending control over behavior.
On fixed-interval schedules of reinforcement, subjects are reinforced for the first response which occurs after a fixed time interval has elapsed. Responses occurring before the interval has elapsed are recorded, but have no specified consequences. Fixed-interval schedules produce characteristic response patterns. A period without responses (initial pause) occurs at the start of each interval, and is followed by accelerated responding which reaches a constant high rate that is maintained until reinforcement. The present paper reports the development of a mathematical index for describing characteristics of fixed-interval curves. Examples from behavioral and pharmacological studies will illustrate applications of this index. DERIVATION OF THE INDEX OF CURVATURE'A cumulative-response record drawn to approximate the performance of a pigeon trained on a 10-minute, fixed-interval schedule (FI 10) is presented in Fig. 1.If the response rate were constant throughout the interval, the cumulative-response record could be completely described by the straight line O Y. However, the actual cumulative record departs from a straight line. Insofar as we can indicate the extent and the direction of this departure from a straight line, we indicate the curve characteristic of the cumulative record.The extent to which the cumulative record departs from a straight line can be determined by comparing the area under the cumulative record with the area under the straight line. That is, the difference between the area of the triangle OXY and the area of the figure 0 a'b'c'YX can be used to indicate the curvature of the cumulative record.
Pigeons responded under a schedule in which food was presented only after a fixed number of fixed-interval components were completed. Two such second-order schedules were studied: under one, 30 consecutive 2-min fixed-interval components were required; under the other, 15 consecutive 4-min fixed-interval components were required. Under both schedules, when a 0.7-sec stimulus light was presented at completion of each fixed interval, positively accelerated responding developed in each component. When no stimulus change occurred at completion of each fixed interval, relatively low and constant rates of responding prevailed in each component; a similar result was obtained when a 0.7-sec stimulus change occurred at completion of each fixed interval except the one which terminated with primary reinforcement. The 0.7-sec stimulus correlated with food delivery was an effective conditioned reinforcer in maintaining patterns of responding in fixed-interval components despite low average frequencies of food reinforcement.
Responding was maintained in two squirrel monkeys under several variations of a 10-min fixed-interval schedule of electric shock presentation. The monkeys were first trained under a 2-min variable-interval schedule of food presentation, and then under a concurrent schedule of food presentation and shock presentation. In one monkey, when shocks (12.6 ma) followed each response during the last minute of an 11-min cycle ending with a timeout period, responding was increased during the first 10 min and suppressed during the last minute of each cycle. When the shock schedule was eliminated, both the enhancement and suppression disappeared, and a steady rate of responding was maintained under the variable-interval schedule. When the food schedule was eliminated, the shock schedule maintained a characteristic fixed-interval pattern of responding during the first 10 min, but suppressed responding during the last minute of each cycle. The fixed-interval pattern of responding was maintained when the timeout period was eliminated and when only one shock could occur at the end of the cycle. In the second monkey, responding under the concurrent food and shock schedule was suppressed when responses produced shocks after 3-min. Under an 11-min cycle, responding continued to be maintained at increasing shock intensities. When the food schedule was eliminated, a fixed-interval pattern of responding was maintained under a 10-min schedule of shock presentation (12.6 ma). Whether response-produced electric shocks suppressed responding or maintained responding depended on the schedule of shock presentation.
Pigeons were required to complete three successive fixed-interval components to obtain food.When the same exteroceptive stimulus was correlated with the three components, responding was positively accelerated between food deliveries. When different exteroceptive stimuli were correlated with each component in a fixed sequence, prolonged pauses developed in the first component; low response rates developed in the second component; and responding was positively accelerated in the second and third components. When different exteroceptive stimuli were correlated with each component in a variable sequence, responding was positively accelerated in each component. Because the response and reinforcement contingencies were the same in all three procedures, the differences in performances must be due to the changes in the sequence of stimuli.In a three-component, chained fixed-interval schedule of reinforcement, the completion of a fixed-interval schedule in the presence of one stimulus produces a second stimulus; the completion of a fixed-interval schedule in the second stimulus produces a third stimulus; and the completion of a fixed-interval schedule in the third stimulus produces food. Of the various functions that a stimulus may have (Skinner, 1938), two are emphasized in chained schedules. First, the presence of a stimulus can control a specific rate and pattern of responding; this is the discriminative function of the stimulus. Of course, the rate of responding and the pattern of responding that occur in the presence of a discriminative stimulus are a function of the schedule of reinforcement that is in effect. Second, a stimulus can reinforce a specific rate and pattern of responding that preceded its appearance; this is the conditioned reinforcing function of the stimulus.
In his analysis of temporal discriminations, Skinner (1938) described an experiment in which the response rates of rats were decreased by reinforcing only interresponse times (IRT's) which exceeded 15 seconds (p. 306). Wilson and Keller (1953) confirmed and extended this finding by demonstrating that the rate of responding is inversely related to the duration of the minimum required IRT. This type of schedule of reinforcement is referred to as the "differential reinforcement of low response rates" (DRL). Recent investigations indicate that DRL schedules engender temporal discriminations which can be analyzed by means of the relativefrequency distribution of IRT's or the distribution of response probabilities (Anger, 1956; Sidman, 1956).On DRL schedules of reinforcement, each response starts the required delay interval. Responses which occur before the delay interval has elapsed not only are unreinforced but they also postpone reinforcement by starting a new delay interval. To the extent that the animal can discriminate the delay interval, these two contingencies should eliminate responding during the delay interval. Wilson and Keller (1953) reported that their rats adapted to the DRL schedule by developing varioua chains of overt behavior which persisted between lever presses and which occupied enough time so that the lever presses following the chains were reinforced.More recent investigations, which include detailed analyses of the temporal response patterns which develop on DRL, have consistently indicated that a large proportion of IRT's occur at about 0-3 seconds (Conrad, Sidman, & Herrnstein, 1958; Sidman, 1955;Sidman, 1956a;Sidman, 1956b). These short IRT's result from frequent "bursts" of responding, and they generate IRT relative-frequency distributions and probability distributions which are bi-modal. One mode occurs in the vicinity of the minimum IRT which is required for reinforcement; the second mode, which is a result of these bursts, occurs at about 0-3 seconds. Sidman presented evidence indicating that the probability of a burst was high near the minimum IRT required for reinforcement. le suggested that "late in the delay period, a single lever press often fails to reset the animal's 'clock,' with the result that several quick responses are emitted" (Sidman, 1956a, p. 472).A very precise control of the rate of responding can be developed by reinforcing only those IRT's which fall within a specified range (Ferster & Skinner, 1957, pp. 498-502); that is, a reinforced IRT must not only be longer than some minimum value (as in DRL) but also shorter than some maximum value. Thus, reinforcements are available for only a limited period of time. This type of schedule is referred to as DRL with a "limited hold" (DRL LH). For example, on DRL 20 LH 3, only responses which are emitted between 20 and 23 seconds after a preceding response will be reinforced; responses emitted at less than 20 seconds or more than 23 seconds after a preceding response start the timing interval again. On DRL LH 91
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.