WILLIAM L. PALYA and JOSEY Y. M. CHU
Jacksonville State University Jacksonville, Alabama
Real-time changes in the interaction of a trial stimulus and the temporal properties of the interfood context were observed. An autoshaping procedure with pigeons partitioned the interfood interval with explicit stimuli correlated with the passage of time. Three experiments altered the duration, the probability, and the validity of the keylight trial stimulus and measured control over keypeck responding exerted by both the trial stimulus and the explicit temporal "clock" stimuli making up the interfood interval. In general, it was found that the trial stimuli (l) did not abolish the responding controlled by the explicit temporal stimuli of the context, (2) increased the rate of responding to the explicit temporal stimuli of the context immediately preceding the trial stimulus onset, and (3) suppressed the responding to the explicit temporal stimuli immediately following the trial stimulus onset. Most typically, responding increased across the latter portion of the interfood interval. These effects were robust across variations in the duration and probability of the trial stimulus. Variations in the validity with which the trial stimulus correctly signaled the upcoming trial outcome had a pronounced effect on the ability of the trial stimulus to condition antecedent temporal stimuli. The results were taken to indicate that a real-time conceptualization of the results of conditioning procedures is necessary.
If a fixed interfood interval is segmented throughout its entire duration into explicitly signaled time periods, the procedure establishes and maintains directed keypecks to the stimuli throughout the latter portion of the interval ( Palya, 1985). The intuitive prediction had been that the behavior would have come under better stimulus control, in that the antecedent clock stimuli were clearly different from the one that was contiguous with food presentation. If behavior to the antecedent stimuli had diminished, that loss would have been attributed to inhibition of delay or to the principle of discrimination and least effort. Because responding was acquired and chronically maintained, delay of reinforcement or serial conditioning could be advanced to account for the obtained results. Unfortunately, processes such as least effort or serial conditioning can have no explanatory merit if they are invoked, after the fact, only in situations for which their predictions were correct.
Palya (1993) proposed a perspective within which chronically maintained behavior to other than the stimulus contiguous with the reinforcer could be seen. Palya's real-time model specifies the qualitative changes in the behavior which would be expected to stimuli correlated with various portions of an interfood interval following specified amounts of experience. He used a simple combination of the output predicted by the Rescorla and Wagner (1972) linear operator model and a modified version of the Gibbon and Balsam ( 1981 ) scalar expectancy model. He noted that there was nothing inherently incompatible in the behavior predicted by the linear operator and scalar expectancy models. Rather, those models could be seen as specifying output across orthogonal dimensions. The Rescorla-Wagner model is focused on behavior change with increasing experience, while the Gibbon and Balsam model deals with variations in behavior as a function of relative position within the interreinforcement interval.
Even though the behavioral predictions of the Rescorla-Wagner and Gibbon-Balsam models can be conveniently integrated in this way, there remain substantial differences in (I) the machinery invoked by each view to generate those behavior changes, as well as in (2) the behavior predicted to occur to other than the trial stimulus. In spite of their substantial paradigmatic differences, however, both the linear operator and scalar expectancy model concur that the terminal behavior controlled by the stimulus contiguous with the reinforcer is the result of both trial stimulus and contextual factors. Therefore, it would appear that the most appropriate initial step toward the identification of a single conceptual framework and underlying machinery is the better description of how obtained terminal behavior varies as a function of the interaction of the trial stimulus and context.
Williams, Frame, and LoLordo (l992) have examined the interaction of a trial stimulus and its context. They compared conditioned freezing in rats that had been exposed to an explicit signal preceding a shock unconditioned stimulus (US) with rats that had not received the trial stimulus. They found that the trial stimulus did not overshadow conditioning to the context in which the rats were shocked. This finding is inconsistent with the predictions of any learning model that suggests that it is necessarily the case that stimuli "compete" for associative strength. Rather, their results support the view that all elements, both trial stimulus and context, can become conditioned (e.g., Gibbon & Balsam, 1981).
Time is an especially important and intrinsic element of the context. Williams et al. ( 1992) separately assessed the ability of the trial stimulus to overshadow the temporal context contiguous with the US presentation. They found that the explicit signal also failed to overshadow conditioning to the temporal context. Unfortunately, their design was intended to simply determine whether the trial stimulus overshadowed "the" temporal and stimulus factors of the context rather than to determine how that overshadowing effect changed as a function of the time between presentations of the US.
It is important to extend the findings of Williams et al., because time changes as the interreinforcement interval (IRI) elapses. There is not just one temporal stimulus with a fixed relationship with the reinforcer. Rather, there is a real-time gradient of temporal "stimuli" each with its own systematic relationship with food presentation. The important implication of this realization is that the precise details of the trial stimulus/context interaction would be expected to vary as a function of time in the IRI. It follows that an understanding of trial stimulus/context interaction would depend on understanding the real-time dynamics of that interaction, since any single relationship could represent the trial stimulus/context interaction at only one point in time or only, at best, as a molar average.
If the task is to document the dynamics in the trial stimulus/temporal context interaction, then a concurrent trial stimulus could be added to an interfood clock sign-tracking procedure using pigeons. Sign tracking in pigeons produces an easily measured, conditioned response which is directed at its controlling stimulus (Hearst & Jenkins, 1974). The target of the responding would designate which of the concurrently available stimuli was exerting the greater relative control over the terminal behavior, while the response rate would index the response strength at that point in the IRI.
GENERAL METHOD
Subjects
Eleven adult, experimentally naive pigeons obtained from a local supplier were used throughout the three experiments. They were housed under a 19:5-h light:dark cycle in individual cages with free access to water. All were maintained at approximately 80% of their ad-lib weights with pelletized laying mash.
Apparatus
Two experimental chambers were used. The interior of each was a 30-cm cube. An unfinished aluminum panel served as one wall of the chamber, the other sides were painted white. The stimulus panel had a feeder aperture 5 cm in diameter medially located 8 cm above the grid floor Three response keys, 2 cm in diameter, were located 9 cm apart 19 cm above the grid floor They required approximately 15 g (.15 N) to operate. The translucent Plexiglas key could be transilluminated by a stimulus projector containing color filters. The filters were the following "Rosco" theatrical gels: pink (34), red (26). orange(23),amber(20),yellow(12),green(91),turquoise(73),blue (68), and purple (58). A Lee color-correcting filter (218) was used to produce white. Two houselights were located on the stimulus panel 28 cm above the grid floor behind a white translucent deflector. The deflector extended across the width of the chamber and was mounted diagonally between the stimulus panel and the ceiling of the chamber Ventilation was provided by an exhaust fan mounted on the outside of the chamber. A white-noise generator provided ambient masking noise in the running room. Stimulus events were controlled and keypecks were recorded by a computer system.
Procedure
All pigeons were magazine trained to a criterion of approaching and eating from the food magazine within 3 sec on three consecutive presentations. During magazine training, the keys were dark. Each session typically contained 20-30 food presentations, as determined by the bird's body weight that day.
Each of the following three experiments examined variations in behavior under concurrent exposure to trial stimuli and an interfood clock. The general paradigm is illustrated in Figure 1. This figure illustrates several interreinforcement intervals segmented into nine discriminable stimulus periods (on the center key), with concurrent trial stimuli (on the side keys) signaling whether the interval would terminate in a US or a timeout. Except where noted one of the two trial types randomly occurred. If the left key was illuminated white, then a 3-sec time-out subsequently occurred. If the right key was illuminated white, then a 3-sec food presentation subsequently occurred. A common explicit temporal context was provided on the center key during the entire intertrial interval in either case. The explicit context was a fixed 54-sec clock, segmented into nine, 6-sec time periods, each of which was designated by a different color on the center key The colors associated with each consecutive 9th of the interval were pink, yellow, amber, orange, purple, red blue, turquoise, and green. Following the elapse of the 54 sec, either the 3-sec food presentation or a 3-sec time-out occurred irrespective of behavior. The houselights were off during both food presentation and time-out.
The present research examined the ability of a trial stimulus to modulate the control exerted by an explicit temporal context (the clock stimuli of a fixed interfood clock). The impact of three series of manipulations was studied: (I ) The relative duration of the trial stimulus was parametrically varied to determine how its interaction with the stimuli of the explicit temporal context was modified by its duration or relative onset position in the interval. (2) The percentage of positive trials was systematically varied in an effort to alter the ability of the trial stimulus to overshadow control by the explicit temporal context. If the rate of responding to the clock stimuli is a simple function of the probability of reinforcement, then reducing the probability of reinforcement should reduce the response rate. On the other hand, if response rate to the clock stimuli is a function of the trial-to-cycle ratio, then rate should be an inverse function of reinforcement probability. The effect of either the probability of reinforcement or the cycle-to-trial (C/T) ratio on responding would be revealed by this manipulation. It should be noted that probability of reinforcement and C/T ratio were necessarily confounded. Finally, (3) the validity of the relationship between the trial stimuli and food presentation was parametrically varied to assess its effect on the interaction between the trial stimuli and the stimuli of the explicit temporal context.
Method
Subjects. Five naive pigeons were used.
Procedure. This experiment varied the point of onset of a trial stimulus presented in an explicitly timed context. The procedures of this experiment were variants of the basic paradigm presented in the general procedure section. The white transillumination of the right or left side key was the prefood or pre-time-out stimulus, respectively. The center key was used to present the stimuli of the interfood clock. The onset of the side-key illumination occurred one segment earlier each phase. This experiment included an initial series of manipulations and a subsequent replication as noted below. Table I documents the temporal position of the onset of the trial stimulus for each phase and the number of sessions at that value for each bird.
Table 1
Temporal Onset of Trial Stimulus and Sessions per Phase
Bird 22 | Bird 28 | Bird 37 | ||||||
---|---|---|---|---|---|---|---|---|
Phase | Segment | Sessions | Segment | Sessions | Segment | Sessions | ||
1 | 9 | 35 | 9 | 35 | 9 | 35 | ||
2* | 7 | 51 | 7 | 51 | 7 | 51 | ||
3 | 7 | 38 | 7 | 38 | 7 | 38 | ||
4 | 6 | 35 | 6 | 35 | 6 | 35 | ||
5 | 5 | 38 | 5 | 38 | 5 | 38 | ||
6 | 4 | 30 | 4 | 30 | 4 | 30 | ||
7 | 3 | 25 | ||||||
8 | 2 | 30 | ||||||
8 | 1 | 25 | ||||||
Bird 28 | Bird 37 | Bird 15 | Bird 16 | |||||
Phase | Segment | Sessions | Segment | Sessions | Segment | Sessions | Segment | Sessions |
1 | 30 | 30 | 30 | 30 | ||||
2 | 9 | 25 | 9 | 25 | 9 | 25 | 9 | 25 |
3 | 8 | 25 | 8 | 25 | 8 | 25 | 8 | 25 |
4 | 7 | 20 | 7 | 20 | 7 | 15 | 7 | 15 |
5 | 6 | 15 | 6 | 15 | 6 | 15 | 6 | 15 |
6 | 5 | 15 | 5 | 15 | 5 | 15 | 5 | 15 |
7 | 4 | 15 | 4 | 15 | 4 | 15 | 4 | 15 |
8 | 3 | 15 | ||||||
9 | 2 | 15 | ||||||
10 | 1 | 15 | ||||||
* Trial stimulus terminated after 6 sec in this phase |
In Phase I of the initial study, food availability was signaled during the ninth stimulus (i.e., the final stimulus). In Phase 2, the appropriate side key came on with the seventh element of the clock and went off with the onset of the eighth stimulus segment. In Phase 3, the onset time was the same except that the trial stimulus remained in effect until food presentation. With each successive phase (4 9), the onset of the appropriate side key occurred one segment earlier. Birds 28 and 37 were terminated following six phases.
The replication reused Birds 28 and 37, and added Birds 15 and 16. All were first exposed to a simple interfood clock which always terminated with food presentation. The side keys remained dark. The general methodology of the first study was then replicated. With each successive phase (2-10), the appropriate side key was illuminated one segment earlier The key remained illuminated until the end of the trial in each phase. Birds 28, 37, and 15 were terminated following Phase 7.
Results and Discussion
The data for Experiment 1 are depicted in Figure 2 (the initial implementation) and Figure 3 (the replication). Because the results from both positive and negative trials were identical up to the onset of the trial stimulus, and responding was abolished following the onset of the negative trial stimulus, only the data for the positive trials are provided in these and the following figures. Each consecutive procedure for each bird is presented as a vertical column of frames. Within each frame of Figures 2 and 3, successive clock periods are represented across the x-axis, while rate in each of those periods is represented on the y-axis. The marker below each frame indicates the temporal location of the trial stimulus for that phase. Responding to the clock key (open bars) and to the concurrent trial stimulus key (superimposed shaded bars) is shown. As can be seen, each bird showed some responding to the concurrent positive trial stimulus during at least one phase. However, only Birds 22 and 16 exhibited any consistent responding to the trial stimulus. Note that because I bird frequently responded in excess of 4 pecks/sec, the scales were not constant across birds.
The behavior obtained with a simple interfood clock without trial stimuli can be seen in the top row of frames in Figure 3. This row shows that an interfood clock chronically maintains a successive increase in rate across the stimuli in the later part of the interval. In one case (Bird 28), there was a rate decrement in the very final portion of the interval. Both of these results are consistent with the findings of other studies ( Dinsmoor, Lee & Brown, 1986; Matthews & Lerer, 1987; Palya, 1985).
Surprisingly, the reliable prediction of the upcoming trial outcome provided by the trial stimuli failed to abolish the control exerted by the intermittently reinforced explicit temporal context. This was the case in spite of substantial variation in the relative position of the trial stimulus onset. It had been expected that the more reliable trial stimuli on the side keys would overshadow control by the stimuli of the explicit temporal context, resulting in responding which could be characterized as optimal or "least effort." However, the explicit temporal stimuli continued to control responding during both the period before the reliable trial stimuli and during the positive trial stimulus, with the partial exception of Bird 16 in the latter case. The chronic maintenance of responding to the explicit temporal stimuli occurred even though ( I ) only the last stimulus of the explicit clock was ever present at food presentation, (2) food followed the final clock stimulus on only half the trials, and (3) the concurrent trial stimulus was the most reliable signal that food would be presented.
The failure of the trial stimuli to abolish control by the explicit temporal context could not be accounted for by the suggestion that the pigeons failed to notice the trial stimuli or the suggestion that the trial stimulus had a very low salience. The trial stimuli clearly modulated the responding to the explicit temporal context. If a trial stimulus signaled that no food was forthcoming, responding to both keys was abolished in its presence. In addition, response rates to the explicit temporal stimuli preceding the onset of the trial stimuli were facilitated and rates to the temporal stimuli immediately following the positive trial stimulus were suppressed. This modulation of behavior to the explicit temporal stimuli by the positive trial stimulus can be seen by glancing down each column of Figures 2 and 3 and noting the alteration in the distribution of behavior associated with changes in the point of onset of the trial stimuli.
It is important to note that all stimuli but the final clock stimulus of the explicit temporal context signaled that no food was to be forthcoming in their presence, yet those explicit, putatively negative stimuli controlled substantial responding. The difference between the control exerted by a stimulus explicitly unpaired with food presentation in a to-be-reinforced sequence and its control in a to-be-nonreinforced sequence demonstrates the inappropriateness of models that invoke similar processes to account for between-trial and within-trial effects. Explicitly unpaired in the trial-to-trial sense is not equivalent to explicitly unpaired in the position-in-the-IRI sense.
To the degree that higher order conditioning was responsible for the rate increase to the stimuli preceding the trial stimulus, that higher order reinforcing effectiveness propagated better through the single trial stimulus than through a series of explicit clock stimuli. This effect can be seen by comparing the rates controlled by early clock stimuli when followed by a series of stimuli (Figure 3, top row) with the rates controlled by those same clock stimuli when followed by a single trial stimulus. Additionally, this result indicates that the reinforcing value of two stimuli, one inevitably followed by food and the other inevitably followed by time-out, was greater than the remainder of the interfood clock which was followed by either food presentation or time-out with equal probability. This effect is similar to that typically obtained with observing response procedures (see, e.g., Bower, McLean, & Meacham, 1966; Wyckoff, 1952). Not surprisingly, the reinforcing effect of the trial stimuli diminished with increasing delays to food presentation.
The onset of the positive stimulus ~ as followed by a decrease in responding to the explicit temporal stimuli. The effect resembled that obtained with second-order fixed interval schedules (Marr, 1979) or with fixed-interval schedules with an added stimulus ( Farmer & Schoenfeld. 1966). As can be seen by comparing the data in the various rows of Figures 2 and 3, this decrease in responding following the onset of the trial stimulus was under the control of the onset of the positive trial stimulus itself rather than the absolute delay to food presentation. Two aspects of the data support this observation. First, a particular clock stimulus controlled low rates when it was immediately preceded by the onset of the positive trial stimulus, but controlled higher rates when the onset of the trial stimuli occurred either earlier or later in the interval. Second, the low rates to the clock stimuli immediately following the onset of the positive trial stimulus were in contrast to the higher rates controlled by the clock stimuli immediately preceding the onset of the trial stimulus on the same trial. even though those antecedent stimuli were temporally further from food presentation.
The rate decrease following the onset of the trial stimulus was not necessarily the simple result of keypecks
being redirected from the explicit temporal stimuli to the trial stimulus. Only 2 birds showed appreciable responding to the trial stimulus, and even in those cases the loss of responding to the temporal stimuli exceeded the responses gained by the side key. This, of course, only indicated that the behavior controlled by the onset of the trial stimulus was not effective keypecking. For example. a bird might have turned around rather than peck the side key.
Method
Subjects. Three naive pigeons were used.
Procedure. This experiment varied the percentage of positive trials. As a result, this manipulation (I) altered the probability that the fifth or middle explicit temporal stimulus could be followed by a reinforcer, as well as (2) altered the average time between food presentations. Either of these manipulations would be expected to alter the rate of responding to either or both the positive trial stimulus or the stimuli of the explicit temporal context. The probability of the positive trial stimulus's following the fifth clock stimulus was varied from 0.1 to 1.0. This resulted in a change in the C/T ratio from approximately 20 to 2. If the rate of responding to the clock stimuli just prior to the trial onset is a simple function of the probability of reinforcement, then lowering the probabilities of reinforcement should result in lower response rates. On the other hand, if response rate to the clock stimuli is a function of the C/T ratio, then rate should be an inverse function of reinforcement rate. The effect of either the probability of reinforcement or the cycle-to-trial ratio on responding would be revealed by this manipulation. It should be noted that probability of reinforcement and cycle-to-trial ratio were necessarily confounded.
A 54-sec interfood interval was segmented into nine discriminable 6-sec periods, each of which was designated by a different color on the center key. Following the elapse of the ninth stimulus, a 3-sec food presentation or a 3-sec time-out occurred with the probability specified by the schedule for that phase. A stimulus consistently associated with the subsequent trial outcome was provided during the last half of the explicit temporal sequence by illuminating the right side key white on trials followed by food and illuminating the left side key white on trials followed by time-out. Detailed in Table 2 are the specific probabilities that the fifth clock stimulus was followed by the positive trial stimulus, the order of the presentation of those treatments, and the sessions per phase for each bird.
Table 2
Probability of Positive Trial Stimulus and Sessions per Phase
Phase | Probability of Positive Stimulus |
Sessions |
---|---|---|
1* | 0 (100% food) | 25 |
2* | 0 (50% food) | 25 |
3 | 50% (baseline) | 35 |
4 | 25% | 20 |
5 | 10% | 25 |
6 | 1% | 30 |
7 | 50% (baseline) | 25 |
8 | 75% | 25 |
9 | 90% | 25 |
10 | 99% | 25 |
11 | 100% | 20 |
12 | 50% (baseline) | 20 |
* No trial stimulus was present in these phases |
Phases 1 and 2 were preliminary baseline procedures during which no trial stimuli on the side keys were present. Phase I was a simple interfood clock which inevitably resulted in food presentation. In Phase 2, only half the interfood clock sequences were followed by food presentation. In Phase 3, a trial stimulus consistently correlated with the trial outcome was presented following the fifth explicit temporal stimulus. As a result, half of the occurrences of the fifth clock stimulus were followed by the positive trial cue and half were followed by the negative trial cue. Phases 4, 5, and 6 reduced the presentation probability of the positive trial stimulus to 0.25, 0.10, and 0.01, respectively. Negative trials were delivered on the proportion of trials not containing the positive CS. Phase 7 reestablished the baseline conditions of Phase 3. In Phase 7, therefore, half the trials contained the positive trial stimulus. Phases 8, 9, 10, and 11 increased the probability that the fifth clock stimulus would be followed by the positive trial stimulus. The respective probabilities were 0.75, 0.90, 0.99 and 1.00. Phase 12 reestablished the baseline procedure of Phase 3; during this procedure, half of the trials again contained the predictor of food.
Results and Discussion
In general, the overall effects noted in the first experiment were replicated even under conditions that would be expected to substantially alter the relative impact of the trial stimulus in one direction or the other. The control exerted by the stimuli of an explicit temporal context persisted in spite of large variations in the probability of a concurrent reliable predictor of food presentation. The surprising strength of the temporal context in the face of a discrete trial stimulus was consistent with that observed by Williams et al. (1992).
Like the procedures of Experiment 1, the trial stimuli increased the rate to the explicit temporal stimuli immediately preceding their onset and decreased responding to the explicit temporal stimuli immediately following their onset. These effects occurred at all but the very lowest probability of the positive trial stimulus, and in spite of substantial changes in the relative duration of the trial with respect to the interfood interval. The specific results of Experiment 2 indicated that the highest rates to the explicit temporal stimuli preceding the reliable signal of the trial outcome occurred when the positive concurrent stimulus occurred on approximately half the occasions or less, rather than when the antecedent explicit temporal stimuli were followed by a signal of a positive trial outcome most often.
Figure 4 presents the rates to each consecutive temporal stimulus on the positive trials under each probability of reinforcement for each bird. Each bird's data are presented as a vertical column of frames, with the results of higher probabilities of the trial stimulus in the successively lower frames (with the exception of the initial and terminal baseline phases). The probability of the positive trial stimulus is given to the right of each row of frames. A marker below the x-axis indicates the temporal position of the trial stimulus within each cycle of the explicit temporal context. Consecutive clock periods are provided across the x-axis, while response rates in each clock stimulus are provided on the y-axis. Responding to the clock key is indicated by open bars, while responding to the positive concurrent stimulus key is indicated by superimposed shaded bars. As can be seen, there was little consistent responding directed to the positive trial stimulus and what did occur was primarily at the onset.
The data from Phase 2, when no trial stimuli were provided and food followed half the trials, are depicted for comparison in the first row. The effects of alterations in the probability of the trial stimulus on the behavior controlled by the explicit temporal stimuli can be seen by glancing down each of the columns in Figure 3 . The top row of frames indicates that, in the absence of the trial stimuli, very little responding occurred to the clock stimuli in the first half of the temporal sequence. The rate facilitation to the early portion of the explicit temporal clock increases and subsequently decreases with decreasing probabilities of the positive trial stimulus. The distribution of responding to the clock stimuli during the 50% baseline conditions was generally recovered with each reinstitution of the procedure.
To simplify the summarization of the effects of the procedures on the behavior maintained to the stimuli preceding the trial stimulus onset, Figure 5 presents a subset of those effects. Each frame presents the mean response rate to only the explicit temporal stimulus immediately preceding the onset of the trial stimulus for each of the procedures, and thereby depicts the overall trend in the treatments in a single frame. As can be seen by comparing the results shown in Figures 4 and 5, Figure 5 adequately summarizes the overall effects of the treatments. The probability of a positive trial stimulus is represented along the x-axis, while the y-axis indicates the rate of responding to the explicit temporal stimulus immediately preceding the trial stimulus. The figure shows that low-to-intermediate probabilities of the positive trial stimulus, rather than the highest probabilities, were most effective in generating and supporting responding to the antecedent explicit temporal stimuli. The effect of a positive trial stimulus was not symmetrical around a 50% presentation probability; rather, the lower probabilities were generally more effective than their positive counterparts.
The present results pose a problem for a simple probability-of-reinforcement interpretation for the obtained effect. That view seems inadequate for two reasons. First, high rates were supported by very infrequent presentations of the positive stimulus. To account for those high rates, it would be necessary to argue that even an occasional positive trial stimulus was sufficient to outweigh the neutral or punishing effect of mostly negative outcomes. Secondly, the change in response rate with changes in reinforcement probability did not exhibit the necessary relationship. Probability of reinforcement increases with increasing probability of the positive trial stimulus. The rate to the antecedent clock stimuli decreased with increasing frequency of the positive trial stimulus over most of its range.
It could be asserted that the higher order effects of primary reinforcement mediated by the trial stimuli were irrelevant, and that the present data were the result of the information provided by the trial stimuli (see, e.g., Wilton & Clements, 1971). The obtained functions are generally consistent with the amount of information transmitted by the onset of the trial stimuli (symmetrical around 50% presentation probability) or by the onset of the positive outcome only (skewed with a mode below a 50% presentation probability). While the functional relationship obtained in the present research could be labeled information, the explanatory application of similar conceptualizations in similar situations has failed to bear fruit in the past (Eckerman, 1973; Fantino, 1977).
The present findings would seem to be most parsimoniously cast within a comparator perspective (e.g., Fantino, 1977; Gibbon & Balsam, 1981; Miller & Matzel, 1989). These views would point out that a reinforcer in the context of a long interreinforcement interval is more effective than one in the context of a short interreinforcement interval. A comparator position would expect an increase in rate with longer IRIs up to some relatively high limit, beyond which behavior would be poorly maintained as the result of the low probability of reinforcement.
Method
Subjects. The 3 pigeons from Experiment 2 and 3 naive pigeons were used as subjects.
Procedure. This experiment manipulated the validity of the trial stimuli by altering the probability that a given trial outcome was correctly signaled by the trial stimuli. A 54-sec interfood interval was segmented into 9 discriminable 6-sec periods, each of which was designated by a different color on the center key Following the elapse of the 9th stimulus, a 3-sec food presentation or a 3 -sec time-out occurred with equal probability. A concurrent trial stimulus was provided during the last half of the explicit temporal sequence by illumination of the right or left side key white. The probability of food presentation given right-key illumination and time-out given left-key illumination varied as specified by the schedule for that phase. The procedures along with the number of sessions in each phase for each bird are given in Table 3.
Table 3
Probability of Food Given Right Trial Stimulus and Sessions per
Phase
Phase | Probability of Food Given Right Key |
Probability of Food Given Left Key |
Sessions |
---|---|---|---|
1 | 100 | 20 | |
2 | 100 | 0 | 20 |
3 | 75 | 25 | 45 |
4 | 100 | 0 | 40 |
5 | 60 | 40 | 45 |
6 | 100 | 0 | 55 |
7 | 55 | 45 | 35 |
8 | 100 | 0 | 45 |
9 | 50 | 50 | 20 |
* Food presentation followed every trial in this phase |
Following a preliminary phase during which every trial was followed by food presentation, every alternate phase reestablished a baseline procedure. During the baseline procedure, the trial stimuli on the side keys reliably signaled the subsequent trial outcome. In those baseline phases, the right key consistently preceded food presentation and the left key always signaled time-out. During Phases 3, 5, 7, and 9, food presentation followed 75%, 60%, 55%, and 50% of the right-key illuminations, and 25%, 40%, 45%, and 50% of the left-key illuminations. Data from the final two baseline procedures of Experiment 2 were used for the initial two baselines of Experiment 3 for the 3 birds carried over from Experiment 2.
Results and Discussion
The specific finding of this experiment was that the response rate to the stimuli of the explicit temporal context immediately preceding the onset of an unreliable predictor of food presentation was lower than the rate to those same temporal stimuli when followed by a valid predictor. Additionally, in 5 of the 6 birds, the rate to those antecedent clock stimuli systematically decreased as a function of decreases in the correlation of the subsequent trial stimulus with the trial outcome. In other respects, the general distribution of responding to the explicit temporal stimuli of the context was similar to that seen in the previous two experiments.
The impact of the validity manipulation for each of the 6 birds under each of the procedures of this experiment is summarized in the six frames of Figure 6. As in Figure 5, the behavior to only the explicit temporal stimulus immediately preceding the trial stimuli is depicted; in that it correctly depicts the primary effect of the procedures, the general distribution of responding under this procedure was essentially the same as those previously presented, and a more complete presentation would require substantially more space. The various probabilities of food presentation given the right trial stimulus (the validity of the trial stimulus) for each phase is provided across the x-axis. The v-axis indicates the mean response rate controlled by the explicit temporal stimulus immediately preceding the onset of the trial stimulus under each of the procedures indicated across the x-axis. The baseline phases, during which the trial stimuli were valid, are designated by shaded bars; the data for treatment phases are designated by solid bars. The results of the preliminary baseline, during which all of the trials were followed by food presentation, are given in the open leftmost bar in each frame.
As can be seen in this figure, the rate to the antecedent explicit temporal stimulus under the baseline procedures with reliable trial stimuli (shaded bars) was consistently higher than that controlled by that same temporal stimulus when it was followed by unreliable predictors of the trial outcome (solid bars). These baseline rates were also higher than those that occurred when every trial was followed by food presentation (open bars). The baseline rates were relatively similar to each other in the birds with extensive experience with variations of the general procedure (left column), whereas the baseline rates exhibited more variability in the birds that were naive (right column).
The validity of the trial stimulus strongly influenced its ability to condition responding to the immediately preceding stimuli of the explicit temporal context. These results make it apparent that, with respect to the behavior controlled by a contextual stimulus, the probability of a subsequent primary reinforcer is far less important than how the separation is mediated. If the reinforcer is consistently signaled, then a substantial amount of responding can be expected to that stimulus. If, on the other hand, the subsequent signal is only partially correlated with the trial outcome, then responding to the antecedent temporal stimulus will be proportionally weaker.
These findings were compatible with a view suggesting that the ability of a positive trial stimulus to perform as a reinforcer is a function of the discrepancy between the correlation of that trial stimulus with the reinforcer and the correlation of the comparator (e.g., the explicit temporal context) with the reinforcer (Miller & Matzel, 1989 ). This view would have predicted that the trial stimulus that was perfectly correlated with an upcoming food presentation ( I ) would have been ineffective when the explicit temporal context also was perfectly correlated with the upcoming food presentation, but (2) would have been effective when the context did not perfectly predict food presentation (e.g., Phase I vs. Phase 2). The machinery underlying this perspective differs from that underlying a traditional information view in that it diminishes the importance of temporal priority and contends that the reinforcing effect of information is a function of the discrepancy between the information source and its background, or comparator, rather than the absolute value of that information.
Both excitatory and suppressive effects of the positive trial stimulus on the control by the explicit temporal stimuli were observed in the present research. The specific nature of the effect depended on the temporal relationship of the clock stimulus to the onset of the trial stimulus, and the validity with which the trial stimulus signaled the subsequent outcome. There was an increase in the rate of responding to the explicit temporal stimuli preceding the onset of the trial stimuli. Following the onset of the negative trial stimulus, responding was abolished. Immediately following the onset of the positive trial stimulus, responding to the clock stimuli was suppressed. Most typically, responding increased across the latter portion of the interfood interval on positive trials. These general effects were robust across relatively large variations in the duration of the trial stimulus as well as large variations in the relative duration of the trial with respect to the interfood interval or in the probability of the positive concurrent trial stimulus. In contrast, relatively small variations in the validity with which the concurrent trial stimulus signaled the upcoming trial outcome had a pronounced effect on the ability of the trial stimulus to function as a reinforcer.
The magnitude of the effects obtained in the present research was consistent with a process governed by the discrepancy between the probability of food presentation given the positive trial stimulus and the probability of food given only the explicit temporal context. Comparator theories (e.g., Miller & Matzel, 1989) specify that a comparison of the predictiveness of the CS with the predictiveness of the context determines the strength of the behavior to the CS. The present results indicate that that discrepancy also correctly describes the ability of the stimulus to function as a reinforcer. Discrepancies of the appropriate magnitude between the strength of the trial stimulus and the strength of the context would be the result of any conditioning process that produced an effect proportional to the probability of reinforcement (e.g., Rescorla & Wagner, 1972). The only exceptions to this general rule were the conditions with extremely low reinforcement rates in Experiment 2. Those exceptions could be seen as the result of some additional process, the most likely being extinction. As few as one or two reinforcers occurred per session when the reinforcement probability was only 1%.
Presumably, to the degree that implicit temporal stimuli correlated with the passage of time are discriminable, they would be expected to control conditioned responses similar to those controlled by the explicit clock stimuli. Additionally, dynamics similar to those obtained in the present research would be expected in similar traditional Pavlovian procedures. While it could be argued that the behavior to implicit temporal stimuli across an interfood interval is completely different from that obtained with explicit stimuli, that argument, without empirical support, opens the door to simply postulating behavior or processes as required by theoretical necessity (Zeiler, 1977). Zeiler cogently pointed out that the behavior obtained with an explicit instantiation of a hypothetical process should be considered veridical. Subsequently, the burden of proof lies on any position suggesting that the hypothetical process is different from its empirical instantiation.
In addition to documenting real-time dynamics in the trial stimulus/temporal context interaction, the present study illustrated a basic weakness in the use of single indices to represent "the" interaction of trial stimuli and the context. The obtained real-time dynamics across the IRI are consistent with Palya's (1993) bipolar model. This view suggests that degree of contiguity is most appropriately defined in terms of position on a gradient extending from the stimulus with the most negative correlation with the reinforcer (Smin ) to the stimulus with the most positive correlation with the reinforcer (Smax). Stimuli across the second half of the gradient would be seen as having increasing contiguity with the subsequent reinforcer. Stimuli in successively earlier portions of the first half of the gradient would be seen as having, in effect, an increasingly negative correlation with the reinforcer. By the specification of contiguity in this way, this position also provides a principled framework within which the extent and degree of higher order conditioning, conditioned reinforcement, and information could be specified.
The distribution of responding in time obtained with the present procedures is consistent with this perspective. With the present procedure, Smin, or the point in the interval with the most negative relationship with the reinforcer, occurred at the onset of the negative trial stimulus. The point with the most positive relationship with food presentation, or Smax occurred at the end of the positive trial stimulus. Stimuli with a higher positive relationship with the reinforcer controlled higher response rates, while stimuli in the first half of the ''Smin-Smax'' gradient did not control the terminal behavior. The portion of the interval immediately following Smin is not illustrated in the figures because keypecking did not occur during the negative trial stimulus.
An appropriate conceptualization of the factors controlling the low rates at the onset of the positive trial stimulus, the portion of the CS furthest removed from the US, is also available. The effect is typically labeled "inhibition of delay." The poor explanatory power of a position based on the weakening effect of the relative or absolute delay to the reinforcer is revealed, however, by the high rates chronically maintained by the explicit temporal stimuli prior to the onset of the trial stimulus. These stimuli explicitly signal a longer delay to the reinforcer than the stimulus in effect at the onset of the trial stimulus.
The change in behavior across the duration of the trial stimulus is more understandable when it is noted that it exhibited essentially the same distribution as the responding found across an entire interfood interval. It may find its best explanation, therefore, in a similar bipolar process. The stimuli with the most negative correlation (in this case temporal) with the subsequent food presentation most strongly controls a behavior other than the final terminal approach behavior.
REFERENCES
BOWER, G., McLEAN, & MEACHAM, J. (1966). Value of knowing when reinforcement is due. Journal of Comparative & Physiological Psychology, 62,184-192.
DINSMOOR, J. A., LEE, D. M., & BROWN, M. M. (1986). Escape from serial stimuli leading to food. Journal of the Experimental .Analysis of Behavior, 46,259-279.
ECKERMAN, D. A. (1973). Uncertainty reduction and conditioned reinforcement. Psychological Record, 23, 39-48.
FANTINO, E. (1977). Conditioned reinforcement: Choice and information. In W. K. Honig & J. E. R. Staddon (Eds.), Handbook of operant behavior (pp. 313-339). Englewood Cliffs, NJ: Prentice-Hall.
FARMER, J., & SCHOENFELD, W. N. (1966). Varying temporal placement of an added stimulus in a fixed-interval schedule. Journal of the Experimental Analysis of Behavior, 9, 369-375.
GIBBON, J., & BALSAM, P. (1981). Spreading association in time In C M Locurto, H. S. Terrace, & J. Gibbon (Eds.), Autoshaping and conditioning theory (pp. 219-253). New York: Academic Press.
HEARST, E., & JENKINS, H. M. (1974). Sign-tracking The stimulus-reinforcer relation and directed action. Austin. TX: Psychonomic Society.
MARR, M. J. (1979). Second-order schedules and the generation of unitary response sequences. In M. D. Zeiler & P Harzem (Eds.), Advances in analysis of behaviour Vol. 1. Reinforcement and the organization of behavior (pp. 223-260). New York: Wiley.
MATTHEWS, T. J., & LERER, B. E. (1987). Behavior patterns in pigeons during autoshaping with an incremental conditioned stimulus. Animal Learning & Behavior, 15, 69-75.
MILLER, R. R., & MATZEL, L. D. (1989). Contingency and relative associative strength. In S. B. Klein & R. R. Mowrer (Eds.), Contemporary learning theories: Pavlovian conditioning and the status of traditional learning theory (pp. 61-84). Hillsdale, NJ: Erlbaum.
PALYA, W. L. (1985). Sign-tracking with an interfood clock. Journal of the Experimental Analysis of Behavior, 43, 321-330.
PALYA, W. L. (1993). Bipolar control in fixed interfood intervals. Journal of the Experimental Analysis of Behavior, 60, 345-359.
RESCORLA, R. A., & WAGNER, A. R. (1972). A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement In A. H. Black & W. F Prokasy (Eds.), Classical conditioning: II Current research and theory (pp. 64-99). New York: Appleton-Century-Crofts.
WILLIAMS, D. A., FRAME, K. A., & LOLORDO, V M. (l992). Discrete signals for the unconditioned stimulus fail to overshadow contextual or temporal conditioning. Animal Learning & Behavior, 18, 41-55.
WILTON, R. N., & CLEMENTS, R. 0. (1971). The role of information in the emission of observing responses: A test of two hypotheses. Journal of the Experimental Analysis of Behavior, 16, 161-166.
WYCKOFF, L. B., JR. (1952). The role of observing responses in discrimination learning: Part 1. Psychological Review, 59, 431-442.
ZEILER, M . D. (1977). Schedules of reinforcement: The controlling variables. In W. K. Honig & J. E. R. Staddon (Eds.), Handbook of operant behavior (pp. 201-232). Englewood Cliffs, NJ: Prentice-Hall.
This research was partially supported by a National Institutes of Health grant (1 R15 HD 25601-01) and a National Science Foundation grant (BNS 8808409) to W.L.P. Portions of this paper were presented at the annual meeting of the American Psychological Association, 1992. The authors gratefully acknowledge the contributions of Don Walter for data analysis and discussions, Janie Christian and Helen Bush for conducting the experiments, and Elizabeth Palya for contributions in all phases of this research . Correspondence and requests for reprints should be sent to William L. Palya, Department of Psychology, Jacksonville State University, Jacksonville, AL 36265 (e-mail: palya@sebac.jsu.edu).
--Accepted by previous editor Vincent M LoLordo
(Manuscript received January 27,1994; revision accepted for publication April 15, 1995.)