Elsevier

Behavioural Processes

Volume 114, May 2015, Pages 72-77
Behavioural Processes

The copyist model and the shaping view of reinforcement

https://doi.org/10.1016/j.beproc.2015.02.009Get rights and content

Highlights

  • Summarized a brief history of strengthening and shaping view of reinforcement.

  • The copyist model belongs to the family of accounts based on the shaping view.

  • Evaluated how well the copyist model explains VR–VI rate difference and the matching law.

  • Future work should be directed at a model which combine the strengths of the strengthening and shaping views.

Abstract

The strengthening view of reinforcement attributes behavior change to changes in the response strength or the value of the reinforcer. In contrast, the shaping view explains behavior change as shaping different response units through differential reinforcement. In this paper, we evaluate how well these two views explain: (1) the response-rate difference between variable-ratio and variable-interval schedules that provide the same reinforcement rate; and (2) the phenomenon of matching in choice. The copyist model (Tanno and Silberberg, 2012) – a shaping-view account – can provided accurate predictions of these phenomena without a strengthening mechanism; however, the model has limitations. It cannot explain the relation between behavior change and stimulus control, reinforcer amount, and reinforcer quality. These relations seem easily explained by a strengthening view. Future work should be directed at a model which combine the strengths of these two types of accounts.

Introduction

In Principles of Psychology, Keller and Shoenfeld (1950) noted operant conditioning is “merely the strengthening of a reflex that already exists in the organism’s repertory” (p. 48). This observation suggests that they viewed operant conditioning as altering the strength between operants and their consequent reinforcers. For Skinner (1953), on the other hand, “reinforcement” may refer, at least partly, to the process of shaping operants. In Science and Human Behavior, he noted “operant reinforcement resembles the natural selection of evolutionary theory. Just as genetic characteristics which arise as mutations are selected or discarded by their consequences, so novel forms of behavior are selected or discarded through reinforcement” (p. 430). It seems that here Skinner viewed reinforcement as a process of shaping rather than strengthening.

These two views of reinforcement, one based on strengthening and the other on shaping, differ in their implications. Although it is generally acknowledged that reinforcement has both of these two effects (Morse, 1966), its generality and implications are still in controversy (e.g., Shimp, 1976, Shimp, 2013). In this paper, we evaluate how well these two views explain: (1) the response-rate difference between variable-ratio (VR) and variable-interval (VI) schedules that provide the same reinforcement rate (Ferster and Skinner, 1957); and (2) the phenomenon of relative response rates matching relative reinforcer rates in concurrent schedules (Herrnstein, 1961). As regards these two phenomena, we believe the idea that reinforcement shapes behavior is predictively superior to the idea that reinforcement strengthening behavior.

Section snippets

The copyist model

Tanno and Silberberg’s (2012) copyist model belongs to the family of accounts based on the shaping view of reinforcement. The computational algorithm of the copyist model is shown in Fig. 1. While the algorithm is similar to interresponse time (IRT) reinforcement theory broadly conceived (Morse, 1966, Peele et al., 1984, Wearden and Clark, 1988), the copyist model differs from the IRT reinforcement theory in one important regard. In earlier IRT accounts such as Peele et al. (1984), the IRTs in

The VR–VI rate difference

The VR–VI rate difference defines a phenomenon where the strengthening view and the shaping view provide clearly different accounts of behavior. In VR schedules, the number of responses required to deliver a reinforcer varies between successive reinforcers. The mean of these interreinforcement ratios (the number of responses between successive reinforcers) defines the schedule's value (e.g., VR 30). In a VI schedule, on the other hand, a reinforcer is delivered for the first response following

The matching law and concurrent-schedule performance

The matching law is one other representative phenomenon where the strengthening view and the shaping view provide clearly different accounts of behavior. Herrnstein (1961) exposed pigeons to concurrent VI VI schedules and found that the relative response rates for one alternative equaled or “matched” the relative reinforcement rates for that alternative. This relation is expressed as:B1B1+B2=R1R1+R2where B and R denote responses and reinforcements, and subscripts distinguish the two

Changeover delay and copyist model II

Skinner’s (1950) first published considerations of concurrent performances were in response to Tolman’s (1938) claim that the “determiners of behavior at a choice point” could be used as a surrogate for measuring strength. Skinner thought preference was a poor measure of strength, and instead reflected the shaping effects of differential reinforcement of switching on behavior (also see Skinner, 1986).

Given Skinner’s view, Herrnstein’s (1961) finding of matching in choice can be seen, at least

Limitations and implications

The copyist model has two major limitations. The first relates to stimulus control. The copyist model has no mechanism for reflecting the discriminative control of operant responses. For example, the model cannot explain schedule performances under Fixed-Interval and Fixed-Ratio schedules in which timing and counting play important roles, respectively. One possible solution is to define multiple memory sets and assign each to each discriminative stimulus as McDowell’s (2013) selection by

Acknowledgements

This work was supported by JSPS KAKENHI#00237309 and #26995075. The authors thank Dr. Kyoichi Hiraoka for his suggestions on the stay/switch idea of the copyist model.

References (41)

  • W.M. Baum

    The correlation-based law of effect

    J. Exp. Anal. Behav.

    (1973)
  • W.M. Baum

    On two types of deviation from the matching law: bias and undermatching

    J. Exp. Anal. Behav.

    (1974)
  • W.M. Baum

    Optimization and the matching law as accounts of instrumental behavior

    J. Exp. Anal. Behav.

    (1981)
  • W.M. Baum

    Performances on ratio and interval schedules of reinforcement: data and theory

    J. Exp. Anal. Behav.

    (1993)
  • W.M. Baum et al.

    Choice as time allocation

    J. Exp. Anal. Behav.

    (1969)
  • A. Catania

    Concurrent performances: a baseline for the study of reinforcement magnitude

    J. Exp. Anal. Behav.

    (1963)
  • J.M. Cleaveland

    Interresponse-time sensitivity during discrete-trial and free-operant concurrent variable-interval schedules

    J. Exp. Anal. Behav.

    (1999)
  • M.R. Cole

    Response-rate differences in variable-interval and variable-ratio schedules: an old problem revisited

    J. Exp. Anal. Behav.

    (1994)
  • C.B. Ferster et al.

    Schedules of Reinforcement

    (1957)
  • R.J. Herrnstein

    Relative and absolute strength of response as a function of frequency of reinforcement

    J. Exp. Anal. Behav.

    (1961)
  • R.J. Herrnstein et al.

    Is matching compatible with reinforcement maximization on concurrent variable interval, variable ratio?

    J. Exp. Anal. Behav.

    (1979)
  • F.S. Keller et al.

    Principles of Psychology

    (1950)
  • J.S. MacDonall

    A local model of concurrent performance

    J. Exp. Anal. Behav.

    (1999)
  • J.S. MacDonall

    The stay/switch model of concurrent choice

    J. Exp. Anal. Behav.

    (2009)
  • B.A. Matthews et al.

    Uninstructed human responding: sensitivity to ratio and interval contingencies

    J. Exp. Anal. Behav.

    (1977)
  • J.J. McDowell

    A quantitative evolutionary theory of adaptive behavior dynamics

    Psychol. Rev.

    (2013)
  • H.L. Miller

    Matching-based hedonic scaling in the pigeon

    J. Exp. Anal. Behav.

    (1976)
  • W.H. Morse

    Intermittent reinforcement

  • Y. Niv

    The Effects of Motivation on Habitual Instrumental Behavior. Unpublished doctoral dissertation

    (2007)
  • Y. Niv et al.

    Tonic dopamine: opportunity costs and the control of response vigor

    Psychopharmacology (Berl.)

    (2007)
  • Cited by (3)

    • Response-bout analysis of interresponse times in variable-ratio and variable-interval schedules

      2016, Behavioural Processes
      Citation Excerpt :

      Two questions concerning the VR-VI response rate difference are why and how the response rates differ. Tanno and Silberberg (2012; see also Anger, 1956; Morse, 1966; Peele et al., 1984; and Tanno et al., 2015) showed that the differential reinforcement of interreseponse time (IRT) sequences can account for why the rates differ. While the probability of reinforcement does not change with IRT duration in VR, in VI this probability is an increasing and bounded function of the IRT duration.

    • Molecular (moment-to-moment) and molar (aggregate) analyses of behavior

      2020, Journal of the Experimental Analysis of Behavior
    View full text