Operant
Conditioning
A
Brief Survey of Operant Behavior
It has long been known that
behavior is affected by its consequences. We reward and punish
people so they will behave in different ways. A more specific
effect of a consequence was first studied experimentally by
Edward L. Thorndike in a well-known experiment. A cat enclosed
in a box struggled to escape and eventually moved the latch,
which opened the door. When repeatedly enclosed in a box,
the cat gradually ceased to do things that had proved ineffective
("errors") and eventually developed the successful
response very quickly.
In operant conditioning, behavior is also affected
by its consequences, but the process is not trial-and-error
learning. It can best be explained with an example. A hungry
rat is placed in a semi-soundproof box. For several days an
automatic dispenser occasionally delivers bits of food into
a tray. The rat soon goes to the tray immediately upon hearing
the sound of the dispenser. A small horizontal section of
a lever protruding from the wall has been resting in its lowest
position, but it is now raised slightly so that when the rat
touches it, it moves downward. In doing so it closes an electric
circuit and operates the food dispenser. Immediately
after eating the delivered food the rat begins to press the
lever fairly rapidly. The behavior has been strengthened or
reinforced by a single consequence. The rat was not
"trying" to do anything when it first touched the
lever and it did not learn from "errors."
To a hungry rat, food is a natural reinforcer,
but the reinforcer in this example
is the sound of the food dispenser, which was conditioned
as a reinforcer when it was repeatedly
followed by the delivery of food before the lever was pressed.
In fact, the sound of that one operation of the dispenser
would have had an observable effect even though no food was
delivered on that occasion; when food no longer follows pressing
the lever, the rat eventually stops pressing. The behavior
is said to have been extinguished.
A number of studies in the Berkeley
laboratory of Edward Tolman explored
the operant conditioning theory. Rats were allowed to explore
a maze in which there were three routes of different lengths
between the starting position and the goal. The rats' behavior
when the maze was blocked implied that they must have some
sort of mental map of the maze. The rats preferred the routes
according to their shortness, so when the maze was blocked
at point A, stopping them using the shortest route, they chose
the second shortest route. When the maze was blocked at point
B, however, the rats did not retrace their steps and use route
2, but rather chose route 3. The rats must have recognized
that block B would stop them from using route 2 by using some
memory of the layout of the maze. Tolman's group also showed unexpected changes in the quality
of reward could weaken learning even though the animal was
still rewarded.

In 1938 Burrhus
Friederich Skinner published the most influential work on animal behavior of the century,
'"The Behavior of Organisms." Skinner's provided
a technology that allowed sequences of behavior produced over
a long time to be studied objectively. His Skinner-Box was
a great improvement on earlier individual learning trials.
Skinner developed the basic concept of operant conditioning. Operant conditioning forms an association between
a behavior and a consequence. It is also called response-stimulus or RS conditioning because
it forms an association between the animal's response [behavior]
and the stimulus that follows [consequence]).
The theory of B.F. Skinner
is based upon the idea that learning is a function of change
in overt behavior. Changes in behavior are the result of an
individual's response to events (stimuli) that occur in the
environment. A response produces a consequence such as defining
a word, hitting a ball, or solving a math problem.
Principles:
1.
Behavior that is positively reinforced will reoccur;
intermittent reinforcement is especially effective.
2.
Information should be presented in small amounts so
that responses can be reinforced (called "shaping").
3.
Reinforcements will generalize across similar stimuli,
producing secondary conditioning.
Reinforcement is the key
element in Skinner's S-R theory. A
reinforcer is anything that strengthens the desired response.
Positive
reinforcement includes verbal praise, a good grade
or a feeling of increased accomplishment or satisfaction.
The theory also covers negative reinforcement -- any stimulus that results in the increased
frequency of a response when it is withdrawn (different from
adverse stimuli -- punishment -- which results in reduced
responses). Skinner explained drive (motivation) in terms
of deprivation and reinforcement schedules.
Reinforcers may be positive or
negative. A positive reinforcer reinforces
when it is presented; a negative
reinforcer reinforces when it is withdrawn. Negative reinforcement
is not punishment. Reinforcers always strengthen behavior; that is what "reinforced"
means. Punishment is used to suppress behavior. It consists of removing a positive reinforcer or presenting
a negative one. It often
seems to operate by conditioning negative
reinforcers. The punished
person henceforth acts in ways which reduce the threat of
punishment and which are incompatible with, and hence take
the place of, the behavior punished.
Four
Possible Consequences
Consequences have to be immediate,
or clearly linked to the behavior. With
verbal humans, we can explain the connection between the consequence
and the behavior, even if they are separated in time. For example, you might tell a friend that you'll
buy dinner for them since they helped you move, or a parent
might explain that the child can't go to summer camp because
of her bad grades. With very young children, humans who don't have
verbal skills, and animals, you can't explain the connection
between the consequence and the behavior. For
the animal, the consequence has to be immediate.
Applying these terms to the
Four Possible Consequences, you get:
Something Good
can start or be presented, so behavior increases = Positive
Reinforcement (R+)
Something Good
can end or be taken away, so
behavior decreases = Negative Punishment
(P-)
Something Bad
can start or be presented, so behavior decreases = Positive
Punishment (P+)
Something Bad
can end or be taken away, so behavior increases = Negative
Reinforcement (R-)
Or:
|
|
Punishment
(behavior increases)
|
Punishment
(behavior decreases)
|
|
Positive 
(something added)
|
Positive Reinforcement:
Something added increases behavior
|
Positive Punishment
Something added decreases behavior
|
|
Negative 
(something removed)
|
Negative Reinforcement
Something removed
increases behavior
|
Negative Punishment
Something removed decreases behavior
|
Technical Terms
The technical terms
for "start or be presented" is positive, since it's something that's added to the environment.
The technical terms for "end
or be taken away" is negative,
since it's something that's subtracted
from the environment.
Anything that increases a behavior - makes it occur more frequently, makes
it stronger, or makes it more likely to occur - is a reinforcer. Often, a person
will perceive "starting Something Good" or "ending
Something Bad" as something worth pursuing,
and they will repeat the behaviors that seem to cause these
consequences. These consequences will increase the behaviors
that lead to them. These are consequences the animal will
work to attain, so they strengthen the behavior.
Anything that decreases a behavior - makes it occur less frequently, makes
it weaker, or makes it less likely to occur - is a punisher. Often, a person will perceive "ending
Something Good" or "starting
Something Bad" as something worth avoiding, and they
will not repeat the behaviors that seem to cause these consequences.
These consequences will decrease the behaviors that lead to
them.
These definitions are based
on their actual effect on the behavior in question: they must
reduce or strengthen the behavior to be considered a consequence
and be defined as a punishment or reinforcement. Pleasures
meant as rewards that do not strengthen a behavior are indulgences,
not reinforcement; aversives meant
as a behavior weakener but which do not weaken a behavior are abuse, not
punishment.
To learn more about negative
and positive reinforcement, check out these websites:
Negative Reinforcement University
Positive Reinforcement University 
Skinner's approach emphasized
the function of behavior, employing a deterministic theory
in which there is no free will. He stressed that we must apply
the principles of learning to each organism individually.
In his novel Walden Two, Skinner described a
utopian community that is behaviorally engineered, based on
principles of operant conditioning; a benevolent government
rewards positive, socially appropriate behavior, and all is
well.
According to Skinner, the
motivations that Freud called the drives of the id
are better understood as biological reinforcers
of the environment. The part of the psyche that Freud called
the superego (conscience) is better understood
as the contingencies that society creates and imposes to control
the selfish (individualistic) nature of the individual. For
Skinner, personality traits such as extroversion are just
groups of behavior that have been reinforced.
Behaviorist approaches such
as operant conditioning are important because they forced
personality theorists to become more empirically minded, and
many untestable Freudian assumptions
were discarded.