Now, let’s take a look at another scenario
and these are some classic experiment
done from our friend B.F. Skinner
and he created something called the Skinner Box
and basically it was a box that how was the rodent
so whether the rat or a mouse
and it was given a task to do
and in there was a little lever
and the lever would if depressed
would potentially deliver a reward
and that was usually some food.
Now, a poor little rat was
sitting on an electrical grid
and that was connected
to a shock generator.
So yes you put those two together
and we would also electrocute our friend
and so we had sort of a negative
reinforcer there as well.
And there is some light
and a speaker
so we could do a lot of different
conditioning with the lights saying,
“when the light goes, pull the lever
and get a treat or get shock” and so on.
So a lot of different scenarios
that you could set up.
So Operant Conditioning used
and punishment to influence behavior
and cause Associative Learning.
So now this is how we’re
able to showcase
how the animal was able to learn something
using either reinforcement or punishment.
Let’s walk through
some of the aspects here.
So Reinforcement, this would
increase the likelihood
that the behavior will be repeated so the animal
is going to repeat it
because it’s happy.
So what are we using
The example usually is food, right.
So positive reinforcement means
“I’m adding something”
typically in this scenario
would be a food pellet
the rat is happy it does
it was positively reinforced and we’re adding
the food pellet to the mix.
Now the alternative is negative reinforcement
and here is we’re usually removing something.
So if it does the job correctly I will no longer
shock mister rat and it saying,
Thank God that I’m not getting
that zap anymore”
and it’s going to as well do a good
job and repeat that behavior
'cause it doesn’t want to get shock, right.
So positive reinforcement
–food adding food pellet.
Negative reinforcing –remove that
negative stimulus to shocking, okay.
Now, we can have
a primary reinforcer
and this is usually something
that’s making them feel quite good
it’s innately satisfying or desirable.
The example again is food, pain avoidance
and there's a direct relationship.
Now, you can have a secondary
or condition reinforcer
and this is us now pairing
or creating a relationship
so a neutral stimulus is paired
with the primary reinforcer.
So back to our Pavlov
the same thing to do
with the unconditioned stimulus
and the conditioned stimulus so the bell and the food.
Here we’re sort of talking about
roughly the same thing
so a primary reinforcer
would be food, okay,
so you did a good job,
here’s your food
but you could make this
an indirect relationship.
So say for example,
I’m thinking of my daughter right now
so we’re trying to get her
to not wet the bed anymore
so we’ve created a little pee pee chart
and I know that we’ve all had that pee pee charts
some use still using the pee pee chart.
So you have this chart in the idea of the pee pee chart
is if you don’t pee the bed
you get a little sticker or a star, right
and what I do for my daughter is
I say, “If you get enough stars,
you are going to get a trip to Disney, okay.”
So now what I've done is
I've paired the stars on the pee pee chart
to a primary reinforcer which is the trip.
So my daughter is now
trying to normally a star
is a neutral stimulus
it’s not really doing anything
but in this case because I paired it
with the primary reinforcer
it’s now become
a secondary reinforcer.
She gets enough stars
and she goes to Disney
and we get to see our girl on an Elsa right, okay.
Now, the actual process of Operant Conditioning
is when we rely on this reinforcement
to create this learning and to create this change.
And it can have different types of reinforcement,
you can have something that’s very scheduled
or you can have something
that’s continuous, right.
So you could do something
where the reward in every single time
or you can do something
that’s were little intermittent.
Now back to the pee pee
if everytime she get a star
I had to take her to Disney,
I’ll be bankrupt and I would probably
want to strangle on an Elsa
So instead what we do is say
“Okay, what when we have
some type of intermittent
reward, you have to get
a full chart of stars before I deliver the reinforcement, okay.
So we’ll look at a couple
examples just a sec.
So continuous reinforcement result
in more rapid acquisition extinction,
So what we’re saying here is if
I give you a reward every single time
you will very, very quickly
acquire this relationship
and you will learn that conditioning
very, very quickly
because pull a lever you get a little pellet,
you pull a lever you get a little pellet very quickly
you get that relationship, okay.
Now, if you look at
the results are slower but you have greater persistence
meaning that will actually last a lot longer
and it will continue the behavior even more.
Compare to the first scenario,
if everytime you're pulling that lever,
mister rat is pulling and it gets
a food pellet every single time,
all of a sudden even if it
just two times in a row
it pulls the lever it doesn’t get food,
it’s going to stop.
He's going to say,
“Wait a second,
everyday for the last nine months
I pulled that lever I get my pellet
and I pulled that twice now I’m
getting nothing, I’m out, okay.”
Now versus the intermittent
reinforcement, the rats pulling the lever
and maybe every few times he gets a pellet and it eats.
Pulls the lever, it eats
and he’ll do that over and over
but because it’s not continuous it will take
a lot for the rat to start giving up.
It will say, “Wait a second, let me just keep trying
'cause I know it doesn’t come everytime
and it persist and it persist
and it persist.
Now the most successful way to do it
is actually a combo of both.
So you start with continuous
meaning pellet everytime
and then you switch over
and now it’s getting
a pellet every few times.
And the combination actually gives you the best success
and a longest lasting conditioning response.
Okay so what are some of these
intermittent reinforcements schedule.
How could we break that down if
we’re not going to go with continuous?
One is the Fixed-ratio schedule,
So here we’re doing a reinforcement
after a set number
of instances of the behavior.
I know apriori like the experimenter myself
or whoever is doing that actual conditioning
will know every four lever pushes
I will give them a reward;
every four I’ll get a reward.
Now as that’s the case you’ll notice
it almost becomes like continuous reinforcement
'cause the animal or the individual
very quickly learn four levers equals one pellet.
So whether you’re doing it
one to one or four to one,
it’s still us to understand
that relationship, okay.
Now the Variable-ratio is where
you get reinforcement
after an unpredictable number
of instances of behavior.
Perfect human example
would be gambling.
How many times do you go and you put
your money in the slot and you pull a crank
and your waiting for the cherry, cherry, cherry be always
keep getting cherry bell bell, right
but you keep going
and all it takes is that one time
and you hear that “ding, ding, ding, ding”
and you get shower
of all this coins
and you’re the happiest person
on the planet and you won.
And you’re getting the rewards,
you get the positive reinforcements.
Now, are you going to go back and do it again?
Of course you are, right.
and you’re having a great time
and you think you’re going to win
everytime, right but you don’t.
Now that unpredictability keeps you
coming back for more and more.
There's a fixed-interval schedules,
Now what we’re doing is we provide reinforcements
after a set of period of time, okay.
So that’s different versus their fixed-ratio schedule
we were talking about number of instances.
Now we’re talking about time. So now we’re saying
I’m going to let him hit the lever
or I’m going to let them pull
that crank of the gamble machine
but every five minutes,
they’ll get a reward, okay.
And the same idea
I’m going to give them award
but it’s going to be over a period of time
but that time is not going to be
five minutes, it’s just going to be inconsistent
–five minutes this time then maybe
I’ll do it again after two minutes,
maybe I’ll do it again
at ten minutes.
That time is inconsistent. So if we put that
as summer we can say the ratio schedules
are based on number of instances
and interval schedule
is based on time, okay so you
should be able to make those difference.
Now you can see all these together
links up ways to condition our response
whether it’s a Classical, Pavlovian Conditioning
or we have this Operant Conditioning
at the end of the day
you’re pairing different types of stimulus
and reward and reinforcement in
order to teach a learned condition behavior.
So when considering operant
conditioning, we understand
that not every behavior is actually learned
by simply providing reinforcement
So sometimes we actually need to work
to the process. So a step-wise approach
can actually be used to achieve a final
behavior by reinforcing intermediate behaviors.
So let’s take this as example
that we have here.
So the child pulls himself up,
the small little baby and you’re the parent
and as the child comes up
the end goal of what they want
is to learn how to walk. So we can't reinforce them
right there on that first task
and get right to walking, we actually support
and reinforce the intermediate steps.
So when the child pulls themselves up
you stand there and your kind of say,
“Good job, we’re so happy.
You’re amazing, you’re a genius”
and the baby falls
and bangs her face.
And then they try it again
and they get better,
and they get better then
they stand up while using some support
and again you’re there
to help them and reinforce
and then they do that with no support
and also they get to walking.
So the reinforcement along
the way is what actually allowed you
to get to the end goal of walking. So in that case
in terms of operant conditioning,
the reinforcement again fell with
the intermediate steps with the intermediate behaviors.
Now let’s take a look
at the flip side of Punishment.
Now punishment is the process by which the behaviors
is followed by a consequence
that decreases the likelihood
that that behavior will be repeated.
So in English, I think we all know punishment is
we’ve all been punished before,
it simply means that you will not
do that behavior as much
because of what has happened
because of the punishment.
Now reinforcement increases the behavior
while punishment decreases it.
And there's two types
I’m going to introduce here.
The first is Positive Punishment
and this is where you deliver or pair
a negative stimulus with the behavior
versus a negative punishment
where you’re removing
or removal of a reinforcing stimulus
after the behavior has occurred.
So an example might be
for positive punishment,
because of what has happened
I’m going to you drop down right now
and get me a 100 push ups,
so that’s a punishment
and its being paired with you doing the behavior
because that’s result of the behavior.
Now, negative reinforcement
might be the Skinner box
for example of the rat sitting
in a box and being electrocuted
and after the behavior is complete
it will remove that shock.
So one case for adding something
and another case removing something.
So reinforcement we know how has
a more long lasting effect than punishment
and better than causing behavioral change
which is why a lot of times you hear
in terms of child development
it’s a little bit better to be more positive
when the child and reinforce
as opposed to punish.
When we see that even with
the rehabilitation and punishment
so presence versus the rehabilitation
so we know that when you reinforce
we see a better improvement
and behavioral change versus punishment.
So it’s kind of summarize here
in this schematic,
we have on the top whether you’re adding
a stimulus or you’re removing a stimulus
whether we’re increasing the behavior,
decreasing the behavior
we’re looking at positive reinforcement
and negative reinforcement
as two vehicles which to increase behavior
and we have positive punishment
and negative punishment
as things like decrease behavior.
So let’s take a look at Escape
and Avoidance Learning.
So these are two types of Operant Learning,
so we’re learning something here
but we’re using two different
means of which to do it.
In Escape, we learn how to
avoid an aversive stimulus
by engaging in a
So say, your mom says,
“You need to eat this
cut up fruit, it’s healthy for you”
and the child does not want this fruit
and what does the child do?
The child could just shut up and take it
but that’s typically not what happened,
instead the child has a meltdown,
“I don’t want this grapes”,
starts throwing the grapes, starts freaking out
and so the mother eventually says,
“Fine, enough with the grapes.
Don’t worry about the grapes”
and the child just realize, “hey big
mama wanting me to eat this grapes,
I don’t want to eat the grapes.
I had a melt down
and I've got out of eating
And so now they have escaped having to
engaged in eating these grapes
by they’ve avoided having eating these grapes
by engaging in that behavior of a meltdown.
Now in avoidance,
we perform a behavior to ensure
the aversive stimuli
not even encountered.
So say for example,
there's something going on at school
so in this image we have a child
who’s not feeling well, right.
So what the child has realized,
well hey, I go to school
and there's something there that
I don’t want to do like write an exam,
“Well why don’t I just avoid that
whole interaction of having into
not doing an exam by just
saying that I’m sick.”
So we’ve completely avoided having to
deal with that interaction of writing an exam
and so you have an indirect measure
or indirect act or behavior to avoid that
so this faking your sick gets you out
having to go into school
and even engage or interact
with that possible aversive stimulus.
So escape versus avoidance the end result
is you’re not having to engage
in that stimulus
in the first place.
One is hands on direct
by a freaking out
and the other one is inadvertently saying
that “I’m sick and not having to engage.”