Lectures

Operant Conditioning – Associative Learning (PSY, BIO)

by Tarry Ahuja, MD
(1)

Questions about the lecture
My Notes
  • Required.
Save Cancel
    Learning Material 2
    • PDF
      Slides AssociativeLearning.pdf
    • PDF
      Download Lecture Overview
    Report mistake
    Transcript

    00:00 Now, let’s take a look at another scenario and these are some classic experiment done from our friend B.F. Skinner and he created something called the Skinner Box and basically it was a box that how was the rodent so whether the rat or a mouse and it was given a task to do and in there was a little lever and the lever would if depressed would potentially deliver a reward and that was usually some food.

    00:28 Now, a poor little rat was sitting on an electrical grid and that was connected to a shock generator.

    00:36 So yes you put those two together and we would also electrocute our friend and so we had sort of a negative reinforcer there as well.

    00:44 And there is some light and a speaker so we could do a lot of different conditioning with the lights saying, “when the light goes, pull the lever and get a treat or get shock” and so on.

    00:56 So a lot of different scenarios that you could set up.

    00:58 So Operant Conditioning used as reinforcement and punishment to influence behavior and cause Associative Learning.

    01:05 So now this is how we’re able to showcase how the animal was able to learn something using either reinforcement or punishment.

    01:15 Let’s walk through some of the aspects here.

    01:17 So Reinforcement, this would increase the likelihood that the behavior will be repeated so the animal is going to repeat it because it’s happy.

    01:24 So what are we using for reinforcement? The example usually is food, right.

    01:28 So positive reinforcement means “I’m adding something” typically in this scenario would be a food pellet the rat is happy it does something correctly, it was positively reinforced and we’re adding the food pellet to the mix.

    01:43 Now the alternative is negative reinforcement and here is we’re usually removing something.

    01:48 So if it does the job correctly I will no longer shock mister rat and it saying, Thank God that I’m not getting that zap anymore” and it’s going to as well do a good job and repeat that behavior 'cause it doesn’t want to get shock, right.

    02:02 So positive reinforcement –food adding food pellet.

    02:06 Negative reinforcing –remove that negative stimulus to shocking, okay.

    02:11 Now, we can have a primary reinforcer and this is usually something that’s making them feel quite good it’s innately satisfying or desirable.

    02:21 The example again is food, pain avoidance and there's a direct relationship.

    02:26 Now, you can have a secondary or condition reinforcer and this is us now pairing or creating a relationship so a neutral stimulus is paired with the primary reinforcer.

    02:37 So back to our Pavlov dog experiments, the same thing to do with the unconditioned stimulus and the conditioned stimulus so the bell and the food.

    02:44 Here we’re sort of talking about roughly the same thing so a primary reinforcer would be food, okay, so you did a good job, here’s your food but you could make this an indirect relationship.

    02:55 So say for example, I’m thinking of my daughter right now so we’re trying to get her to not wet the bed anymore so we’ve created a little pee pee chart and I know that we’ve all had that pee pee charts some use still using the pee pee chart.

    03:09 So you have this chart in the idea of the pee pee chart is if you don’t pee the bed you get a little sticker or a star, right and what I do for my daughter is I say, “If you get enough stars, you are going to get a trip to Disney, okay.” So now what I've done is I've paired the stars on the pee pee chart to a primary reinforcer which is the trip.

    03:33 So my daughter is now trying to normally a star is a neutral stimulus it’s not really doing anything but in this case because I paired it with the primary reinforcer it’s now become a secondary reinforcer.

    03:46 She gets enough stars and she goes to Disney and we get to see our girl on an Elsa right, okay.

    03:52 Now, the actual process of Operant Conditioning is when we rely on this reinforcement to create this learning and to create this change. And it can have different types of reinforcement, you can have something that’s very scheduled or you can have something that’s continuous, right.

    04:12 So you could do something where the reward in every single time or you can do something that’s were little intermittent.

    04:17 Now back to the pee pee chart example, if everytime she get a star I had to take her to Disney, I’ll be bankrupt and I would probably want to strangle on an Elsa So instead what we do is say “Okay, what when we have some type of intermittent reward, you have to get a full chart of stars before I deliver the reinforcement, okay.

    04:38 So we’ll look at a couple examples just a sec.

    04:40 So continuous reinforcement result in more rapid acquisition extinction, So what we’re saying here is if I give you a reward every single time you will very, very quickly acquire this relationship and you will learn that conditioning very, very quickly because pull a lever you get a little pellet, you pull a lever you get a little pellet very quickly you get that relationship, okay.

    04:58 Now, if you look at intermittent reinforcement, the results are slower but you have greater persistence meaning that will actually last a lot longer and it will continue the behavior even more.

    05:07 Compare to the first scenario, if everytime you're pulling that lever, mister rat is pulling and it gets a food pellet every single time, all of a sudden even if it just two times in a row it pulls the lever it doesn’t get food, it’s going to stop.

    05:20 He's going to say, “Wait a second, everyday for the last nine months I pulled that lever I get my pellet and I pulled that twice now I’m getting nothing, I’m out, okay.” Now versus the intermittent reinforcement, the rats pulling the lever and maybe every few times he gets a pellet and it eats.

    05:38 Pulls the lever, it eats and he’ll do that over and over but because it’s not continuous it will take a lot for the rat to start giving up.

    05:46 It will say, “Wait a second, let me just keep trying 'cause I know it doesn’t come everytime and it persist and it persist and it persist.

    05:53 Now the most successful way to do it is actually a combo of both.

    05:57 So you start with continuous meaning pellet everytime and then you switch over the intermittent and now it’s getting a pellet every few times.

    06:05 And the combination actually gives you the best success and a longest lasting conditioning response.

    06:14 Okay so what are some of these intermittent reinforcements schedule.

    06:17 How could we break that down if we’re not going to go with continuous? One is the Fixed-ratio schedule, So here we’re doing a reinforcement after a set number of instances of the behavior.

    06:27 I know apriori like the experimenter myself or whoever is doing that actual conditioning will know every four lever pushes I will give them a reward; every four I’ll get a reward.

    06:37 Now as that’s the case you’ll notice it almost becomes like continuous reinforcement 'cause the animal or the individual very quickly learn four levers equals one pellet.

    06:46 So whether you’re doing it one to one or four to one, it’s still us to understand that relationship, okay.

    06:52 Now the Variable-ratio is where you get reinforcement after an unpredictable number of instances of behavior.

    06:58 Perfect human example would be gambling.

    07:00 How many times do you go and you put your money in the slot and you pull a crank and your waiting for the cherry, cherry, cherry be always keep getting cherry bell bell, right but you keep going and all it takes is that one time and you hear that “ding, ding, ding, ding” and you get shower of all this coins and you’re the happiest person on the planet and you won.

    07:17 And you’re getting the rewards, you get the positive reinforcements.

    07:20 Now, are you going to go back and do it again? Of course you are, right.

    07:23 You’re gambling and you’re having a great time and you think you’re going to win everytime, right but you don’t.

    07:28 Now that unpredictability keeps you coming back for more and more.

    07:32 There's a fixed-interval schedules, Now what we’re doing is we provide reinforcements after a set of period of time, okay.

    07:40 So that’s different versus their fixed-ratio schedule we were talking about number of instances.

    07:46 Now we’re talking about time. So now we’re saying I’m going to let him hit the lever or I’m going to let them pull that crank of the gamble machine but every five minutes, they’ll get a reward, okay.

    07:55 And the same idea variable-interval schedule, I’m going to give them award but it’s going to be over a period of time but that time is not going to be five minutes, it’s just going to be inconsistent –five minutes this time then maybe I’ll do it again after two minutes, maybe I’ll do it again at ten minutes.

    08:10 That time is inconsistent. So if we put that as summer we can say the ratio schedules are based on number of instances and interval schedule is based on time, okay so you should be able to make those difference.

    08:21 Now you can see all these together links up ways to condition our response whether it’s a Classical, Pavlovian Conditioning or we have this Operant Conditioning at the end of the day you’re pairing different types of stimulus and reward and reinforcement in order to teach a learned condition behavior.

    08:38 So when considering operant conditioning, we understand that not every behavior is actually learned by simply providing reinforcement So sometimes we actually need to work to the process. So a step-wise approach can actually be used to achieve a final behavior by reinforcing intermediate behaviors.

    08:54 So let’s take this as example that we have here.

    08:56 So the child pulls himself up, the small little baby and you’re the parent and as the child comes up the end goal of what they want is to learn how to walk. So we can't reinforce them right there on that first task and get right to walking, we actually support and reinforce the intermediate steps.

    09:15 So when the child pulls themselves up you stand there and your kind of say, “Good job, we’re so happy. You’re amazing, you’re a genius” and the baby falls and bangs her face.

    09:23 And then they try it again and they get better, and they get better then they stand up while using some support and again you’re there to help them and reinforce and then they do that with no support and also they get to walking.

    09:35 So the reinforcement along the way is what actually allowed you to get to the end goal of walking. So in that case in terms of operant conditioning, the reinforcement again fell with the intermediate steps with the intermediate behaviors.

    09:48 Now let’s take a look at the flip side of Punishment.

    09:52 Now punishment is the process by which the behaviors is followed by a consequence that decreases the likelihood that that behavior will be repeated.

    09:58 So in English, I think we all know punishment is we’ve all been punished before, it simply means that you will not do that behavior as much because of what has happened because of the punishment.

    10:08 Now reinforcement increases the behavior while punishment decreases it.

    10:13 And there's two types I’m going to introduce here.

    10:15 The first is Positive Punishment and this is where you deliver or pair a negative stimulus with the behavior versus a negative punishment where you’re removing or removal of a reinforcing stimulus after the behavior has occurred.

    10:29 So an example might be for positive punishment, because of what has happened I’m going to you drop down right now and get me a 100 push ups, so that’s a punishment and its being paired with you doing the behavior because that’s result of the behavior.

    10:43 Now, negative punishment might be the Skinner box for example of the rat sitting in a box and being electrocuted and after the behavior is complete it will remove that shock.

    10:59 So one case for adding something and another case removing something.

    11:03 So reinforcement we know how has a more long lasting effect than punishment and better than causing behavioral change which is why a lot of times you hear in terms of child development it’s a little bit better to be more positive when the child and reinforce as opposed to punish.

    11:19 When we see that even with the rehabilitation and punishment so presence versus the rehabilitation so we know that when you reinforce we see a better improvement and behavioral change versus punishment.

    11:31 So it’s kind of summarize here in this schematic, we have on the top whether you’re adding a stimulus or you’re removing a stimulus whether we’re increasing the behavior, decreasing the behavior we’re looking at positive reinforcement and negative reinforcement as two vehicles which to increase behavior and we have positive punishment and negative punishment as things like decrease behavior.

    11:52 So let’s take a look at Escape and Avoidance Learning.

    11:54 So these are two types of Operant Learning, so we’re learning something here but we’re using two different means of which to do it.

    12:00 In Escape, we learn how to avoid an aversive stimulus by engaging in a particular behavior.

    12:05 So say, your mom says, “You need to eat this cut up fruit, it’s healthy for you” and the child does not want this fruit and what does the child do? The child could just shut up and take it but that’s typically not what happened, instead the child has a meltdown, “I don’t want this grapes”, starts throwing the grapes, starts freaking out and so the mother eventually says, “Fine, enough with the grapes. Don’t worry about the grapes” and the child just realize, “hey big mama wanting me to eat this grapes, I don’t want to eat the grapes. I had a melt down and I've got out of eating the grapes”.

    12:37 And so now they have escaped having to engaged in eating these grapes by they’ve avoided having eating these grapes by engaging in that behavior of a meltdown.

    12:48 Now in avoidance, we perform a behavior to ensure the aversive stimuli not even encountered.

    12:52 So say for example, there's something going on at school so in this image we have a child who’s not feeling well, right.

    13:01 So what the child has realized, well hey, I go to school and there's something there that I don’t want to do like write an exam, “Well why don’t I just avoid that whole interaction of having into not doing an exam by just saying that I’m sick.” So we’ve completely avoided having to deal with that interaction of writing an exam and so you have an indirect measure or indirect act or behavior to avoid that so this faking your sick gets you out having to go into school and even engage or interact with that possible aversive stimulus.

    13:36 So escape versus avoidance the end result is you’re not having to engage in that stimulus in the first place.

    13:43 One is hands on direct by a freaking out and the other one is inadvertently saying that “I’m sick and not having to engage.”


    About the Lecture

    The lecture Operant Conditioning – Associative Learning (PSY, BIO) by Tarry Ahuja, MD is from the course Attitude and Behavior Change.


    Included Quiz Questions

    1. B.F. Skinner.
    2. Ivan Pavlov.
    3. Rescorla–Wagner.
    4. Edward Thorndike.
    5. Joseph LeDoux.
    1. It uses rewards and punishments to cause associative learning.
    2. It uses laws and sanctions to influence behavior.
    3. It conditions a neutral stimulus to an innate response.
    4. It relies on repeated exposure to a stimulus to cause a decreasing response.
    5. It relies on repeated exposure to a stimulus to cause an increasing response.
    1. Increased likelihood of repetition of a behavior.
    2. Removing a negative stimulus increases the likelihood of repetition of a behavior.
    3. Removing a negative stimulus decreases the likelihood of repetition of a behavior.
    4. It is based on frequency of stimulus.
    5. Adding a positive stimulus decreases the likelihood of repetition of a behavior.
    1. Avoiding pain.
    2. Bell.
    3. Tokens.
    4. Gold stars.
    5. Academic grades.
    1. Combination reinforcement.
    2. Continuous reinforcement.
    3. Fixed ratio reinforcement.
    4. Variable ratio reinforcement.
    5. Variable interval reinforcement.
    1. Variable interval schedule.
    2. Variable ratio schedule.
    3. Combination reinforcement.
    4. Continuous reinforcement.
    5. Fixed interval schedule.

    Author of lecture Operant Conditioning – Associative Learning (PSY, BIO)

     Tarry Ahuja, MD

    Tarry Ahuja, MD


    Customer reviews

    (1)
    5,0 of 5 stars
    5 Stars
    5
    4 Stars
    0
    3 Stars
    0
    2 Stars
    0
    1  Star
    0