Welcome back for Lecture 14.
Where we’re gonna describe some special probability models.
Let’s start by motivating what we’re gonna do with an example.
Suppose you enter a free throw shooting contest
and the idea is to make as many free throws as you can in a given number of attempts.
The next part of the contest is to start from 0 and then shoot until you make the first free throw.
Once you make one, you’re done. This part of the contest depends on how many shots you took.
Let’s look at what Bernoulli Trials are because this concept is going to be central to the rest of this lecture.
They’re just like coin tosses.
A Bernoulli Trial is a situation in which there are only two possible outcomes.
We generically term these possible outcomes success and failure.
On each trial, the probability of success which we’ll call P is the same.
In the free throw shooting example, each free throw is a Bernoulli Trial because you either make it or you don’t.
Success would be making the free throw a failure would be a miss.
In the rest of this section, Bernoulli Trials are going to be assumed to be carried out independently.
Making the first free-throw does not change the probability that you would make the second one, for instance.
Suppose the probability that you make any given free throw is 0.6,
we’re gonna let Y be a random variable corresponding to the number of free throws you take before you make the first one.
Then we’re gonna build a probability model based on it.
We’re gonna assume that we stop shooting after we make the first free throw.
Let’s first describe the probabilities of the events.
What’s the probability that you make the first free throw?
Well, this corresponds to the probability that Y = 1 and this probability is 0.6.
What is the probability that you make your first one on the second shot?
This means that you missed the first which happens with probability 0.4.
Then you made the second which happens with probability 0.6.
This corresponds to the probability that Y = 2.
Since the trials are independent, since your free throws are independent,
the probability that you missed the first and make the second
is the probability that you missed the first times the probability that you make the second.
So you get 0.4 times 0.6 or 0.24.
Let’s look at the 10% Rule and when we can assume that Bernoulli Trials are independent.
Often, the assumption of independence among Bernoulli Trials is a reasonable assumption.
The Free Throw Example is one of these.
Flipping coins is another example.
These activities above, in theory, they can be done infinitely many times.
Each trial does not remove one from the population.
This is not always true.
Suppose maybe you’re looking for a specific toy in a cereal box and you know that 20% of the boxes have that toy.
Each time you try to get it, you remove a box from the population.
This changes the probability of getting the box with a toy in it
next time since one has already been removed from the population.
With small populations especially, this is a huge problem.
We go by the 10% Rule.
Bernoulli trials have to be independent.
If they’re not, it’s okay to proceed as long as we randomly sample less than 10% of the population.
Let’s look at a special probability model, the Geometric Model which basically models waiting for success.
What it does is it models the number of Bernoulli Trials we take until we get the first success.
Let’s let P be the probability of success and let X be a random variable corresponding to the trial
on which the first success occurs.
If X = K, then there were K minus 1 failure before the first success.
Therefore, since the Bernoulli Trials are independent,
the probability that X = K is the probability of failure on the first K minus 1,
so 1 minus P to the K minus one power times the probability of success on the last one.
We get the probability that X = K = P times 1 minus P, K minus 1 or K = 1, 2, etc, all the way up.
For the Geometric Model, the expected value of that random variable is 1 over P
which is 1 over the probability of success
and the standard deviation of a geometric random variable is the square root of 1 minus P over P-squared.
Let’s use the Free Throw Example and try some of these out.
In the Free Throw Example, we started to look at how many free throws were taken until the first one was made.
If X is the number of free throws, we used Y then let’s just call it X now to correspond with the notation of the model.
Then the probability that X = K is 0.6 times 0.4 to the power of K minus one. For K = 1, 2, 3, 4, etc.
The probability of making your first free throw on your 6th attempt
is the probability that X = 6 or 0.6 times 0.4 to the fifth or 0.006144.
The number of trials until your first free throw is made
is a geometric random variable with a success probability of P = 0.6.
Now, let’s look at the Binomial Model. Now, we’re counting successes.
Sometimes we have a set number of independent Bernoulli Trails which we’ll call N.
What we’re interested in is how many successes we get in those N trials.
For example, we take 5 free throws and we wanna know what the probability is that we make 4 of them.
If Y is the random variable corresponding to the number of free throws,
then this an example of a random variable that follows what we call the Binomial Model.
We need to count the number of ways to get a given number of successes.
If we have some number n of Bernoulli Trials each with success probability p,
we wanna know the probability of getting y successes.
The first thing we need to know is how many different ways can we get y successes.
For example, getting 2 heads in 3 flips of a fair coin.
We can get the following arrangements:
we’re gonna have heads, heads, tails; heads, tails, heads; tails, heads, heads.
There are 3 ways to get 2 successes or 2 heads in 3 flips of a coin.
In general, we have this formula for combinations which is counting without writing out all the possibilities.
In general, there are what’s called n choose y ways to get y successes in n trials.
Let’s think about what n choose y means.
Formally, what n choose y is n factorial divided by y factorial times n minus y factorial.
That’s what the exclamation point means.
When we talk about factorials, n factorials is n times n minus 1 times N minus 2, all the way down to 2 times 1.
We’re taking N and then multiplying it by the number one less than it,
the number 2 less than it, and all the numbers down to 1.
For example, 4 factorial is 4 times 3 times 2 times 1 or 24.
We can use this to find a number of ways to get two heads in three flips of the coin.
We would have 3 choose 2 which would be 3 factorial divided by 2 factorial times 3 minus 2 factorial
which is 6 divided by 2 or 3.
That matches the number we got by writing out all of the possible arrangements.
Let’s look at the Binomial Model and how we use it to find probabilities.
For each outcome in which out of n trials, we get y successes.
Since the trials are independent, the probability of this particular outcome
is p to the y times 1 minus p to the n minus y.
Since each number of successes can happen in n choose y ways,
we have the probability that y takes the value of little y is n choose little y, p to the little y,
1 minus p to the n minus little y for y equals 1, 2, 3, 4, all the way up to the number of trials.
For the Binomial Model, what we have is the expected value of a binomial random variable
is the number of trials times the success probability and the standard deviation of a binomial random variable
is the square root of the number of trials times the success probability times 1 minus the success probability.
In summary, the Binomial Model describes the number of successes
that occur in a given number of independent Bernoulli Trials
where each trial has the same probability of success.
For example, back to the free throws.
Suppose you’re the last one to shoot in a free throw shooting contest, you get 10 shots.
The current leader made 8.
We wanna know: What is the probability is that you tie the current leader?
Then what is the probability that you beat him?
Now, the words what’s the probability that you make 9 or 10?
The probability of a tie corresponds to the probability that y = 8
just 10 choose 8 times 0.6 to the 8 times 0.4 to the 2 or 0.1209.
Probability that you win is the probability that y = 9 or 10.
Remember that 9 and 10 can’t happen at the same time.
The probability that either one of those happens is just the sum of the probabilities.
We have the probability that y= 9 plus the probability that y = 10.
We get 10 choose 9 times 0.6 to the 9 times 0.4 to the 1,
plus 10 choose 10 times 0.6 to the 10 times 0.4 to the 0.
We get 0.0403 plus 0.0006.
We get 0.0463 as the probability that you beat the current leader.
The expected number of free throws that you'll make is number of trials 10,
times the success probability, 0.6, 6 six.
The standard deviation of the number of free throws that you’ll make
is the square root of the number of trials times the success probability
times the failure probability or the square root of 10 times 0.6 times 0.4 or 1.5492.
What happens when we have a large number of trials?
We saw in the previous example that, if we want the probability of a win,
we have to add a couple of probabilities.
Suppose in the Free Shooting Example, the contest consists of you shooting 500 free throws
and you want the probability that you make more than 400 of them.
Then you have Y as a binomial random variable but now there are 500 trials with success probability 0.6.
The probability that Y is at least 400 is probability that Y is equal to 400
plus the probability that Y is equal to 401 plus all of them up to the probability that Y equals 500.
That’s a lot of work.
You have to find 101 binomial probabilities using the formula, the complicated formula under the Binomial Model.
It’s gonna take you a long time to do.
Fortunately, we have a way around this using the normal model
in some situations that it makes our lives so much easier.
Let’s look at the Normal Approximation to the binomial distribution and when it can be used.
Suppose Y is a binomial random variable with N trials and P as the success probability
then recall that the expected value of Y is N times P
and the standard deviation of Y is the square root of N times P times 1 minus P.
We can use these as the mean and the standard deviation of the normal distribution.
When can we do this?
Basically, whenever the Success/Failure Condition is satisfied.
The Binomial Model can be approximated with the normal distribution
if we expect at least 10 successes and at least 10 failures.
In other words, if N times P is at least 10 and N times 1 minus P is at least 10.
This is the Success/Failure Condition. How do we use the Normal Approximation?
Well, if the Success/Failure Condition is satisfied,
then the random variable Y can be modeled as a normal random variable
with mean NP and standard deviation square root of N times P times 1 minus P.
Let’s revisit the Free Throw Question.
Suppose that we take 50 free throws and we want the probability of making at least 40 of them.
In other words, we want the probability that Y is at least 40.
If we were to calculate this using the Binomial Model, we would get the probability of 0.0022.
Let’s try this using the Normal Approximation.
All right, so if we’re going to use the Normal Approximation,
then the first thing we need to do is to check the Success/Failure Condition.
We have N equals 50, P equals 0.6, so N times P is 50 times 0.6 or 30 which is bigger than 10, so we’re good there.
N times 1 minus P is 50 times 1 minus 0.6 which is 20 which is also bigger than 10, so we’re good there.
The Success/Failure Condition is satisfied,
so Y is approximately normal with mean NP = 30 and standard deviation,
square root of N times P times one minus P or 3.4641.
Using the Normal Approximation then, the probability that Y is bigger than or equal to 40
is equal to the probability that a Z random variable or normal zero-one random variable
is at least 40 minus 30 divided by 3.4641 or the probability that the normal zero-one random variable
is at least 2.89 or 1 minus the probability that the normal zero-one random variable
is less than or equal to 2.89 which gives us a probability of 0.0019.
We see that the Normal Approximation gives a value that’s pretty close to what comes from the Binomial Model.
This approximation improves with more trials and when the success probability is close to 0.5.
There is an issue with the Normal Approximation,
however, especially with describing discrete random variables with Continuous Models in general.
The one big issue is that with Normal Approximation is we can’t use it to find the probability that we have a particular number of successes.
For instance, we can’t use it to determine the probability that we make exactly 40 free throws.
This happens because continuous random variables can take any value in a particular range
and so we can’t list all the possible outcomes.
Approximation can be used to find the probability of getting a number of successes in a particular range.
Now, let’s look at Statistical Significance.
Is what we saw strange? Suppose you make all 10 free throws.
This is more than what you would expect but how unexpected is it really?
Well, you believe that your free throw percentage is 60%.
If this is actually the case, then the probability that you would make all 10
is only 10 choose 10 times 0.6 to the 10 times 0.4 to the zero or 0.006.
Making all 10 is very rare if, in fact, your free throw percentage is actually 60%.
This means that this result is extremely unlikely to happen by chance.
This gives evidence that your free throw percentage may, in fact, be higher than 60%.
This result is termed to be statistically significant because it’s not reasonable to assume that it just happened by chance.
We can’t assume that it just happened by chance because it’s so rare if the free throw percentage is actually 60%.
What can go wrong?
A lot of issues can come about with the use of probability models.
Let’s look at some of the pitfalls we want to avoid.
If you’re going to use a Geometric or a Binomial Model, make sure you have Bernoulli Trials.
Results coming out of such models are meaningless if you’re not using Bernoulli Trials.
Don’t use the Normal Approximation with small numbers of trials.
It won’t work very well especially if the success probability is far away from 0.5.
Finally, do not confuse Geometric and Binomial Models.
Geometric Models count the number of trials until one success occurs.
The Binomial Model counts the number of successes in a given number of independent Bernoulli Trials.
What we’ve done here is we’ve described what probability models are and we’ve talked about a couple of them for discrete random variables.
We talked about the Geometric Model. We talked about the Binomial Model.
We described how to approximate the Binomial Model with the normal distribution.
Then we closed with the discussion of Statistical Significance and issues that can arise in the use of probability models.
Congratulations! You’ve made it through the first Statistics course and now you know a lot more about Linear Regression and Probability.
I hope that you’ve enjoyed the course. I hope that you’ve learned a lot.
I hope that you’ll consider continuing to build your knowledge of Statistics
by taking the second course which deals with Statistical Inference and Data Analysis.