# Type I and Type II Errors – Data

by Raywat Deonandan, PhD
(1)

My Notes
• Required.
Learning Material 2
• PDF
Slides 13 Data Epidemiology.pdf
• PDF
Report mistake
Transcript

00:01 I set that, but historically, we set it somewhere useful, we set it at about 0.05. To explore this a bit further, let's work through the analogy. Analogy is a courtroom, I love courtroom analogies because they are always dramatic. Courtrooms are always about drama and tension.

00:13 So imagine a man is on trial for murder and the jury is either going to find him guilty or not guilty, they'll never find him innocent. That's kind of important statistically. They only find him not guilty, that's the best he can hope for. Now regardless of a jury's verdict, he is either going to be guilty or innocent in real life, but the jury is going to find him either guilty or innocent or not guilty rather. So the jury can make one of two types of error, let's think about this as a statistical test now, what is the null hypothesis that the jury is going to assume? Well it's that defendant is not guilty, that's the default state. We always assume the defendant is not guilty coming into the process. The jury is now going to assess the evidence to determine whether or not to reject this assumption that he is not guilty, therefore they're looking for the alternative hypothesis which is that he is guilty. So there are four possible outcomes here. If you think about it as a grid, if the man truly is innocent and the jury finds him not guilty, that was the right decision.

01:20 So the jury fails to reject the null hypothesis. The null hypothesis says the man is not guilty, they accept the null hypothesis and the man is innocent, that's great. But if the man actually is guilty and they found him not guilty, they have made a mistake. We call this a type II error. When in fact they have rejected or they failed to reject the null hypothesis incorrectly. On the other hand, let's say the jury does reject the null hypothesis, they find him guilty, but he really is innocent. That's a type I error. Or they reject the null hypothesis, i.e. they find him guilty and he is guilty, well that's a right decision. So again, out of our four possibilities, two of them are right decisions and two are wrong decisions. The two wrong decisions were; he truly is innocent and they found him guilty, or he truly is guilty and they found him not guilty. We call these type I and type II error in statistics. Now think about this, which one are you most offended by, a guilty man going free, or an innocent man going to prison? Now many societies are more offended by an innocent man going to prison, that’s a type I error, so they try to reduce the possibility of type I error as much as possible. We do the same thing in statistics, we try to reduce the chances of type I error as much as possible. So the conviction of an innocent man is like incorrectly rejecting the null hypothesis or in the case of epidemiological tests, maybe we're doing a clinical trial, the drug doesn't really work but we think it did, we committed a type I error. On the other hand, acquitting a guilty person is like incorrectly failing to reject the null, in the case of medical research, that's like saying the drug actually does work, but we didn't found that it worked, we thought it doesn't work, so we don't release it into market, that's a type II error.

03:16 So type I and type II error are kind of in balance, as we improve one, the other gets worse and we improve the other, the first one gets worse. So we have to decide which one we care about most and as I mentioned, most societies in a criminal court room environment choose to reduce the chance of type I error the most. That means conceivably several criminals go free, but it also means that it's not very likely that an innocent man goes to jail.

03:41 So we do the same thing in medical research. We make it as unlikely as possible that a drug that doesn't work makes it into market. So we set our type I and type II errors accordingly, we tend to like to set our type I error limit at around 5% or 1%, those are the two most common numbers. If you look up any medical study in the journals, you will probably find they have done the same thing. 5% is the most common, we call it a type I error alpha. We're looking for a p-value that's less than alpha, if our p-value was less than alpha, we say that we can reject the null hypothesis and find statistical significance.

04:22 So what have we learned in this lecture? A variable can be both a concept and an operation.

04:28 It has two phases, when you do statistics or mathematics on a variable never forget that it represents a philosophical construct, we sometimes lose sight of this. Different types of measurements allow different types of mathematics. Make sure you know which type of measurement your valuables correspond to, to know what kind of math to do. Frequency distributions can be expressed as a histogram, we also learned about the normal curve, it's a special kind of histogram and the central limit theorem describes the way in which almost any human characteristic eventually can be expressed as a normal curve and from the normal curve, we’ve learned about type I and type II errors and that's how we define statistical significance. So what have we learned? We've learned about the two phases of a variable,

The lecture Type I and Type II Errors – Data by Raywat Deonandan, PhD is from the course Data.

### Included Quiz Questions

1. Rate of type I error
2. Type II error
3. Type I error
4. Rate of the type II error
5. Random error

### Customer reviews

(1)
5,0 of 5 stars
 5 Stars 5 4 Stars 0 3 Stars 0 2 Stars 0 1  Star 0