Misclassification Bias

by Raywat Deonandan, PhD

My Notes
  • Required.
Save Cancel
    Learning Material 2
    • PDF
      Slides 05 CausationBiasInformation Epidemiology.pdf
    • PDF
      Download Lecture Overview
    Report mistake

    00:00 So one of the most important aspects of information bias is misclassification bias, that's when the actual records might be wrong. Many epidemiological investigations use medical records, we'll look back to see how diagnoses were made, we'll look at government registries to see the prevalence of a variety of diseases, so sometimes those records are simply wrong. So misclassification is a type of information bias, it's when some people have the disease or maybe they're labelled as not having the disease or vice versa. As an example, let's say we're trying to compute the prevalence of menopause suffers in some population, but sometimes the records say that someone has menopause based upon their age alone and sometimes it's based upon whether or not their menses has ceased. So clearly two different definitions of the same condition is being used and that gives us an artificial problem with a change in definition, it's a misclassification problem. A more famous example is when they developed a more accurate diagnosis for AIDS, a more accurate diagnosis has to do testing for the HIV virus, but before that was available to us, we used a visual definition of AIDS, something called the Bangui definition, so a checklist of symptoms was used to determine whether or not a patient likely was suffering from AIDS. So when the HIV test came into play, all the previous definitions of AIDS weren't applicable anymore, if you look past through the historical prevalence data, apparently around the mid-1980s, the prevalence rate shifts dramatically, it wasn't because something dramatically happened that was different in terms of the disease itself, just that the classification of the disease had changed, that's an information bias, a misclassification bias.

    01:48 When we talk about misclassification bias, we can either be differential or non-differential, one of them has to do with a directionality and the other has to do with a lack of directionality, let's work through some examples and see if you can tell what I'm talking about. So for example, let's say we have a study that is attempting to measure whether mothers of malformed babies had more infections during pregnancy than did mothers of normal babies. And to conduct this research, we will find some mothers who have malformed babies and some will have normal babies and we'll ask them about their experiences when they first gave birth in the hospital, whether or not they were more infections around that time. Women with malformed babies tend to have more problematic pregnancies in general and therefore more doctor content, therefore they're more likely to remember infections more so than those without malformed babies, so they are more likely to have a better recognition or recollection of infection and that gives us an artificial sense of the relationship between infection rates when they are giving birth and the likelihood of having malformed babies. So in differential misclassification bias, errors tend to be in one direction. Let's look at this example. We are using a blood pressure cuff to take measurements of both adults and children. Let's say we're measuring association between blood pressure and intelligence, or any kind of outcome you care about, it doesn't really matter, it's the measurement we care about here. Now as you probably know, children have smaller arms than adults, so one blood pressure cuff will behave differently from both of those populations, amongst the children it's going to underestimate blood pressure, amongst adults it probably will measure it appropriately. So this will give us a sense that the children have lower blood pressures than the adults do, that's differential bias, one group is having a measurement that is dramatically different from the other group in one direction only. Now for non-differential misclassification bias, the readings can be random, or at the very least, not in one direction or another. So let's say we have the same blood pressure example and instead of having a blood pressure cuff for children and adults, we're doing one for two groups of adults, but the cuff doesn't work, it's broken, so we're going to have random data for both groups, it doesn't matter which group is having what direction, we're going to be random and in fact that's going to bias our results towards the null hypothesis, we're less likely to find an affect artificially. So the thing about misclassification bias is that it's inherent in the data collection methodology and probably avoidable.

    04:29 A really popular kind of information bias is a recall bias, we encounter recall bias a lot in surveys and recall bias is when subjects have selective memory of events, they recall things differently. So it's when the response to a study questions influenced by the respondents memory as well as by his actual opinion and our memories are notoriously unreliable. Here is an example, in 1995 there is a very famous criminal court trial, maybe you remember it, it was the O.J. Simpson murder trial, it was in all the news, everybody watched it. So let's say you're doing a study in 2005 and you're doing a random survey asking people if they thought O.J. was actually guilty, because he was declared not guilty in the trial and whether or not they thought the trial was fair, it's entirely possible that people who thought he was truly innocent were more likely to remember the details because they are more angered by the result than those who think he's guilty, so their recall is being affected by their opinions about the outcome, so that's recall bias.

    05:36 Similar to recall bias is interviewer bias. Interviewer bias is when the researcher elicits the responses that he or she wants when doing the interviewing. For example, you can use your body language, the tone of your voice or even sort of leaning forward and making a face like this when someone is saying something interesting, people are susceptible to triggers from other individuals, this is why it's important to have well-trained interviewers who know to be objective and to not give away their desires when interviewing people when doing this sort of research.

    About the Lecture

    The lecture Misclassification Bias by Raywat Deonandan, PhD is from the course Statistical Biases.

    Included Quiz Questions

    1. It skews the results in one direction.
    2. The records for only a certain subset of the sample are recorded incorrectly.
    3. It leads to overestimating the relationship between exposure and outcome.
    4. It leads to underestimating the relationship between exposure and outcome.
    5. The effect of the bias favors the null hypothesis.
    1. Rejecting the null hypothesis when it is true is a likely consequence of non-differential misclassification bias.
    2. Each group or category of variable has the same probability of being misclassified for all study subjects.
    3. The bias is inherent in the data collection methodology.
    4. The bias does not differ between study groups.
    5. If the data is collected correctly then it is avoidable.
    1. Retrospective survey study
    2. Prospective survey study
    3. Case-control study
    4. Randomized control trial
    5. Cohort study
    1. Avoiding questions that violate social norms on survey questionnaires
    2. Randomly assigning subjects to different interviewers
    3. Using a structured process to record observations/responses
    4. Using a computer to communicate with the subject with the interviewer in another room
    5. Training interviewers to use neutral body language and tone of voice

    Author of lecture Misclassification Bias

     Raywat Deonandan, PhD

    Raywat Deonandan, PhD

    Customer reviews

    5,0 of 5 stars
    5 Stars
    4 Stars
    3 Stars
    2 Stars
    1  Star