Dichotomous Variables – Statistics Basics

Name: Dichotomous Variables – Statistics Basics
Uploaded: 2017-06-22
Duration: 6 min 28 s
Description: The lecture Dichotomous Variables – Statistics Basics by Raywat Deonandan, PhD is from the course Statistics: Basics.

by Raywat Deonandan, PhD

Ask Questions

Take Notes

Download Slides

Report Mistake

Comments

show all Show less

My Notes

show all

Learning Material 3

PDF

Slides 03 Statistics Epidemiology.pdf
PDF

Reference List Epidemiology and Biostatistics.pdf
PDF

Download Lecture Overview

show all

Report mistake

Transcript

00:00 they are discrete, they are categorical. Now when we have a categorical variable with two levels, we call that a dichotomous variable. For example, male versus female, there are two levels, the dichotomous variable or employed versus unemployed, or we have a disease or doesn't have a disease or guilty versus innocent in a courtroom. An emergency therapy outcome we can have someone who's alive or deceased, you're never something in between. Why is this important? It is important because a lot of the computations that we will be doing future lectures depend upon whether or not the exposure or outcome variables are dichotomous.

00:38 We call those two by two contingency tables and we'll learn about that more in greater detail in future lectures. We can also take a continuous variable and turn it into a dichotomous variable, we call that the process of dichotomization. For example, let's say I have the ages of six individuals in a study, 23, 17, 14, 35, 68 and 15 and I can decide to categorize them into two groups, those who are under 18 and those who are 18 or over. Well, why would I want to do that? It may seem that I'm losing information by going from a continuous realm to a categorical realm. And it's true, I am losing information, it's typically not advisable to do that. For example, I can compute a mean age or a median age; I can't do that with my age groups anymore. If I have an individual who is 23, I know that that person is over 18, but if I know that someone's over 18, I don't know that he is 23, I've lost the ability to extract some nuance when I go to a dichotomous realm. So why would we want to do that? Again, it depends on the context; can you imagine a scenario in which it would be useful to dichotomize at age 18? You probably can, because 18 is an important age in a variety of places, it's the age you can drink sometimes or vote, so maybe I care about if my set of individuals in my study are of voting age or not, in which case I would cut them off at 18 and that has meaning. Again, context breeds information.

02:11 It's possible to create categorical variables with more than just two levels from my continuous set. It doesn't have to be dichotomous. For example, that same set of six individuals I can create three categories for, those who are under 25, those who are 26-50 and those who are over 50. You may notice that from surveys you may have participated in, sometimes they ask for your age group, that's what they're doing; they're artificially creating a categorical variable out of a continuous variable. So it's important to think about where these numbers might come from and how to manipulate them in order to learn some wisdom about a larger set of individuals. This is what we call sampling. When we take a sample from a population, what we're doing is trying to get a representative set of individuals, upon which we can perform certain statistical tests that allow us to infer information about that population, inference is the key word here, because there are two kinds of statistics, there's a descriptive and inferential. With descriptive statistics I'm just describing the people that I have in front of me, six people with six different ages for example.

03:22 With inferential statistics, I'm using that information to learn something about the larger population at hand. So where does this sample come from? It comes from a larger population, sometimes called a reference population. We extract a sample from that larger population, we manipulate that sample with statistics and we learn something about the larger population.

03:44 It's important that that sample be representative, imagine if we selected a portion from that larger population that was atypical, that did not have the characteristics that one typically sees in that larger population, I might make faulty conclusions about that greater population because I chose poorly, my sample must be representative.

04:07 So let's say we want to do a study in the USA, we're going to measure the prevalence rate of perceived back pain and we're going to do it via telephone survey, which is a very common way to conduct health sciences investigations with populations of this size.

04:23 I'm going to have to take a sample of the American population, ask them about their back pain and make conclusions now about the overall American population. It's not feasible to ask the entire 300 million citizens of the USA about their back pain. I can’t afford it; I haven't got their phone numbers, so I have to use a sample. So I'm trying to generalize to all the adults in the USA, who can I access via telephone survey? Well, only those who have telephones, obviously. How can I access them? Well I'm going to buy a block of listed numbers from the phone company; this is typically how it is done, and who's in my study? Well pretty much those who answer the phone and agree to participate. Alright. So all of the adults in the USA, that’s the reference population. Who is the accessible population? Well those with telephones, and who is the sampling frame? Those individuals whose numbers I've purchased. Now from that frame I'm going to select a bunch to ask to participate and those who agree are my sample. Now think about this, the sample is where I do my statistics, when I’m making conclusions about their reference population. So we end up with a bunch of people with phones and listed numbers who agree to be interviewed, and we are going to conclude from their results some wisdom about the entire American population.

05:51 Ask yourself, is this is a rational way to go. There is going to be some bias here, the kinds of people who typically have landline still are not typical of most people, the kinds of people who are home when you call them are not typical of most people, and the kinds of people who agree to participate in this kind of study probably aren't typical of most people. They are a particular kind of American phone owner, but yet their responses are going to allow us to generalize to the greater American population. That's a kind of bias and we’re going to talk about biases in more depth in a further lecture.

About the Lecture

The lecture Dichotomous Variables – Statistics Basics by Raywat Deonandan, PhD is from the course Statistics: Basics.

Included Quiz Questions

What is meant by the term “dichotomization” in epidemiology?

Conversion of a continuous variable into two groups
Dividing a continuous variable into multiple groups
Sampling a small group of the population to represent a larger sample of the population
The categorization of variables into dependent and independent
"All or none" thinking in dialectical behavioral therapy

Which of the following is a dichotomous variable?

Results of a screening test
Incidence of a disease
Education level
Household income
Number of siblings

A study is developed to examine associations between weight and nutritional status in a sample of American children. Which of the following is the “reference population” for this study?

All children in the USA
Children with obesity in the USA
The children whose data is included in the study
Nutritional status of children in the USA
Vaccination status of children in the USA

A study is developed to examine the effect of social settings on the acceptance of HIV testing. Which of the following represents a “sampling frame” for this study?

List of patients at a medical office
The respondents who fill out the survey
All the people who have HIV in a population
All the people who refused to be tested for HIV in the population
All the people who have been tested for HIV in the population

Author of lecture Dichotomous Variables – Statistics Basics

Raywat Deonandan, PhD

Customer reviews

(1)
5,0 of 5 stars

5 Stars		5
4 Stars		0
3 Stars		0
2 Stars		0
1 Star		0

Playlist

Show Playlist

Hide Playlist