Achieve Mastery of Medical Concepts

Study for medical school and boards with Lecturio

Statistical Tests and Data Representation

One of the main objectives of research Research Critical and exhaustive investigation or experimentation, having for its aim the discovery of new facts and their correct interpretation, the revision of accepted conclusions, theories, or laws in the light of newly discovered facts, or the practical application of such new or revised conclusions, theories, or laws. Conflict of Interest and medical studies is to learn what associations or outcomes are not a product Product A molecule created by the enzymatic reaction. Basics of Enzymes of chance. According to the study's design and the data it provides, a hypothesis can be accepted or rejected, allowing for a determination in correlation Correlation Determination of whether or not two variables are correlated. This means to study whether an increase or decrease in one variable corresponds to an increase or decrease in the other variable. Causality, Validity, and Reliability. Statistical tests are tools used by researchers to obtain information and meaning from pools of variable Variable Variables represent information about something that can change. The design of the measurement scales, or of the methods for obtaining information, will determine the data gathered and the characteristics of that data. As a result, a variable can be qualitative or quantitative, and may be further classified into subgroups. Types of Variables data. These tests come in several forms, including, for example, the chi-square and Fisher exact tests, and are chosen depending on the needs of the investigators and the characteristics of the variables being analyzed. Study results can be considered statistically significant based on calculated p-values and predetermined levels of significance (known as the α-level). Confidence intervals are another way to express the significance of a statistical result without using a p-value.

Last updated: Aug 9, 2022

Editorial responsibility: Stanley Oiseth, Lindsay Jones, Evelin Maza

Introduction

Hypothesis testing is used to assess the plausibility of a hypothesis by analyzing study data. 

For example, a company creates a new Drug X that is intended to treat hypertension Hypertension Hypertension, or high blood pressure, is a common disease that manifests as elevated systemic arterial pressures. Hypertension is most often asymptomatic and is found incidentally as part of a routine physical examination or during triage for an unrelated medical encounter. Hypertension. The company wants to know whether Drug X does in fact work to lower BP, so they need to do hypothesis testing.

Steps for testing a hypothesis:

  1. Formulate the hypothesis.
  2. Choose which statistical test you are going to use. 
  3. Set the significance level. 
  4. Calculate the test statistics from your data using the appropriate/chosen test.
  5. Conclusions:
    • A decision is made to reject or not reject the null hypothesis from step 1.
    • This decision is based on the predetermined levels of significance from step 3.

Formulating a Hypothesis

A hypothesis is a preliminary answer to a research Research Critical and exhaustive investigation or experimentation, having for its aim the discovery of new facts and their correct interpretation, the revision of accepted conclusions, theories, or laws in the light of newly discovered facts, or the practical application of such new or revised conclusions, theories, or laws. Conflict of Interest question (i.e., a “guess” about what the results will be). There are 2 types of hypotheses: the null hypothesis and the alternative hypothesis.

Null hypothesis

  • The null hypothesis (H0) states that there is no difference between the populations being studied (or put another way, there is no relationship Relationship A connection, association, or involvement between 2 or more parties. Clinician–Patient Relationship between the variables being tested).
  • Written as a formula, H0: µ1 = µ2, where µ represents the means (or average measurements) of groups 1 and 2, respectively  
  • Example: Drug X was created to lower BP. An experiment is designed to test whether Drug X actually lowers BP. Drug X is given to 1 group, while a 2nd group gets a placebo Placebo Any dummy medication or treatment. Although placebos originally were medicinal preparations having no specific pharmacological activity against a targeted condition, the concept has been extended to include treatments or procedures, especially those administered to control groups in clinical trials in order to provide baseline measurements for the experimental protocol. Epidemiological Studies. The null hypothesis would state that Drug X has no effect on BP and that both groups will have the same average BP at the end of the study period.

Alternative hypothesis

  • The alternative hypothesis (H1) states that there is a difference between the populations being studied.
  • Written as a formula, H1: µ1 ≠ µ2  
  • Example: In the experiment described above, the alternative hypothesis is that Drug X lowers BP, and that patients Patients Individuals participating in the health care system for the purpose of receiving therapeutic, diagnostic, or preventive procedures. Clinician–Patient Relationship in the study group getting Drug X will have lower BP than patients Patients Individuals participating in the health care system for the purpose of receiving therapeutic, diagnostic, or preventive procedures. Clinician–Patient Relationship in the placebo Placebo Any dummy medication or treatment. Although placebos originally were medicinal preparations having no specific pharmacological activity against a targeted condition, the concept has been extended to include treatments or procedures, especially those administered to control groups in clinical trials in order to provide baseline measurements for the experimental protocol. Epidemiological Studies group at the end of the study period.
  • H1 is a statement that researchers think is true.

What is the study really testing?

  • Hypothesis testing on samples can never verify a hypothesis with certainty and can only say that a hypothesis has a certain probability Probability Probability is a mathematical tool used to study randomness and provide predictions about the likelihood of something happening. There are several basic rules of probability that can be used to help determine the probability of multiple events happening together, separately, or sequentially. Basics of Probability to be true or false.
  • research Research Critical and exhaustive investigation or experimentation, having for its aim the discovery of new facts and their correct interpretation, the revision of accepted conclusions, theories, or laws in the light of newly discovered facts, or the practical application of such new or revised conclusions, theories, or laws. Conflict of Interest study involving hypotheses will either reject or fail to reject the null hypothesis.

Examples

Example 1: rejecting the null hypothesis

In the example above, if the findings of the trial show that Drug X does in fact significantly lower BP (that is, there is sufficient statistical evidence to support it), then the null hypothesis (postulating that there is no difference between the groups) is rejected with a given probability Probability Probability is a mathematical tool used to study randomness and provide predictions about the likelihood of something happening. There are several basic rules of probability that can be used to help determine the probability of multiple events happening together, separately, or sequentially. Basics of Probability. Note that these findings cannot confirm the alternative hypothesis, but only support it with a given probability Probability Probability is a mathematical tool used to study randomness and provide predictions about the likelihood of something happening. There are several basic rules of probability that can be used to help determine the probability of multiple events happening together, separately, or sequentially. Basics of Probability, determined by the sampling distribution in the population tested.

Example 2: failing to reject the null hypothesis

In the example above, if the findings of the trial show that Drug X did not significantly lower BP, then the study failed to reject the null hypothesis. Again, note that the findings cannot confirm the null hypothesis but only support it with a given probability Probability Probability is a mathematical tool used to study randomness and provide predictions about the likelihood of something happening. There are several basic rules of probability that can be used to help determine the probability of multiple events happening together, separately, or sequentially. Basics of Probability, determined by the sampling distribution in the population tested.

Types of errors and power

  • Type I error Type I error An error in which a test result incorrectly indicates the presence of a condition when the condition is not truly present. Epidemiological Values of Diagnostic Tests: 
    • The null hypothesis is true, but is rejected.
    • The chance of committing a type I error Type I error An error in which a test result incorrectly indicates the presence of a condition when the condition is not truly present. Epidemiological Values of Diagnostic Tests is represented as α. 
  • Type II error Type II error An error where the test result incorrectly fails to detect the presence of a condition when, in fact, the condition is present. Epidemiological Values of Diagnostic Tests: 
    • The null hypothesis is false, but is accepted/not rejected.
    • The chance of committing a type II error Type II error An error where the test result incorrectly fails to detect the presence of a condition when, in fact, the condition is present. Epidemiological Values of Diagnostic Tests is represented as β.
  • Power: 
    • The probability Probability Probability is a mathematical tool used to study randomness and provide predictions about the likelihood of something happening. There are several basic rules of probability that can be used to help determine the probability of multiple events happening together, separately, or sequentially. Basics of Probability that a test will correctly reject a false null hypothesis
    • Power = 1 – β 
    • Power depends on:
      • Sample size Sample size The number of units (persons, animals, patients, specified circumstances, etc.) in a population to be studied. The sample size should be big enough to have a high likelihood of detecting a true difference between two groups. Statistical Power (e.g., higher sample size Sample size The number of units (persons, animals, patients, specified circumstances, etc.) in a population to be studied. The sample size should be big enough to have a high likelihood of detecting a true difference between two groups. Statistical Power → ↑ power) 
      • Size of expected effect (e.g., higher/larger expected effect → ↑ power)
Types of errors

Types of errors

Image by Lecturio.

Determining Statistical Significance

Statistical significance is the idea that all test outcomes are highly unlikely to be produced simply by chance. To determine statistical significance, you need to set an α-value and calculate a p-value.

P-values

A graph can be created in which possible study results are plotted on the x-axis and the probability Probability Probability is a mathematical tool used to study randomness and provide predictions about the likelihood of something happening. There are several basic rules of probability that can be used to help determine the probability of multiple events happening together, separately, or sequentially. Basics of Probability of observing each result are plotted on the y-axis. The area under the curve represents the p-value. 

  • The p-value is the probability Probability Probability is a mathematical tool used to study randomness and provide predictions about the likelihood of something happening. There are several basic rules of probability that can be used to help determine the probability of multiple events happening together, separately, or sequentially. Basics of Probability of obtaining a given result, assuming the null hypothesis is true.
    • In other words, the p-value is the probability Probability Probability is a mathematical tool used to study randomness and provide predictions about the likelihood of something happening. There are several basic rules of probability that can be used to help determine the probability of multiple events happening together, separately, or sequentially. Basics of Probability that you would get this result if there was no relationship Relationship A connection, association, or involvement between 2 or more parties. Clinician–Patient Relationship between the variables and that the results occurred simply by chance. 
    • Like all probabilities, the p-value is between 0 and 1.
  • Higher p-values (larger areas under the curve):
    • Indicate a higher likelihood that the null hypothesis is true 
    • Suggests that there is no relationship Relationship A connection, association, or involvement between 2 or more parties. Clinician–Patient Relationship between your variables
    • Example: In the example above, a p-value of 0.6 would mean Mean Mean is the sum of all measurements in a data set divided by the number of measurements in that data set. Measures of Central Tendency and Dispersion it is unlikely that Drug X is associated with lower BP.
  • Lower p-values (smaller areas under the curve):
    • Indicate a low likelihood that the null hypothesis is true
    • Suggests that an observed correlation Correlation Determination of whether or not two variables are correlated. This means to study whether an increase or decrease in one variable corresponds to an increase or decrease in the other variable. Causality, Validity, and Reliability between your variables is unlikely to be due simply to chance and that a true relationship Relationship A connection, association, or involvement between 2 or more parties. Clinician–Patient Relationship likely exists
    • Example: In the example above, a p-value of 0.02 suggests that Drug X is associated with lower BP.
  • If the p-value is lower than your predetermined level of significance (α-level), you can reject the null hypothesis, because there likely is a real relationship Relationship A connection, association, or involvement between 2 or more parties. Clinician–Patient Relationship between your variables.
  • The lower the p-value, the more confident you can be that the relationship Relationship A connection, association, or involvement between 2 or more parties. Clinician–Patient Relationship between your variables is true (and are not due to chance).

Mnemonic:

“If the p is low, the null (hypothesis) must go.”

A graphical representation of the p value and %ce%b1 levels

Graphical representation of the p-value and α-levels:
Note, in this example, that the observed p-value is less than the predetermined level of statistical significance (in this case, 95%). This means that the null hypothesis should be rejected because the observed result would be very unlikely if the null hypothesis (that no relationship Relationship A connection, association, or involvement between 2 or more parties. Clinician–Patient Relationship exists between variables) were true.

Image by Lecturio.

α-level

  • The α-level is a p-value that represents an arbitrarily determined “significance level.”
  • The α-level should be chosen prior to conducting a study.
  • By convention, the α-level is typically set at 0.05 or 0.01.
  • The α-level is the risk you are willing to take of making a wrong decision, in which you incorrectly reject the null hypothesis (when it is in fact true).
  • Example: 
    • An α-level of 0.05 means you will conclude that a relationship Relationship A connection, association, or involvement between 2 or more parties. Clinician–Patient Relationship between your variables exists if the p-value is < 0.05.
    • This means you are willing to accept up to a 5% chance of committing a type 1 Type 1 Spinal Muscular Atrophy error Error Refers to any act of commission (doing something wrong) or omission (failing to do something right) that exposes patients to potentially hazardous situations. Disclosure of Information.
  • In the Drug X BP example, if the p-value was 0.03, then you would conclude that:
    • Drug X is associated with lower BP → this is a rejection of the null hypothesis
    • There is a 3% chance you have committed a type 1 Type 1 Spinal Muscular Atrophy error Error Refers to any act of commission (doing something wrong) or omission (failing to do something right) that exposes patients to potentially hazardous situations. Disclosure of Information: that the null hypothesis was in fact true and Drug X is not actually associated with lower BP.

Confidence intervals

  • A CI CI The percentage of the chest diameter occupied by the heart. Imaging of the Heart and Great Vessels is the probability Probability Probability is a mathematical tool used to study randomness and provide predictions about the likelihood of something happening. There are several basic rules of probability that can be used to help determine the probability of multiple events happening together, separately, or sequentially. Basics of Probability that your result falls between a defined range of values. 
  • The confidence level for CIs CIS Multiple Sclerosis is the probability Probability Probability is a mathematical tool used to study randomness and provide predictions about the likelihood of something happening. There are several basic rules of probability that can be used to help determine the probability of multiple events happening together, separately, or sequentially. Basics of Probability that the CI CI The percentage of the chest diameter occupied by the heart. Imaging of the Heart and Great Vessels contains the true result
    • Most commonly, a 95% confidence level is used (though the confidence level often ranges from 90% to 99%) 
    • A 95% CI CI The percentage of the chest diameter occupied by the heart. Imaging of the Heart and Great Vessels is a range of values that are 95% certain to contain the true mean Mean Mean is the sum of all measurements in a data set divided by the number of measurements in that data set. Measures of Central Tendency and Dispersion of the population.
    • Like the α-level, the CI CI The percentage of the chest diameter occupied by the heart. Imaging of the Heart and Great Vessels confidence level is chosen prior to testing the data.
    • The higher the confidence needed, the larger the interval will be.
  • Example: Researchers want to determine the average height in a population of 1000 men. Heights are measured in a random sample of 50 of these men. 
90% confidence interval on a standard normal curve

A 90% confidence interval on a standard normal curve

Image by Lecturio.

Pitfalls Pitfalls Basics of Probability in hypothesis testing

  • Do not base your hypothesis on what you see in the data.
  • Do not make your H0 what you want to show to be true.
  • Check the conditions.
  • Do not accept the H0, instead fail to reject it.
  • Do not confuse practical significance and statistical significance (e.g., with a large enough sample size Sample size The number of units (persons, animals, patients, specified circumstances, etc.) in a population to be studied. The sample size should be big enough to have a high likelihood of detecting a true difference between two groups. Statistical Power, you may find that Drug X lowers systolic BP by 2 mm MM Multiple myeloma (MM) is a malignant condition of plasma cells (activated B lymphocytes) primarily seen in the elderly. Monoclonal proliferation of plasma cells results in cytokine-driven osteoclastic activity and excessive secretion of IgG antibodies. Multiple Myeloma Hg. Even if this is statistically significant, is this clinically significant for your patient?)
  • If you fail to reject the H0, do not assume that a larger sample size Sample size The number of units (persons, animals, patients, specified circumstances, etc.) in a population to be studied. The sample size should be big enough to have a high likelihood of detecting a true difference between two groups. Statistical Power will lead to rejection.
  • Be sure to think about whether it is reasonable to assume that events are independent.
  • Do not interpret p-values as the probability Probability Probability is a mathematical tool used to study randomness and provide predictions about the likelihood of something happening. There are several basic rules of probability that can be used to help determine the probability of multiple events happening together, separately, or sequentially. Basics of Probability that the H0 is true.
  • Even a test carried out perfectly can be wrong.

Statistical Tests

Choosing the right test

Your choice of test is based on:

  • The types of variables you are testing (both your test “exposure” and your “outcome”)
    • Quantitative: continuous (age, weight, height) versus discrete (number of patients Patients Individuals participating in the health care system for the purpose of receiving therapeutic, diagnostic, or preventive procedures. Clinician–Patient Relationship)
    • Categorical: ordinal (rankings; e.g., grades, clothing size), nominal (groups with names; e.g., marital status), or binary (data with only a “yes/no” answer; e.g., alive or dead)
  • Whether or not your data meet certain criteria known as assumptions; common assumptions include:
    • Data points are all independent of one another.
    • Variance within a single group is similar among all groups.
    • Data follow a normal distribution (bell curve). 

The reasonability of the model should always be questioned. If the model is wrong, so is everything else.

Be careful of variables that are not truly independent.

Continuous and categorical variables

Graphical representations of continuous and categorical data

Image by Lecturio. License: CC BY-NC-SA 4.0

Types of tests

The 3 primary categories of statistical tests are:

  1. Regression Regression Corneal Abrasions, Erosion, and Ulcers tests: assess cause-and-effect relationships
  2. Comparison tests: compare the means of different groups (require quantitative outcome data)
  3. Correlation Correlation Determination of whether or not two variables are correlated. This means to study whether an increase or decrease in one variable corresponds to an increase or decrease in the other variable. Causality, Validity, and Reliability tests: look for associations between different variables
Table: Types of statistical tests
Test name What the test is testing Types of variables/data Example
Regression Regression Corneal Abrasions, Erosion, and Ulcers tests
Simple linear regression Regression Corneal Abrasions, Erosion, and Ulcers How a change in the predictor/input variable Variable Variables represent information about something that can change. The design of the measurement scales, or of the methods for obtaining information, will determine the data gathered and the characteristics of that data. As a result, a variable can be qualitative or quantitative, and may be further classified into subgroups. Types of Variables affects the outcome variable Variable Variables represent information about something that can change. The design of the measurement scales, or of the methods for obtaining information, will determine the data gathered and the characteristics of that data. As a result, a variable can be qualitative or quantitative, and may be further classified into subgroups. Types of Variables
  • Predictor: continuous
  • Outcome: continuous
How does weight (predictor) affect Affect The feeling-tone accompaniment of an idea or mental representation. It is the most direct psychic derivative of instinct and the psychic representative of the various bodily changes by means of which instincts manifest themselves. Psychiatric Assessment life expectancy Life expectancy Based on known statistical data, the number of years which any person of a given age may reasonably expected to live. Population Pyramids (outcome)?
Multiple linear regression Regression Corneal Abrasions, Erosion, and Ulcers How changes in the combinations of ≥ 2 predictor variables can predict changes in the outcome
  • Predictor: continuous
  • Outcome: continuous
How do weight and socioeconomic status (predictors) affect Affect The feeling-tone accompaniment of an idea or mental representation. It is the most direct psychic derivative of instinct and the psychic representative of the various bodily changes by means of which instincts manifest themselves. Psychiatric Assessment life expectancy Life expectancy Based on known statistical data, the number of years which any person of a given age may reasonably expected to live. Population Pyramids (outcome)?
Logistic regression Regression Corneal Abrasions, Erosion, and Ulcers How ≥ 1 predictor variables can affect Affect The feeling-tone accompaniment of an idea or mental representation. It is the most direct psychic derivative of instinct and the psychic representative of the various bodily changes by means of which instincts manifest themselves. Psychiatric Assessment a binary outcome
  • Predictor: continuous
  • Outcome: binary
What is the effect of weight (predictor) on survival (binary outcome: dead or alive)?
Comparison tests
Paired t-test T-test Statistical Power Compares the means of 2 groups from the same population
  • Predictor: categorical
  • Outcome: quantitative
Compare the weights of infants (outcome) before and after feeding (predictor).
Independent t-test T-test Statistical Power Compares the means of 2 groups from different populations
  • Predictor: categorical
  • Outcome: quantitative
What is the difference in average height (outcome) between 2 different basketball teams (predictor)?
Analysis of variance (ANOVA) Compares the means from > 2 groups
  • Predictor: categorical
  • Outcome: quantitative
What is the difference in blood glucose Glucose A primary source of energy for living organisms. It is naturally occurring and is found in fruits and other parts of plants in its free state. It is used therapeutically in fluid and nutrient replacement. Lactose Intolerance levels (outcome) 1, 2, and 3 hours after a meal (predictors)?
Correlation Correlation Determination of whether or not two variables are correlated. This means to study whether an increase or decrease in one variable corresponds to an increase or decrease in the other variable. Causality, Validity, and Reliability tests
Chi-square test Tests the strength of association between 2 categorical variables with a larger sample size Sample size The number of units (persons, animals, patients, specified circumstances, etc.) in a population to be studied. The sample size should be big enough to have a high likelihood of detecting a true difference between two groups. Statistical Power
  • Variable Variable Variables represent information about something that can change. The design of the measurement scales, or of the methods for obtaining information, will determine the data gathered and the characteristics of that data. As a result, a variable can be qualitative or quantitative, and may be further classified into subgroups. Types of Variables 1: categorical
  • Variable Variable Variables represent information about something that can change. The design of the measurement scales, or of the methods for obtaining information, will determine the data gathered and the characteristics of that data. As a result, a variable can be qualitative or quantitative, and may be further classified into subgroups. Types of Variables 2: categorical
Compare whether acceptance into medical school ( variable Variable Variables represent information about something that can change. The design of the measurement scales, or of the methods for obtaining information, will determine the data gathered and the characteristics of that data. As a result, a variable can be qualitative or quantitative, and may be further classified into subgroups. Types of Variables 1) is more likely if the applicant was born in the United Kingdom ( variable Variable Variables represent information about something that can change. The design of the measurement scales, or of the methods for obtaining information, will determine the data gathered and the characteristics of that data. As a result, a variable can be qualitative or quantitative, and may be further classified into subgroups. Types of Variables 2).
Fisher’s exact test Tests the strength of association between 2 categorical variables with a smaller sample size Sample size The number of units (persons, animals, patients, specified circumstances, etc.) in a population to be studied. The sample size should be big enough to have a high likelihood of detecting a true difference between two groups. Statistical Power
  • Variable Variable Variables represent information about something that can change. The design of the measurement scales, or of the methods for obtaining information, will determine the data gathered and the characteristics of that data. As a result, a variable can be qualitative or quantitative, and may be further classified into subgroups. Types of Variables 1: categorical
  • Variable Variable Variables represent information about something that can change. The design of the measurement scales, or of the methods for obtaining information, will determine the data gathered and the characteristics of that data. As a result, a variable can be qualitative or quantitative, and may be further classified into subgroups. Types of Variables 2: categorical
Same as chi-square, but with smaller sample sizes
Pearson r test Tests the strength of association between 2 continuous variables
  • Variable Variable Variables represent information about something that can change. The design of the measurement scales, or of the methods for obtaining information, will determine the data gathered and the characteristics of that data. As a result, a variable can be qualitative or quantitative, and may be further classified into subgroups. Types of Variables 1: continuous
  • Variable Variable Variables represent information about something that can change. The design of the measurement scales, or of the methods for obtaining information, will determine the data gathered and the characteristics of that data. As a result, a variable can be qualitative or quantitative, and may be further classified into subgroups. Types of Variables 2: continuous
Compare how plasma Plasma The residual portion of blood that is left after removal of blood cells by centrifugation without prior blood coagulation. Transfusion Products HbA1c level ( variable Variable Variables represent information about something that can change. The design of the measurement scales, or of the methods for obtaining information, will determine the data gathered and the characteristics of that data. As a result, a variable can be qualitative or quantitative, and may be further classified into subgroups. Types of Variables 1) is related to plasma Plasma The residual portion of blood that is left after removal of blood cells by centrifugation without prior blood coagulation. Transfusion Products triglyceride levels ( variable Variable Variables represent information about something that can change. The design of the measurement scales, or of the methods for obtaining information, will determine the data gathered and the characteristics of that data. As a result, a variable can be qualitative or quantitative, and may be further classified into subgroups. Types of Variables 2) in diabetic patients Patients Individuals participating in the health care system for the purpose of receiving therapeutic, diagnostic, or preventive procedures. Clinician–Patient Relationship.

Chi-square test (χ2)

Chi-square tests are commonly used to analyze categorical data and determine whether 2 categorical variables are related.

  • What chi-square tests can assess: 
    • Whether or not a statistically significant association is present between 2 variables
    • Analyzed data: typically “counted” categorical data, meaning you have a number of named categories, and your data points are the counted values for each category.
    • More accurate on large samples than Fisher’s exact test
  • What chi-square tests cannot assess:

In order to perform a chi-square test, 2 pieces of information are needed: the degrees of freedom (number of categories minus 1), and the α-level (which is chosen by the researcher and usually set at 0.05). In addition, the data should be organized in a table. 

Example: If you wanted to see whether jugglers were more likely to be born during a particular season, the data could be recorded in the following table:

Category (i): season of birth Observed frequency of jugglers in each birth season
Spring 66
Summer 82
Fall 74
Winter Winter Pityriasis Rosea 78
Total number of jugglers in the sample: 300

To begin, the expected frequencies for each cell in the table above need to be determined using the equation:

$$ Expected\ frequency = np_{0i} $$

where n = the sample size Sample size The number of units (persons, animals, patients, specified circumstances, etc.) in a population to be studied. The sample size should be big enough to have a high likelihood of detecting a true difference between two groups. Statistical Power and p0i is the hypothesized proportion in each category i

In the above example, n = 300 and p0i is ¼, so the expected cell frequency is 300 * 0.25 = 75 in each cell.

The test statistic is then calculated by the standard chi-square formula:

$$ \chi ^{2} = \sum _{all\ cells} \frac{(observed-expected)^{2}}{expected} $$

where 𝝌2 is the test statistic being calculated. For each “cell” or category, the expected frequency is subtracted from the observed frequency; this value is squared and then divided by the expected frequency. After this number is calculated for each category, the numbers are added together.

Example 𝝌2 calculation: Using the example above, the expected frequency in each cell is 75, so the 𝝌2 test statistic can be calculated as follows:

Category (i): season of birth Observed frequency of jugglers with each birth season (Observed – expected)2/expected
Spring 66 (66 ‒ 75)2 / 75 = 1.08
Summer 82 (82 ‒ 75)2 / 75 = 0.653
Fall 74 (74 ‒ 75)2 / 75 = 0.013
Winter Winter Pityriasis Rosea 78 (78 ‒ 75)2 / 75 = 0.12

𝝌2 = 1.08 + 0.653 + 0.013 + 0.12 = 1.866

Determining whether or not the test statistic is statistically significant:

To determine whether this test statistic is statistically significant, the chi-square table is used to obtain the chi-square critical number. 

  • The table has degrees of freedom (number of categories minus 1) on the y-axis and the α-level on the x-axis.
  • Using the degrees of freedom and α-level from the study, you find the critical number on the chart (see example chart below).
  • The critical number is used to determine statistical significance by comparing it to the test statistic. 
    • If the test statistic > critical value:
      • The observed frequencies are far away from expected frequencies
      • Reject the null hypothesis in favor of the alternative hypothesis based on this α-level.  
    • If the test statistic < critical value:
      • The observed frequencies were close to the expected frequencies
      • Do not reject the null hypothesis based on this α-level.
Example of a chi-square table

Example of the critical value table for the 𝝌2 test:
On the y-axis, V represents the degrees of freedom (i.e., the number of categories being studied minus 1); significance levels (α-levels) are shown along the x-axis. The corresponding critical values are found in the table and then compared to the calculated test statistic.

Image by Lecturio. License: CC BY-NC-SA 4.0

Example 𝝌2 test: Are jugglers more likely to be born in a particular season at a 0.05 significance level?

  • There are 4 different seasons, so there are 3 degrees of freedom.
  • α-level = 0.05
  • Using the table above, the critical number is 7.81
  • Therefore, we will reject our null hypothesis if the test statistic is > 7.81.
Calculations assuming the expected frequency in each cell is 75
Category (i): season of birth Observed frequency of jugglers with each birth season (Observed ‒ expected)2/expected
Spring 66 (66 ‒ 75)2 / 75 = 1.08
Summer 82 (82 ‒ 75)2 / 75 = 0.653
Fall 74 (74 ‒ 75)2 / 75 = 0.013
Winter Winter Pityriasis Rosea 78 (78 ‒ 75)2 / 75 = 0.12

𝝌2 = 1.08 + 0.653 + 0.013 + 0.12 = 1.866

Since 1.866 is < 7.81 (our critical value), we need to fail to reject (i.e., accept) the null hypothesis and conclude that season of birth is not associated with juggling.

Common pitfalls Pitfalls Basics of Probability:

  • Do not use chi-square unless the data are counted.
  • Beware of large sample sizes, as degrees of freedom do not increase.

Fisher’s exact test

Similar to the 𝝌2 test, the Fisher’s exact test is a statistical test used to determine whether there are nonrandom associations between 2 categorical variables.

  • Used to analyze data found in contingency tables and determine the deviation of data from the null hypothesis (i.e., the p-value)
    • For example: comparing 2 possible “exposures” ( smoking Smoking Willful or deliberate act of inhaling and exhaling smoke from burning substances or agents held by hand. Interstitial Lung Diseases versus not smoking Smoking Willful or deliberate act of inhaling and exhaling smoke from burning substances or agents held by hand. Interstitial Lung Diseases) with 2 possible outcomes (develops lung cancer Lung cancer Lung cancer is the malignant transformation of lung tissue and the leading cause of cancer-related deaths. The majority of cases are associated with long-term smoking. The disease is generally classified histologically as either small cell lung cancer or non-small cell lung cancer. Symptoms include cough, dyspnea, weight loss, and chest discomfort. Lung Cancer versus healthy)
    • Contingency tables may have > 2 “exposures” or > 2 outcomes
  • More accurate for small data sets
  • Fisher’s test gives exact p-values based on the table.
  • Complicated formula to calculate the test statistic, so typically calculated with software.

A 2 × 2 contingency table Contingency table A contingency table lists the frequency distributions of variables from a study and is a convenient way to look at any relationships between variables. Measures of Risk is set up like this:

Y Z Row total
W A B A + B
X C D C + D
Column total A + C B + D A + B + C + D (= n)

The test statistic, p, is calculated from this table using the following formula:

$$ p = \frac{(\frac{a+b}{a})(\frac{c+d}{c})}{(\frac{n}{a+c})} = \frac{(\frac{a+b}{b})(\frac{c+d}{d})}{(\frac{n}{b+d})} = \frac{(a+b)! (c+d)! (a+c)! (b+d)!}{a! b! c! d! n!} $$

where p = p-value; A, B, C, and D are numbers from the cells in a basic 2 × 2 contingency table Contingency table A contingency table lists the frequency distributions of variables from a study and is a convenient way to look at any relationships between variables. Measures of Risk; and n = total of A + B + C + D.

Graphical Representation of Data

Purpose

Before any calculations are made, data should be presented in a simple graphical format (e.g., bar graph, scatter plot, histogram Histogram Population Pyramids). 

  • The characteristics of the distribution of data will indicate the statistical tools that will be needed for analysis. 
  • Graphs are the 1st step in data analysis, allowing for the immediate visualization of distributions and patterns, which will determine the next steps of statistical analysis. 
  • Outliers can be an indication of mathematical or experimental errors.
  • There are many ways to graphically represent data.
  • After calculations are completed, visual presentation can assist the reader in conceptualizing the results.

Displaying a relationship Relationship A connection, association, or involvement between 2 or more parties. Clinician–Patient Relationship between variables

Contingency tables:

  • Tables showing the relative frequencies of different combinations of variables
  • Example: Comparing the results of a screening Screening Preoperative Care test (positive or negative) with whether or not people actually have a disease. (Note: This specific type of contingency table Contingency table A contingency table lists the frequency distributions of variables from a study and is a convenient way to look at any relationships between variables. Measures of Risk can be used to calculate the sensitivity and specificity Sensitivity and Specificity Binary classification measures to assess test results. Sensitivity or recall rate is the proportion of true positives. Specificity is the probability of correctly determining the absence of a condition. Epidemiological Values of Diagnostic Tests of a screening Screening Preoperative Care test.)
Contingency table for false postives and negatives

Contingency table Contingency table A contingency table lists the frequency distributions of variables from a study and is a convenient way to look at any relationships between variables. Measures of Risk identifying false positives (b) and false negatives (c)

Image by Lecturio. License: CC BY-NC-SA 4.0

Scatter diagram or dispersion Dispersion Central tendency is a measure of values in a sample that identifies the different central points in the data, often referred to colloquially as “averages.” The most common measurements of central tendency are the mean, median, and mode. Identifying the central value allows other values to be compared to it, showing the spread or cluster of the sample, which is known as the dispersion or distribution. Measures of Central Tendency and Dispersion diagrams:

  • A method commonly used to display the relationship Relationship A connection, association, or involvement between 2 or more parties. Clinician–Patient Relationship between 2 numerical variables or 1 numerical variable Variable Variables represent information about something that can change. The design of the measurement scales, or of the methods for obtaining information, will determine the data gathered and the characteristics of that data. As a result, a variable can be qualitative or quantitative, and may be further classified into subgroups. Types of Variables and 1 categorical variable Variable Variables represent information about something that can change. The design of the measurement scales, or of the methods for obtaining information, will determine the data gathered and the characteristics of that data. As a result, a variable can be qualitative or quantitative, and may be further classified into subgroups. Types of Variables
  • The dots represent the values of individual data points.
  • Allows for calculation of a “best fit line” representing the data as a whole
  • Allows for easy visualization of the entire data set
  • Example: scatter diagram showing the relationship Relationship A connection, association, or involvement between 2 or more parties. Clinician–Patient Relationship between 2 numerical variables
Scatter plot

Example of a scatter diagram

Image: “ Scatterplot” by Qwertyus. License: CC0 1.0

Box plots:

  • Shows the spread and centers of the data set
  • Visually expresses a 5-number summary:
    1. The minimum value is shown at the end of the left of the box.
    2. The first quartile (Q1) is at the far left of the box.
    3. The median is shown as the line in the center of the box
    4. The third quartile (Q3) is at the far right of the box.
    5. The maximum value is shown at the end of the right of the box.
  • Typically used when comparing means and distributions between 2 populations
  • Example: The following box plot compares the average incubation Incubation The amount time between exposure to an infectious agent and becoming symptomatic. Rabies Virus periods between different variants of the novel coronavirus Coronavirus Coronaviruses are a group of related viruses that contain positive-sense, single-stranded RNA. Coronavirus derives its name from “κορώνη korṓnē” in Greek, which translates as “crown,” after the small club-shaped proteins visible as a ring around the viral envelope in electron micrographs. Coronavirus (nCoV), SARS, and Middle East respiratory syndrome (MERS).
Sars-cov-2 incubation period boxplot

Example of a box plot

Image: “Box-and-whisker-plots” by Jantien A. Backer, Don Klinkenberg, Jacco Wallinga. License: CC BY 4.0

Kaplan-Meier survival curves

  • A type of statistical analysis used to estimate the time-to-event data—typically, survival data.
  • Commonly used in medical studies showing how a particular treatment can affect Affect The feeling-tone accompaniment of an idea or mental representation. It is the most direct psychic derivative of instinct and the psychic representative of the various bodily changes by means of which instincts manifest themselves. Psychiatric Assessment/prolong survival.
  • The line represents the number of patients Patients Individuals participating in the health care system for the purpose of receiving therapeutic, diagnostic, or preventive procedures. Clinician–Patient Relationship surviving (or who have not yet achieved a certain end point) at a given point in time.
  • Example: The survival curve below shows how 2 different gene Gene A category of nucleic acid sequences that function as units of heredity and which code for the basic instructions for the development, reproduction, and maintenance of organisms. Basic Terms of Genetics signatures affect Affect The feeling-tone accompaniment of an idea or mental representation. It is the most direct psychic derivative of instinct and the psychic representative of the various bodily changes by means of which instincts manifest themselves. Psychiatric Assessment survival. The study begins at time point 0, with 100% of the 2 groups surviving. Each drop-off in the line represents people dying in each group, decreasing the percentage of people who remain living. After 3 years, approximately 50% of people with the Gene Gene A category of nucleic acid sequences that function as units of heredity and which code for the basic instructions for the development, reproduction, and maintenance of organisms. Basic Terms of Genetics A signature are still alive, compared with only 5% who have the Gene Gene A category of nucleic acid sequences that function as units of heredity and which code for the basic instructions for the development, reproduction, and maintenance of organisms. Basic Terms of Genetics B signature.
Example of a kaplan-meier plot

Example of a Kaplan-Meier plot

Image: “An example of a Kaplan Meier plot” by Rw251. License: CC0 1.0

Presentation of numerical variables

Tables (a frequency table is 1 example):

  • The most simple form of graphing data
  • Data are displayed in columns and rows.

Histograms:

  • Good for demonstrating the results of continuous data, such as:
    • Weights
    • Heights
    • Lengths of time
  • Similar to, but not the same as, bar graphs (which display categorical data) 
  • A histogram Histogram Population Pyramids display divides the continuous data into intervals or ranges.
  • The height of each bar represents the number of data points that fall into that range.
  • Because histograms are representing continuous data, they are drawn with no gaps between bars.
  • Example: A histogram Histogram Population Pyramids showing how many people lost or gained weight over a 2-week study period. In this example, 1 person lost between 2.5 and 3 pounds, 27 people gained between 0 and 0.5 pounds, and 6 people gained between 1 and 1.5 pounds.
Example of a histogram

Example of a histogram Histogram Population Pyramids

Image: “Example of a histogram Histogram Population Pyramids” by Jkv. License: Public Domain

Frequency polygon charts:

Example of a frequency polygon chart

Frequency polygon chart for salaries of 31 NFL teams

Image: “Example of a frequency polygon chart” by JLW87. License: Public Domain

Presentation of categorical variables

Frequency tables, bar charts/histograms, and pie charts are 3 of the most common ways to present categorical data.

Frequency tables:

  • Display numbers and/or percentages for each value of a variable Variable Variables represent information about something that can change. The design of the measurement scales, or of the methods for obtaining information, will determine the data gathered and the characteristics of that data. As a result, a variable can be qualitative or quantitative, and may be further classified into subgroups. Types of Variables
  • Example: Pull up to 100 different stoplights and record whether the light was red, yellow, or green upon your arrival.
Table: Example of a frequency table
Stoplight color Frequency
Red 65
Yellow 5
Green 30

Bar graph:

  • The length of each bar indicates the number or frequency of that variable Variable Variables represent information about something that can change. The design of the measurement scales, or of the methods for obtaining information, will determine the data gathered and the characteristics of that data. As a result, a variable can be qualitative or quantitative, and may be further classified into subgroups. Types of Variables in the data set; bars can be plotted vertically or horizontally
  • Example: A bar graph showing the breakdown of race/ethnicity in Texas in 2015.
Example of a bar graph

Example of a bar graph

Image: “Bar Chart of Race & Ethnicity in Texas” by Datawheel. License: CC0 1.0

Pie charts:

  • Demonstrates relative proportions between different categorical variables
  • Example: The following pie chart shows the results of the European Parliament election in 2004, with each color representing a different political party and the percentage of votes they received.
Example of a pie chart

Example of a pie chart

Image: “A pie chart for the example data” by Liftarn. License: Public Domain

References

  1. Greenhalgh, T. (2014). How to Read a Paper: The Basics of Evidence-Based Medicine. Chichester, UK: Wiley.
  2. Cochran, W. G. (1952). The chi-square test of goodness of fit. Annals of Mathematical Statistics 23(3):315–345.
  3. Yates, F. (1934). Contingency table involving small numbers and the χ2 test. Supplement to the Journal of the Royal Statistical Society 1(2):217–235.
  4. Kale, A. (2009). Chapter 2 of Basics of Research Methodology. Essentials of Research Methodology and Dissertation Writing, 7–14.
  5. Till, Y., Matei, A. (n.d.). Basics of Sampling for Survey Research. SAGE Handbook of Survey Methodology, pp. 311–328.
  6. Shober, P. et al. (2018). Statistical significance versus clinical importance of observed effect sizes: what do p values and confidence intervals really represent? Anesthesia & Analgesia 126:1068–1072.
  7. Katz, D. L., et al. (Eds.), Jekel’s Epidemiology, Biostatistics, Preventive Medicine, and Public Health, pp. 105–118. Retrieved July 8, 2021, from https://search.library.uq.edu.au/primo-explore/fulldisplay?vid=61UQ&search_scope=61UQ_All&tab=61uq_all&docid=61UQ_ALMA2193525390003131&lang=en_US&context=L

USMLE™ is a joint program of the Federation of State Medical Boards (FSMB®) and National Board of Medical Examiners (NBME®). MCAT is a registered trademark of the Association of American Medical Colleges (AAMC). NCLEX®, NCLEX-RN®, and NCLEX-PN® are registered trademarks of the National Council of State Boards of Nursing, Inc (NCSBN®). None of the trademark holders are endorsed by nor affiliated with Lecturio.

Study on the Go

Lecturio Medical complements your studies with evidence-based learning strategies, video lectures, quiz questions, and more – all combined in one easy-to-use resource.

Learn even more with Lecturio:

Complement your med school studies with Lecturio’s all-in-one study companion, delivered with evidence-based learning strategies.

User Reviews

¡Hola!

Esta página está disponible en Español.

Details