Table of Contents

Are you more of a visual learner? Check out our online video lectures and start your calculus course now for free!

Image: “statistics” by kulinetto. License: CC0 1.0

## Introduction

Paired data is defined as two sets of data that follow a one-to-one relationship between their values. Each data set has the same number of data points. Each data point in one data set is related to one, and only one, data point in the other data set.

An example of paired data is the blood pressure of patients before doing yoga (ie, readings before the intervention) and the blood pressure of patients after doing yoga (ie, readings after the intervention). The most widely used technique, testing the impact of an intervention (or for analyzing paired data) when it is numeric, is called the paired sample t-test.

## Paired Sample T-test

### Using Formulas

A physician develops a new drug that is used to treat hypertension and anxiety. The physician tests the level of hypertension and anxiety in 30 patients before giving them the drug, and gives the patients a score from 0–10, with 0 indicating no hypertension and anxiety and 10 indicating the highest level of hypertension and anxiety (ie, the patient is likely to suicide).

Then, to test the impact of the new drug that the physician has developed, they give the drug to the same 30 patients and then test the level of the patients’ hypertension and anxiety (see table below).

 No. Patient Hypertension and Anxiety Score Before After 1 Alex 8 9 2 Bob 7 8 3 Cathy 8 9 4 Drake 8 7 5 Emily 5 4 6 Frank 5 3 7 George 8 4 8 Henry 7 3 9 Iris 9 8 10 Jack 9 7 11 Kate 7 4 12 Louis 7 6 13 Mathew 6 7 14 Nick 4 5 15 John 2 2 16 Paul 8 8 17 Ross 8 7 18 Rachel 7 6 19 Sam 9 8 20 Clark 8 4 21 David 7 1 22 Mark 7 2 23 Walker 6 4 24 Adam 7 5 25 Huss 8 5 26 Jones 7 4 27 Angelina 5 3 28 Brad 8 7 29 Tom 8 6 30 Dakota 7 5

It is difficult to draw clear conclusions about the impact of the new drug by just seeing the paired data above; thus, we need to conduct the paired sample t-test. The 4 steps of the paired sample t-test are outlined below.

Step 1: State the hypothesis

Null hypothesis (µd = 0): There is no statistically significant difference between hypertension and the anxiety scores of patients before and after the intervention (i.e. the new drug).

Alternative hypothesis (µd ≠ 0): There is a statistically significant difference between hypertension and the anxiety scores of patients before and after the intervention (ie, the new drug).

Step 2: Compute the test statistic

Mean of the difference = –1.663 and standard deviation of the difference = 1.829. These XXX are computed as follows:

 No. Patient Hypertension and Anxiety Score (Difference = After – Before) 1 Alex 1 2 Bob 1 3 Cathy 1 4 Drake -1 5 Emily -1 6 Frank -2 7 George -4 8 Henry -4 9 Iris -1 10 Jack -2 11 Kate -3 12 Luis -1 13 Mathew 1 14 Nick 1 15 John 0 16 Paul 0 17 Ross -1 18 Rachel -1 19 Sam -1 20 Clark -4 21 David -6 22 Mark -5 23 Walker -2 24 Adam -2 25 Huss -3 26 Jones -3 27 Angelina -2 28 Brad -1 29 Tom -2 30 Dakota -2 Mean of difference = d bar = Sum of above differences / 30 -1.633 Standard deviation of difference = ∑ (difference I – mean) 2 / 30 1.829

The t-statistic is thus equal to –1.663 / (1.829 / √ 30) = –4.98.

Step 3: Identify the critical values/rejection region

N = sample size of 30, with a 5% significance level; thus, the rejection region is < –2.04 or > +2.04.

Step 4: Make a conclusion

The t statistic in this example is –4.98. This is less than –2.04 and therefore we can reject the null hypothesis at the 5% significance level. We can thus conclude that the new drug has been effective in terms of reducing hypertension and anxiety of the patients.

### Using the Statistical Package for Social Sciences

This example can also be solved using the Statistical Package for Social Sciences (SPSS) as follows: Go to ‘Analyze’ then ‘Compare Means’ then ‘Paired Sample t-test’. The SPSS output solution for the above example is outlined in the table below.

T-test

Paired Sample Statistics

 Mean N Standard Deviation Standard Error Mean Pair 1 Hypertension and anxiety after using the new drug 5.3667 30 2.18905 0.39966 Hypertension and anxiety score before the new drug 7.0000 30 1.55364 0.28365

Paired Sample Correlations

 N Correlation Sig Pair 1 Hypertension and anxiety after and before using the new drug 30 0.568 0.001

Paired Sample Test

The p-value of 0.000 is less than 0.01, indicating that we can reject the null hypothesis even at the 1% significance level.

### Assumptions

As a parametric method (a technique that gauges obscure parameters), the paired t-test makes a few suppositions. Although these tests are very robust, it is important to assess the level of deviation, keeping in mind the goal of evaluating the nature of the outcomes.

In a paired specimen t-test, the perceptions are characterized as the contrast between 2 arrangements of qualities, and every supposition alludes to these distinctions, not the first information values. This special t-test has 4 key presumptions:

1. The response variable must not be categorical (interim/proportion).
2. The values are not dependent on each other.
3. The response variable needs to normally distributed.
4. The response variable must not contain any extreme values.

## McNemar’s Test

### Using SPSS

A therapist has developed a new technique to treat people with substance use disorders. The therapist tests 31 patients before using the new technique, and categorizes them into those with substance use disorders and those without this disorder. To test the impact of the new technique, the therapist uses it on the same 31 patients, using the 2 categories developed. The paired data are shown in the table below.

 Pairs Before After 1 Without  Disorder With Substance Use Disorder 2 With Substance Use Disorder Without  Disorder 3 With Substance Use Disorder Without  Disorder 4 With Substance Use Disorder Without  Disorder 5 With Substance Use Disorder Without  Disorder 6 With Substance Use Disorder With Substance Use Disorder 7 With Substance Use Disorder With Substance Use Disorder 8 With Substance Use Disorder Without  Disorder 9 With Substance Use Disorder Without  Disorder 10 Without  Disorder With Substance Use Disorder 11 Without  Disorder Without  Disorder 12 Without  Disorder Without  Disorder 13 With Substance Use Disorder Without  Disorder 14 With Substance Use Disorder Without  Disorder 15 With Substance Use Disorder With Substance Use Disorder 16 With Substance Use Disorder With Substance Use Disorder 17 With Substance Use Disorder Without  Disorder 18 With Substance Use Disorder Without  Disorder 19 Without  Disorder With Substance Use Disorder 20 With Substance Use Disorder Without  Disorder 21 With Substance Use Disorder Without  Disorder 22 With Substance Use Disorder Without  Disorder 23 With Substance Use Disorder Without  Disorder 24 With Substance Use Disorder Without  Disorder 25 With Substance Use Disorder With Substance Use Disorder 26 With Substance Use Disorder Without  Disorder 27 Without  Disorder Without  Disorder 28 Without  Disorder With Substance Use Disorder 29 With Substance Use Disorder Without  Disorder 30 With Substance Use Disorder Without  Disorder 31 With Substance Use Disorder Without  Disorder

It is difficult to draw any conclusions about the impact of the new technique by just seeing the paired data above. It is important, therefore, to conduct the McNemar’s test. The test is conducted using SPSS. The output is shown in the tables below.

Descriptive Statistics

 N Mean Standard Deviation Minimum Maximum Before 31 0.77 0.425 0 1 After 31 0.29 0.461 0 1

Wilcoxon Signed-Rank Test

 N Mean Rank Sum of Ranks Before to After Negative Ranks 19a 12.00 228.00 Positive Ranks 4b 12.00 48.00 Ties 8c Total 31
1. After < Before
2. After > Before
3. After = Before

Test Statistics

 After to Before Z 0-3.128b Asymp Sig (2-tailed) 0.002
1. Wilcoxon signed-rank test
2. Based on positive ranks

Crosstabs: Before and After

 Before After With substance use disorder Without substance use disorder With substance use disorder 5 19 Without substance use disorder 4 3

Test Statistics

 Before to After N 31 Exact Sig (2-tailed) 0.003b
1. McNemar’s test
2. Binomial distribution used

The p-value of 0.003 is < 0.05 and therefore we can reject the null hypothesis. It is possible to conclude that the new technique was effective in turning patients from addicts to non-addicts.

### Using SPSS

The steps of the McNemar’s test using formulas are outlined below.

Step 1: State the hypothesis

Null hypothesis (µd = 0): The new techique has no statistically significant impact.

Alternative hypothesis (µd ≠ 0): The new techique has a statistically significant impact.

Step 2: Compute the test statistic

To run a McNemar test, your data must be placed in a 2 × 2 contingency table, with the cell frequencies equal to the number of pairs, as outlined in the table below:

 Test 2 Positive Test 2 Negative Total Test 1 positive a b a + b Test 1 negative c d c + d Column total a + c b + d n
1. Test statistic = (b – c) 2 / (b + c).

These boxes for this example are as follows:

 Before After With substance use disorder Without substance use disorder With substance use disorder 5 19 Without substance use disorder 4 3

Thus, test statistic = (19 – 4) 2 / (19 + 4) = 9.78

Step 3: Identify the critical values/rejection region

N = sample size of 31, with a 5% significance level; thus, the rejection region is < –2.04 or > +2.04.

Step 4: Make a conclusion

The test statistic in the example is 9.78. This is > +2.04 and thus we can reject the Ho at the 5% significance level. Finally, we conclude that the new treatment has been effective.

This test is a non-parametric test for paired ostensible information. It is used when you are occupied with finding an adjustment in extent for the linked information. For example, this test could be used to break down review case-control experiments, where every treatment is matched with a control. It could also be used to break down a trial in which 2 medicines are given to co-ordinated sets. The method is simple, quick, and easy to perform. It also enables an appropriate confirmatory data analysis for situations dealing with paired dichotomous responses to surveys or experiments.

This test is sometimes referred to as McNemar’s chi-square test in light of the fact that the test measurement has a chi-square model. McNemar’s test is used to decide whether there are contrasts on a dichotomous response variable between 2 related gatherings. It can be thought to be like the matched specimens t-test, though for a dichotomous instead of a non-stop ward variable.

In any case, unlike the combined specimens t-test, it can be conceptualized to test 2 unique properties of a rehashed measure dichotomous variable, as outlined below. McNemar’s test is used to break down the pretest–posttest plan and to examine co-ordinated matches and case-control surveys. On the off-chance that you have more than 2 repeated estimates, you could utilize the Cochran’s Q-test.

One limitation of the Mcnemar test is that is made to be used with large samples. It also assumes that the discordant pair (ie, b+c) is ≥ 10; hence, the use of an exact binomial test is recommended if the discordant pair is < 10.

### Assumptions

1. There must be one continuous variable with 2 categories and 1 independent variable with 2 connected groups.
2. The 2 groups in the response variable must be mutually exclusive. This means that participants cannot appear in more than 1 group.
3. The sample must be random.
Learn. Apply. Retain.
Your path to achieve medical excellence.
Study for medical school and boards with Lecturio.
CREATE YOUR FREE ACCOUNT

### Leave a Reply

Register to leave a comment and get access to everything Lecturio offers!

Free accounts include:

• 1,000+ free medical videos
• 2,000+ free recall questions
• iOS/Android App
• Much more

Already registered? Login.