Paired data is the data that is recorded for the same unit of analysis (i.e., person, animal, etc) over a period of time. Paired data often results when the researcher is testing the impact of an intervention, or when the researcher is conducting an experiment. The techniques used for testing the impact of an intervention (or for analyzing paired data) are the paired sample t-test and McNemar’s test.

Are you more of a visual learner? Check out our online video lectures and start your calculus course now for free! Image: “statistics” by kulinetto. License: CC0 1.0

h2>Introduction
An example of paired data is blood pressure of patients before yoga (i.e., readings before the intervention) and the blood pressure of patients after yoga (i.e., readings after the intervention). The most widely used technique, testing the impact of an intervention (or for analyzing paired data), when it is numeric, is the paired sample t-test. This is explained with the help of a medical example as follows.

## Paired sample t-test

### Using formulas

Suppose that a doctor has developed a new drug that is used to treat hypertension and anxiety. The doctor tested the level of hypertension and anxiety of 30 patients before giving them the drug and gave them a score from 0—10 with 0 indicating no hypertension and anxiety and 10 indicating the highest level of hypertension and anxiety (i.e., the patient is likely to suicide).

Then, in order to test the impact of the new drug that he has developed, he gave the drug to the same 30 patients and then tested the level of their hypertension and anxiety. Hypertension and anxiety scores before and after consuming the new drug are shown as follows:

 # Patient Hypertension – Anxiety Score Before After 1 Alex 8 9 2 Bob 7 8 3 Cathay 8 9 4 Drake 8 7 5 Emily 5 4 6 Frank 5 3 7 George 8 4 8 Hood 7 3 9 Iris 9 8 10 Jack 9 7 11 Kate 7 4 12 Luis 7 6 13 Mathew 6 7 14 Nick 4 5 15 John 2 2 16 Paul 8 8 17 Ross 8 7 18 Rachel 7 6 19 Smith 9 8 20 Clark 8 4 21 David 7 1 22 Mark 7 2 23 Walker 6 4 24 Adam 7 5 25 Huss 8 5 26 Jones 7 4 27 Angelina 5 3 28 Brad 8 7 29 Tom 8 6 30 Dakota 7 5

It is difficult to draw any conclusions about the impact of the new drug by just seeing the paired data above; thus, we need to conduct the paired sample t-test. The steps of the paired sample t-test are as follows:

Step 1: State the hypothesis of the paired test

Ho: null hypothesis: µd = 0: there is no statistically significant difference between hypertension and anxiety scores of patients before and after the intervention (i.e. the new drug).

HA: alternative hypothesis: µd ≠ 0: There is a statistically significant difference between hypertension and anxiety scores of patients before and after the intervention (i.e. the new drug).>

Step 2: Compute the test statistic Mean of the difference = -1.663; Standard deviation of the difference = 1.829. These are computed as follows:

 # Patient Hypertension – Anxiety Score Difference = After – Before 1 Alex 1 2 Bob 1 3 Cathay 1 4 Drake -1 5 Emily -1 6 Frank -2 7 George -4 8 Hood -4 9 Iris -1 10 Jack -2 11 Kate -3 12 Luis -1 13 Mathew 1 14 Nick 1 15 John 0 16 Paul 0 17 Ross -1 18 Rachel -1 19 Smith -1 20 Clark -4 21 David -6 22 Mark -5 23 Walker -2 24 Adam -2 25 Huss -3 26 Jones -3 27 Angelina -2 28 Brad -1 29 Tom -2 30 Dakota -2 Mean of difference = d bar = Sum of above differences / 30 -1.633 Standard deviation of difference = ∑ (difference I – mean) 2 / 30 1.829

The t-statistic is thus equal to -1.663 / (1.829 / √ 30) = -4.98.

Step 3: Identify the critical values / rejection region

The n = sample size = 30 and we choose a 5% significance level; thus, the rejection region is less than -2.04 or more than +2.04.

Step 4: Make a conclusion

The t statistic in the example is -4.98. This is less than -2.04 and thus we can reject the Ho at the 5% significance level. Finally, we conclude that the new drug has been effective in terms of reducing hypertension and anxiety of the patients.

### Using SPSS

The above example can also be solved using the Statistical Package for Social Sciences (SPSS) as follows:

Go to ‘Analyze’ then ‘Compare Means’ then ‘Paired Sample t-test’. The SPSS output solution for the above example is as follows:

T-test

Paired Sample Statistics

 Mean N Std. Deviation Std. Error Mean Pair 1 Hypertension and anxiety after using the new drug 5.3667 30 2.18905 .39966 Hypertension and anxiety score before the new drug 7.0000 30 1.55364 .28365

Paired Sample Correlations

 N Correlation Sig. Pair 1 Hypertension and anxiety after using the new drug and hypertension and anxiety score before the new drug 30 .568 .001

Paired Sample Test

 Paired Differences t df Sig. (2-tailed) Mean Std. Deviation Std. Error Mean 95% Confidence Interval of the Difference Lower Upper Pair 1 Hyper tension and anxiety after using the new drug – hyper tension and anxiety score before the new drug -1.6333 1.82857 .33385 -2.316 -0.950 -4.892 29 .000

The p-value of 0.000 is less than 0.01 and this indicates that we can reject the null hypothesis even at the 1% significance level.

### Assumptions

The assumptions of the paired sample t-test are as follows:

As a parametric method (a technique which gauges obscure parameters), the paired t-test makes a few suppositions. In spite of the fact that t-tests are very robust, it is great practice to assess the level of deviation from these suspicions, keeping in mind the end goal to evaluate the nature of the outcomes.

In a paired specimen t-test, the perceptions are characterized as the contrasts between two arrangements of qualities, and every supposition alludes to these distinctions, not the first information values. This special t-test has four primary presumptions:

1. The response variable must not be categorical (interim/proportion).
2. The values are not dependent on each other.
3. The response variable ought to be around typically normally distributed.
4. The response variable must not contain any extreme values.

## McNemar’s test

### Using SPSS

Suppose that a therapist has developed a new technique that is used to treat drug addicts. The therapist tested 30 patients before using the new technique on them and categorized them into ‘drug addicts’ and ‘non-drug addicts’. Then, in order to test the impact of the new technique that he has developed, he used the technique on the same 30 patients and then again categorized them into ‘drug addicts’ and ‘non-drug addicts’. The paired data for this example is shown as follows:

It is difficult to draw any conclusions about the impact of the new technique by just seeing the paired data above; thus, we need to conduct the McNemar’s test. The test is conducted using SPSS and the output is shown as follows:

Descriptive Statistics

 N Mean Std. Deviation Minimum Maximum Before 30 .77 .425 0 1 After 30 .29 .461 0 1

Wilcoxon Signed Ranks Test

 N Mean Rank Sum of Ranks Before – After Negative Ranks 19a 12.00 228.00 Positive Ranks 4b 12.00 48.00 Ties 8c Total 31
1. After < Before
2. After > Before
3. After = Before

Test statistics

 After – Before Z -3.128b Asymp. Sig. (2-tailed) .002
1. Wilcoxon Signed Ranks Test
2. Based on positive ranks

McNemar’s test

Crosstabs – Before and After

Test statistics

 Before – After N 31 Exact Sig. (2-tailed) .003b
1. McNemar’s Test
2. Binomial distribution used

The p-value of 0.003 is less than 0.05 and thus we reject the null hypothesis. Finally, we conclude that the new technique was effective in turning patients from addicts to non-addicts.

### Using SPSS

The steps of the McNemar’s test using formulas are as follows:

Step 1: State the hypothesis of the McNemar’s test

Ho: null hypothesis: µd = 0: there is no statistically significant impact of the new technique.

HA: alternative hypothesis: µd ≠ 0: there is a statistically significant impact of the new technique.

Step 2: Compute the test statistic

To run a McNemar test, your data must be placed in a 2×2 contingency table, with the cell frequencies that equals to the number of pairs as follows:

 Test 2 positive Test 2 negative Row total Test 1 positive a b a + b Test 1 negative c d c + d Column total a + c b + d n
1. Test statistic = (b – c) 2 / (b + c).

These boxes for this example are as follows:

Thus, test statistic = (4 – 19) 2 / (4 + 19) = 9.78

Step 3: Identify the critical values / rejection region

The n = sample size = 30 and we choose a 5% significance level; thus, the rejection region is less than -2.04 or more than +2.04.

Step 4: Make a conclusion

The test statistic in the example is 9.78. This is more than +2.04 and thus we can reject the Ho at the 5% significance level. Finally, we conclude that the new treatment has been effective in terms of treating addictions.

This test is a non-parametric test for paired ostensible information. It’s utilized when you are occupied with finding an adjustment in extent for the linked information. For instance, you could utilize this test to break down review case-control experiments, where every treatment is matched with a control. It could likewise be utilized to break down a trial where two medicines are given to co-ordinated sets. The method is simple, quick and easy to perform. It also enables an appropriate confirmatory data analysis for situations dealing with paired dichotomous responses to surveys or experiments.

This test is sometimes referred to as McNemar’s Chi-Square test in light of the fact that the test measurement has a chi-square model. The McNemar’s test is utilized to decide whether there are contrasts on a dichotomous response variable between two related gatherings. It can be thought to be like the matched specimens t-test; however, for a dichotomous as opposed to a non-stop ward variable.

In any case, not at all like the combined specimens t-test, it can be conceptualized to test two unique properties of a rehashed measure dichotomous variable, as is clarified beneath. The McNemar’s test is utilized to break down the pretest-posttest plan and also being generally utilized in examining co-ordinated matches and case-control surveys. On the off-chance that you have more than two repeated estimates, you could utilize the Cochran’s Q-test.

A limitation of the Mcnemar test includes that it was made to be used with large samples. It also assumes that the discordant pair i.e. (b+c) is equal to or larger than 10; hence, the use of an exact binomial test is recommended if the discordant pair is less than 10.

### Assumptions

1. There must be one continuous variable with two categories and one independent variable with two connected groups.
2. The two groups in your response variable must be mutually exclusive. This means that participants cannot appear in more than one group.
3. Your sample must be a random sample.
Do you want to learn even more?
Start now with 1,000+ free video lectures
given by award-winning educators!
Yes, let's get started!
No, thanks!