Table of Contents
Introduction
Paired data is defined as two sets of data that follow a onetoone relationship between their values. Each data set has the same number of data points. Each data point in one data set is related to one, and only one, data point in the other data set.
An example of paired data is the blood pressure of patients before doing yoga (ie, readings before the intervention) and the blood pressure of patients after doing yoga (ie, readings after the intervention). The most widely used technique, testing the impact of an intervention (or for analyzing paired data) when it is numeric, is called the paired sample ttest.
Paired Sample Ttest
Using Formulas
A physician develops a new drug that is used to treat hypertension and anxiety. The physician tests the level of hypertension and anxiety in 30 patients before giving them the drug, and gives the patients a score from 0–10, with 0 indicating no hypertension and anxiety and 10 indicating the highest level of hypertension and anxiety (ie, the patient is likely to suicide).
Then, to test the impact of the new drug that the physician has developed, they give the drug to the same 30 patients and then test the level of the patients’ hypertension and anxiety (see table below).
No. 
Patient 
Hypertension and Anxiety Score 

Before 
After 

1 
Alex 
8 
9 
2 
Bob 
7 
8 
3 
Cathy 
8 
9 
4 
Drake 
8 
7 
5 
Emily 
5 
4 
6 
Frank 
5 
3 
7 
George 
8 
4 
8 
Henry 
7 
3 
9 
Iris 
9 
8 
10 
Jack 
9 
7 
11 
Kate 
7 
4 
12 
Louis 
7 
6 
13 
Mathew 
6 
7 
14 
Nick 
4 
5 
15 
John 
2 
2 
16 
Paul 
8 
8 
17 
Ross 
8 
7 
18 
Rachel 
7 
6 
19 
Sam 
9 
8 
20 
Clark 
8 
4 
21 
David 
7 
1 
22 
Mark 
7 
2 
23 
Walker 
6 
4 
24 
Adam 
7 
5 
25 
Huss 
8 
5 
26 
Jones 
7 
4 
27 
Angelina 
5 
3 
28 
Brad 
8 
7 
29 
Tom 
8 
6 
30 
Dakota 
7 
5 
It is difficult to draw clear conclusions about the impact of the new drug by just seeing the paired data above; thus, we need to conduct the paired sample ttest. The 4 steps of the paired sample ttest are outlined below.
Step 1: State the hypothesis
Null hypothesis (µd = 0): There is no statistically significant difference between hypertension and the anxiety scores of patients before and after the intervention (i.e. the new drug).
Alternative hypothesis (µd ≠ 0): There is a statistically significant difference between hypertension and the anxiety scores of patients before and after the intervention (ie, the new drug).
Step 2: Compute the test statistic
Mean of the difference = –1.663 and standard deviation of the difference = 1.829. These XXX are computed as follows:
No. 
Patient 
Hypertension and Anxiety Score (Difference = After – Before) 
1 
Alex 
1 
2 
Bob 
1 
3 
Cathy 
1 
4 
Drake 
1 
5 
Emily 
1 
6 
Frank 
2 
7 
George 
4 
8 
Henry 
4 
9 
Iris 
1 
10 
Jack 
2 
11 
Kate 
3 
12 
Luis 
1 
13 
Mathew 
1 
14 
Nick 
1 
15 
John 
0 
16 
Paul 
0 
17 
Ross 
1 
18 
Rachel 
1 
19 
Sam 
1 
20 
Clark 
4 
21 
David 
6 
22 
Mark 
5 
23 
Walker 
2 
24 
Adam 
2 
25 
Huss 
3 
26 
Jones 
3 
27 
Angelina 
2 
28 
Brad 
1 
29 
Tom 
2 
30 
Dakota 
2 
Mean of difference = d bar =
Sum of above differences / 30 
1.633 

Standard deviation of difference = ∑ (difference I – mean) 2 / 30 
1.829 
The tstatistic is thus equal to –1.663 / (1.829 / √ 30) = –4.98.
Step 3: Identify the critical values/rejection region
N = sample size of 30, with a 5% significance level; thus, the rejection region is < –2.04 or > +2.04.
Step 4: Make a conclusion
The t statistic in this example is –4.98. This is less than –2.04 and therefore we can reject the null hypothesis at the 5% significance level. We can thus conclude that the new drug has been effective in terms of reducing hypertension and anxiety of the patients.
Using the Statistical Package for Social Sciences
This example can also be solved using the Statistical Package for Social Sciences (SPSS) as follows: Go to ‘Analyze’ then ‘Compare Means’ then ‘Paired Sample ttest’. The SPSS output solution for the above example is outlined in the table below.
Ttest
Paired Sample Statistics
Mean 
N 
Standard Deviation 
Standard Error Mean 

Pair 1  Hypertension and anxiety after using the new drug 
5.3667 
30 
2.18905 
0.39966 
Hypertension and anxiety score before the new drug 
7.0000 
30 
1.55364 
0.28365 
Paired Sample Correlations
N 
Correlation 
Sig 

Pair 1  Hypertension and anxiety after and before using the new drug 
30 
0.568 
0.001 
Paired Sample Test
The pvalue of 0.000 is less than 0.01, indicating that we can reject the null hypothesis even at the 1% significance level.
Assumptions
As a parametric method (a technique that gauges obscure parameters), the paired ttest makes a few suppositions. Although these tests are very robust, it is important to assess the level of deviation, keeping in mind the goal of evaluating the nature of the outcomes.
In a paired specimen ttest, the perceptions are characterized as the contrast between 2 arrangements of qualities, and every supposition alludes to these distinctions, not the first information values. This special ttest has 4 key presumptions:
 The response variable must not be categorical (interim/proportion).
 The values are not dependent on each other.
 The response variable needs to normally distributed.
 The response variable must not contain any extreme values.
McNemar’s Test
Using SPSS
A therapist has developed a new technique to treat people with substance use disorders. The therapist tests 31 patients before using the new technique, and categorizes them into those with substance use disorders and those without this disorder. To test the impact of the new technique, the therapist uses it on the same 31 patients, using the 2 categories developed. The paired data are shown in the table below.
Pairs 
Before 
After 
1 
Without Disorder 
With Substance Use Disorder 
2 
With Substance Use Disorder 
Without Disorder 
3 
With Substance Use Disorder 
Without Disorder 
4 
With Substance Use Disorder 
Without Disorder 
5 
With Substance Use Disorder 
Without Disorder 
6 
With Substance Use Disorder 
With Substance Use Disorder 
7 
With Substance Use Disorder 
With Substance Use Disorder 
8 
With Substance Use Disorder 
Without Disorder 
9 
With Substance Use Disorder 
Without Disorder 
10 
Without Disorder 
With Substance Use Disorder 
11 
Without Disorder 
Without Disorder 
12 
Without Disorder 
Without Disorder 
13 
With Substance Use Disorder 
Without Disorder 
14 
With Substance Use Disorder 
Without Disorder 
15 
With Substance Use Disorder 
With Substance Use Disorder 
16 
With Substance Use Disorder 
With Substance Use Disorder 
17 
With Substance Use Disorder 
Without Disorder 
18 
With Substance Use Disorder 
Without Disorder 
19 
Without Disorder 
With Substance Use Disorder 
20 
With Substance Use Disorder 
Without Disorder 
21 
With Substance Use Disorder 
Without Disorder 
22 
With Substance Use Disorder 
Without Disorder 
23 
With Substance Use Disorder 
Without Disorder 
24 
With Substance Use Disorder 
Without Disorder 
25 
With Substance Use Disorder 
With Substance Use Disorder 
26 
With Substance Use Disorder 
Without Disorder 
27 
Without Disorder 
Without Disorder 
28 
Without Disorder 
With Substance Use Disorder 
29 
With Substance Use Disorder 
Without Disorder 
30 
With Substance Use Disorder 
Without Disorder 
31 
With Substance Use Disorder 
Without Disorder 
It is difficult to draw any conclusions about the impact of the new technique by just seeing the paired data above. It is important, therefore, to conduct the McNemar’s test. The test is conducted using SPSS. The output is shown in the tables below.
Descriptive Statistics
N 
Mean 
Standard Deviation 
Minimum 
Maximum 

Before 
31 
0.77 
0.425 
0 
1 
After 
31 
0.29 
0.461 
0 
1 
Wilcoxon SignedRank Test
N  Mean Rank  Sum of Ranks  
Before to After  Negative Ranks  19^{a}  12.00  228.00 
Positive Ranks  4^{b}  12.00  48.00  
Ties  8^{c}  
Total  31 
 After < Before
 After > Before
 After = Before
Test Statistics
After to Before 

Z 
03.128^{b} 
Asymp Sig (2tailed) 
0.002 
 Wilcoxon signedrank test
 Based on positive ranks
Crosstabs: Before and After
Before 
After 

With substance use disorder 
Without substance use disorder 

With substance use disorder  5  19 
Without substance use disorder  4  3 
Test Statistics
Before to After 

N 
31 
Exact Sig (2tailed) 
0.003^{b} 
 McNemar’s test
 Binomial distribution used
The pvalue of 0.003 is < 0.05 and therefore we can reject the null hypothesis. It is possible to conclude that the new technique was effective in turning patients from addicts to nonaddicts.
Using SPSS
The steps of the McNemar’s test using formulas are outlined below.
Step 1: State the hypothesis
Null hypothesis (µd = 0): The new techique has no statistically significant impact.
Alternative hypothesis (µd ≠ 0): The new techique has a statistically significant impact.
Step 2: Compute the test statistic
To run a McNemar test, your data must be placed in a 2 × 2 contingency table, with the cell frequencies equal to the number of pairs, as outlined in the table below:
Test 2 Positive  Test 2 Negative  Total  
Test 1 positive  a  b  a + b 
Test 1 negative  c  d  c + d 
Column total  a + c  b + d  n 
 Test statistic = (b – c) ^{2} / (b + c).
These boxes for this example are as follows:
Before 
After 

With substance use disorder 
Without substance use disorder 

With substance use disorder 
5 
19 
Without substance use disorder 
4 
3 
Thus, test statistic = (19 – 4) ^{2} / (19 + 4) = 9.78
Step 3: Identify the critical values/rejection region
N = sample size of 31, with a 5% significance level; thus, the rejection region is < –2.04 or > +2.04.
Step 4: Make a conclusion
The test statistic in the example is 9.78. This is > +2.04 and thus we can reject the Ho at the 5% significance level. Finally, we conclude that the new treatment has been effective.
This test is a nonparametric test for paired ostensible information. It is used when you are occupied with finding an adjustment in extent for the linked information. For example, this test could be used to break down review casecontrol experiments, where every treatment is matched with a control. It could also be used to break down a trial in which 2 medicines are given to coordinated sets. The method is simple, quick, and easy to perform. It also enables an appropriate confirmatory data analysis for situations dealing with paired dichotomous responses to surveys or experiments.
This test is sometimes referred to as McNemar’s chisquare test in light of the fact that the test measurement has a chisquare model. McNemar’s test is used to decide whether there are contrasts on a dichotomous response variable between 2 related gatherings. It can be thought to be like the matched specimens ttest, though for a dichotomous instead of a nonstop ward variable.
In any case, unlike the combined specimens ttest, it can be conceptualized to test 2 unique properties of a rehashed measure dichotomous variable, as outlined below. McNemar’s test is used to break down the pretest–posttest plan and to examine coordinated matches and casecontrol surveys. On the offchance that you have more than 2 repeated estimates, you could utilize the Cochran’s Qtest.
One limitation of the Mcnemar test is that is made to be used with large samples. It also assumes that the discordant pair (ie, b+c) is ≥ 10; hence, the use of an exact binomial test is recommended if the discordant pair is < 10.
Assumptions
 There must be one continuous variable with 2 categories and 1 independent variable with 2 connected groups.
 The 2 groups in the response variable must be mutually exclusive. This means that participants cannot appear in more than 1 group.
 The sample must be random.
Leave a Reply