Paired data is the data that is recorded for the same unit of analysis (i.e., person, animal, etc) over a period of time. Paired data often results when the researcher is testing the impact of an intervention, or when the researcher is conducting an experiment. The techniques used for testing the impact of an intervention (or for analyzing paired data) are the paired sample t-test and McNemar’s test.

financial inference for paired data

Image: “statistics” by kulinetto. License: CC0 1.0


Introduction

An example of paired data is blood pressure of patients before yoga (i.e., readings before the intervention) and the blood pressure of patients after yoga (i.e., readings after the intervention). The most widely used technique, testing the impact of an intervention (or for analyzing paired data), when it is numeric, is the paired sample t-test. This is explained with the help of a medical example as follows.

Paired sample t-test

Using formulas

Suppose that a doctor has developed a new drug that is used to treat hypertension and anxiety. The doctor tested the level of hypertension and anxiety of 30 patients before giving them the drug and gave them a score from 0—10 with 0 indicating no hypertension and anxiety and 10 indicating the highest level of hypertension and anxiety (i.e., the patient is likely to suicide).

Then, in order to test the impact of the new drug that he has developed, he gave the drug to the same 30 patients and then tested the level of their hypertension and anxiety. The hypertension and anxiety scores before and after consuming the new drug are shown as follows:

#

Patient

Hypertension – Anxiety Score

Before

After

1

Alex

8

9

2

Bob

7

8

3

Cathay

8

9

4

Drake

8

7

5

Emily

5

4

6

Frank

5

3

7

George

8

4

8

Hood

7

3

9

Iris

9

8

10

Jack

9

7

11

Kate

7

4

12

Luis

7

6

13

Mathew

6

7

14

Nick

4

5

15

John

2

2

16

Paul

8

8

17

Ross

8

7

18

Rachel

7

6

19

Smith

9

8

20

Clark

8

4

21

David

7

1

22

Mark

7

2

23

Walker

6

4

24

Adam

7

5

25

Huss

8

5

26

Jones

7

4

27

Angelina

5

3

28

Brad

8

7

29

Tom

8

6

30

Dakota

7

5

It is difficult to draw any conclusions about the impact of the new drug by just seeing the paired data above; thus, we need to conduct the paired sample t-test. The steps of the paired sample t-test are as follows:

Step 1: State the hypothesis of the paired test

Ho: Null Hypothesis: µd = 0: There is no statistically significant difference between the hyper tension and anxiety scores of patients before and after the intervention (i.e. the new drug).

HA: Alternative Hypothesis: µd ≠ 0: There is a statistically significant difference between the hyper tension and anxiety scores of patients before and after the intervention (i.e. the new drug).>

Step 2: Compute the test statistic


Mean of the difference = -1.663; Standard deviation of the difference = 1.829. These are computed as follows:

#

Patient

Hypertension – Anxiety Score

Difference = After – Before

1

Alex

1

2

Bob

1

3

Cathay

1

4

Drake

-1

5

Emily

-1

6

Frank

-2

7

George

-4

8

Hood

-4

9

Iris

-1

10

Jack

-2

11

Kate

-3

12

Luis

-1

13

Mathew

1

14

Nick

1

15

John

0

16

Paul

0

17

Ross

-1

18

Rachel

-1

19

Smith

-1

20

Clark

-4

21

David

-6

22

Mark

-5

23

Walker

-2

24

Adam

-2

25

Huss

-3

26

Jones

-3

27

Angelina

-2

28

Brad

-1

29

Tom

-2

30

Dakota

-2

Mean of difference = d bar =

Sum of above differences / 30

-1.633

Standard deviation of difference =

∑ (difference I – mean) 2 / 30

1.829

The t-statistic is thus equal to -1.663 / (1.829 / √ 30) = -4.98.

Step 3: Identify the critical values / rejection region

The n = sample size = 30 and we choose a 5% significance level; thus, the rejection region is less than -2.04 or more than +2.04.

Step 4: Make a conclusion

The t statistic in the example is -4.98. This is less than -2.04 and thus we can reject the Ho at the 5% significance level. Finally, we conclude that the new drug has been effective in terms of reducing the hypertension and anxiety of the patients.

Using SPSS

The above example can also be solved using the Statistical Package for Social Sciences (SPSS) as follows:

Go to ‘Analyze’ then ‘Compare Means’ then ‘Paired Sample t-test’. The SPSS output solution for the above example is as follows:

T-test

Paired Sample Statistics

Mean

N

Std. Deviation

Std. Error Mean

Pair 1 Hypertension and anxiety after using the new drug

5.3667

30

2.18905

.39966

Hypertension and anxiety score before the new drug

7.0000

30

1.55364

.28365

Paired Sample Correlations

N

Correlation

Sig.

Pair 1 Hypertension and anxiety after using the new drug and hypertension and anxiety score before the new drug

30

.568

.001

Paired Sample Test

Paired Differences

t

df

Sig. (2-tailed)

Mean

Std. Deviation

Std. Error Mean

95% Confidence Interval of the Difference

Lower

Upper

Pair 1 Hyper tension and anxiety after using the new drug – hyper tension and anxiety score before the new drug

-1.6333

1.82857

.33385

-2.316

-0.950

-4.892

29

.000

The p-value of 0.000 is less than 0.01 and this indicates that we can reject the null hypothesis even at the 1% significance level.

Assumptions

The assumptions of the paired sample t-test are as follows:

As a parametric method (a technique which gauges obscure parameters), the paired t-test makes a few suppositions. In spite of the fact that t-tests are very robust, it is great practice to assess the level of deviation from these suspicions, keeping in mind the end goal to evaluate the nature of the outcomes.

In a paired specimen t-test, the perceptions are characterized as the contrasts between two arrangements of qualities, and every supposition alludes to these distinctions, not the first information values. This special t-test has four primary presumptions:

  1. The response variable must not be categorical (interim/proportion).
  2. The values are not dependent on each other.
  3. The response variable ought to be around typically normally distributed.
  4. The response variable must not contain any extreme values.

McNemar’s test

Using SPSS

Suppose that a therapist has developed a new technique that is used to treat drug addicts. The therapist tested 30 patients before using the new technique on them and categorized them into ‘drug addicts’ and ‘non-drug addicts’. Then, in order to test the impact of the new technique that he has developed, he used the technique on the same 30 patients and then again categorized them into ‘drug addicts’ and ‘non-drug addicts’. The paired data for this example is shown as follows:

Before

After

Non-Drug-Addict

Drug Addict

Drug Addict

Non-Drug-Addict

Drug Addict

Non-Drug-Addict

Drug Addict

Non-Drug-Addict

Drug Addict

Non-Drug-Addict

Drug Addict

Drug Addict

Drug Addict

Drug Addict

Drug Addict

Non-Drug-Addict

Drug Addict

Non-Drug-Addict

Non-Drug-Addict

Drug Addict

Non-Drug-Addict

Non-Drug-Addict

Non-Drug-Addict

Non-Drug-Addict

Drug Addict

Non-Drug-Addict

Drug Addict

Non-Drug-Addict

Drug Addict

Drug Addict

Drug Addict

Drug Addict

Drug Addict

Non-Drug-Addict

Drug Addict

Non-Drug-Addict

Non-Drug-Addict

Drug Addict

Drug Addict

Non-Drug-Addict

Drug Addict

Non-Drug-Addict

Drug Addict

Non-Drug-Addict

Drug Addict

Non-Drug-Addict

Drug Addict

Non-Drug-Addict

Drug Addict

Drug Addict

Drug Addict

Non-Drug-Addict

Non-Drug-Addict

Non-Drug-Addict

Non-Drug-Addict

Drug Addict

Drug Addict

Non-Drug-Addict

Drug Addict

Non-Drug-Addict

Drug Addict

Non-Drug-Addict

It is difficult to draw any conclusions about the impact of the new technique by just seeing the paired data above; thus, we need to conduct the McNemar’s test. The test is conducted using SPSS and the output is shown as follows:

Descriptive Statistics

N

Mean

Std. Deviation

Minimum

Maximum

Before

30

.77

.425

0

1

After

30

.29

.461

0

1

Wilcoxon Signed Ranks Test

N Mean Rank Sum of Ranks
Before – After Negative Ranks 19a 12.00 228.00
Positive Ranks 4b 12.00 48.00
Ties 8c
Total 31
  1. After < Before
  2. After > Before
  3. After = Before

Test statistics

After – Before

Z

-3.128b

Asymp. Sig. (2-tailed)

.002

  1. Wilcoxon Signed Ranks Test
  2. Based on positive ranks

McNemar’s test

Crosstabs – Before and After

Before

After

Non-drug addict

Drug Addict

Non-drug addict 3 4
Drug Addict 19 5

Test statistics

Before – After

N

31

Exact Sig. (2-tailed)

.003b

  1. McNemar’s Test
  2. Binomial distribution used

The p-value of 0.003 is less than 0.05 and thus we reject the null hypothesis. Finally, we conclude that the new technique was effective in turning patients from addicts to non-addicts.

Using SPSS

The steps of the McNemar’s test using formulas are as follows:

Step 1: State the hypothesis of the McNemar’s test

Ho: Null Hypothesis: µd = 0: There is no statistically significant impact of the new technique.

HA: Alternative Hypothesis: µd ≠ 0: There is a statistically significant impact of the new technique.

Step 2: Compute the test statistic

To run a McNemar test, your data must be placed in a 2×2 contingency table, with the cell frequencies that equals to the number of pairs as follows;

Test 2 positive Test 2 negative Row total
Test 1 positive a b a + b
Test 1 negative c d c + d
Column total a + c b + d n
  1. Test statistic = (b – c) 2 / (b + c).

These boxes for this example are as follows:

Before

After

Drug addict

Non-Drug Addict

Drug addict

5

19

Non-Drug Addict

4

3

Thus, test statistic = (4 – 19) 2 / (4 + 19) = 9.78

Step 3: Identify the critical values / rejection region

The n = sample size = 30 and we choose a 5% significance level; thus, the rejection region is less than -2.04 or more than +2.04.

Step 4: Make a conclusion

The test statistic in the example is 9.78. This is more than +2.04 and thus we can reject the Ho at the 5% significance level. Finally, we conclude that the new treatment has been effective in terms of treating addictions.

This test is a non-parametric test for paired ostensible information. It’s utilized when you are occupied with finding an adjustment in extent for the linked information. For instance, you could utilize this test to break down review case-control experiments, where every treatment is matched with a control. It could likewise be utilized to break down a trial where two medicines are given to co-ordinated sets. The method is simple, quick and easy to perform. It also enables an appropriate confirmatory data analysis for situations dealing with paired dichotomous responses to surveys or experiments.

This test is sometimes referred to as McNemar’s Chi-Square test in light of the fact that the test measurement has a chi-square model. The McNemar’s test is utilized to decide whether there are contrasts on a dichotomous response variable between two related gatherings. It can be thought to be like the matched specimens t-test; however, for a dichotomous as opposed to a non-stop ward variable.

In any case, not at all like the combined specimens t-test, it can be conceptualized to test two unique properties of a rehashed measure dichotomous variable, as is clarified beneath. The McNemar’s test is utilized to break down pretest-posttest plan and also being generally utilized in examining co-ordinated matches and case-control surveys. On the off-chance that you have more than two repeated estimates, you could utilize the Cochran’s Q-test.

A limitation of the Mcnemar test includes that it was made for to be used with large samples. It also assumes that the discordant pair i.e. (b+c) is equal to or larger than 10; hence, the use of an exact binomial test is recommended if the discordant pair is less than 10.

Assumptions

  1. There must be one continuous variable with two categories and one independent variable with two connected groups.
  2. The two groups in your response variable must be mutually exclusive. This means that participants cannot appear in more than one group.
  3. Your sample must be a random sample.
Do you want to learn even more?
Start now with 1,000+ free video lectures
given by award-winning educators!
Yes, let's get started!
No, thanks!

Leave a Reply

Register to leave a comment and get access to everything Lecturio offers!

Free accounts include:

  • 1,000+ free medical videos
  • 2,000+ free recall questions
  • iOS/Android App
  • Much more

Already registered? Login.

Leave a Reply

Your email address will not be published. Required fields are marked *