Are you more of a visual learner? Check out our online video lectures and start your calculus course now for free!

financial inference for paired data

Image: “statistics” by kulinetto. License: CC0 1.0


Introduction

Paired data is defined as two sets of data that follow a one-to-one relationship between their values. Each data set has the same number of data points. Each data point in one data set is related to one, and only one, data point in the other data set.

An example of paired data is the blood pressure of patients before doing yoga (ie, readings before the intervention) and the blood pressure of patients after doing yoga (ie, readings after the intervention). The most widely used technique, testing the impact of an intervention (or for analyzing paired data) when it is numeric, is called the paired sample t-test.

Paired Sample T-test

Using Formulas

A physician develops a new drug that is used to treat hypertension and anxiety. The physician tests the level of hypertension and anxiety in 30 patients before giving them the drug, and gives the patients a score from 0–10, with 0 indicating no hypertension and anxiety and 10 indicating the highest level of hypertension and anxiety (ie, the patient is likely to suicide).

Then, to test the impact of the new drug that the physician has developed, they give the drug to the same 30 patients and then test the level of the patients’ hypertension and anxiety (see table below).

No.

Patient

Hypertension and Anxiety Score

Before

After

1

Alex

8

9

2

Bob

7

8

3

Cathy

8

9

4

Drake

8

7

5

Emily

5

4

6

Frank

5

3

7

George

8

4

8

Henry

7

3

9

Iris

9

8

10

Jack

9

7

11

Kate

7

4

12

Louis

7

6

13

Mathew

6

7

14

Nick

4

5

15

John

2

2

16

Paul

8

8

17

Ross

8

7

18

Rachel

7

6

19

Sam

9

8

20

Clark

8

4

21

David

7

1

22

Mark

7

2

23

Walker

6

4

24

Adam

7

5

25

Huss

8

5

26

Jones

7

4

27

Angelina

5

3

28

Brad

8

7

29

Tom

8

6

30

Dakota

7

5

It is difficult to draw clear conclusions about the impact of the new drug by just seeing the paired data above; thus, we need to conduct the paired sample t-test. The 4 steps of the paired sample t-test are outlined below.

Step 1: State the hypothesis

Null hypothesis (µd = 0): There is no statistically significant difference between hypertension and the anxiety scores of patients before and after the intervention (i.e. the new drug).

Alternative hypothesis (µd ≠ 0): There is a statistically significant difference between hypertension and the anxiety scores of patients before and after the intervention (ie, the new drug).

Step 2: Compute the test statistic

Mean of the difference = –1.663 and standard deviation of the difference = 1.829. These XXX are computed as follows:

No.

Patient

Hypertension and Anxiety Score (Difference = After – Before)

1

Alex

1

2

Bob

1

3

Cathy

1

4

Drake

-1

5

Emily

-1

6

Frank

-2

7

George

-4

8

Henry

-4

9

Iris

-1

10

Jack

-2

11

Kate

-3

12

Luis

-1

13

Mathew

1

14

Nick

1

15

John

0

16

Paul

0

17

Ross

-1

18

Rachel

-1

19

Sam

-1

20

Clark

-4

21

David

-6

22

Mark

-5

23

Walker

-2

24

Adam

-2

25

Huss

-3

26

Jones

-3

27

Angelina

-2

28

Brad

-1

29

Tom

-2

30

Dakota

-2

Mean of difference = d bar =

Sum of above differences / 30

-1.633

Standard deviation of difference = ∑ (difference I – mean) 2 / 30

1.829

The t-statistic is thus equal to –1.663 / (1.829 / √ 30) = –4.98.

Step 3: Identify the critical values/rejection region

N = sample size of 30, with a 5% significance level; thus, the rejection region is < –2.04 or > +2.04.

Step 4: Make a conclusion

The t statistic in this example is –4.98. This is less than –2.04 and therefore we can reject the null hypothesis at the 5% significance level. We can thus conclude that the new drug has been effective in terms of reducing hypertension and anxiety of the patients.

Using the Statistical Package for Social Sciences

This example can also be solved using the Statistical Package for Social Sciences (SPSS) as follows: Go to ‘Analyze’ then ‘Compare Means’ then ‘Paired Sample t-test’. The SPSS output solution for the above example is outlined in the table below.

T-test

Paired Sample Statistics

Mean

N

Standard Deviation

Standard Error Mean

Pair 1 Hypertension and anxiety after using the new drug

5.3667

30

2.18905

0.39966

Hypertension and anxiety score before the new drug

7.0000

30

1.55364

0.28365

Paired Sample Correlations

N

Correlation

Sig

Pair 1 Hypertension and anxiety after and before using the new drug

30

0.568

0.001

Paired Sample Test

The p-value of 0.000 is less than 0.01, indicating that we can reject the null hypothesis even at the 1% significance level.

Assumptions

As a parametric method (a technique that gauges obscure parameters), the paired t-test makes a few suppositions. Although these tests are very robust, it is important to assess the level of deviation, keeping in mind the goal of evaluating the nature of the outcomes.

In a paired specimen t-test, the perceptions are characterized as the contrast between 2 arrangements of qualities, and every supposition alludes to these distinctions, not the first information values. This special t-test has 4 key presumptions:

  1. The response variable must not be categorical (interim/proportion).
  2. The values are not dependent on each other.
  3. The response variable needs to normally distributed.
  4. The response variable must not contain any extreme values.

McNemar’s Test

Using SPSS

A therapist has developed a new technique to treat people with substance use disorders. The therapist tests 31 patients before using the new technique, and categorizes them into those with substance use disorders and those without this disorder. To test the impact of the new technique, the therapist uses it on the same 31 patients, using the 2 categories developed. The paired data are shown in the table below.

Pairs

Before

After

1

Without  Disorder

With Substance Use Disorder

2

With Substance Use Disorder

Without  Disorder

3

With Substance Use Disorder

Without  Disorder

4

With Substance Use Disorder

Without  Disorder

5

With Substance Use Disorder

Without  Disorder

6

With Substance Use Disorder

With Substance Use Disorder

7

With Substance Use Disorder

With Substance Use Disorder

8

With Substance Use Disorder

Without  Disorder

9

With Substance Use Disorder

Without  Disorder

10

Without  Disorder

With Substance Use Disorder

11

Without  Disorder

Without  Disorder

12

Without  Disorder

Without  Disorder

13

With Substance Use Disorder

Without  Disorder

14

With Substance Use Disorder

Without  Disorder

15

With Substance Use Disorder

With Substance Use Disorder

16

With Substance Use Disorder

With Substance Use Disorder

17

With Substance Use Disorder

Without  Disorder

18

With Substance Use Disorder

Without  Disorder

19

Without  Disorder

With Substance Use Disorder

20

With Substance Use Disorder

Without  Disorder

21

With Substance Use Disorder

Without  Disorder

22

With Substance Use Disorder

Without  Disorder

23

With Substance Use Disorder

Without  Disorder

24

With Substance Use Disorder

Without  Disorder

25

With Substance Use Disorder

With Substance Use Disorder

26

With Substance Use Disorder

Without  Disorder

27

Without  Disorder

Without  Disorder

28

Without  Disorder

With Substance Use Disorder

29

With Substance Use Disorder

Without  Disorder

30

With Substance Use Disorder

Without  Disorder

31

With Substance Use Disorder

Without  Disorder

It is difficult to draw any conclusions about the impact of the new technique by just seeing the paired data above. It is important, therefore, to conduct the McNemar’s test. The test is conducted using SPSS. The output is shown in the tables below.

Descriptive Statistics

N

Mean

Standard Deviation

Minimum

Maximum

Before

31

0.77

0.425

0

1

After

31

0.29

0.461

0

1

Wilcoxon Signed-Rank Test

N Mean Rank Sum of Ranks
Before to After Negative Ranks 19a 12.00 228.00
Positive Ranks 4b 12.00 48.00
Ties 8c
Total 31
  1. After < Before
  2. After > Before
  3. After = Before

Test Statistics

After to Before

Z

0-3.128b

Asymp Sig (2-tailed)

0.002

  1. Wilcoxon signed-rank test
  2. Based on positive ranks

Crosstabs: Before and After

Before

After

With substance use disorder

Without substance use disorder

With substance use disorder 5 19
Without substance use disorder 4 3

Test Statistics

Before to After

N

31

Exact Sig (2-tailed)

0.003b

  1. McNemar’s test
  2. Binomial distribution used

The p-value of 0.003 is < 0.05 and therefore we can reject the null hypothesis. It is possible to conclude that the new technique was effective in turning patients from addicts to non-addicts.

Using SPSS

The steps of the McNemar’s test using formulas are outlined below.

Step 1: State the hypothesis

Null hypothesis (µd = 0): The new techique has no statistically significant impact.

Alternative hypothesis (µd ≠ 0): The new techique has a statistically significant impact.

Step 2: Compute the test statistic

To run a McNemar test, your data must be placed in a 2 × 2 contingency table, with the cell frequencies equal to the number of pairs, as outlined in the table below:

Test 2 Positive Test 2 Negative Total
Test 1 positive a b a + b
Test 1 negative c d c + d
Column total a + c b + d n
  1. Test statistic = (b – c) 2 / (b + c).

These boxes for this example are as follows:

Before

After

With substance use disorder

Without substance use disorder

With substance use disorder

5

19

Without substance use disorder

4

3

Thus, test statistic = (19 – 4) 2 / (19 + 4) = 9.78

Step 3: Identify the critical values/rejection region

N = sample size of 31, with a 5% significance level; thus, the rejection region is < –2.04 or > +2.04.

Step 4: Make a conclusion

The test statistic in the example is 9.78. This is > +2.04 and thus we can reject the Ho at the 5% significance level. Finally, we conclude that the new treatment has been effective.

This test is a non-parametric test for paired ostensible information. It is used when you are occupied with finding an adjustment in extent for the linked information. For example, this test could be used to break down review case-control experiments, where every treatment is matched with a control. It could also be used to break down a trial in which 2 medicines are given to co-ordinated sets. The method is simple, quick, and easy to perform. It also enables an appropriate confirmatory data analysis for situations dealing with paired dichotomous responses to surveys or experiments.

This test is sometimes referred to as McNemar’s chi-square test in light of the fact that the test measurement has a chi-square model. McNemar’s test is used to decide whether there are contrasts on a dichotomous response variable between 2 related gatherings. It can be thought to be like the matched specimens t-test, though for a dichotomous instead of a non-stop ward variable.

In any case, unlike the combined specimens t-test, it can be conceptualized to test 2 unique properties of a rehashed measure dichotomous variable, as outlined below. McNemar’s test is used to break down the pretest–posttest plan and to examine co-ordinated matches and case-control surveys. On the off-chance that you have more than 2 repeated estimates, you could utilize the Cochran’s Q-test.

One limitation of the Mcnemar test is that is made to be used with large samples. It also assumes that the discordant pair (ie, b+c) is ≥ 10; hence, the use of an exact binomial test is recommended if the discordant pair is < 10.

Assumptions

  1. There must be one continuous variable with 2 categories and 1 independent variable with 2 connected groups.
  2. The 2 groups in the response variable must be mutually exclusive. This means that participants cannot appear in more than 1 group.
  3. The sample must be random.
Do you want to learn even more?
Start now with 1,000+ free video lectures
given by award-winning educators!
Yes, let's get started!
No, thanks!

Leave a Reply

Register to leave a comment and get access to everything Lecturio offers!

Free accounts include:

  • 1,000+ free medical videos
  • 2,000+ free recall questions
  • iOS/Android App
  • Much more

Already registered? Login.

Leave a Reply

Your email address will not be published. Required fields are marked *