Table of Contents

## Estimation of Proportions Using Intervals

If several samples of multiple sizes are taken, it **helps in** **computation of proportion of successes in each sample**. In this case, there are several different answers which will be recorded from these different proportions. For this reason sampling distributions are calculated. If it is to find out which one is the correct answer, the answer would be none of them is correct.

Estimate of proportions using intervals are used to make a **measure of results for an entire population using a range of reasonable values for population proportion**. It is assumed that the intervals capture the true value of population parameter. This interval is known as confidence interval which helps in measuring a suitable and average value extracted from multiple samples of a whole population. Use of intervals help in providing multiple values to estimate population proportion.

**Example**: Suppose we want to measure the proportion of undergraduates who are permanently residing in southern Mexico. We take a random sample of 25 students, and compute the sample proportion. Suppose it comes out that 9 out of 25 students chosen are undergraduate and belong to Southern Mexico.

The population proportion comes out as follows.

ƥ = 9/25

ƥ = 0.36.

Due to “Luck of draw” factor, we are sure that the true population proportion is not exactly 0.36. If we want to find out an interval of a whole population which draws the sample proportion probability at at least 95 % confidence, we need to calculate it using standard deviation formula.

ƥ =

## Creation of Confidence Interval

There are **four basic steps involved in constructing a confidence interval** given as follows:

**The first step**involves**identification of sample statistics**. The sample statistic i.e. sample mean, sample proportion will be used to estimate population parameter.**The second step**involves**selection of a confidence level**. The confidence level helps in identification of uncertainty level in a sampling method. Normally 90 %, 95 % or 99 % confidence interval is used to find out accurate results**The third step**is to**calculate the margin of error**. The margin of error has to be calculated using following equation

**Margin of error = Critical value * Standard deviation of statistic**

**The fourth**and last step is to**specify the confidence interval**. The uncertainty is interlinked with the confidence level by defining the range of confidence interval by the following equation

**Confidence interval = sample statistic + Margin of error**

Confidence interval is calculated by using sampling distribution data of sample proportion ƥ.

**Example**: In case we have selected 100 people who believe in the process of evolution. We currently don’t know the population proportion of the people who believe in the process of evolution. Suppose if 47 out of 100 people say that they believe in evolution then we get to know that our sample proportion is ƥ = 0.47.

In order to find out the confidence interval for sample proportion, we require sampling distribution of ƥ.

Here the problem is that sampling distribution depends on ƥ and it is missing. First of all we have to find out standard deviation of ƥ. To find out SD of ƥ, we need to use the same formula:

=

=

= 0.0499

## Constructing the Interval

In order to construct an interval we need to **follow the four steps** specified above.

**Example**: Suppose we need to calculate average weight of an adult male in Baltimore, Maryland. We draw a random sample of 1,000 men from a total population of 1,000,000 men.Suppose weighted average of samples chosen comes out 180 pounds. Here the standard deviation of the whole population comes out 30 pounds.

We are required to find out confidence interval. We will calculate the confidence interval using the four step mentioned above.

- We have identified the mean sample statistic as 180 pounds for the sample statistic
- The confidence level selected for this case is 95 %
- The margin of error is calculated using standard error of the mean as follows

Standard error (SE) = s / sqrt (n) = 30 / sqrt (1000)

= 0.95

The critical value is a factor which is used to compute margin of error here. t score (t*) is calculated as follows:

Alpha (α): α = 1 – (confidence level / 100) = 0.05

Critical probability (p*): p* = 1 – α/2 = 1 – 0.05/2 = 0.975

Degrees of freedom (df): df = n – 1 = 1000 – 1 = 999

The critical value is the t statistics with 999 degrees of freedom and 0.975 cumulative frequency. Using the t Distribution calculator, the critical value comes out 1.96.

Margin of error is computed using the above given equation

**Margin of error = Critical value * Standard deviation of statistic**

= 1.96 * 0.95

= 1.86

The confidence interval is specified by **sample statistic + margin of error**. Here the uncertainty is denoted by confidence level. We have already chosen confidence interval 95 % which is 180 + 1.86

## Meaning of Confidence

Confidence level is the **probability that the targeted interval actually contains the target quantity**. In case we refer to 95 % confidence interval it does not refer to 95 % chance that the interval contains ƥ. The population proportion is either inside an interval or outside of it. The population proportion is a fixed quantity. If we show a confidence interval of 95 %, it refers that 95 % of samples in the population produce the true proportion.

It can also be stated like this, “We are 95 % confident that the true proportion lies in the interval from which samples have been selected”. Here the uncertainty refers to the probability that the selected sample belong to the 95 % of the samples of chosen interval and to the 5 % that does not related to confidence interval.

## Margin of Error

It represents the **amount of sampling error in a survey result**. The margin of error and confidence intervals are directly proportional to each other. If margin of error is higher, the confidence on the sampling results is higher to show that it is close to true figures.

**Margin of error is half of the width of the interval**. It refers to the extent on each side of sample proportion. For higher level of confidence, higher level of margin of error is required. **Confidence and precision are inversely proportional to each other**. The higher level of confidence refers to less precision.

**Example**: The common example of margin of error is a group of people divided into two classes. One of them prefer product A whereas the second group prefers product B. For measuring a global margin of error which refers to the percentages from the whole population of people liking product A and B.

In case statistic is a percentage, the maximum margin of error can be calculated using the radius of the confidence interval for 50 % reported percentage. **Here the margin of error is referred as absolute quantity which is equal to confidence interval radius for the statistics**. In case the true value is 50 %, the confidence interval radius is 5 % points, and the margin of error is 5 % points.

## Critical Values

It is the **value which splits the probability of availability or rejection region which include or exclude the targeted value in an interval**. Under normal standard model, the critical value depicts the Z term linked with the central region. The critical value is used in case sampling distribution is normal or close to normal.

Z-scores are used when standard deviation is known absolutely or have larger sample size. Z-score can be used to find out standard deviation which is unknown instead of t distribution. These values are also used for hypothesis testing with different steps to calculate right and left tail values.

The critical value of Z-score is used in case the sampling distribution is close to normal. To change the confidence level, it is required to change the number of standard errors in order to extend the interval away from the sample proportion. There are **three main types of critical values including t score, Chi square and z-tests**. This number of standard error is referred as critical value. The critical value is calculated using Z-table to find out z*.

The Z-table shows value of 95 % confidence interval z* = 1.96 and for 90 % confidence interval = 1.645.

### Using the Normal distribution

The critical value can be **calculated using normal distribution**. The critical value at 95 % confidence interval refers to two values which lie between 95 %. It shows that out of total 5 %, 2.5 % lies on each tail. In Z table we look up for the probability of 0.4750 as 0 ≤ ≤ ∗ ≥ ∗ = 0.025, 0 ≤ ≤ ∗) = 0.4750. The value of 0.4750 in Z- table is = 1.96. Here the cutoffs of the critical values for 95 % confidence interval are 1.96 and -1.96 on both sides.

The graph of 90 % of confidence interval has been shown below

The cuts off in the case of confidence interval 90 % have shown above * = 1.645. The cuts off in this case have given value of -1.645 and 1.645 on both sides.

## Trade-off Between Confidence and Precision

The safest way to enhance the confidence level on the chosen sample for inclusion of targeted value is to **choose larger sample size**. It helps in improving the precision level as well. The margin of error is calculated using the same formula, i.e., *

n =

The sample size is needed to obtain margin of error of 0.03 for a 95 % confidence interval from the whole population of people who believe in evolution process. If it is assumed that no sample have been taken yet, the calculations have shown as follows:

Z* = 1.96

ME = 0.30

n =

n = 1,067.11

It is not possible to choose people in rounded figure so that it will be chosen for nearest whole number, i.e., 1,068 samples from whole population.

## Caution

The following points should be considered carefully in hypothesis testing:

- Population proportion in a fixed quantity it does not vary
- Different sample results in each interval don’t match with each other. They have different values, hence, each sample shows different results.
- It is not possible to be certain about a parameter, one can only be confident to a specific extent.
- The complete process revolves around estimating the population proportion instead of sample proportion.
- It should not be stated confidently that more about the results is known than the interval actually tells.
- All intervals of samples chosen are treated equally. The values near the center of the interval are not more plausible than values near the edges.
- The researchers should be beware of margin of error that is too large to be useful. The margin of error between 10 %—90 % is not overly useful.
- The possibility of biased sampling should be taken into account.
- Consider the trials being independent.