Table of Contents

## Estimation of Proportions Using Interval

Taking several multiple-sized samples **helps with computing**** the proportion of successes in each sample**. In this case, there are several different answers to be recorded from these different proportions. Therefore, sampling distributions are calculated to find out which one is the correct answer. The correct answer would be none of them.

The estimate of proportions using intervals **measures results for an entire population, using a range of reasonable values for the population proportion**. It is assumed that the intervals capture the population parameter’s true value. This interval is known as the confidence interval, which helps measure a suitable and average value extracted from multiple samples of a whole population. Using intervals helps provide multiple values to estimate the population proportion.

Confidence intervals give more details than point estimates. A point estimate has limited usefulness since it does not reveal the uncertainty associated with the estimate, i.e., it is unclear how far this sample mean might be from the population mean.

**Example**: Suppose we want to measure the proportion of undergraduates who are permanently residing in southern Mexico. We take a random sample of 25 students and compute the sample proportion. Suppose it comes out that 9 out of 25 students chosen are undergraduates and belong to southern Mexico. The population proportion comes out as follows:

ƥ = 9/25

ƥ = 0.36

Due to the “luck of the draw” factor, we are sure that the true population proportion is not exactly 0.36. If we want to find out an interval of a whole population, which draws the sample proportion probability of at least 95% confidence, we need to calculate it using the standard deviation formula.

ƥ = √(ƥ(1-ƥ))/n = 0.95

## Creation of a Confidence Interval

There are **four basic steps involved in constructing a confidence interval**:

**The first step**involves**identifying sample statistics**. The sample statistic (i.e. sample mean, sample proportion) will be used to estimate the population parameter.**The second step**involves**selecting a confidence level**. The confidence level helps identify the uncertainty level of a sampling method. Normally, a 90%, 95%, or 99% confidence interval is used to find accurate results.**The third step**is**calculating the margin of error**. The margin of error is calculated using the following equation:

**Margin of error = Critical value * Standard deviation of a statistic**

**The fourth**and last step is**specifying the confidence interval**. The uncertainty is interlinked with the confidence level by defining the range of confidence interval by the following equation:

**Confidence interval = Sample statistic + Margin of error**

A confidence interval is calculated using sampling distribution data of sample proportion ƥ.

**Example**: Suppose we have selected 100 people who believe in evolution. We currently don’t know the population proportion of the people who believe in evolution. If 47 out of 100 people say that they believe in evolution, then know that our sample proportion is ƥ = 0.47. In order to find out the confidence interval for the sample proportion, we need to find the sampling distribution of ƥ.

Here, the problem is that the sampling distribution depends on ƥ, and it is missing. First of all, we have to find out the standard deviation of ƥ; we can do that with the same formula:

= √(ƥ(1-ƥ))/n

= √(0.47(1-0.47)/100

= 0.0499

## Constructing the Interval

In order to construct an interval, we need to **follow the 4 steps** specified above.

**Example**: Suppose we need to calculate the average weight of an adult male in Baltimore, Maryland. We draw a random sample of 1,000 men from a total population of 1,000,000 men. The weighted average of samples chosen comes out at 180 pounds. Here, the standard deviation of the whole population comes out at 30 pounds.

We need to find the confidence interval. We can calculate it using the four steps mentioned above.

- We have identified the mean of 180 pounds as the sample statistic.
- The confidence level selected for this case is 95%.
- The margin of error can be calculated using the standard error of the mean as follows:

Standard error (SE) = s / sqrt (n) = 30 / sqrt (1000)

= 0.95

The critical value is a factor which is used to compute the margin of error here. The t score (t*) is calculated as follows:

Alpha (α): α = 1 – (confidence level / 100) = 0.05

Critical probability (p*): p* = 1 – α/2 = 1 – 0.05/2 = 0.975

Degrees of freedom (df): df = n – 1 = 1000 – 1 = 999

The critical value is the t statistics, with 999 degrees of freedom and a 0.975 cumulative frequency. Using the t distribution calculator, the critical value comes out to 1.96.

The margin of error is computed using the above-given equation.

**Margin of error = Critical value * Standard deviation of a statistic**

= 1.96 * 0.95

= 1.86

The confidence interval is specified by the **sample statistic + margin of error**. Here, the uncertainty is denoted by the confidence level. We have already chosen the confidence interval of 95%, which is 180 + 1.86.

## Meaning of Confidence

The confidence level is the **probability that the targeted interval actually contains the target quantity**. When we refer to a 95% confidence interval, it does not mean there is a 95% chance that the interval contains ƥ. The population proportion is either inside an interval or outside it. The population proportion is a fixed quantity. If we show a confidence interval of 95%, it means that 95% of samples in the population produce the true proportion.

It can also be stated like this: “We are 95% confident that the true proportion lies in the interval from which samples have been selected”. Here, the uncertainty refers to the probability that the selected sample belongs to 95% of the samples of the chosen interval and to the 5% that does not relate to the confidence interval.

## Margin of Error

The **margin of error **is the range of values below and above the sample statistic in a confidence interval. It represents the **amount of sampling error in a survey result**. The margin of error and confidence intervals are directly proportional to each other. If the margin of error is higher, the confidence in the sampling results is higher, showing that it is close to the true figure.

The **margin ****of error is half the width of the interval**. It refers to the extent on each side of the sample proportion. For a higher level of confidence, a higher level margin of error is required. **Confidence and precision are inversely proportional to each other**. A higher level of confidence indicates less precision.

**Example**: The common example of the margin of error is a group of people divided into two classes. One of them prefers product A, and the second group prefers product B. Measuring a global margin of error refers to the percentages from the whole population of people who like product A or B.

In case a statistic is a percentage, the maximum margin of error can be calculated using the radius of the confidence interval for 50% of the reported percentage. **Here, the margin of error is known as an absolute quantity, which is equal to the confidence interval radius for the statistics**. If the true value is 50%, the confidence interval radius is 5% points, and the margin of error is 5% points.

## Critical Values

The **critical value** is a factor used to compute the margin of error. It is a point on the test distribution that is compared to the test statistic to determine whether to reject the null hypothesis.

It is the **value that splits the probability of availability or rejection region, which includes or excludes the targeted value in an interval**. Under the normal standard model, the critical value depicts the Z term linked with the central region. The critical value is used in case the sampling distribution is normal or close to normal. If the absolute value of the test statistic is greater than the critical value, you can declare statistical significance and reject the null hypothesis.

The critical value of the Z-score is used in case the sampling distribution is close to normal. Change the confidence level means changing the number of standard errors in order to extend the interval away from the sample proportion. There are **three main types of critical values, including the t score, Chi-square, and z-tests**. This number of standard errors is referred to as the critical value. The critical value is calculated using the Z-table to find out z*.

The Z-table shows a value of 95% confidence interval z* = 1.96 and for 90% confidence interval = 1.645.

### Using the Normal distribution

The critical value can be **calculated using the normal distribution**. The critical value at the 95% confidence interval refers to two values that lie between 95%. It shows that out of the total 5%, 2.5% lies on each tail. In the Z-table, we look up the probability of 0.4750 as 0 ≤ ≤ ∗ ≥ ∗ = 0.025, 0 ≤ ≤ ∗) = 0.4750. The value of 0.4750 in Z- table is = 1.96. Here, the cut-offs of the critical values for a 95% confidence interval are 1.96 and -1.96 on both sides.

The graph of 90% of the confidence interval is shown below:

The cut-offs in the case of the confidence interval 90 % have been shown above * = 1.645. The cut-offs, in this case, have a given value of -1.645 and 1.645 on both sides.

## The trade-off between Confidence and Precision

There is a close relationship between confidence and precision, but it is important to note that these terms are not direct complements of each other. The safest way to enhance the confidence level on the chosen sample for the inclusion of the targeted value is to **choose a larger sample size**. It helps improve the precision level as well. High levels of confidence can be achieved with wider intervals, while narrower or more precise intervals permit less confidence; thus, at a constant state, there is a trade-off between precision and confidence.

Most importantly, any statement of precision without a corresponding confidence level is termed incomplete and impossible to interpret.

The margin of error is calculated using the same formula, i.e. √(ƥ(1-ƥ)) /n. For that purpose, we are required to make the margin of error large enough to ensure the confidence interval is accurate. If we take the value of sampling proportion ƥ = 0.50, the desired sample size is calculated this way:

n = ((z* +(0.5))/ME)^2

The sample size is needed to obtain a margin of error of 0.03 for a 95% confidence interval from the whole population of people who believe in evolution. If we assume no sample has been taken yet, the calculations are shown as follows:

Z* = 1.96

ME = 0.30

n = ((1.96 +(0.5))/((0.03)))^2

n = 1,067.11

It is impossible to choose people in a rounded figure, so it will be chosen for the nearest whole number, i.e., 1,068 samples from the whole population.

**Absolute precision**is often applied when estimating quantities such as population proportions, which exist in the form of percentages.

The standard error has similar physical units as the estimator; hence, the absolute precision is the same physical units as the estimation target.

**Relative precision**is always unit-free and expressed as a percentage.

## Caution

The following points should be considered carefully in hypothesis testing:

- The population proportion in a fixed quantity does not vary.
- Different sample results in each interval don’t match with each other. They have different values; hence, each sample shows different results.
- It is impossible to be certain about a parameter; one can only be confident to a specific extent.
- The complete process revolves around estimating the population proportion instead of the sample proportion.
- Do not state confidently that more about the results is known than the interval actually tells.
- All sample intervals chosen are treated equally. The values near the center of the interval are not more plausible than the values near the edges.
- Beware of a margin of error that is too large to be useful. The margin of error between 10%—90% is not overly useful.
- The possibility of biased sampling should be taken into account.
- Consider the trials to be independent.