Introduction
Inferences for population mean are made using a sample mean when the population mean is unknown. There are three ways to make inferences related to population mean:
 Point estimation
 Interval estimation
 Hypothesis testing
Each of the above are described as follows:
Point Estimation
Point estimation involves making inferences about the population mean using a single number such as the sample mean and mean of the sampling distribution. The single value is calculated from the sample.
Sample mean
A sample is a small portion intended to show how the whole looks like. The simplest way to make inference about the population mean is to draw a small sample out of the population, compute its mean and use it as an estimator of the population.
Example: Suppose you want to estimate the average age of the people who suffer from cancer. It would be very costly and timeconsuming to contact every person on the planet who is suffering from cancer; thus, a sample mean would work. You can contact 30 cancer patients and compute the mean of their ages and use it as an estimation of the population mean. This is costeffective and less timeconsuming, but it is not very accurate.
Sampling distribution
This is a more sophisticated method as compared to calculating a single sample mean. It involves drawing various samples from the population (as the name ‘sampling distribution’ suggests), computing the mean of each sample and then computing the mean of the means of each sample. The key to this method is that the samples must be drawn from the population in a random manner. This would be a more accurate point estimator of the population mean. This improves the chances of accuracy, especially if there was no bias in sample selection.
Example: Suppose we draw 5 random samples from the population of cancer patients, compute the mean of each sample and finally compute the mean of the five means. This is shown as follows:
Sample 1  Sample 2  Sample 3  Sample 4  Sample 5  
10 
80 
40 
80 
10 

10 
80 
40 
80 
10 

20 
10 
40 
10 
20 

30 
10 
40 
10 
30 

10 
10 
30 
10 
20 

80 
0 
30 
10 
80 

0 
20 
40 
10 
0 

10 
30 
40 
10 
20 

50 
70 
0 
70 
50 

Mean = 
24 
34 
33 
32 
27 
Mean of sampling distribution = 24 + 34 + 33 + 32 + 27 / 5 = 30 
The mean of the sampling distribution was 30; thus, we can conclude that the average age of cancer patients is 30 years.
Interval Estimation
Interval estimation involves making inferences about the population mean using a range of values, such as the confidence interval. Confidence interval estimation is the range of likely values of the parameter with a specified level of confidence. For instance, a 95% confidence interval would mean that if we were to take 100 different samples and calculate a 95% confidence interval, then 95 of 100 would contain the true mean value. The confidence interval under various scenarios is calculated as follows:
Confidence interval when population standard deviation is known
When the population standard deviation is known, we calculate the confidence interval as follows:
In this formula, x bar is sample mean, ơ is population standard deviation, n is the sample size, and z is 1.645 for 90% confidence, 2 for 95% confidence, and 2.576 for 99% confidence.
An example to explain this is as follows. Suppose that the sample mean is 1.8, the population standard deviation is 0.5 and the sample size is 36. The confidence interval would be:
 (1.8 + 1.645 * (0.5 / √ 36), 1.8 – 1.645 * (0.5 / √ 36)) → (1.9370, 1.6629) at 90% confidence. That is, we can be sure 90% of the time that the ‘population mean’ would lie in the range 1.6629 to 1.9370.
 (1.8 + 2 * (0.5 / √ 36), 1.8 – 2 * (0.5 / √ 36)) → (1.9667, 1.6333) at 95% confidence. That is, we can be sure 95% of the time that the ‘population mean’ would lie in the range 1.9667 to 1.6333.
 (1.8 + 2.576 * (0.5 / √ 36), 1.8 – 2.576 * (0.5 / √ 36)) → (2.0146, 1.5853) at 99% confidence. That is, we can be sure 99% of the time that the ‘population mean’ would lie in the range 1.5853 to 2.0146.
Confidence interval when population standard deviation is unknown
When the population standard deviation is unknown, we calculate the confidence interval using the sample standard deviation as follows:
In this formula, x bar is sample mean, s is sample standard deviation, n is the sample size, and t depends on the degrees of freedom (i.e., n1). The tdistribution is as follows:
An example to explain this is as follows. Suppose that the sample mean is 1.8, the sample standard deviation is 1.3 and the sample size is 36, the confidence interval would be:
 For 90% confidence, the interval would be: (1.8 + 1.697 * (1.3 / √ 36), 1.8 – 1.697 * (1.3 / √ 36)) → (2.1676, 1.4323). That is, we can be sure 90% of the time that the ‘population mean’ would lie in the range 1.4323 to 2.1676.
 For 95% confidence, the interval would be: (1.8 + 2.042 * (1.3 / √ 36), 1.8 – 2.042 * (1.3 / √ 36)) → (2.2424, 1.3575). That is, we can be sure 90% of the time that the ‘population mean’ would lie in the range 1.3575 to 2.2424.
 For 99% confidence, the interval would be: (1.8 + 2.750 * (1.3 / √ 36), 1.8 – 2.750 * (1.3 / √ 36)) → (2.3958, 1.2041). That is, we can be sure 90% of the time that the ‘population mean’ would lie in the range 1.2041 to 2.3958.
Limitations
We cannot use confidence intervals when the sample size is small (i.e., less than 30) and when the population is not normally distributed.
Hypothesis testing
Hypothesis testing involves making inferences about the population mean by statistically testing certain claims about the population mean, such as the population mean is equal to x, or less than x, or greater than x. Study statistics estimating population parameters are the basis of developing a hypothesis seeing that true values of populations are hard to know.
Hypothesis testing µ ≠ x
The first type of hypothesis testing related to the population mean is to test whether the population mean is equal to a certain value x. This is also referred to as the twotailed test. The steps for conducting such a hypothesis test are as follows:
Step 1: Stating the hypothesis
H0: Null Hypothesis: µ = 20: The population mean is equal to 20.
HA: Alternative Hypothesis: µ ≠ 20: The population mean is not equal to 20.
Step 2: Computing the test statistic
The test statistic is calculated as follows:
In this formula, x bar is sample mean, µo is the hypothesized mean, ơ is the population standard deviation, and n is the sample size. Suppose the sample mean is 24, the population standard deviation is 9.14 and the sample size is 64, then the zstatistic would be equal to = (24 – 20) / (9.14 / √ 64) = 4 / (9.14/8) = 3.5.
Step 3: Determining the critical values/rejection region
There are various rejection regions depending upon the level of significance (α).
Twotailed test 

α 
Reject Ho if Z statistic is: 
0.20 
Less than 1.282 OR more than +1.282 
0.10 
Less than 1.645 OR more than +1.645 
0.05 
Less than 1.96 OR more than +1.96 
0.010 
Less than 2.576 OR more than +2.576 
0.001 
Less than 3.291 OR more than +3.291 
0.0001 
Less than 3.819 OR more than +3.819 
Step 4: Make a conclusion
The z statistic in the example is +3.5. This is higher than +3.291 and thus we can reject the Ho at the 0.1% significance level, but this is not higher than +3.819 and thus we cannot reject H0 at the 0.01%. Finally, we conclude that we have enough evidence to reject the claim that the population mean is equal to 20.
Hypothesis testing µ > x
The second type of hypothesis testing related to the population mean is to test whether the population mean is greater than a certain value x. This is also referred to as the uppertailed test. The steps for conducting such a hypothesis test are as follows:
Step 1: Stating the hypothesis
H0: Null Hypothesis: µ = 19: The population mean is equal to 19.
HA: Alternative Hypothesis: µ > 19: The population mean is greater than 19.
Step 2: Computing the test statistic
The test statistic is calculated in the similar way as follows:
In this formula, x bar is again the sample mean, µo is again the hypothesized mean, ơ is again the population standard deviation, and n is again the sample size. Suppose that the sample mean is 16, the population standard deviation is 8.57 and the sample size is 36, then the zstatistic would be equal to = (16 – 19) / (9.14 / √ 36) = 4 / (9.14 / 8) = 1.96937.
Step 3: Determining the critical values/rejection region
There are various rejection regions depending upon the level of significance (α), but, this time, they are different from the twotailed test.
Uppertailed test 

α 
Reject Ho if Z statistic is: 
0.20 
More than +1.282 
0.10 
More than +1.645 
0.05 
More than +1.96 
0.010 
More than +2.576 
0.001 
More than +3.291 
0.0001 
More than +3.819 
Step 4: Make a conclusion
The z statistic in the example is 1.96937. This is lower than +1.282 and thus we fail to reject the Ho even at the 20% significance level. Finally, we conclude that we have enough evidence to not reject the claim that the population mean is equal to 19, i.e., it is less than 19.
Hypothesis testing µ < x
The third type of hypothesis testing related to the population mean is to test whether the population mean is less than a certain value x. This is also referred to as the lowertailed test. The steps for conducting such a hypothesis test are as follows:
Step 1: Stating the hypothesis
H0: Null Hypothesis: µ = 40: The population mean is equal to 40.
HA: Alternative Hypothesis: µ < 40: The population mean is less than 40.
Step 2: Computing the test statistic
The test statistic is calculated as follows in a similar way;
In this formula, x bar is again the sample mean, µo is again the hypothesized mean, ơ is again the population standard deviation, and n is again the sample size. Suppose that the sample mean is 46, the population standard deviation is 35 and the sample size is 100, then the zstatistic would be equal to = (46 – 40) / (35 / √ 100) = 6 / (3.5) = +1.714.
Step 3: Determining the critical values/rejection region
There are various rejection regions depending upon the level of significance (α), but this is different from a twotailed test and from an uppertailed test:
Lowertailed test 

α 
Reject Ho if Z statistic is: 
0.20 
Less than 1.282 
0.10 
Less than 1.645 
0.05 
Less than 1.96 
0.010 
Less than 2.576 
0.001 
Less than 3.291 
0.0001 
Less than 3.819 
Step 4: Make a conclusion
The z statistic in the example is +1.714. This is higher than 1.282 and thus we can fail to reject the Ho even at the 20% significance level. Finally, we conclude that we have enough evidence to not reject the claim that the population mean is equal to 40, i.e., it is more than 40.
When the population standard deviation is unknown
We use the sample standard deviation and the tdistribution (instead of the zdistribution) for hypothesis testing when the population standard deviation is unknown. The formula for the test statistic when using sample standard deviation is as follows:
In this formula, x bar is again the sample mean, µ is the hypothesized mean, and n is the sample size. The only change is that of ‘s’ which is the sample standard deviation. Moreover, the t critical values are identified from the tdistribution.
Factors such as size of the sample, sample distribution being normal distribution or not, and knowing or not knowing the variance, are the factors to consider when deciding whether to use a ttest or a Z confidence interval.
Leave a Reply