### Chapter 6. Parameter Estimation and Confidence Intervals: Estimation

### Confidence Interval for the Population Proportion

A confidence interval for the population proportion #\pi# is a range of values, based on sample data, which are highly plausible candidates for the true value of the population proportion.

To construct a confidence interval for the population proportion #\pi#, we will need to make use of the *sampling distribution of the sample proportion*.

Remember that the sample proportion #\hat{p}# (approximately) follows the #N\bigg(\pi, \sqrt{\cfrac{\pi \cdot (1-\pi)}{n}}\,\bigg)# distribution if both of the following conditions are satisfied:

- There are at least 10
*positive*cases: #n\cdot \pi \geq 10# - There are at least 10
*negative*cases: #n\cdot (1-\pi) \geq 10#

The problem, however, is that since the value of #\pi# is unknown, we cannot use it to check the conditions for normality.

The solution is to use the sample proportion #\hat{p}# as an estimate for the population proportion #\pi# and check the conditions for normality using #\hat{p}# instead.

Likewise, without knowing #\pi#, we cannot compute the *standard error of the proportion* #\sigma_{\hat{p}}#. Instead, we will use the *estimated standard error of the proportion* #s_{\hat{p}}# in the calculation of a confidence interval for the population proportion #\pi#:

\[s_{\hat{p}} =\sqrt{\cfrac{\hat{p}\cdot (1-\hat{p})}{n}}\]

The width of a confidence interval is determined by the *margin **of error*.

#\phantom{0}#

Margin of Error

The **margin of error **#(ME)# of a confidence interval for the population proportion #\pi# is the distance from the center of the interval #\hat{p}# to either the lower bound #L# or the upper bound #U#.

To calculate the margin of error of a confidence interval for the population proportion #\pi#, use the following formula:

\[\begin{array}{rcccl}ME &=& z^* \cdot s_{\hat{p}} &=& z^* \cdot \sqrt{\cfrac{\hat{p}\cdot(1-\hat{p})}{n}}\end{array}\]

Where* *#z^*# is the **critical value **of the *Standard Normal Distribution* such that #\mathbb{P}(-z^* \leq Z \leq z^*) = \cfrac{C}{100})#.

Calculating z* with Statistical Software

Let #C# be the *confidence level *in #\%#.

To calculate the *critical value* #z^*# in Excel, make use of the function **NORM.INV()**:

\[=\text{NORM.INV}((100+C)/200, 0, 1)\]

To calculate the *critical value* #z^*# in R, make use of the function **qnorm()**:

\[\text{qnorm}(p=(100+C)/200, mean=0, sd=1,lower.tail = \text{TRUE})\]

Factors that Influence the Margin of Error

The margin of error of a confidence interval for the population proportion #\pi# is dependent on #3# factors: the confidence level, the sample proportion, and the sample size.

- As the confidence level increases, the margin of error increases and the confidence interval becomes wider.
- As the sample proportion approaches a value of #0.5# (from either side), the margin of error increases and the confidence interval becomes wider.
- As the sample size increases, the margin of error decreases and the confidence interval becomes narrower.

He randomly selects a sample of #150# from this population and finds that #X=27# of them are vegetarian/vegan.

Calculate the

*margin of error*of the #99\%# confidence interval for the population proportion #\pi#. Round your answer to #3# decimal places.

#ME=0.081#

There are a number of different ways we can calculate the *margin of error*. Click on one of the panels to toggle a specific solution.

The *margin of error *of a confidence interval for the population proportion #\pi# is calculated with the following formula:

\[ME=z^* \cdot s_{\hat{p}}\]

Calculate the *sample proportion *#\hat{p}#:

\[\hat{p}=\cfrac{X}{n}=\cfrac{27}{150}=0.18\]

Investigate whether the *sampling distribution of the sample proportion *may be considered approximately normal:

- #n\cdot \hat{p} = 150 \cdot 0.18 = 27 \geq 10#
- #n\cdot (1 -\hat{p}) = 150 \cdot (1-0.18) = 123 \geq 10#

Since both conditions are satisfied, the *sampling distribution of the sample proportion *is approximately normally distributed with parameters #\mu_{\hat{p}}=\pi# and #\sigma_{\hat{p}}=\sqrt{\cfrac{\pi \cdot (1 - \pi)}{n}}#.

However, because the population proportion #\pi# is unknown, the *standard error of the proportion *#\sigma_{\hat{p}}# cannot be calculated.

Instead, we will use the sample proportion #\hat{p}# to calculate the *estimated standard error of the proportion *#s_{\hat{p}}#:

\[s_{\hat{p}}=\sqrt{\cfrac{\hat{p} \cdot (1 - \hat{p})}{n}} = \sqrt{\cfrac{0.18 \cdot (1 -0.18)}{150}} = 0.03137\]

For a given *confidence level *#C#, the *critical value* #z^*# of the standard normal distribution is the value such that #\mathbb{P}(-z^* \leq Z \leq z^*)=\cfrac{C}{100}#.

To calculate this critical value #z^*# in Excel, make use of the following function:

NORM.INV(probability, mean, standard_dev)

probability: A probability corresponding to the normal distribution.mean: The mean of the distribution.standard_dev: The standard deviation of the distribution.

Here, we have #C=99#. Thus, to calculate #z^*# such that #\mathbb{P}(-z^* \leq Z \leq z^*)=0.99#, run the following command:

\[\begin{array}{c}

=\text{NORM.INV}((100+C)/200, 0, 1)\\

\downarrow\\

=\text{NORM.INV}(199/200, 0, 1)

\end{array}\]

This gives:

\[z^* = 2.57583\]

With this information, the *margin of error *can be calculated:

\[ME=z^* \cdot s_{\hat{p}} = 2.57583 \cdot 0.03137 = 0.081\]

The *margin of error *of a confidence interval for the population proportion #\pi# is calculated with the following formula:

\[ME=z^* \cdot s_{\hat{p}}\]

Calculate the *sample proportion *#\hat{p}#:

\[\hat{p}=\cfrac{X}{n}=\cfrac{27}{150}=0.18\]

Investigate whether the *sampling distribution of the sample proportion *may be considered approximately normal:

- #n\cdot \hat{p} = 150 \cdot 0.18 = 27 \geq 10#
- #n\cdot (1 -\hat{p}) = 150 \cdot (1-0.18) = 123 \geq 10#

Since both conditions are satisfied, the *sampling distribution of the sample proportion *is approximately normally distributed with parameters #\mu_{\hat{p}}=\pi# and #\sigma_{\hat{p}}=\sqrt{\cfrac{\pi \cdot (1 - \pi)}{n}}#.

However, because the population proportion #\pi# is unknown, the *standard error of the proportion *#\sigma_{\hat{p}}# cannot be calculated.

Instead, we will use the sample proportion #\hat{p}# to calculate the *estimated standard error of the proportion *#s_{\hat{p}}#:

\[s_{\hat{p}}=\sqrt{\cfrac{\hat{p} \cdot (1 - \hat{p})}{n}} = \sqrt{\cfrac{0.18 \cdot (1 -0.18)}{150}} = 0.03137\]

For a given *confidence level *#C#, the *critical value* #z^*# of the standard normal distribution is the value such that #\mathbb{P}(-z^* \leq Z \leq z^*)=\cfrac{C}{100}#.

To calculate this critical value #z^*# in R, make use of the following function:

qnorm(p, mean, sd, lower.tail)

p: A probability corresponding to the normal distribution.mean: The mean of the distribution.sd: The standard deviation of the distribution.lower.tail: If TRUE (default), probabilities are #\mathbb{P}(X \leq x)#, otherwise, #\mathbb{P}(X \gt x)#.

Here, we have #C=99#. Thus, to calculate #z^*#such that #\mathbb{P}(-z^* \leq Z \leq z^*)=0.99#, run the following command:

\[\begin{array}{c}

\text{qnorm}(p = (100+C)/200, mean = 0, sd = 1, lower.tail = \text{TRUE})\\

\downarrow\\

\text{qnorm}(p =199/200, mean = 0, sd = 1, lower.tail = \text{TRUE})

\end{array}\]

This gives:

\[z^* = 2.57583\]

With this information, the *margin of error *can be calculated:

\[ME=z^* \cdot s_{\hat{p}} = 2.57583 \cdot 0.03137 = 0.081\]

#\phantom{0}#

General Formula for a Confidence Interval for the Population Proportion

Assuming the *sampling distribution of the sample proportion *is (approximately) normal, the general formula for computing a #C\%\,CI# for the population proportion #\pi#, based on a random sample of size #n#, is:

\[CI_{\pi}=\bigg(\hat{p} - z^*\cdot \sqrt{\cfrac{\hat{p}\cdot(1-\hat{p})}{n}},\,\,\,\, \hat{p} + z^*\cdot \sqrt{\cfrac{\hat{p}\cdot(1-\hat{p})}{n}} \bigg)\]

Of these, #X=666# cultures showed some resistance to penicillin.

Construct a #99\%# confidence interval for the proportion of strep cultures among all Florida patients that are penicillin-resistant. Round your answers to #3# decimal places.

There are a number of different ways we can compute the *confidence interval*. Click on one of the panels to toggle a specific solution.

Calculate the *sample proportion *#\hat{p}#:

\[\hat{p}=\cfrac{X}{n}=\cfrac{666}{1665}=0.4000\]

Investigate whether the *sampling distribution of the sample proportion *may be considered approximately normal:

- #n\cdot \hat{p} = 1665 \cdot 0.4000 = 666 \geq 10#
- #n\cdot (1 -\hat{p}) = 1665 \cdot (1-0.4000) = 999 \geq 10#

Since both conditions are satisfied, the *sampling distribution of the sample proportion *is approximately normal.

Assuming the *sampling distribution of the sample proportion *is (approximately) normal, the general formula for computing a #C\%\,CI# for the population proportion #\pi#, based on a random sample of size #n#, is:

\[CI_{\pi}=\bigg(\hat{p} - z^*\cdot \sqrt{\cfrac{\hat{p}\cdot(1-\hat{p})}{n}},\,\,\,\, \hat{p} + z^*\cdot \sqrt{\cfrac{\hat{p}\cdot(1-\hat{p})}{n}} \bigg)\]

For a given *confidence level *#C#, the *critical value* #z^*# of the standard normal distribution is the value such that #\mathbb{P}(-z^* \leq Z \leq z^*)=\cfrac{C}{100}#.

To calculate this critical value #z^*# in Excel, make use of the following function:

NORM.INV(probability, mean, standard_dev)

probability: A probability corresponding to the normal distribution.mean: The mean of the distribution.standard_dev: The standard deviation of the distribution.

Here, we have #C=99#. Thus, to calculate #z^*# such that #\mathbb{P}(-z^* \leq Z \leq z^*)=0.99#, run the following command:

\[\begin{array}{c}

=\text{NORM.INV}((100+C)/200, 0, 1)\\

\downarrow\\

=\text{NORM.INV}(199/200, 0, 1)

\end{array}\]

This gives:

\[z^* = 2.5758\]

Calculate the lower bound #L# of the confidence interval:

\[L = \hat{p} - z^*\cdot \sqrt{\cfrac{\hat{p}\cdot(1-\hat{p})}{n}} = 0.4000 - 2.5758 \cdot \sqrt{\cfrac{0.4000 \cdot (1-0.4000)}{1665}} = 0.369\]

Calculate the lower bound #U# of the confidence interval:

\[U = \hat{p} + z^*\cdot \sqrt{\cfrac{\hat{p}\cdot(1-\hat{p})}{n}} = 0.4000 + 2.5758 \cdot \sqrt{\cfrac{0.4000 \cdot (1-0.4000)}{1665}} = 0.431\]

Thus, the #99\%# confidence interval for the population proportion #\pi# is:

\[CI_{\pi,\,99\%}=(0.369,\,\,\, 0.431)\]

Calculate the *sample proportion *#\hat{p}#:

\[\hat{p}=\cfrac{X}{n}=\cfrac{666}{1665}=0.4000\]

*sampling distribution of the sample proportion *may be considered approximately normal:

- #n\cdot \hat{p} = 1665 \cdot 0.4000 = 666 \geq 10#
- #n\cdot (1 -\hat{p}) = 1665 \cdot (1-0.4000) = 999 \geq 10#

Since both conditions are satisfied, the *sampling distribution of the sample proportion *is approximately normal.

Assuming the *sampling distribution of the sample proportion *is (approximately) normal, the general formula for computing a #C\%\,CI# for the population proportion #\pi#, based on a random sample of size #n#, is:

\[CI_{\pi}=\bigg(\hat{p} - z^*\cdot \sqrt{\cfrac{\hat{p}\cdot(1-\hat{p})}{n}},\,\,\,\, \hat{p} + z^*\cdot \sqrt{\cfrac{\hat{p}\cdot(1-\hat{p})}{n}} \bigg)\]

For a given *confidence level *#C#, the *critical value* #z^*# of the standard normal distribution is the value such that #\mathbb{P}(-z^* \leq Z \leq z^*)=\cfrac{C}{100}#.

To calculate this critical value #z^*# in R, make use of the following function:

qnorm(p, mean, sd, lower.tail)

p: A probability corresponding to the normal distribution.mean: The mean of the distribution.sd: The standard deviation of the distribution.lower.tail: If TRUE (default), probabilities are #\mathbb{P}(X \leq x)#, otherwise, #\mathbb{P}(X \gt x)#.

Here, we have #C=99#. Thus, to calculate #z^*#such that #\mathbb{P}(-z^* \leq Z \leq z^*)=0.99#, run the following command:

\[\begin{array}{c}

\text{qnorm}(p = (100+C)/200, mean = 0, sd = 1, lower.tail = \text{TRUE})\\

\downarrow\\

\text{qnorm}(p =199/200, mean = 0, sd = 1, lower.tail = \text{TRUE})

\end{array}\]

This gives:

\[z^* = 2.5758\]

Calculate the lower bound #L# of the confidence interval:

\[L = \hat{p} - z^*\cdot \sqrt{\cfrac{\hat{p}\cdot(1-\hat{p})}{n}} = 0.4000 - 2.5758 \cdot \sqrt{\cfrac{0.4000 \cdot (1-0.4000)}{1665}} = 0.369\]

Calculate the lower bound #U# of the confidence interval:

\[U = \hat{p} + z^*\cdot \sqrt{\cfrac{\hat{p}\cdot(1-\hat{p})}{n}} = 0.4000 + 2.5758 \cdot \sqrt{\cfrac{0.4000 \cdot (1-0.4000)}{1665}} = 0.431\]

Thus, the #99\%# confidence interval for the population proportion #\pi# is:

\[CI_{\pi,\,99\%}=(0.369,\,\,\, 0.431)\]

#\phantom{0}#

Controlling the Margin of Error

Suppose you would like the *margin of error *for a #C\%# confidence interval for the population proportion #\pi# to be no larger than #k#.

Then the minimum sample size required is

\[n=0.25 \cdot \Big(\cfrac{z^*}{k}\Big)^2,\]

rounded up to the next whole number.

If the researcher wants the

*margin of error*of the #94\%# confidence interval for the population proportion #\pi# to be no larger than #0.03#, what is the minimum sample size she needs to achieve this goal?

#n \geq 983#

There are a number of different ways we can calculate the *minimum* *sample size*. Click on one of the panels to toggle a specific solution.

For a given *confidence level *#C#, the *critical value* #z^*# of the standard normal distribution is the value such that #\mathbb{P}(-z^* \leq Z \leq z^*)=\cfrac{C}{100}#.

To calculate this critical value #z^*# in Excel, make use of the following function:

NORM.INV(probability, mean, standard_dev)

probability: A probability corresponding to the normal distribution.mean: The mean of the distribution.standard_dev: The standard deviation of the distribution.

Here, we have #C=94#. Thus, to calculate #z^*# such that #\mathbb{P}(-z^* \leq Z \leq z^*)=0.94#, run the following command:

\[\begin{array}{c}

=\text{NORM.INV}((100+C)/200, 0, 1)\\

\downarrow\\

=\text{NORM.INV}(194/200, 0, 1)

\end{array}\]

This gives:

\[z^* = 1.8808\]

With this information, the *minimum sample size *can be calculated:

\[n=0.25 \cdot \Big(\cfrac{z^*}{k}\Big)^2=\Big(\cfrac{1.8808}{0.03}\Big)^2=982.607\]

Rounding this value up gives #n=983#.

Thus, for the *margin of error *to be no larger than #0.03#, you need a sample size of at least #983#.

For a given *confidence level *#C#, the *critical value* #z^*# of the standard normal distribution is the value such that #\mathbb{P}(-z^* \leq Z \leq z^*)=\cfrac{C}{100}#.

To calculate this critical value #z^*# in R, make use of the following function:

qnorm(p, mean, sd, lower.tail)

p: A probability corresponding to the normal distribution.mean: The mean of the distribution.sd: The standard deviation of the distribution.lower.tail: If TRUE (default), probabilities are #\mathbb{P}(X \leq x)#, otherwise, #\mathbb{P}(X \gt x)#.

Here, we have #C=94#. Thus, to calculate #z^*#such that #\mathbb{P}(-z^* \leq Z \leq z^*)=0.94#, run the following command:

\[\begin{array}{c}

\text{qnorm}(p = (100+C)/200, mean = 0, sd = 1, lower.tail = \text{TRUE})\\

\downarrow\\

\text{qnorm}(p =194/200, mean = 0, sd = 1, lower.tail = \text{TRUE})

\end{array}\]

This gives:

\[z^* = 1.8808\]

With this information, the *minimum sample size *can be calculated:

\[n=0.25 \cdot \Big(\cfrac{z^*}{k}\Big)^2=\Big(\cfrac{1.8808}{0.03}\Big)^2=982.607\]

Rounding this value up gives #n=983#.

Thus, for the *margin of error *to be no larger than #0.03#, you need a sample size of at least #983#.

**Pass Your Math**independent of your university. See pricing and more.

Or visit omptest.org if jou are taking an OMPT exam.