[A, SfS] Chapter 6: Hypothesis Testing: 6.2: Test for Population Mean
Hypothesis Test for a Population Mean
Hypothesis Test for a Population Mean
In this section, we will look at how to test whether the mean of a quantitative variable measured on a population differs from some benchmark value.
Suppose we have a continuous variable #X# whose mean value when measured on a specified population is denoted #\mu#.
Research Question and Hypotheses
The research question of a hypothesis test for a population mean is whether or not #\mu# differs from some benchmark value #\mu_0#.
Depending on the direction of the test, a hypothesis test for a population mean has one of the following pairs of hypotheses:
Twotailed  Lefttailed  Righttailed 



Test Statistic and Null Distribution
Suppose a random sample of size #n# is selected from the population and #X# is measured on the sample, from which we obtain the sample mean #\bar{x}# and the sample standard deviation #s#.
As with confidence intervals, there are several conditions that guide the procedure we follow:
1) The distribution of #X# on the population is a normal distribution;
2) We know the value of the variance #\sigma^2# of #X# on the population;
3) The sample size #n# is large.
Test statistic  Null distribution  Use when 
\[Z=\cfrac{\bar{X}  \mu_0}{\sigma/\sqrt{n}}\]  \[N(0,1)\] 

\[Z = \cfrac{\bar{X}  \mu_0}{s/\sqrt{n}}\]  \[N(0,1)\] 

\[T = \cfrac{\bar{X}  \mu_0}{s/\sqrt{n}}\]  \[t_{n1}\] 

Calculating the Pvalue
The calculation of the Pvalue of a hypothesis test for a population mean #\mu# depends on which form of #H_1# is being considered, and which test statistic is being used. We present in the below tables the Pvalue calculation, including the #\mathrm{R}# commands.
Test statistic: #Z \sim N(0,1)#; Computed value based on sample data: #z#.
#\begin{array}{lllll}
\phantom{0}\text{Direction}&\phantom{0000}H_0&\phantom{0000}H_1&\phantom{000}p\text{value}&\phantom{0000000}\text{R Command}\\
\hline
\text{Twotailed}&H_0:\mu = \mu_0&H_1:\mu \neq \mu_0&2\cdot \mathbb{P}(Z\geq z)&2 \text{ * }\text{pnorm}(\text{abs}(z),0,1, \text{low=FALSE})\\
\text{Lefttailed}&H_0:\mu \geq \mu_0&H_1:\mu \lt \mu_0&\mathbb{P}(Z\leq z)&\text{pnorm}(z,0,1, \text{low=TRUE})\\
\text{Righttailed}&H_0:\mu \leq \mu_0&H_1:\mu \gt \mu_0&\mathbb{P}(Z\geq z)&\text{pnorm}(z,0,1, \text{low=FALSE})\\
\end{array}#
#H_1: \mu \neq \mu_0#  #H_1: \mu \lt \mu_0#  #H_1: \mu \gt \mu_0# 
Test statistic: #T \sim t_{n1}#; Computed value based on sample data: #t# with sample size #n#.
#\begin{array}{lllll}
\phantom{0}\text{Direction}&\phantom{0000}H_0&\phantom{0000}H_1&\phantom{000}p\text{value}&\phantom{0000000}\text{R Command}\\
\hline
\text{Twotailed}&H_0:\mu = \mu_0&H_1:\mu \neq \mu_0&2\cdot \mathbb{P}(T\geq T)&2 \text{ * }\text{pt}(\text{abs}(t),n\text{  }1,\text{low=FALSE})\\
\text{Lefttailed}&H_0:\mu \geq \mu_0&H_1:\mu \lt \mu_0&\mathbb{P}(T\leq T)&\text{pt}(t,n\text{  }1, \text{low=TRUE})\\
\text{Righttailed}&H_0:\mu \leq \mu_0&H_1:\mu \gt \mu_0&\mathbb{P}(T\geq t)&\text{pt}(t,n\text{  }1, \text{low=FALSE})\\
\end{array}#
#H_1: \mu \neq \mu_0#  #H_1: \mu \lt \mu_0#  #H_1: \mu \gt \mu_0# 
If the Pvalue is larger than the significance level #\alpha#, then the evidence against #H_0# is not convincing. Otherwise, we would reject #H_0# and conclude that there is sufficient evidence to support #H_1#.
Note that in #\mathrm{R}#, #\mathtt{abs(z)}# or #\mathtt{abs(t)}# compute the absolute value of the test statistic. You don’t need to use the #\mathtt{abs()}# command in practice, as long as you enter the absolute value of #z# or #t# yourself.
Suppose it is hypothesized that the mean pregnancy length for expectant mothers in a certain region is less than #270# days. Thus, we have the following hypotheses:
\[H_0: \mu \geq 270 \\
H_1: \mu < 270\]
Assume pregnancy lengths for expectant mothers in the region are normally distributed, and set #\alpha = 0.05#.
From a random sample of #n = 25# expectant mothers from the region, the sample mean pregnancy length is #\bar{x} = 264# and the sample variance is #s^2 = 150.80#.
The test statistic is: \[t = \cfrac{264  270}{\sqrt{150.80/25}} \approx 2.443\]
The sample size is small, so the corresponding Pvalue is found using the #t_{24}# distribution: \[P(T \leq 2.443) \approx 0.011\]
using the #\mathrm{R}# command:
pt(2.443, 24, low = TRUE)
Since this Pvalue is less than #\alpha = 0.05#, we conclude in favor of #H_1#, i.e., that the mean pregnancy length for expectant mothers in that region is less than #270# days.
If instead, we had #n = 50# expectant mothers in the sample, then whether or not pregnancy lengths are normallydistributed in the region, we can use the normal distribution to compute the Pvalue, on the basis of the Central Limit Theorem.
Suppose the sample mean and sample variance are the same as above.
Then in this case: \[z = \cfrac{264  270}{\sqrt{150.80/50}} \approx 3.455\]
and the approximate Pvalue is: \[P(Z \leq 3.455) \approx 0.000275\]
using the #\mathrm{R}# command:
pnorm(3.455, 0, 1, low = TRUE)
Since the Pvalue is less than #\alpha = 0.05#, we conclude in favor of #H_1#, i.e., that the mean pregnancy length for expectant mothers in that region is less than #270# days.
Or visit omptest.org if jou are taking an OMPT exam.