[A, SfS] Chapter 6: Hypothesis Testing: 6.4: Test for Difference in Two Population Means
Hypothesis Test for a Difference in Two Population Means
Hypothesis Test for a Difference in Two Population Means
In this section, we will look at how to test whether the difference between the means of a continuous variable measured on two distinct populations differs from some benchmark value.
Suppose there are two distinct, independent populations, which we arbitrarily label 1 and 2, and the continuous variable #X# is measured on random samples of sizes #n_1# and #n_2# from the two populations, respectively.
For population 1 the mean and variance of #X# are #\mu_1# and #\sigma_1^2#, respectively. For population 2 the mean and variance of #X# are #\mu_2# and #\sigma_2^2#, respectively.
The mean of the sample from population 1 is #\bar{x}_1#, with sample variance #s^2_1#. The mean of the sample from population 2 is #\bar{x}_1#, with sample variance #s^2_2#.
Research Question and Hypotheses
The research question of a hypothesis test for a difference in two population means is whether or not the difference between #\mu_1# and #\mu_2# differs from some value #\Delta_0#. Usually #\Delta_0 = 0#, so we will use #0# in this course.
Depending on the expected direction of the difference, a hypothesis test for a difference in two population means has one of the following pairs of hypotheses:
Two-tailed | Left-tailed | Right-tailed |
|
|
|
Test Statistic and Null Distribution
As with confidence intervals in this setting, there are four conditions to consider that determine how to proceed with the hypothesis test:
1) The distribution of #X# in each population is normal.
2) The variances #\sigma_1^2# and #\sigma_2^2# of #X# on the two populations are both known.
3) The population variances are equal, i.e., #\sigma_1^2 = \sigma_2^2 = \sigma^2#.
4) Both sample sizes #n_1# and #n_2# are large.
Test statistic | Null distribution | Use when |
\[Z = \cfrac{\bar{X}_1 - \bar{X}_2}{\sqrt{\cfrac{\sigma_1^2}{n_1}+\cfrac{\sigma_2^2}{n_2}}}\] | \[N(0,1)\] |
|
\[Z= \cfrac{\bar{X}_1 - \bar{X}_2}{\sigma \sqrt{\cfrac{1}{n_1}+\cfrac{1}{n_2}}}\] | \[N(0,1)\] |
|
\[Z = \cfrac{\bar{X}_1-\bar{X}_2}{\sqrt{\cfrac{s_1^2}{n_1}+\cfrac{s_2^2}{n_2}}}\] | \[N(0,1)\] |
|
\[Z=\cfrac{\bar{X}_1 - \bar{X}_2}{s_p \sqrt{\cfrac{1}{n_1}+\cfrac{1}{n_2}}}\\\phantom{0}\\ \text{where} \\\phantom{0}\\ s_p = \sqrt{\cfrac{(n_1 - 1)s_1^2 + (n_2 - 1)s_2^2}{n_1 + n_2 - 2}}\] | \[N(0,1)\] |
|
\[T = \cfrac{\bar{X}_1-\bar{X}_2}{\sqrt{\cfrac{s_1^2}{n_1}+\cfrac{s_2^2}{n_2}}}\] | \[t_\nu\\\phantom{0}\\ \text{where} \\\phantom{0}\\ \nu = \cfrac{\Big(\cfrac{s_1^2}{n_1} + \cfrac{s_2^2}{n_2} \Big)^2}{\cfrac{(s_1^2/n_1)^2}{n_1 - 1} + \cfrac{(s_2^2/n_2)^2}{n_2 - 1}}\] |
|
\[T = \cfrac{\bar{X}_1 - \bar{X}_2}{s_p \sqrt{\cfrac{1}{n_1}+\cfrac{1}{n_2}}}\\\phantom{0}\\ \text{where} \\\phantom{0}\\ s_p = \sqrt{\cfrac{(n_1 - 1)s_1^2 + (n_2 - 1)s_2^2}{n_1 + n_2 - 2}}\] | \[t_\nu\\\phantom{0}\\ \text{where} \\\phantom{0}\\ \nu = n_1 + n_2 - 2\] |
|
These last two settings, in which the Student's t-distribution is used, are the most typical in scientific research. When a scientist talks about doing a "t-test", this setting is what is meant.
Calculating Degrees of Freedom Using R
As mentioned previously, you can add the following user-defined function to your #\mathrm{R}# workspace if you need to compute the degrees of freedom #\nu# for a two-sample t-test when the population variances are not assumed equal:
DF = function(sd1,sd2,n1,n2){
return((sd1^2/n1+sd2^2/n2)^2/((sd1^2/n1)^2/(n1-1)+(sd2^2/n2)^2/(n2-1)))
}
For example, suppose #s_1 = 1.7#, #s_2 = 1.9#, #n_1 = 12# and #n_2 = 13#. Then
> DF(1.7,1.9,12,13)
would return the degrees of freedom #\nu#.
Once data are collected and the appropriate test statistic is computed based on the four considerations, the P-value is computed using the same methods discussed for a Hypothesis Test for a Population Mean. You will have to check which form of the research hypothesis is being tested, and which distribution corresponds to the test statistic.
Given a significance level #\alpha#, if the P-value is larger than #\alpha# then do not reject the null hypothesis. Otherwise, you can conclude that the difference between the two sample means is statistically significant at significance level #\alpha#, and reject #H_0# in favor of #H_1#.
Example:
A study published in 2005 investigated the effectiveness of giving blood plasma containing complement component C4A to pediatric cardiopulmonary bypass patients.
Of #58# patients receiving C4A-rich plasma, the average length of hospital stay was #9.1# days and the standard deviation was #2.9# days. Of #59# patients receiving C4A-free plasma, the average length of hospital stay was #10.9# days and the standard deviation was #3.6# days.
Can we conclude, at significance level #0.01#, that the mean hospital stay for all patients receiving C4A-rich plasma is shorter than the mean hospital stay for all patients receiving C4A-free plasma?
Solution:
We are testing #H_0: \mu_1 - \mu_2 \geq 0# against #H_1 : \mu_1 - \mu_2 < 0#, where #\mu_1# is the mean hospital stay for all patients receiving C4A-rich plasma, and #\mu_2#is the main hospital stay for all patients receiving C4A-free plasma.
We don't have any mention of the distributions of hospital stay duration in the two populations, but the sample sizes are both large. The population variances are not given, and we are not told that these variances are equal.
So the test statistic is \[z = \cfrac{9.1 - 10.9}{\sqrt{\cfrac{2.9^2}{58} + \cfrac{3.6^2}{59}}} \approx -2.98\]
The P-value is computed in #\mathrm{R}# using
> pnorm(-2.98, 0, 1, low = TRUE)
to be #0.0014#, which is smaller than the significance level of #0.01#.
Therefore we reject #H_0# and conclude that the mean hospital stay for all patients receiving C4A-rich plasma is indeed shorter than the mean hospital stay for all patients receiving C4A-free plasma.
Example:
In a study of the relationship of the shape of a tablet to its dissolution time, #6# disc-shaped ibuprofen tablets and #8# oval-shaped ibuprofen tablets were dissolved in water.
The dissolution times, in seconds, were as follows:
Disk: #269.0,249.3,255.2,252.7,247.0,261.6#
Oval: #268.8,260.0,273.5,253.9,278.5,289.4,261.6,280.2#
Assume that dissolution times are normally-distributed among all disc-shaped tablets with mean #\mu_1#, and that dissolution times are normally-distributed among all oval-shaped tablets with mean #\mu_2#.
Can we conclude, at significance level #0.05#, that the mean dissolution times differ between the two shapes?
Solution:
We test #H_0: \mu_1 = \mu_2# against #H_1: \mu_1 \neq \mu_2#.
We assume normal distributions of dissolution times in both populations, with unknown variances, the variances are not assumed equal, and the sample sizes are small.
The sample mean and sample standard deviation for the #6# disc-shaped tablets are #255.8# and #8.216#, respectively. The sample mean and sample standard deviation for the #8# oval-shaped tablets are #270.7# and #11.903#, respectively.
The test statistic is \[t = \cfrac{270.7 - 255.8}{\sqrt{\cfrac{8.216^2}{6} + \cfrac{11.903^2}{8}}} \approx 2.769\] with degrees of freedom \[\nu = \cfrac{\Big(\cfrac{8.216^2}{6} + \cfrac{11.903^2}{8} \Big)^2}{\cfrac{(8.216^2/6)^2}{6 - 1} + \cfrac{(11.903^2/8)^2}{8 - 1}} \approx 11.96\] The P-value is computed in #\mathrm{R}#
> 2*pt(2.769, 11.96, low = FALSE)
to be #0.017#, which is smaller than the significance level #0.05#. Therefore we conclude that the mean dissolution times differ between the two shapes.
Or visit omptest.org if jou are taking an OMPT exam.