Hypothesis Tests about the Difference of Means
of Two Independent Arbitrary Populations

Consider two independent populations X and Y having unknown means µx and µy respectively. We wish to test, with level of significance a, the following three null hypotheses about the difference in means µx - µy:

 1. Ho: µx - µy <= M 2. Ho: µx - µy >= M 3. Ho: µx - µy = M

Our decision of acceptance or rejection of each hypothesis is based on the difference in sample means Xbar - Ybar from large independent random samples (of sizes n >= 30 and m >= 30 respectively) conducted on each population.

For this scenario, each population could be "large" or "small" relative to the respective sample size. If Population X is of a small finite size N, then the sample deviation Sx is adjusted by multiplying it by the finite population correction factor Sqrt[ (N - n) / (N - 1) ] . The sample deviation Sy is adjusted similarly if the second population is a small finite size.

We define the test statistic by x = (Xbar - Ybar - M) / Sqrt[ (Sx)^2 / n + (Sy)^2 / m ]. For large sample sizes n and m, the test statistic follows an approximate standard normal distribution Z ~ N(0,1). Therefore, we can compute the left and right tail probability values created by the test statistic: P(Z <= x) and P(Z >= x).

We reject the first hypothesis µx - µy <= M when Xbar - Ybar is too large, which means the right tail value will be too small: P(Z >= x) < a. This test is equivalent to the test Ho: µx - µy = M with the one-sided alternative Ha: µx - µy > M.

Likewise, we reject the the second hypothesis µx - µy >= M if Xbar - Ybar is too small, which means the left tail value will be too small: P(Z <= x) < a. This test is equivalent to the test Ho: µx - µy = M with the one-sided alternative Ha: µx - µy < M.

If P(Z >= x) < a / 2 or P(Z <= x) < a / 2, then we reject the third hypothesis µx - µy = M. The p-value for this two-sided test is always given by twice the smallest tail value.

Using the Z2MNTEST Program

The Z2MNTEST program can be used to perform these hypothesis tests. To execute the program, first we enter 1, 2, 3, or 4 to designate the types of populations we have under study: (1) two large populations; (2) the first large and the second finite; (3) the first finite and the second large; or (4) two finite populations. If a population is finite, then we enter its population size. Next, we enter the value of the difference M to be tested, the sample sizes, thesample means, the sample deviations, and the desired level of significance. The program displays the conclusion for each test, the test statistic, and the left and right tail values.

Example. A random survey of 650 workers with a college degree yields an average starting income of Xbar = \$38,560 with a standard deviation of Sx = \$4350. An independent random survey of 700 workers without a college degree yields an average starting income of Ybar = \$34,920 with a standard deviation of Sy = \$3870. At the 0.05 level of significance, test the following three hypotheses about the true difference in average starting incomes µx - µy among all workers with and without college degrees:

 1. Ho: µx - µy <= 3000 2. Ho: µx - µy >= 3000 3. Ho: µx - µy = 3000

Solution. After calling up the Z2MNTEST program, first enter 1 to designate two large populations (presumably all workers nationwide). Next, enter 3000 for TEST DIFFERENCE, 650 for X SAMPLE SIZE, 38560 for XBAR, 4350 for X SAMPLE DEV., 700 for Y SAMPLE SIZE, 34920 for YBAR, 3870 for Y SAMPLE DEV., and .05 for LEVEL OF SIG.

We obtain a right tail value of 0.0022 and thus reject the first and third hypotheses.

Since we have accepted the second claim but rejected the third, we can conclude that µx - µy > 3000. Thus, those with a degree should average more than \$3000 per year in starting income than those without a degree. Note that Xbar - Ybar = 3640. If µx - µy <= 3000, then there would be at most a 0.22% chance of Xbar - Ybar being as large as \$3640 with samples of these sizes.

Exercises

1. Consider the following data on the percentages of body fat from two random groups of men. Assume that the first set is a sample from a control group of 150 men aged 20-29, while the second group is simply a random sample of all men aged 30-39.

Percentages of Body Fat from Men Aged 20 -29
 12.6 6.9 24.6 10.9 27.8 20.6 19 12.8 5.1 12 7.5 8.5 16.1 19 15.3 14.2 4.6 4.7 9.4 6.5 13.4 9.9 10.8 14.4 19 28.6 6.1 24.5 9.9 19.1 10.6 16.5 20.5 17.2 30.1 10.5

Percentages of Body Fat from Men Aged 30 -39
 20.5 28.1 17.6 8.4 12.8 21.4 16.8 24.6 16.5 20.8 22.4 8.5 6.4 22 16.8 25.8 15.2 4.1 21.7 16.5 22.4 10.1 14.8 13.4 28.8 20 1.9 20.5 15.7 12.3 21.4 26.5 22 10.4 34.7 20.2

We wish to study the difference in means µx - µy between the percentage body fat of men in their 30's versus the control group of men in their 20's. Test the following three hypotheses at the 0.10 level of significance.

 1. Ho: µx - µy <= 1 2. Ho: µx - µy >= 1 3. Ho: µx - µy = 1

2. Consider the following random samples of high school GPAs from sophomores in college:

Random Collection of Female High School GPAs
 3.25 3.25 3 3 4 3.6 3 3.25 3.4 3.6 3.75 3.7 3 3.25 3.5 3.8 3 2.8 4 3.25 2.75 3.1 3.75 3.5 3.4 3.75 3.25 3.3 2.7 3.1 3.85 3.05 4 2.65 3.8 3.5 3.45 3 3.6 3.4

Random Collection of Male High School GPAs
 3.75 3 2.3 2.9 3 4 2.1 3.5 2.1 2.5 4 3.75 3.75 3 3.4 4 2.4 2.5 2.9 2.7 3.75 4 2.5 2.5 3.75 2.2 3.7 4 2.8 2.3 3.65 2.4 3.3 3.5 2.6 3.4 2.7 3

We wish to test whether the average GPA for one sex is higher than the other, or whether they are equal. Letting µx be the true average GPA among females and letting µy be the true average GPA among males, test the following three hypotheses at the 0.03 level of significance under the following circumstances:

(a) the samples are to represent all students nationwide.
(b) the samples are to represent only the 1254 female sophomores and the 982 male sophomores at that university.

 1. Ho: µx <= µy 2. Ho: µx >= µy 3. Ho: µx = µy

Solutions

1. We note that the X population consists of all men aged 30-39, while the Y population consists of the finite control group of 150 men aged 20-29.

Now we enter the data into lists and compute the basic statistics. We see that for the X population, n =36, Xbar = 17.83333, and Sx = 7.19337. For the Y population, m = 36, Ybar = 14.42222, Sy = 6.9497.

Next, we bring up the Z2MNTEST program and enter 2 to designate a large population versus a finite population, then we enter the 2nd population size of 150. After entering the statistics when prompted along with .10 for LEVEL OF SIG., we receive a right tail value of 0.0623 from a test statistic of1.536, and we and reject the first hypothesis.

Hence, we can conclude that µx - µy >= 1. (Since we have not rejected the third hypothesis, we say greater than or equal to 1.) Thus, men in their 30's average at least 1% more body fat than the control group of 150 men in their 20's.

We note that Xbar - Ybar = 3.41111. If µx - µy <= 1, then there would be at most a 6.23% chance of Xbar - Ybar being as large as 3.411, which is why we have rejected µx - µy <= 1 at the a = 0.10 level of significance. However the p-value for the two-sided test required of the third hypothesis is 0.1246, which is larger than a; thus, we have not rejected the third hypothesis.

2. These hypothesis tests are equivalent to tests about the difference µx - µy with a test difference of M = 0.

First, enter the data into lists and compute the sample statistics. For the females we obtain n = 40, Xbar = 3.3575, and Sx = 0.3715093953. For the males we obtain m = 38, Ybar = 3.094736842, and Sy = 0.629327126.

Now execute the Z2MNTEST program with 0 for TEST DIFFERENCE. For part (a) with two large populatiions, we receive a right tail value of 0.0128 from a test statistic of 2.2309, and we reject the first and third hypotheses.

Hence, we conclude that µx - µy > 0; that is, µx > µy. Thus, the average GPA of females is greater than the average GPA of males.

We note that Xbar - Ybar = 0.26276. If µx - µy <= 0, then there would be at most a 1.28% chance of Xbar - Ybar being as large as it is with sample of these sizes.

For part (b), we reexecute the program by first entering 4 for two finite populations, then entering the population sizes of 1254 and 982. We now obtain a right tail value of 0.0115 from a test statistic of 2.272267381, and we come to the same conclusion.