Between Two Independent Arbitrary Populations

Consider two independent populations having unknown means µx and µy respectively. We wish to construct a confidence interval for the difference in means µx - µy. We first estimate this difference with the difference in sample means Xbar - Ybar from independent random samples, of sizes n and m respectively, conducted on each population. Then the confidence interval is of the form (Xbar - Ybar) +/- e, where e is an appropriate margin of error.

For this scenario, each population could be "large" or "small" relative to the respective sample size. If Population 1 is of a small finite size N, then the sample deviation Sx is adjusted by multiplying it by the finite population correction factor Sqrt[ (N - n) / (N - 1) ] . The sample deviation Sy is adjusted similarly if the second population is a small finite size.

Once again, if the desired level of confidence is r, then we define the z-score z to be the value such that P(-z <= Z <= z) = r, where Z ~ N(0,1). If we let Sx and Sy denote the sample deviations of the respective random samples, then the margin of error is given by e = z Sqrt[ (Sx)^2 / n + (Sy)^2 / m ] .

The confidence interval is given by

Usually, samples of sizes n >= 30 and m >= 30 are considered sufficient to use the analysis above. However in practice, if one wishes to obtain a reasonably small margin of error, then much larger sample sizes are needed.

To execute the program, we first enter **1**, **2**, **3**, or **4** to designate the types of populations we have under study: (1) two large populations; (2) the first large and the second finite; (3) the first finite and the second large; or (4) two finite populations. If a population is finite, then we enter its population size. Next, enter the size of the first sample, followed by Xbar and Sx, then enter the size of the second sample, Ybar, and Sy. Lastly, we enter the desired level of confidence (in decimal). The program displays the difference Xbar - Ybar, the margin of error, and the confidence interval.

** Example 1.** A national random survey of 650 workers with a college degree yields an average income of Xbar= $38,560 with a standard deviation of Sx = $4350. An independent random survey of 700 workers without a college degree yields an average income of Ybar = $34,920 with a standard deviation of Sy = $3870. Find a 95% confidence interval for the true difference in average incomes.

*Solution.* After calling up the **ZDIFMNCI** program, enter 1 to designate two "large" populations. Next, enter** 650** for **X SAMPLE SIZE**, **38560** for **XBAR**, **4350** for **X SAMPLE DEV.**, **700** for **Y SAMPLE SIZE**, **34920** for **YBAR**, **3870** for **Y SAMPLE DEV.**, and **.95** for **CONF. LEVEL**. We see that (µx - µy) ~ 3640 +/- 440.4781. In other words, workers with a degree average somewhere from $3199.52 to $4080.48 more in income.

If we have two data sets with an equal number of measurements, then we can enter the data into the **STAT Edit** screen (**LIST EDIT** on the TI-86, **APPS 6** on the TI-89) in order to compute the statistics Xbar, Sx, Ybar, and Sy with the **2-Var Stats** command (**TwoVar** on the TI-86 and TI-89). We then can access the statistics to enter these values into the program as follows:

On the TI-83: For **X SAMPLE SIZE**, press **VARS**, press **5**, press **1**, press **ENTER**. For **XBAR**, press **VARS**, press **5**, press **2**, press **ENTER**. For **X SAMPLE DEV.**, press **VARS**, press **5**, press **3**, press **ENTER**. For **Y SAMPLE SIZE**, press **VARS**, press **5**, press **1**, press **ENTER**. For **YBAR**, press **VARS**, press **5**, press **5**, press **ENTER**. For **Y SAMPLE DEV**., press **VARS**, press **5**, press **6**, press **ENTER**.

On the TI-86: For **X SAMPLE SIZE**, type **2nd ALPHA 9** to obtain **n**, press **ENTER**. For **XBAR**, press **STAT** (i.e., **2nd +**), press **F5**, then press **F1**, press **ENTER**. For **X SAMPLE DEV.**, press **STAT**, press **F5**, press **F3**, press **ENTER**. For **Y SAMPLE SIZE**, type **2nd ALPHA 9** for **n**, press **ENTER**. For **YBAR**, press **STAT**, press **F5**, then press **F4**, press **ENTER**. For **Y SAMPLE DEV.**, press **STAT**, press **F5**, press **MORE**, press **F1**, press **ENTER**.

On the TI-89: For **X SAMPLE SIZE**, type and enter **nStat**. For **XBAR**, press **CHAR** (i.e., **2nd +**), press **2**, then scroll down to **Xbar** (item **A**), press **ENTER**. For **X SAMPLE DEV.**, type and enter **Sx**. For **Y SAMPLE SIZE**, type and enter ** nStat**. For

1. We wish to see if there is any apparent difference in high school grade point average between girls and boys who choose to go to college. The data below is a random collection of high school GPAs from a group of sophomores at a random university. Find a 90% confidence interval for the difference between average female and average male grade point average in the following cases:

(a) the samples are to represent all students nationwide.

(b) the samples are to represent only the 1254 female sophomores and the 982 male sophomores at that university.

2. (Data sets of different sizes). Suppose we obtain the following additional
random GPAs to add to the above data:

Add this new data and find a new 90% confidence interval for the difference between average female and average male grade point average in the same two case as in Exercise 1.

3. A survey of 810 married men in a county that has 4160 married men found that the mean age of first marriage was 25.2 years with a sample deviation of 2.4 years. A national survey of 850 women found that the mean age of first marriage was 23.3 years with a sample deviaion of 2.1 years. Find a 95% confidence interval for the difference in average age at first marriage between men in this county and women nationwide.

1. First, enter the data into the **STAT Edit** screen (**LIST EDIT** on TI-86, **APPS 6** on TI-89), then use the **2-Var Stats** command (**TwoVar** on the TI-86 and 89) in order to compute the statistics. We see for these random samples of size 30 that Xbar = XBar ~ 3.333 and Sx ~ 0.35727 for the girls and YBar ~ 3.1017 and Sy ~ 0.673448 for the boys.

Next, call up the **ZDIFMNCI** program and either enter these statistics directly or access the non-rounded values as explained above under Using Data Sets of a Common Size.

(a) For two large populations, we obtain a 90% confidence interval for µx - µy of [0.0027, 0.4606]. That is, based on this data we may say that, nationally, the average high school GPA of females is greater than that of males by as little as 0.0027 or by as much as 0.4606.

(b) For the two finite populations of sizes 1254 and 982 respectively, we obtain a 90% confidence interval of [0.006, 0.4574]. So just for sophomores at this university, the average high school GPA of females is greater than that of males by as little as 0.006 or by as much as 0.4574.

2. We first add the 10 additional girl GPAs and the 8 additional boy GPAs in the appropriate columns in the list editor. Since the data sets no longer have the same size, we cannot we the **2-Var Stats** command to compute the sample means and sample deviations. So compute them separately with the **1-Var Stats** command (**OneVar** on the TI-86 and TI-89).

For the girls, the sample size is 40, the sample mean is 3.3575, and the sample deviation is 0.37151. For the boys, the sample size is 38, the sample mean is 3.094736842, and the sample deviation is 0.629327126.

(a) After entering these values in the **ZDIFMNCI** for two large populations, we obtain a 90% confidence interval of [0.069, 0.4565].

(b) Upon entering these values in the **ZDIFMNCI** for populations of sizes 1254 and 982 respectively, we obtain a 90% confidence interval of [0.0726, 0.453].

3. Bring up the **ZDIFMNCI** program and enter **3** for a finite population against a large population, then enter **4160** for the first population size. Next, enter the summary statistics for each sample. We obtain a 95% confidence interval of [1.6952, 2.1048]. Thus, men in this select county average from 1.6952 years older to 2.1048 years older at the age of first marriage compared to women nationwide.

Return to Table of Contents.