Suppose we are studying two populations and we wish to measure the proportions p1 and p2 of those having a certain designation. Then the difference between the proportions p1 - p2 is a special case of the difference between means. To estimate this difference, we conduct independent random samples on each population to first estimate p1 and p2 individually. Then, p1 ~ m1 / n1, where m1 is the number of affirmative responses and n1 is the number surveyed in the first population. Similarly, p2 ~ m2 / n2, where m2 is the number of affirmative responses and n2 is the number surveyed in the second population. Then, (p1 - p2) ~ (m1 / n1 - m2 / n2) +/- e, where e is an appropriate margin of error.

For this scenario, each population could be "large" or "small" relative to the respective sample size. If Population 1 is "large", then the sample deviation is given by S1 = Sqrt[ (m1 / n1) (1 - m1 / n1) ]. If Population 1 is of a smaller finite size N, (usually so that the sample size is more than 5% of N), then the sample deviation is S1 = Sqrt[ (m1 / n1) (1 - m1 / n1) ] Sqrt[ (N - n1) / (N - 1) ] . The sample deviation S2 for the second population is defined similarly.

If our desired level of confidence is r and z is the z-score such that P(-z <= Z <= z) = r, where Z ~ N(0,1), then the margin of error is given by e = z*Sqrt[ (S1)^2 / n1 + (S2)^2 / n2 ]. Thus,

To use the analysis above, we generally require samples of sizes n1 >= 30 and n2 >= 30. However to obtain reasonably small margins of error, we would usually need much larger sample sizes.

To execute the program, we first enter **1**, **2**, **3**, or **4** to designate the types of populations we have under study: (1) two large populations; (2) the first large and the second finite; (3) the first finite and the second large; or (4) two finite populations. If a population is finite, then we enter its population size. Next, enter the number of affirmative responses and the sample sizes for each population and the desired level of confidence. The program displays the difference in sample proportions m1 / n1 - m2 / n2, the margin of error, and the confidence interval.

** Example.** The results of a poll commisioned by the Center on Addiction and Substance Abuse at Columbia University found that 1340 out of 2000 adults and 304 out of 400 youths interviewed believed that popular culture encourages drug use. Find a 95% confidence interval for the true difference in proportions between adults and youths with this belief at that time.

*Solution.* We shall assume that these were large nationwide populations of adults and youths under study. Thus after calling up the **DIFPCI** program, first enter **1** to specify that we have two large populations. Next, enter **1340** for **1ST NO. OF YES**, enter **2000** for **1ST SAMPLE SIZE**, enter **304** for **2ND NO. OF YES**, enter **400** for **2ND SAMPLE SIZE**, and enter **.95** for **CONF. LEVEL**.

We find that p1 - p2 ~ -0.09 +/- 0.0467, or that -0.1367 <= p1 - p2 <= -0.0433. Equivalently 0.0433 <= p2 - p1<= 0.1367. So apparently, a greater percentage of youths had this belief at the time of the study. The percentage of youths having this belief was possibly from 4.33 percentage points higher to 13.67 percentage points higher than the percentage of adults having this belief.

1. On July 13, 1995, USA TODAY reported the results of a USA TODAY/CNN Gallup Poll. Out of 801 adults surveyed nationally, 68% felt that the Republicans work in Congress was "politics as usual." However, out of 326 adults surveyed in California, only 62% felt this way. Find a 90% confidence interval for the difference between proportions nationally and in California.

2. On June 25, 1995, The Associated Press reported the results of a national
survey conducted by the Center for Social and Religious Research at the
Hartford Seminary. The study was on the divorce rate of a group of 5000
Protestant clergywomen and 5000 Protestent clergymen. It was found that
25% out of 2458 clergywomen responding had been divorced at least once
and 20% out of 2086 clergymen responding had been divorced at least once. Find a 99% confidence interval for the true difference in divorce rates among the two targeted groups of 5000 clergywomen and 5000 clergymen.

3. Suppose we know that 1032 out of 4544 respondents from a targeted
population of 10,000 clergy had been divorced. Then an independent national
survey (which may include some of these clergy) found that 2080 out of 8000
adults had been divorced at least once. Find a 95% confidence interval for the true difference in divorce rates among these 10,000 Protestant clergy and the general adult population.

1. In the **DIFPCI** program, first enter **1** to designate two large populations. Since the exact number of affirmative responses are not given, we can enter **.68*801** for **1ST NO. OF YES** and **801** for **1ST SAMPLE SIZE**, then enter **.62*326** for **2ND NO. OF YES** and **326** for **2ND SAMPLE SIZE**. Finally enter **.9** for **CONF. LEVEL**.

We find that p1 - p2 ~ 0.06 +/- 0.0519, or that 0.0081 <= p1 - p2 <= 0.1119. That is, this feeling was apparently stronger nationally by as much as 11.19 percentage points.

2. If we limit the two populations to the two groups of 5000, rather than all
possible Protestant clergy, then we have two small finite populations of known size. We then are studying the proportions of these two groups rather than all possible clergy. Thus in the **DIFPCI** program, first enter **4** to designate two finite populations, then enter **5000** for both of their population sizes.

Again, we do not have the exact number of "Yes" responses; thus, in the program, enter **.25*2458** for **1ST NO. OF YES** and **2458** for **1ST SAMPLE SIZE**, then enter **.2*2086** for **2ND NO. OF YES** and **2086** for **2ND SAMPLE SIZE**. Finally enter **.99** for **CONF. LEVEL**.

We find that 0.0265 <= p1 - p2 <= 0.0735. That is, among these two groups of 5000 clergywomen and 5000 clergymen, the proportion of clergywomen who have been divorced is from 2.65 percentage points higher to 7.35 percentage points higher than the proportion of clergymen who have been divorced.

3. Our first population is still small and finite; but now our second population is "large" and can be considered infinite. Thus in the **DIFPCI** program, first enter **3** to designate that the first population is finite and the second is large, then enter **10000** for the first population size.

Next, enter **1032** for **1ST NO. OF YES** and **4544** for **1ST SAMPLE SIZE**, then enter **2080** for **2ND NO. OF YES** and **8000** for **2ND SAMPLE SIZE**. Finally enter **.95** for **CONF. LEVEL**.

We find that -0.0461 <= p1 - p2 <= -0.0197, or equivalently 0.0197 <= p2 - p1 <= 0.0461. Thus, the general public has a higher divorce rate by 1.97 to 4.61 percentage points.

**Note**: The built-in **2-PropZInt** command on the TI-83 cannot be used without the exact numbers of "Yes" responses. These values must be integers. Also, this built-in function cannot take into account the possible finite population size correction factor for the sample deviations.

Return to Table of Contents.