Hypothesis Tests about the Difference of Proportions

 DIFPTEST.83p DIFPTEST.86p difptest.89p

Suppose we are considering two independent populations W1 and W2, and we wish to study the difference p1 - p2 between the proportions of those in each population who have a certain designation. We wish to conduct the following three hypothesis tests:

 1. Ho: p1 - p2 <= P 2. Ho: p1 - p2 >= P 3. Ho: p1 - p2 = P

We base our decision for each test on the difference (m1 / n1) - (m2 / n2) between sample proportions from independent random samples (of sizes n1 >= 30 and n2 >= 30 respectively) from W1 and W2.

For this scenario, each population could be "large" or "small" relative to the respective sample size. If W1 is "large," then the sample deviation is given by

S1 = Sqrt[ (m1 / n1)*(1 - m1 / n1) ].

If W1 is of a smaller finite size N (so that the sample size is more than 5% of the population), then the sample deviation is

S1 = Sqrt[ (m1 / n1)* (1 - m1 / n1) ] * Sqrt[ (N - n1) / (N - 1) ].

The sample deviation S2 for the second population is defined similarly.

We define the test statistic by

x = (m1 / n1 - m2 / n2 - P) / Sqrt[ (S1)^2 / n1 + (S2)^2 / n2 ],

which follows an approximate standard normal distribution Z for large sample sizes. We then compute the left and right tail probability values created by the test statistic, P(Z <= x) and P(Z >= x), to compare with the level of significance a.

We reject the first hypothesis p1 - p2 <= P when the difference in sample proportions (m1 / n1) - (m2 / n2) is too large, which means the right tail value will be too small: P(Z >= x) < a. This test is equivalent to the test Ho: p1 - p2 = P with a one-sided alternative Ha: p1 - p2 > P.

Likewise, we reject the the second hypothesis p1 - p2 >= P if (m1 / n1) - (m2 / n2) is too small, which means the left tail value will be too small: P(Z <= x) < a. This test is equivalent to the test Ho: p1 - p2 = P with a one-sided alternative Ha: p1 - p2 < P.

If P(Z >= x) < a / 2 or P(Z <= x) < a / 2, then we reject the third hypothesis p1 - p2 = P. For this two-sided test, the p-value is always twice the smallest tail-value.

Using the DIFPTEST Program

The DIFPTEST program can be used to perform these hypothesis tests. To execute the program, first we enter 1, 2, 3, or 4 to designate the types of populations we have under study: (1) two large populations; (2) the first large and the second finite; (3) the first finite and the second large; or (4) two finite populations. If a population is finite, then we enter its population size. Next, enter the value of the proportion difference P to be tested, the numbers of affirmative responses, the two samples sizes, and the level of significance. The program displays the conclusions for the three tests along with the test statistic and tail probability values.

Example. A poll commissioned by the Center on Addiction and Substance Abuse at Columbia University found that 304 out of 400 youths and 1340 of 2000 adults believed that popular culture encourages drug use. If p1 and p2 denote the true proportions among youths and adults respectively, test the following three hypotheses at the 0.05 level of significance.

 1. Ho: p1 - p2 <= 0.05 2. Ho: p1 - p2 >= 0.05 3. Ho: p1 - p2 = 0.05

Solution. We assume that the populations under study were nationwide; so after calling up the DIFPTEST, program enter 1 to designate two "large" populations. Next, enter .05 for TEST DIFFERENCE, 304 for 1ST NO. OF YES, 400 for 1ST SAMPLE SIZE, 1340 for 2ND NO. OF YES, 2000 for 2ND SAMPLE SIZE, and .05 for LEVEL OF SIG.

We receive a right tail value of 0.0464 from a test statistic of 1.6805091, and we reject the first hypothesis. Hence based on this data, we can conclude that p1 - p2 >= 0.05.

We note that the difference in sample proportions is 304 / 400 - 1340 / 2000 = 0.09. If p1 - p2 <= 0.05, then there would be at most a 4.64% chance of this difference in sample proportions being as large as 0.09 with samples of these sizes, which is why we have rejected the first hypothesis.

For the two-sided test Ho: p1 - p2 = 0.05, the p-value is 2*.0464 = 0.0928, which is not below the level of significance; thus we do not have significant evidence, at the 0.05 level, to reject this hypothesis. For if p1 - p2 = 0.05 were true, there would still be a 9.28% chance of the difference in sample proportions being as far away (in either direction) as 0.09.

Exercises

1. On July 13, 1995, USA TODAY reported the results of a USA TODAY/CNN Gallup Poll. Out of 326 adults surveyed in California, 47% stated that they favored President Clinton over Bob Dole for the 1996 presidential election. Nationally, 48% out of 801 adults favored Clinton.

If p1 and p2 denote the true proportions among those in California and among those nationally that favored Clinton at that time, test the following three hypotheses at the 0.05 level of significance.

 1. Ho: p1<= p2 2. Ho: p1 >= p2 3. Ho: p1 = p2

2. The Center for Social and Religious Research at the Hartford Seminary has studied the divorce rate of Protestant clergy. A survey on a targeted group of 5000 women and 5000 men found that 25% out of 2458 clergywomen responding had been divorced at least once as had 20% out of the 2086 clergymen who responded.

If p1 and p2 denote the true proportions among these two targeted groups of 5000 (rather than among all possible clergywomen and clergymen), test the following three hypotheses at the 0.04 level of significance.

 1. Ho: p1 - p2 <= 0.03 2. Ho: p1 - p2 >= 0.03 3. Ho: p1 - p2 = 0.03

3. In a Newsweek Poll conducted on Jan. 28-29, 1993, 394 out of 774 adults surveyed nationwide approved of President Clinton's performance. Suppose however that a similar poll was conducted only in Hope, Arkansas where the adult population was approximately 7300. Suppose that 450 out of 600 approved of the President's performance in that poll.

State and conduct a test on whether or not the President's approval rate in Hope, AR was at least 30 perecentage points higher than his approval rate nationwide at that time.

4. A study was undertaken to see if there is a statistically significant difference in cancers or heart diseases among patients who regularly took beta carotene pills. The results appearing in the Journal of the National Cancer Institute found that after fours years there were 378 cancers among 19,939 women in the beta carotene group. There were 369 cancers out of 19,937 women in the placebo group. State and conduct a hypothesis test to see if taking beta carotene leads to a difference in cancers among women.

Solutions

1. These tests are equivalent to tests about the difference p1 - p2 with a test difference value of 0. In the DIFPTEST program, we first enter 1 to designate the two "large" populations, then enter 0 for TEST DIFFERENCE. Next, since the exact numbers of affirmative responses are not given, enter .47*326 for 1ST NO. OF YES and 326 for 1ST SAMPLE SIZE. Then enter .48*801 for 2ND NO. OF YES, 801 for 2ND SAMPLE SIZE, and .05 for LEVEL OF SIG.

We receive a left tail value of 0.3802 and we do not reject any of the hypotheses. We note that the differnce in sample proportions is -0.01. But if p1 - p2 = 0, then there could still be a 38.02% chance of the difference in sample proportions being as small as -.01. This chance is too high to reject the hypotheses.

Based on this data then, we can conclude that p1 = p2. That is at that time, the percentage that favored Clinton was statistically the same in California as it was nationally.

2. In the DIFPTEST program, first enter 4 to designate two finite populations, then enter both population sizes as 5000. Next, enter .03 for TEST DIFFERENCE, .25*2458 for 1ST NO. OF YES, 2458 for 1ST SAMPLE SIZE, .2*2086 for 2ND NO. OF YES, 2086 for 2ND SAMPLE SIZE, and .04 for LEVEL OF SIG.

We receive a right tail value of 0.0143 and we reject the first and third hypotheses.

Since we have accepted the second claim but rejected the third, we conclude that p1 - p2 > 0.03. Thus among these two populations of 5000 clergy, the clergywomen have a divorce rate that is more than 3 percentage points higher than that of the clergymen.

Indeed, if p1 - p2 <= 0.03 were true, then there would be at most a 1.43% chance of the difference in sample proportions being as large as 0.05 with sample of these sizes from these two finite populations.

3. Let p1 be the true approval proportion in Hope, and let p2 be the national approval proportion. The sample proportions are 450 / 600 = 0.75 and 394 / 774 = 0.509, which gives a difference of about 0.241.

We shall test Ho: p1 - p2 >= 0.30, with a one-sided alternative Ha: p1 - p2 < 0.30, using a 0.05 level of significance.

In the DIFPTEST program, first enter 3 to designate a finite population versus a large population, then enter 7300 for the 1ST POP SIZE. Next, enter .30 for TEST DIFFERENCE, 450 for 1ST NO. OF YES, 600 for 1ST SAMPLE SIZE, 394 for 2ND NO. OF YES, 774 for 2ND SAMPLE SIZE, and .05 for LEVEL OF SIG.

We receive left tail value (p-value) of 0.0084 and we reject the null hypothesis. We conclude that the President's approval rating in Hope was less than 30 percentage points higher than the national approval rating.

If p1 - p2 >= 0.30, then there would only be a 0.84% chance of obtaining a difference in sample proportions as low as 0.241 with samples of these sizes.

4. Let p1 be the true proportion of cancer among all women using beta carotene, and let p2 be proportion of cancer among women not using beta carotene. We shall test Ho: p1 - p2 = 0, with a two-sided alternative at the 0.05 level of significance. We also shall assume "large" populations in each group.

In the DIFPTEST program, enter 1 to designate two large populations, then enter 0 for TEST DIFFERENCE. Next, enter 378 for 1ST NO. OF YES, 19939 for 1ST SAMPLE SIZE, 369 for 2ND NO. OF YES, 19937 for 2ND SAMPLE SIZE, and .05 for LEVEL OF SIG.

We receive a right-tail value of 0.3703, which gives a p-value for the two-sided alternative of 2*.3703 = 0.7406. We can conclude that there is not a statistically significant difference in the cancer rates.

We note that the difference in sample proportions is 378 / 19939 - 369 / 19937 = 0.00044952. If p1 were equal to p2, then there would still be a 74.06% chance of obtaining a difference in sample proportions as far way from 0 as 0.00044952 even with samples of these sizes.