Project 3: Can We Use ANOVA?

This project studies the conditions that allows us to use analysis of variance to determine if a group of populations have a common mean. We begin with several data sets having no known underlying conditions and proceed with the study.



Random Data from Several Populations

The following charts list the estimated city miles per gallon obtained for samples of 1993 models of cars as reported by Consumer Reports: The 1993 Cars - Annual Auto Issue (April 1993).


Population 1: Compact Cars

Model and City MPG
Population 2: Midsize Cars

Model and City MPG
Audi 90 -- 20
Acura Legend -- 18
Chevy Cavalier -- 25
Audi 100 -- 19
Chevy Corsica -- 25
BMW 535i -- 22
Chrysler LeBaron -- 20
Buick Century -- 22
Dodge Spirit -- 22
Buick Riviera -- 19
Ford Tempo -- 22
Cadillac Seville -- 16
Honda Accord -- 24
Chevy Lumina -- 21
Mazda 626 -- 26
Dodge Dynasty -- 21
Mercedes-Benz 190E -- 20
Ford Taurus -- 21
Nissan Altima -- 24
Hyundai Sonata -- 20
Olds Achieva -- 24
Infiniti Q45 -- 17
Pontiac Sunbird -- 23
Lexus ES300 -- 18
Saab 900 -- 20
Lexus SC300 -- 18
Subaru Legacy -- 23
Lincoln Continental -- 17
Volkswagen Passat -- 21
Mercedes-Benz 300E -- 19
Volvo 240 -- 21
Mercury Cougar -- 19
Mitsubishi Diamante -- 18
Nissan Maxima -- 21
Olds Cutlass Ciera -- 23
Pontiac Grand Prix -- 19
Toyota Camry -- 22
Volvo 850 -- 20


Population 3: Large Cars and Vans

Large Cars

Model and City MPG
Vans

Model and City MPG
Buick LeSabre -- 19
Chevy Astro -- 15
Buick Roadmaster -- 16
Chevy Lumina APV -- 18
Cadillac Deville -- 16
Dodge Caravan -- 17
Chevy Caprice -- 17
Ford Aerostar -- 15
Chrysler Concorde -- 20
Mazda MPV -- 18
Chrysler Imperial -- 20
Nissan Quest -- 17
Eagle Vision -- 20
Olds Silhouette -- 18
Ford Crown Victoria -- 18
Toyota Previa -- 18
Lincoln TownCar -- 18
Volkswagen Eurova -- 17
Olds Eighty-Eight -- 19
Pontiac Bonneville -- 19



Testing for Normality

The populations actually consist of all possible models and not just those listed. Before we can use ANOVA, we must verify or accept the hypothesis that the populations are normally distributed. Use the TESTNORM program with 8 partitions and a 0.05 level of significance to test whether or not we can accept the claim that each of these three populations is normally distributed.

If one or more of the populations is rejected as being normal, then we cannot use ANOVA. In this case, proceed to the last section below.



Testing for Common Variance

Suppose we have accepted the hypothesis that each population is normally distributed. Then secondly, ANOVA requires that each population have the same variance. We must therefore test various pairs of populations in order to accept or reject the hypothesis that they have a common variance. To do so, we can use the RATFTEST program.

If X and Y represent one pair of the populations above, then we can denote the unknown variances by VarX and VarY respectively. We wish to test the hypothesis Ho: VarX / VarY = 1. If we accept the claim, then we can assume that X and Y have the same variance. Then we test another population with one of these two. We continue until we either accept that all populations have the same variance or find a pair for which we reject the hypothesis.

Compute the sample variance for each population above, and then test the hypothesis that they all have a common variance. If the hypothesis is rejected, then ANOVA cannot be used; thus, proceed to the last section.



ANOVA

So now the conditions of normality and common variance have been accepted. Thus, choose an appropriate level of significance and use ANOVA to test whether each population has a common mean. If you accept this hypothesis, then you can state your conclusions.



If ANOVA Yields Rejection

Suppose that you reject that all means are equal. Then at least one pair of populations have different means. To get an idea of which means appear to be different, you can simply look at the values of the individual sample means. Find the populations whose sample means are furthest apart. Then use the T2MNTEST program to test the hypothesis that the difference in means is equal to 0. If you reject the hypothesis, then you have found a pair with different means. Find all such pairs.



If Normality or Common Variance Is Rejected

The required conditions for ANOVA have not been met; but we can still test whether or not each population has the same distribution. We can do so with the non-parametric Kruskal-Wallis test using the KRUSKAL program.

And if the Kruskal-Wallis test yields rejection of the hypothesis, then we should use the test again on various populations two at a time to determine which pairs have different distributions.

Thus if ANOVA cannot be used, then use the Kruskal-Wallis test to determine if the populations have the same distribution, or to find which pairs of populations have different distributions.



Additional Data


Small Cars

Model and City MPG
Sporty Cars

Model and City MPG
Acura Integra -- 25
Chevy Camaro -- 19
Dodge Colt -- 29
Chevy Corvette -- 17
Dodge Shadow -- 23
Dodge Stealth -- 18
Eagle Summit -- 29
Ford Mustang -- 22
Ford Festiva -- 31
Ford Probe -- 24
Ford Escort -- 23
Geo Storm -- 30
Geo Metro -- 46
Honda Prelude -- 24
Honda Civic -- 42
Hyundai Scoupe -- 26
Hyundai Elantra -- 22
Mazda RX-7 -- 17
Hyundai Excel -- 29
Mercury Capri -- 23
Mazda 323 -- 29
Plymouth Laser -- 23
Mazda Protege -- 28
Pontiac Firebird -- 19
Mitsubishi Mirage -- 29
Toyota Celica -- 25
Nissan Sentra -- 29
Volkswagen Corrado -- 18
Pontiac LeMans -- 31
Saturn SL -- 28
Subaru Justy -- 33
Suzuki Swift -- 39
Toyota Tercel -- 32
Volkswagen Fox -- 25



Part 2

Now let Population 1 = Compact & Sporty; Population 2 = Midsize & Vans; and Population 3 = Large & Small.

Rework the project with these three populations.



Return to Table of Contents.