### Chapter 9: Categorical Association: Chi-Square Goodness of Fit Test

### Chi-Square Goodness of Fit Test: Purpose, Hypotheses, and Assumptions

Chi-Square Test for Goodness of Fit: Purpose and Hypotheses

The **Chi-Square Goodness of Fit Test **uses the sample data of a *categorical variable* to test hypotheses about the *proportions* of a population distribution.

Specifically, the test determines how well the observed sample proportions *fit *the population proportions predicted by the null hypothesis.

The null hypothesis of a *Goodness of Fit Test* makes a prediction about the proportion (or percentage) of the population in each of the measurement categories. Although it is possible to choose any hypothesized proportions for the null hypothesis, the null hypothesis generally falls into one of two categories:

**No preference**

A null hypothesis of *no preference *is used to determine whether there are any preferences among the measurement categories, or whether the proportions differ from one category to the next.

In these cases, #H_0# predicts that the population is divided equally across all categories.

#\phantom{0}#

For example, a null hypothesis stating that there is *no preference* among #4# of the most popular ice cream flavors would predict the following population distribution:

Chocolate | Vanilla | Strawberry | Banana | |

#H_0:# | #1/4# | #1/4# | #1/4# | #1/4# |

**No difference from a known population**

Alternatively, we might want to determine whether the unknown proportions for one population significantly differ from the proportions for another population of which the distribution is already known.

A null hypothesis of *no difference *predicts that the proportions for the unknown population are identical to the proportions for the known population.

#\phantom{0}#

For example, suppose it is known that #22\%# of students at the University of Amsterdam prefer morning classes, #65\%# of students prefer to have their classes in the afternoon, and the remaining #13\%# prefer evening classes. A professor at the Erasmus University in Rotterdam wonders whether these same proportions hold for her own students.

Here, the null hypothesis would state that the distribution of students at the Erasmus University is identical to the distribution of students at the University of Amsterdam:

Preference for morning classes |
Preference for afternoon classes |
Preference for evening classes | |

#H_0:# | #.22# | #.65# | #.13# |

Since the null hypothesis #H_0# of a *Goodness of Fit test *makes an exact prediction about the distribution for the population, the alternative hypothesis #H_a# simply predicts that the population has a different distribution from the one predicted by the null hypothesis.

Assumptions of the Chi-Square Goodness of Fit Test

The following assumptions are required to hold in order for a *Chi-Square Goodness of Fit Test* to produce valid results:

- The variable being studied is
**categorical**(qualitative) in nature. - The measurement categories are
**mutually exclusive**, which means that each observation can be classified into one and only one category. **Random sampling**is used to draw the sample.- All categories should have an expected frequency of at least #1#.
- The majority of categories #(\geq 80\%)# should have an expected frequency of at least #5#.

If either assumption #4# or #5# is not satisfied, you must combine some categories.

**Pass Your Math**independent of your university. See pricing and more.

Or visit omptest.org if jou are taking an OMPT exam.