Other tail of chi-squared distribution (goodness of fit)

You should have noticed the X^2 is always non-negative, X^2 being far from zero only occurs in one tail (unlike the normal distribution). Because there are several dimensions, there is not the notion of above or below the expected value. But the tail of the chi-squared distribution near zero is useful; it can be used to address whether data agrees "too well" with the hypothesis, e.g., whether the experiment was truly random. [For example, if you flipped a coin 1000 times, you would not expect to get exactly 500 heads, this would correspond to a z-score "too near" zero.]

In the multinomial context, one would not expect to get 166 each of red and green M& M's and 167 each of brown, tan, orange, and yellow if one randomly chose 1000 from a vat with equal frequencies of the six colors. The value of X^2 for this selection would be .008, which is very close to zero, reflecting that the observed values are very close to the expected values. The probability of such a small X^2 if the selection was truly random is very small.

Being "too close" to the expected values can be quantified with the area to the left of the observed X^2. Subscripts such as .99 in the chi-squared table indicate that the area to the left is (in this case) .01. We identify a result as "to close" to the expected value if the observed X^2 lies within the left hand tail with the area of the specified level of significance.

Example If one found 19 B, 16 T, 17 G, 17 O, 16 Y, and 15 R candies in a bag which was allegedly randomly filled from a source with equal frequencies of all colors, The expected values would be 16.67 of each color, since there are 100 candies total. X^2 is readily calculated, and is equal to .560. There are five degrees of freedom. .554 .LT. .560 .LT. .831 (look at the 5 df row in the chi-squared table), hence the observed values are "too close" to the expected values at the .025 (= 1-.975) significance level, but not at the .01 (=1-.99) significance level. "Too close" in this case would indicate that the candies were selected to balance the colors,rather than at random.


3) If a business man employs 12 Blacks, 7 Hispanics, and 31 Caucasians in a community which is 20% Black, 15% Hispanic, and 65% Caucasian; would you accuse him of using quotas? At what significance level?