Chapter 11: Goodness-of-Fit and Contingency Tables

Loading audio…

ⓘ This audio and summary are simplified educational interpretations and are not a substitute for the original text.

If there is an issue with this chapter, please let us know → Contact Us

The chi-square goodness-of-fit test evaluates whether observed sample frequencies align with expected theoretical distributions, providing a framework for assessing distributional assumptions. The chi-square test for independence examines potential associations between two categorical variables using contingency tables, determining whether variables are related or independent through the analysis of cell frequencies. A related procedure, the chi-square test for homogeneity, compares categorical distributions across different populations to establish whether proportions are consistent across groups. All chi-square tests rely on computing a test statistic that measures discrepancies between observed and expected frequencies, with the magnitude of this statistic compared against critical values from the chi-square distribution. Proper application requires careful attention to assumptions, particularly the minimum expected frequency requirement, which ensures the validity of statistical conclusions. The chapter then transitions to analysis of variance, a parametric technique for testing whether three or more population means are significantly different. ANOVA operates by decomposing total variation into between-group components, representing differences among group means, and within-group components, representing variation within each group. This partitioning of variance is expressed through the F-statistic, which follows the F-distribution under the null hypothesis of equal means. The methodology assumes independence of observations, normality of populations, and homogeneity of variances across groups. When ANOVA yields statistically significant results, post-hoc tests such as Tukey's method or Scheffe's test identify which specific group pairs differ meaningfully, addressing the problem of multiple comparisons. Understanding when to apply these techniques—recognizing that ANOVA avoids inflated type one error rates compared to conducting multiple independent t-tests—represents essential statistical literacy for researchers and data analysts.