StatsCalculators.com

Chi-Square Goodness of Fit Test

The Chi-Square Goodness of Fit Test Calculator helps you determine whether your observed data follows an expected distribution pattern. This statistical test compares observed frequencies with expected frequencies to assess if deviations are due to chance or indicate a significant difference. It's commonly used in research to analyze categorical data, such as testing if dice rolls are fair, if genetic traits follow Mendelian ratios, or if customer preferences match expected market distributions. Click here to populate the sample data for a quick example.

Calculator

1. Load Your Data

2. Select Columns & Options

Learn More

Chi-Square Goodness of Fit Test

Definition

Chi-Square Goodness of Fit Test is used to determine whether sample data is consistent with a hypothesized probability distribution. It compares observed frequencies with expected frequencies to test if the differences are statistically significant.

Formula

Test Statistic:

χ2=i=1k(OiEi)2Ei\chi^2 = \sum_{i=1}^k \frac{(O_i - E_i)^2}{E_i}

Where:

  • OiO_i = observed frequency for category ii
  • EiE_i = expected frequency for category ii
  • kk = number of categories
  • df=k1df = k - 1 (degrees of freedom)

Key Assumptions

Random Sample: Data must be randomly sampled
Independence: Observations must be independent
Sample Size: Expected frequency should be ≥ 5 for each category
Categorical Data: Data must be categorical

Practical Example

Step 1: State the Data

Die roll frequencies from 60 rolls:

FaceObserved (O)Expected (E)(O-E)²/E
110100.000
28100.400
312100.400
410100.000
515102.500
65102.500
Step 2: State Hypotheses
  • H0H_0: The die is fair (equal probabilities)
  • HaH_a: The die is not fair
  • α=0.05\alpha = 0.05
Step 3: Calculate Test Statistic

Chi-square statistic:

χ2=(OiEi)2Ei=5.800\chi^2 = \sum \frac{(O_i - E_i)^2}{E_i} = 5.800

Degrees of freedom = 61=56 - 1 = 5

Step 4: Determine Critical Value

At α=0.05\alpha = 0.05 with df=5df = 5:

χ52=11.070\chi^2_5 = 11.070
Step 5: Calculate P-value

Using chi-square distribution:

p-value=0.326p\text{-value} = 0.326
Step 6: Draw Conclusion

Since χ2=5.800<χ52=11.070\chi^2 = 5.800 < \chi^2_5 = 11.070 and pp-value = 0.326>0.050.326 \gt 0.05, we fail to reject H0H_0. There is insufficient evidence to conclude that the die is unfair.

Effect Size

Cramer's V for goodness of fit test:

V=χ2n(k1)V = \sqrt{\frac{\chi^2}{n(k-1)}}

Where:

  • χ2\chi^2 = chi-square statistic
  • nn = total sample size
  • kk = number of categories

For our example:

V=5.80060(61)=0.139V = \sqrt{\frac{5.800}{60(6-1)}} = 0.139

Interpretation guidelines:

  • Small effect: V0.10V \approx 0.10
  • Medium effect: V0.30V \approx 0.30
  • Large effect: V0.50V \approx 0.50

With V = 0.139, this indicates a small to medium effect size, suggesting that while there are some deviations from the expected frequencies, they are relatively modest in practical terms.

Code Examples

R
# Chi-Square Goodness of Fit Test
# Observed frequencies
observed <- c(10, 8, 12, 10, 15, 5)

# Perform chi-square test
result <- chisq.test(
  observed,
  p = rep(1/6, 6)  # Equal probabilities for each face
)

print(result)
Python
# Chi-Square Goodness of Fit Test
from scipy.stats import chisquare

# Observed frequencies
observed = [10, 8, 12, 10, 15, 5]

# Expected frequencies (equal probabilities)
n = sum(observed)  # total observations
p = 1/6  # probability for each face
expected = [n * p] * 6

# Perform chi-square test
stat, pvalue = chisquare(observed, expected)

print(f'Chi-square statistic: {stat:.4f}')
print(f'p-value: {pvalue:.4f}')

# Calculate degrees of freedom
df = len(observed) - 1

# Calculate critical value
from scipy.stats import chi2
critical_value = chi2.ppf(0.95, df)
print(f'Critical value (α=0.05): {critical_value:.4f}')

Alternative Tests

Consider these alternatives:

  • G-test: Alternative to chi-square for categorical data
  • Exact Multinomial Test: For small sample sizes

Verification

Related Calculators