
Chi-Square Test
A chi-square (χ2) statistic is a test that measures how expectations compare to actual observed data (or model results). The data used in calculating a
chi-square statistic must be random, raw, mutually exclusive, drawn from independent variables, and drawn from a large enough sample.
That is, the chi-square (χ2) tests are certain types of statistical hypothesis tests that are valid to perform when the test statistic is chi-squared distributed under the null hypothesis.
In the standard applications of this test, the observations are classified into mutually exclusive classes. If the so-called null hypothesis is true, the test statistic computed from the observations follows a χ2 distribution.
The purpose of the test is to evaluate how likely the observed frequencies
would be assuming the null hypothesis is true. Test statistics that follow a χ2
distribution occurs when the observations are independent and normally
distributed, which assumptions are often justified under the central limit theorem. There are also χ2 tests for testing the null hypothesis of independence of a pair of random variables based on observations of the pairs.
It determines whether or not the sampling distribution (if the null hypothesis is true) of the test statistic approximates a chi-squared distribution more and more closely as sample sizes increase.
Types of chi-square (χ2) tests:
There are two types of chi-square tests. Both use the chi-square statistic and distribution for different purposes:
1. A chi-square test for independence compares two variables in a contingency table to see if they are related. In a more general sense, it tests to see whether distributions of categorical variables differ from each another.
2. A chi-square goodness-of-fit test determines if a sample data matches a population. This test is also referred to as Goodness-of-Fit Test.
A very small chi square test statistic means that your observed data fits your expected data extremely well. In other words, there is a relationship.
A very large chi-square test statistic means that the data does not fit very well. In other words, there isn’t a relationship.
Assumptions:
Like so many of our inference procedures, chi-square tests too have some underlying assumptions which should be in place to make the results of calculations completely trust worthy. They include:
1 The data in the cells should be frequencies, or counts of cases rather than percentages or some other transformation of the data.
2. The levels (or categories) of the variables are mutually exclusive.
That is, a particular subject fits into one and only one level of each of the variables.
3. Each subject may contribute data to one and only one cell in the χ2. If, for example, the same subjects are tested over time such that the comparisons are of the same subjects at Time 1, Time 2, Time 3, etc., then χ2 may not be used.
4. The study groups must be independent. This means that a different test must be used if the two groups are related. For example, a
different test must be used if the researcher’s data consists of paired samples, such as in studies in which a parent is paired with his or her child.
5. There are 2 variables, and both are measured as categories, usually at the nominal level. However, data may be ordinal data. Interval or ratio data that have been collapsed into ordinal categories may also be used. While Chi-square has no rule about limiting the number of cells (by limiting the number of categories for each variable), a very large number of cells (over 20) can make it difficult to meet assumption #6 below, and to interpret the meaning of the results.
6. The value of the cell expected should be 5 or more in at least 80% of the cells, and no cell should have an expected of less than one (3). This assumption is most likely to be met if the sample size equals at least the number of cells multiplied by 5. Essentially, this assumption specifies the number of cases (sample size) needed to use the χ2 for any number of cells in that χ2.
Hypotheses:
Null Hypothesis (H0): There is “no change” or “no difference” in situation.
Alternative Hypothesis (H1): There is a “change” or “difference” in situation.

I. Test of Independence
When considering student sex and course choice, a χ2 test for independence could be used. To do this test, the researcher would collect data on the two chosen variables (sex and courses picked) and then compare the frequencies at which male and female students select among the offered classes using the formula given above and a χ2 statistical table.
If there is no relationship between sex and course selection (that is, if they are independent), then the actual frequencies at which male and female students select each offered course should be expected to be approximately equal, or conversely, the proportion of male and female students in any selected course should be approximately equal to the proportion of male and female students in the sample.
A χ2 test for independence can tell us how likely it is that random chance can explain any observed difference between the actual frequencies in the data and these theoretical expectations.
Problem
Imagine you have surveyed 200 individuals to determine if there is a significant association between gender and preference for a particular smartphone brand.
Solution
Step 1: Formulate Hypotheses
Null Hypothesis (H0): There is no association between gender and smartphone brand preference.
Alternative Hypothesis (Ha): There is a significant association between gender and smartphone brand preference.
Step 2: Collect Data
| iPhone | Samsung | Other |
Male | 30 | 40 | 10 |
Female | 20 | 50 | 50 |
Step 3: Set Significance Level
Choose a significance level (commonly 0.05) to determine if the observed association is statistically significant.
Step 4: Create a Contingency Table
Sum the rows and columns and create a contingency table:
| iPhone | Samsung | Other | Row Total |
Male | 30 | 40 | 10 | 80 |
Female | 20 | 50 | 50 | 120 |
Column Total | 50 | 90 | 60 | 200 |
Step 5: Calculate Expected Frequencies
Calculate the expected frequency for each cell using the formula:
Expected Frequency =
Calculation for the cell in the first row, first column (iPhone, Male)
Expected Frequency =
Expected Frequency = 20
Repeat this calculation for each cell in the table.
In the following Table the Expected Frequency calculated are given in parenthesis in each cell.
| iPhone | Samsung | Other | Row Total |
Male | 30 (20) | 40 (36) | 10 (24) | 80 |
Female | 20 (30) | 50 (54) | 50 (36) | 120 |
Column Total | 50 | 90 | 60 | 200 |
Step 6: Calculate Chi-Square (χ2) Statistic
Where, O is the observed frequency, and E is the expected frequency.
For the given example, you would calculate contributions for each cell and sum them to get the chi-square value. In the Table below, value in parenthesis is the contributions for each cell.
| iPhone | Samsung | Other | Row Total |
Male | (30 – 20)2/20 (5) | (40 – 36)2/36 (0.44) | (10 – 24)2/24 (8.16) | 80 |
Female | (20 – 30)2/30 (3.33) | (50 – 54)2/54 (0.29) | (50 – 36)2/36 (5.4) | 120 |
Column Total | 50 | 90 | 60 | 200 |
χ2 = 5 + 0.44 + 8.16 + 3.33 + 0.29 + 5.4
χ2 = 22.62
Step 7: Determine Degrees of Freedom
Degrees of freedom (df) is calculated as df = (r−1) × (c−1),
Where, r is the number of rows and c is the number of columns.
For our example,
df = (2−1) × (3−1) = 2.
Step 8: Find Critical Value or P-value
Using the chi-square distribution table or a statistical software, find the critical value or p-value corresponding to the degrees of freedom and chosen significance level.
Using a chi-square distribution table or statistical software, the critical chi-square value for df=2 at a significance level of 0.05 is approximately 5.99.
Step 9: Make a Decision
Compare the calculated chi-square value with the critical value or use the p-value to determine whether to reject the null hypothesis.
If the calculated chi-square value is greater than the critical value or the p-value is less than the significance level, reject the null hypothesis.
In our example, the calculated chi-square value (22.62) is greater than the critical value (5.99), hence we reject the null hypothesis, and confirm that there is a significant association between gender and smartphone brand preference.

II. Goodness-of-Fit Test
χ2 provides a way to test how well a sample of data matches the (known or assumed) characteristics of the larger population that the sample is intended to represent. This is known as goodness of fit. If the sample data do not fit the expected properties of the population that we are interested in, then we would not want to use this sample to draw conclusions about the larger population.
Problem
Imagine you are studying the wing colour variation in a population of 120 Monarch butterflies in a local meadow to determine if the wing colour follows the ratio 3:1:2 for orange, black, and white variations, respectively.
Solution
Step 1: Formulate Hypotheses
Null Hypothesis (H0): The proportion of Monarch butterflies with each color variant follows the 3:1:2 ratio (60%, 20%, and 40%).
Alternative Hypothesis (Ha): The proportion of Monarch butterflies with each colour variant deviates significantly from the suspected ratio.
Step 2: Collect Data
Color | Observed Frequency |
Orange | 60 |
Black | 25 |
White | 35 |
Step 3: Set Significance Level
Choose a significance level (commonly 0.05) to determine if the observed association is statistically significant.
Step 4: Calculate Expected Frequencies
Based on the 3:1:2 ratio, the expected frequencies are calculated using the expected percentage for each category using the following formula.
Expected Frequency = expected percentage * 120
Thus, for orange, the expected frequency will be (3/6) * 120 = 60, for black the expected frequency will be (1/6) * 120 = 20, and for black the expected frequency will be (2/6) * 120 = 40.
In the following Table the Expected Frequency calculated are given.
Color | Observed Frequency | Expected Frequency |
Orange | 60 | 60 |
Black | 25 | 20 |
White | 35 | 40 |
Step 6: Calculate Chi-Square (χ2) Statistic
Where, O is the observed frequency, and E is the expected frequency.
For the given example, you would calculate contributions for each cell and sum them to get the chi-square value. In the Table below, value in parenthesis is the contributions for each cell.
Color | Chi-Square Value |
Orange | (60 – 60)2/60 (0) |
Black | (25 – 20)2/20 (1.25) |
White | (35 – 40)2/40 (0.625) |
χ2 = 0 + 1.25 + 0.625
χ2 = 1.875
Step 7: Determine Degrees of Freedom
Degrees of freedom (df) is calculated as df = n − 1,
Where, n is the number of categories.
For our example,
df = 3 – 1 = 2.
Step 8: Find the Critical Value or P-value
Using the chi-square distribution table or a statistical software, find the critical value or p-value corresponding to the degrees of freedom and chosen significance level.
Using a chi-square distribution table or statistical software, the critical chi-square value for df=2 at a significance level of 0.05 is approximately 5.99 and the p value is approximately 0.391.
Step 9: Make a Decision
Compare the calculated chi-square value with the critical value or use the p-value to determine whether to reject the null hypothesis.
If the calculated chi-square value is greater than the critical value or the p-value is less than the significance level, reject the null hypothesis.
In our example, the calculated chi-square value (1.875) is lesser than the critical value (5.99) and the p value (0.391) is greater than the significance level, hence we fail to reject the null hypothesis and confirm that there is not enough evidence to conclude that the proportion of Monarch butterflies with each colour variant deviates significantly from the suspected 3:1:2 ratio at the 5% significance level.