Chi-squared distributions are very important distributions in the field of statistics. In probability theory and statistics, the chi-squared distribution (also chi-square or χ2-distribution) with degrees of freedom is the distribution of a sum of the squares of k independent standard normal random variables.
The chi-square distribution is a special case of the gamma distribution and is one of the most widely used probability distributions in inferential statistics.
If N1, …, Nk are independent, standard normal random variables, then the sum of their squares,
is distributed according to the chi-squared distribution with k degrees of freedom. This is usually denoted as
The chi-squared distribution has one parameter: k — a positive integer that specifies the number of degrees of freedom. There are, of course, an infinite number of possible values for the degrees of freedom.
The probability density function for the
distribution with
degrees of freedom is given by


The chi-squared distribution is not as often applied in the direct modeling of natural phenomena. The primary reason why the chi-squared distribution is used extensively in hypothesis testing is its relationship to the normal distribution.
For these hypothesis tests, as the sample size, n, increases, the sampling distribution of the test statistic approaches the normal distribution, thanks to the central limit theorem. Because the test statistic is asymptotically normally distributed, provided the sample size is sufficiently large, the distribution used for hypothesis testing may be approximated by a normal distribution. The simplest chi-squared distribution is the square of a standard normal distribution. So wherever a normal distribution could be used for a hypothesis test, a chi-squared distribution could be used.
A chi-squared distribution constructed by squaring a single standard normal distribution is said to have 1 degree of freedom. Thus, as the sample size for a hypothesis test increases, the distribution of the test statistic approaches a normal distribution, and the distribution of the square of the test statistic approaches a chi-squared distribution. Just as extreme values of the normal distribution have low probability (and give small p-values), extreme values of the chi-squared distribution have low probability.
Finally, it’s also interesting to notice how, ass the following theorems illustrate, the moment generating function, mean and variance of the chi-square distributions are just straightforward extensions of those for the gamma distributions.
| Theorem. Let X be a chi-square random variable with r degrees of freedom. Then, the moment generating function of X is:
M(t)=1(1−2t)r/2 for t < ½. |
Proof. The moment generating function of a gamma random variable is: M(t)=1(1−θt)α
The proof is therefore straightforward by substituting 2 in for θ and r/2 in for α.
| Theorem. Let X be a chi-square random variable with r degrees of freedom. Then, the mean of X is:
μ=E(X)=r That is, the mean of X is the number of degrees of freedom. |
Proof. The mean of a gamma random variable is: μ=E(X)=αθ
The proof is again straightforward by substituting 2 in for θ and r/2 in for α.
| Theorem. Let X be a chi-square random variable with r degrees of freedom. Then, the variance of X is:
σ2=Var(X)=2r That is, the variance of X is twice the number of degrees of freedom. |
Proof. The variance of a gamma random variable is: σ2=Var(X)=αθ2
The proof is again straightforward by substituting 2 in for θ and r/2 in for α.
One last interesting thing to notice is that Lancaster showed the connections among the binomial, normal, and chi-squared distributions, as it follows.
We’ve already seen how both De Moivre and Laplace established that a binomial distribution could be approximated by a normal distribution. Specifically they showed the asymptotic normality of the random variable
where m is the observed number of successes in N trials, where the probability of success is p, and q = 1 − p.
Squaring both sides of the equation gives
Using N = Np + N(1 − p), N = m + (N − m), and q = 1 − p, this equation simplifies to
The expression is of the form that Pearson would generalize to the form:
where
= Pearson’s cumulative test statistic, which asymptotically approaches a
distribution.
= the number of observations of type i.
= the expected (theoretical) frequency of type i, asserted by the null hypothesis that the fraction of type i in the population is
= the number of cells in the table.
