The Kolmogorov-Smirnov Test

Editorial Staff

09 march 2021

In statistics we are often estimating parameters and then, using some hypothesis test, we can find out if we should either reject our null hypothesis or not. Now in these tests we almost always rely on some known distribution of a so-called test statistic. Let us start by looking at an example of such a test. We will first need to assume that our sample follows a normal distribution with a known variance. Now we want to test if the mean of this normal distribution is equal to $0$. Hence, we conduct the following test:

$$H_0: \mu = 0 \qquad \text{Against} \qquad H_1: \mu \neq 0$$

In this case we have a test statistic and can easily obtain either the p-value or the critical region for a certain significance level. The assumption of normality is particularly common in classical statistical tests. But this assumption can not always be made. For example, assume that we have a sample of some distribution of which we would like to know the distribution. Then how can we estimate anything about this sample? Well, for that we can use the Kolmogorov-Smirnov Test.

The empirical CDF of a sample

To get started with the Kolmogorov-Smirnov Test, we first need some tools. The empirical cumulative distribution or ECDF is, in short, the main tool that we will need. For the ECDF we need an ordered sample $X_1, …, X_n$. Then the ECDF is defined as

$$\hat{F}_n(t) = \frac{\text{the number of elements} \leq t}{n} = \frac{1}{n} \sum_{i = 1}^n \mathbf{1}_{X_i \leq t}$$

Here we have that $\mathbf{1}_A$ denotes the indicator function of event A. Then one can notice that this ECDF is a step function. One can also notice that the ECDF looks very similar to the CDF of a given distribution. By definition for some random variable $Y$ we have that the CDF is given by $$F_Y(y) = P(Y \leq y)$$

Then the most important things about the ECDF are of course its asymptotic properties. We know for example that the ECDF converges almost surely to the actual CDF of the distribution that the sample was drawn from. And this is the property that the Kolmogorov-Smirnov Test makes use of.

The test

Now that we have the tools needed for the Kolmogorov–Smirnov Test, we will start by defining the test. We will perform the following test.

$$H_0: F = F_0 \qquad \text{vs} \qquad F \neq F_0$$

Here F is the true cumulative distribution function, here $F_0$ is the cumulative distribution that we want to test for a match. This means that we first need to make an assumption on what the true distribution is, to then be able to calculate the certainty with which we can say that $F_0$ is the true cumulative distribution function. As with any test we use a test statistic to find this certainty. The test statistic we use in the Kolmogorov–Smirnov test for a given cumulative distribution function $F_0(x)$ is as follows.

$$D_n = \sup_x |F_n(x) – F_0(x)|$$

Here $\sup_x$ is the supremum of this distance of all possible values of $x$. And most importantly, $F_n(x)$ is the empirical CDF of the sample. Now we know from this test statistic that it asymptotically converges to zero. Meaning that with an infinitely large sample the empirical cumulative distribution function is equal to the true cumulative distribution function.

Now just like any other test in statistics, we will have to start by specifying a significance level $\alpha$. Then, using the distribution of the Kolmogorov–Smirnov test statistic we can find a critical value. We need to reject the null hypothesis if the value from the test statistic is larger than the critical value. To not make things too complex, I will not go into the distribution of the Kolmogorov–Smirnov test statistic under the null hypothesis. If you are interested in an in depth explanation, this wikipedia page does a very good job at explaining this.

The limitations of the test

One of the most important limitations of the Kolmogorov–Smirnov test is that the test is only applicable to samples assumed to have a continuous distribution. Hence, if the sample is drawn from for example a binomial distribution, we can not use this test to find out if that truly is the case.

The second limitation is that the test is very sensitive at the centre of the distribution and way less sensitive at the tails. Now what does this mean? Often if we have a sample from for example a normal distribution, most of the observations in the sample are likely to be near the mean or the centre of the distribution. And very few to none of the observations are within the tails. Meaning the test can be way more accurate near the centre of the distribution than in the tails. And to fully paint the picture, we will often need a very large sample (the scale of which is of course dependent on the underlying true distribution) to make sure we have enough data about the tails of the distribution.

The last limitation, that may well be the hardest to solve, is that the test needs a full specification of the assumed distribution we want to test for. Meaning, it is not enough to just specify for example that we want to test if a given sample stems from for example a beta distribution, but that we also need to specify all parameters beforehand. This even goes as far as saying that the test will no longer be valid if the parameters of the distribution are estimated from the given sample, using other statistical methods. Hence, to determine the parameters, the method one would need to resort to is running a lot of simulations.

Conclusion

In conclusion, the Kolmogorov–Smirnov test is used to compare a sample to a reference distribution. That is, we hypothesize that the sample follows a specific distribution, which we then investigate using the Kolmogorov–Smirnov test. In essence the test compares the deviation of the empirical CDF (which is constructed using the sample) from the theoretical values of the assumed underlying distribution (i.e. of the theoretical CDF) from the null hypothesis. Even though the test has its limitations, it can still be very useful in statistics.