One of the most wellknown statistical tests to analyze the differences between means of given groups is the ANOVA (analysis of variance) test. While ANOVA is a great tool, it assumes that the data in question follows a normal distribution. What if your data doesn’t follow a normal distribution or if your sample size is too small to determine a normal distribution? That’s where the KruskalWallis test comes in.
The KruskalWallis test can be thought of as the nonparametric equivalent to ANOVA. This test determines if independent groups have the same mean on ranks; instead of using the data values themselves, a rank is assigned to each data point and those ranks are used to determine if the data in each group originates from the same distribution. Essentially this test determines if the groups have the same median.
As mentioned above, KruskalWallis is a nonparametric test, meaning it makes no assumptions about the data’s parameters such as it’s mean, variance, etc. Because it makes no assumptions about the data’s parameters, it is unable to make an assumption about the distribution of the data; this is how KruskalWallis does not assume normally distributed data.
KruskalWallis is typically used with three or more independent groups, but can be used with just two, and each group should have a sample size of 5 or more. To perform a KruskalWallis test, we use the ranks of the data to calculate the test statistic, H, given by
\[H = \frac{12}{N(N+1)} \sum_{i=1}^{k} \frac{R_i^2}{n_i}3(N+1)\]
where N is the total sample size, k is the number of groups we are comparing, \(R_i\) is the sum of ranks for group i, and \(n_i\) is the sample size of group i.
We then compare H to a critical cutoff point determined by the chisquare distribution (chisquare is used because it is a good approximation of H, especially if each group’s sample size is >= 5). If the H statistic is significant (H is larger than the cutoff) we reject the null hypothesis. If the H statistic is not significant (H is smaller than the cutoff) we fail to reject the null hypothesis. In this test the null hypothesis is that the medians of each group are the same, meaning that all groups come from the same distribution. The alternative hypothesis is that at least one of the groups has a different median, meaning at least one comes from a different distribution than the others.
Assumptions
 Ordinal Variables  the variable in question should be ordinal or continuous, i.e., have some kind of hierarchy to them
 Independence  each group should be independent from the others
 Sample size  each group must have a sample size of 5 or more. With a sample size in this range, the chisquare distribution wellapproximates the H statistic.
HowTo and Example (by hand)
The stepbystep process to calculate the H statistic is as follows:
Step 1: State your hypothesis  Null Hypothesis: the medians (mean on ranks) are equal across the samples; Alternative Hypothesis: at least one median is different
Step 2: Prepare and rank your data  Arrange data from all groups together in one list in an ascending order  Give a rank to each of the data entries
Step 3: Sum the ranks for each group
Step 4: Calculate the test statistic, H
Step 5: Compare it to the critical cutoff, determined by the critical chisquare value
Step 6: Interpret your results
As an example, we will use data on antibody production after receiving a vaccine. A hospital administered three different vaccines to 6 individuals each and measured the antibody presence in their blood after a chosen time period. The data is as follows:
Vaccine  Antibodies (μg/ml) 

A  1232 
A  751 
A  339 
A  848 
A  447 
A  542 
–  – 
B  302 
B  57 
B  521 
B  278 
B  176 
B  201 
–  – 
C  839 
C  342 
C  473 
C  1128 
C  242 
C  475 
We want to determine how the three vaccines perform compared to each other. This can be quantified by determining if each vaccine causes the recipients to produce the same number of antibodies. Essentially we are looking to determine if the antibody data for each vaccine originates from the same distribution. We have relatively small sample sizes so we cannot welldetermine if the data is normally distributed, so we use the KruskalWallis test.
Step 1:
Null Hypothesis \(H_0 =\) the vaccines cause the same amount of antibodies to be produced (all three groups originate from the same distribution and have the same median)
Alternative Hypothesis \(H_A =\) At least one of the vaccines causes a different amount of antibodies to be produced (at least one group originates from a different distribution and has a different median)
Step 2:
Here we organize our data into ascending order then give each a rank.
Vaccine  Antibodies (μg/ml)  Rank 

B  57  1 
B  176  2 
B  201  3 
C  242  4 
B  278  5 
B  302  6 
A  339  7 
C  342  8 
A  447  9 
C  473  10 
C  475  11 
B  521  12 
A  542  13 
A  751  14 
C  839  15 
A  848  16 
C  1128  17 
A  1232  18 
Step 3:
Now we put our data back into their original groups and sum the ranks for each group.
Vaccine  Antibodies (μg/ml)  Rank 

A  1232  18 
A  751  14 
A  339  7 
A  848  16 
A  447  9 
A  542  13 
–  –  – 
B  302  6 
B  57  1 
B  521  12 
B  278  5 
B  176  2 
B  201  3 
–  –  – 
C  839  15 
C  342  8 
C  473  10 
C  1128  17 
C  242  4 
C  475  11 
Here, the sum of ranks for vaccine A is 77, the sum of ranks for vaccine B is 29, and the sum of ranks for vaccine C is 65.
Step 4:
Now we are ready to calculate our test statistic H \(H = \frac{12}{N(N+1)} \sum_{i=1}^{k} \frac{R_i^2}{n_i}3(N+1)\). For our data,
\[N = 18\]
\[k = 3\]
\[R_i = 77, 29, 65\]
\[n_i = 6, 6, 6 \]
Plugging these in we get:
\[H = \frac{12}{18(18+1)} \left[\frac{77^2}{6} + \frac{29^2}{6} + \frac{65^2}{6}\right]3(18+1)\]
Working out the math gives us a test statistic of \[H = 7.29824\]
Step 5:
Next we compare this H statistic to the critical cutoff: the corresponding chisquare value. We can determine the chisqure value by referencing a chisquare probabilities table.
We find the degrees of freedom by subtracting 1 from \(k\):
\[df = k1 \] \[ = 31 = 2\]
Using this value and a probability of 0.05 we find
\[\chi^2(2) = 5.99\]
The comparison between H = 7.29824 and \(\chi^2(2)\) = 5.99 gives
\[H > \chi^2(2).\]
Step 6:
Finally we interpret our results. Since H is larger than the critical cutoff \(\chi^2(2)\), we reject the null hypothesis; the medians are not the same across all three groups, at least one of them has a different median than the others. This means that all three vaccines do not perform equally, at least one vaccine causes their recipients to produce a different amount of antibodies than the others.
It’s important to note that KruskalWallis can only tell us that at least one of the groups originates from a different distribution. It cannot tell us which of the group(s) that is(are).
HowTo and Example (with Python)
The Python scipy.stats
module has a function called kruskal()
. Basically this function carries out the above calculation for us. This function takes two or more arraylike objects as arguments and returns the H statistic and the pvalue. Like most statistical software, the kruskal()
function computes approximate pvalues that are based on the chisquared distribution. To refresh our memories, the pvalue in this case is the probability of seeing differences in the groups as large as what we witnessed if the null hypothesis is true. If we have a small pvalue, say less than 0.05, we have evidence against the null. Small pvalues with KruskalWallis lead us to reject the null hypothesis and say that at least one of our groups likely originates from a different distribution than the others.
Here we will use the same example data and use kruskal()
to carry out the test. We enter the data into three separate arrays, one array for each group (in this case vaccine). We store the data in one array per group to make it easy for kruskal()
to tell our groups apart. This function interprets each array input as a separate group and will use each array as its own group in the H statistic and \(\chi^2\) calculations.
from scipy import stats
import numpy as np
# Store the data from each vaccine (the group for this example) into its own array
d1 = np.array([1232, 751, 339, 848, 447, 542])
d2 = np.array([302, 57, 521, 278, 176, 201])
d3 = np.array([839, 342, 473, 1128, 242, 475])
# Conduct the KruskalWallis test
H = stats.kruskal(d1, d2, d3)
print(H)
KruskalResult(statistic=7.298245614035082, pvalue=0.02601393801711558)
Here we see that the pvalue is ~0.026 which is less than the cutoff 0.05, so we reject the null hypothesis: the medians are not the same across all three groups, at least one of them has a different median than the others. This means that the vaccines do not perform equally well because the resulting antibody production is not the same for each vaccine. We draw the same conclusion as we did above when we performed the calculation ourselves!
Again we emphasize that the KruskalWallis test can only tell us that at least one of the vaccines performs differently than the others. It cannot tell us which vaccine(s) that is(are). In order to determine which vaccine performs differently we would need to conduct a post hoc test.
Summary

KruskalWallis tests if groups originate from the same distribution by determining if the groups have the same median

KruskalWallis is a nonparametric test, meaning it does not assume normally distributed data

The test statistic is the H statistic given by \(H = \frac{12}{N(N+1)} \sum_{i=1}^{k} \frac{R_i^2}{n_i}3(N+1)\)

Compare the H statistic to the critical cutoff given by the \(\chi^2\) distribution (with df=k1 and chosen probability)
 H > \(\chi^2\) –> reject the null hypothesis
 H < \(\chi^2\) –> fail to reject the null hypothesis

Use the Python
scipy.stats
functionkruskal()
to compute this quickly p < 0.05 : reject the null hypothesis
 p >= 0.05 : fail to reject the null hypothesis

Reject the null hypothesis: at least one group has a different median so we're confident at least one group originates from a different distribution

Fail to reject the null hypothesis: we cannot reject the possibility that all groups originate from the same distribution

KruskalWallis can only tell us if the groups originate from the same distribution. If we reject the null hypothesis, we can only conclude that one or more of the groups has a different median (comes from a different distribution). The test cannot tell us which groups originate from a different distribution.
Samantha Lomuscio
StatLab Associate
University of Virginia Library
December 07, 2021
For questions or clarifications regarding this article, contact statlab@virginia.edu.
View the entire collection of UVA Library StatLab articles, or learn how to cite.