Kruskal Wallis Test: Purpose, Scope, Assumptions, Examples, Python Implementation
Kruskal Wallis is a non-parametric method for evaluating whether samples come from the same distribution. It is used in the comparison of more than two independent or unrelated samples. One-way analysis of variance (ANOVA) is the parametric equivalence of the Kruskal-Wallis test.
1.1 What would be a good Business Use case?
Let’s measure the impact of a campaign rolled out by a Pharma Company on a newly launched drug, where we have 1,550 Targets and 500 Holdouts. We looked at the prescription behavior distribution and found it non-normal (skewed) but similarly shaped for each group (targets and holdouts). We can not perform ANOVA; hence we apply a non-parametric test, Kruskal-Wallis.
Since Kruskal Wallis is a non-parametric test, there is no assumption that the data is normally distributed (unlike ANOVA).
- The factual null hypothesis is that the populations from which the samples originate have the same median.
- The Kruskal-Wallis test is most commonly used when there is one attribute variable and one measurement variable, and the measurement variable does not meet the assumptions of ANOVA (normality and homoscedasticity)
- Like most non-parametric tests, it is performed on ranked data, so the measurement observations are converted to their ranks using the overall data set: the smallest or the lowest value gets a rank of 1, the next smallest gets a rank of 2, the following a rank of 3, and so on. In the case of a tie, an average rank is considered.
- The loss of information in substituting ranks for the original values makes this a less powerful test than ANOVA, so ANOVA should be used if the data meet the assumptions.
The Kruskal-Wallis test’s null hypothesis is sometimes stated to be that the group medians are equal. However, this is only accurate if you believe each group’s distributional characteristics are the same. Even though the medians are the same, the Kruskal-Wallis test can reject the null hypothesis if the distributions differ.
Groups of different sizes can be examined using the Kruskal-Wallis statistic. The Kruskal-Wallis test, unlike the comparable one-way analysis of variance, does not assume a normal distribution because it is a non-parametric procedure. The test does, however, presume that each group’s distribution is identically shaped and scaled, except for any variations in medians.
Kruskal Wallis can be used to analyze whether the test and control performed differently. When the data is skewed (non-normal distribution), the test will tell whether the two groups are different without establishing any causation. It will not suggest the reason for the difference in behavior.
4.1 How the test Works?
Kruskal Wallis works by ranking all observations, starting from 1 (most minor). The ranking is done for all data points, irrespective of the group to which they belong. Tied values receive the average rank they would have received had they not been tied.
When all the observations have been assigned a signed rank based on the analysis variable (the number of prescriptions prescribed), they are differentiated/divided into groups based on their target/holdout status. After that, each group’s mean rank is calculated and compared.
Target is expected to have a higher mean rank than holdouts since the initiative or promotional effort is rolled out for this group. With a significant p-value, Target is performing better than holdouts. The challenge here is that the average rank of the target group can be higher in the presence of outliers, i.e., few doctors writing more scripts than others. Hence, we always look at the arithmetic median and the resultant p-value obtained by Kruskal Wallis to validate/refute our hypothesis.
Let Ni (i = 1, 2, 3, 4,…, g) represent the sample sizes for each g group (i.e., samples or, in this case, the number of doctors) in the data. ri is the sum of the ranks for group i with ri’ as the average rank of group i. Then the Kruskal Wallis test statistic is calculated as:
The null hypothesis of equal population medians is rejected if the test statistic exceeds the threshold chi-squared value. When the null hypothesis of equal populations is true, this statistic has k-1 degrees of freedom and approximates a chi-square distribution. The approximation must have ni’s of at least 5 (i.e., at least five observations in a group) for it to be accurate.
Using a chi-squared probability distribution table, we may get the crucial chi-squared value at g-1 degrees of freedom and the desired significance level. Alternatively, we might examine the p-value to comment on the results’ significance.
4.2 Run the H Test by Hand
Let’s assume that a Pharma Company wants to understand if three groups of doctor segments have different patient volumes (Stephanie Glen, n.d.) E.g.,
Key Opinion Leaders/KOL (Patient Volume in a Month): 23, 42, 55, 66, 78
Specialists/SPE (Patient Volume in a Month): 45, 56, 60, 70, 72
General Practitioners/GPs (Patient Volume in a Month): 18, 30, 34, 41, 44
4.2.1 Arrange the data in ascending order after combining them into one set
18 23 24 30 41 42 44 45 55 56 60 66 70 72 78
4.2.2 Rank the sorted data points. Use average in case of ties
Values: 18 23 24 30 41 42 44 45 55 56 60 66 70 72 78
Rank: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
4.2.3 Calculate the sum of ranks for each group
4.2.4 Calculate H Statistics using Formula 1 and numbers from Figure 1
4.2.5 Identify the critical chi-square value for g-1 degrees of freedom with
an α=0.05 which for our problem (3–1=2 degrees of freedom) should be 5.99. Refer to the table below.
4.2.6 Compare H value from 4.2.4 to the critical value from 4.2.5
The null hypothesis stating that the median patient volume across three different groups is equal should be rejected if the critical chi-square value is smaller than the H statistic. Since 5.99 (Critical value) < 6.72, we can reject the null hypothesis.
There needs to be more evidence to infer that the medians are unequal if the chi-square value is not lower than the H statistic calculated above.
The null hypothesis that all groups’ population medians are equal is tested using the Kruskal-Wallis H-test. It is an ANOVA variant that is non-parametric. The test utilizes two or more independent samples of varying sizes. Note that disproving the null hypothesis does not reveal how the groups differ. To identify which groups are different, post hoc comparisons between the groupings are necessary.
from scipy import stats
x = [1, 3, 5, 8, 9, 12, 17]
y = [2, 6, 6, 8, 10, 15, 20, 22]
stats.kruskal(x, y)KruskalResult(statistic=0.7560483870967752, pvalue=0.3845680059797648)print(np.median(x))
The output generated by Python is shown above. It should be noted that although a marked difference is observed in the mean of values across the two categories, this difference, when taking the median into account, is insignificant as the p-value is much greater than 5%.
Kruskal Wallis test is instrumental when dealing with particularly skewed samples. It can be used widely for a test control group during a campaign rollout or even when performing A/B testing. This is applicable for most industry use cases since each customer has different behavior when dealing with customers in a retail space or doctors in a pharmaceutical landscape. When we look at basket size or patient volume, few customers buy more, whereas few doctors have more patients. Hence for such skewed distribution, it is vital to put a Kruskal Wallis test to check if the behaviors are similar.
Stephanie Glen. “Kruskal Wallis H Test: Definition, Examples, Assumptions, SPSS” From StatisticsHowTo.com: Elementary Statistics for the rest of us! https://www.statisticshowto.com/probability-and-statistics/statistics-definitions/kruskal-wallis/
Kruskal Wallis Test for Beginners Republished from Source https://towardsdatascience.com/kruskal-wallis-test-for-beginners-4fe9b0333b31?source=rss—-7f60cf5620c9—4 via https://towardsdatascience.com/feed