Source: http://www.r-bloggers.com/wilcoxon-mann-whitney-rank-sum-test-or-test-u/
比较两个独立样本群组的平均值,这里不需要假设总体为Gaussian类型分布;这也称作Mann-Whitney U-test
你想要看看两个足球队在一年进球数均值是否一样。以下为每个队在一年6场比赛中的进球数:
Team A: 6, 8, 2, 4, 4, 5
Team B: 7, 10, 4, 3, 5, 6
Wilcoxon-Matt-Whitney test (or Wilcoxon rank sum test, orMann-Whitney U-test) 用于比较两个并不满足正态分布群组的均值比较:这是一个非参数检验(non-parametrical test)。其与应用于独立样本的t-test相当。
让我们看看如何在R中解决这个我问题:
a = c(6, 8, 2, 4, 4, 5) b = c(7, 10, 4, 3, 5, 6) wilcox.test(a,b, correct=FALSE) Wilcoxon rank sum test data: a and b W = 14, p-value = 0.5174 alternative hypothesis: true location shift is not equal to 0
wilcox.test(b, a, correct = FALSE)
,p-value在逻辑上将会是一样的:
a = c(6, 8, 2, 4, 4, 5) b = c(7, 10, 4, 3, 5, 6) wilcox.test(b,a, correct=FALSE) Wilcoxon rank sum test data: b and a W = 22, p-value = 0.5174 alternative hypothesis: true location shift is not equal to 0
而值W的计算如下:
sum.rank.a = sum(rank(c(a,b))[1:6]) #sum of ranks assigned to the group a W = sum.rank.a – (length(a)*(length(a)+1)) / 2 W [1] 14 sum.rank.b = sum(rank(c(a,b))[7:12]) #sum of ranks assigned to the group b W = sum.rank.b – (length(b)*(length(b)+1)) / 2 W [1] 22
最后我们比较对独立样本Wilcoxon的表上查表得到的区间。对两个6个样本群组查表得到的区间是(26,52),而我们样本的区间为:
sum(rank(c(a,b))[1:6]) #sum of ranks assigned to the group a [1] 35 sum(rank(c(a,b))[7:12]) #sum of ranks assigned to the group b [1] 43
因为计算的区间(35, 43)包含在查表区间(26,52),我们论断接受null hypothesis H0,即均值相等
=========================================================
在使用函数wilcox.test,以下的输入参数形式才是使用了rank sum检验方法:
wilcox.test(a,b, paired=F) Wilcoxon rank sum test with continuity correction data: a and b W = 14, p-value = 0.5711 alternative hypothesis: true location shift is not equal to 0 #------------------------------------------ wilcox.test(b,a, paired=F) Wilcoxon rank sum test with continuity correction data: b and a W = 22, p-value = 0.5711 alternative hypothesis: true location shift is not equal to 0
这里求得的两个W值分别是低尾部值和高尾部值,我们可以用以下函数获得接受区间:
qwilcox(0.025, length(a), length(b), lower.tail=T) [1] 6 qwilcox(0.025, length(a), length(b), lower.tail=F) [1] 30