Wilcoxon-Mann-Whitney秩和检验/rank sum test(或test U)

Source: http://www.r-bloggers.com/wilcoxon-mann-whitney-rank-sum-test-or-test-u/

 

比较两个独立样本群组的平均值,这里不需要假设总体为Gaussian类型分布;这也称作Mann-Whitney U-test
你想要看看两个足球队在一年进球数均值是否一样。以下为每个队在一年6场比赛中的进球数:

Team A: 6, 8, 2, 4, 4, 5
Team B: 7, 10, 4, 3, 5, 6

Wilcoxon-Matt-Whitney test (or Wilcoxon rank sum test, orMann-Whitney U-test) 用于比较两个并不满足正态分布群组的均值比较:这是一个非参数检验(non-parametrical test)。其与应用于独立样本的t-test相当。

让我们看看如何在R中解决这个我问题:

a = c(6, 8, 2, 4, 4, 5)
b = c(7, 10, 4, 3, 5, 6)

wilcox.test(a,b, correct=FALSE)

Wilcoxon rank sum test

data: a and b
W = 14, p-value = 0.5174
alternative hypothesis: true location shift is not equal to 0


p-value大于0.05,因此我们可接受null hypothesis H0,即两个群组的均值统计相等。如果你运行 wilcox.test(b, a, correct = FALSE),p-value在逻辑上将会是一样的:

a = c(6, 8, 2, 4, 4, 5)
b = c(7, 10, 4, 3, 5, 6)

wilcox.test(b,a, correct=FALSE)

Wilcoxon rank sum test

data: b and a
W = 22, p-value = 0.5174
alternative hypothesis: true location shift is not equal to 0


而值W的计算如下:

sum.rank.a = sum(rank(c(a,b))[1:6]) #sum of ranks assigned to the group a
 W = sum.rank.a – (length(a)*(length(a)+1)) / 2
 W
[1] 14

sum.rank.b = sum(rank(c(a,b))[7:12]) #sum of ranks assigned to the group b
 W = sum.rank.b – (length(b)*(length(b)+1)) / 2 
 W
[1] 22


最后我们比较对独立样本Wilcoxon的表上查表得到的区间。对两个6个样本群组查表得到的区间是(26,52),而我们样本的区间为:

sum(rank(c(a,b))[1:6]) #sum of ranks assigned to the group a
[1] 35
sum(rank(c(a,b))[7:12]) #sum of ranks assigned to the group b
[1] 43


因为计算的区间(35, 43)包含在查表区间(26,52),我们论断接受null hypothesis H0,即均值相等

=========================================================

在使用函数wilcox.test,以下的输入参数形式才是使用了rank sum检验方法:

wilcox.test(a,b, paired=F)

    Wilcoxon rank sum test with continuity correction

data:  a and b 
W = 14, p-value = 0.5711
alternative hypothesis: true location shift is not equal to 0 

#------------------------------------------

wilcox.test(b,a, paired=F)

    Wilcoxon rank sum test with continuity correction

data:  b and a 
W = 22, p-value = 0.5711
alternative hypothesis: true location shift is not equal to 0 

这里求得的两个W值分别是低尾部值和高尾部值,我们可以用以下函数获得接受区间:

qwilcox(0.025, length(a), length(b), lower.tail=T)
[1] 6

qwilcox(0.025, length(a), length(b), lower.tail=F)
[1] 30 

之前计算的W值区间(14, 22)在[6,30]范围内,所以我们接受null hypothesis,即a和b的均值显著相等。


你可能感兴趣的:(c,null)