首页 > 解决方案 > 在使用 R 时,在 Wilcox 符号秩检验期间发现 V 为 0 时,如果有的话,这意味着什么?

问题描述

我正在对置信度分数进行前测和后测比较。当我通常运行此测试时,v 值是大于 0 的某个数字。然后我比较给定的 p 值以确定它是否显着。

这是我习惯获得的那种输出:

pretest = c (3,4,4,2,2,4,2,2,5,3,3,3,3,1,3,3,2,2,3,2,3,3,5,4,3,4,2,2,4,2,1,4,3)
posttest = c(4,5,4,5,4,5,3,6,5,6,4,2,5,2,4,5,3,3,5,4,5,5,5,5,4,5,4,5,5,4,3,6,5)

wilcox.test (pretest, posttest, paired = TRUE, exact = FALSE)


Wilcoxon signed rank test with continuity correction

data:  pretest and posttest
V = 7.5, p-value = 2.461e-06
alternative hypothesis: true location shift is not equal to 0

当前问题

我的背景是教育而不是统计,所以如果我遇到以前从未见过的东西,我的知识肯定会很脆弱。

我今天遇到了一个新情况。我的 v 返回为 0。请参阅以下代码:

pretest = c (2,3,3,4,4,1,2,3,2,4,4,2,2,5,3,3,2,2,3,1,3,3,3,4,2,2,3,2,3,3,4,4,2,3,4,2,3,2,5,2,1,3,2)

posttest = c (5,5,5,5,4,4,4,6,6,5,5,3,6,5,6,4,3,2,6,2,5,4,4,5,3,3,4,2,4,3,5,5,6,3,4,2,6,5,5,3,4,6,5)

wilcox.test (pretest, posttest, paired = TRUE, exact = FALSE)


Wilcoxon signed rank test with continuity correction

data:  pretest and posttest
V = 0, p-value = 2.309e-07
alternative hypothesis: true location shift is not equal to 0

(1) 我的问题是为什么 V 返回为 0?(2) 这会影响我应该如何解释 p 值吗?(3) V的值是什么意思?有/有什么关系?

标签: rstatistics

解决方案


As alluded to by @42-, I think your question has to do with a misunderstanding of what the V value denotes in a Wilcoxon signed-rank test.

To recap: The test statistic in a paired Wilcoxon signed-rank test (the V value) is the sum of the ranks of the pairwise differences x - y > 0.

Let's create some sample data to understand how V can be zero.

We draw samples from two normal distributions with different means.

set.seed(2018)
x <- rnorm(10, mean = 0)
y <- rnorm(10, mean = 5)

We now perform a paired Wilcoxon signed-rank test

wilcox.test(x, y, paired = TRUE, exact = FALSE)
#
#   Wilcoxon signed rank test with continuity correction
#
#data:  x and y
#V = 0, p-value = 0.005922
#alternative hypothesis: true location shift is not equal to 0

We first note how the p-value is very small, leading us to reject the null hypothesis that the ranks of the population means are the same. Since the samples x and y come from two normal distributions with very different means (mean = 0 vs. mean = 5), this is hardly surprising. Furthermore, we note that the test statistic V = 0. Given the definition of the test statistic, this is means that there are no values x > y; we can confirm that this is indeed the case

any(x > y)
#[1] FALSE

For good measure, we can visualise the distribution of both samples

library(ggplot2)
ggplot(data.frame(
    val = c(x, y),
    smpl = c(rep("x",length(x)), rep( "y", length(x))))) +
geom_histogram(aes(val, fill = smpl), bins = 30, position = "identity")

enter image description here


推荐阅读