http://www.john-uebersax.com/stat/tetra.htm 提到可用两种方法做四分格相关系数的显著性检验,这里用到的是第一种方法,也就是利用标准差的检验.
<The Tetrachoric Correlation and its Asymptotic Standard Error>一文讲到四分格相关系数的标准差为s:
tetrachoric.se <- function(a, b, c, d) { n <- a + b + c + d z1 <- qnorm((a + c)/n) z2 <- qnorm((a + b)/n) r <- psych::tetrachoric(matrix(c(a, b, c, d), 2, 2))$rho f1 <- pnorm((z1 - r * z2)/sqrt(1 - r * r)) - 0.5 f2 <- pnorm((z2 - r * z1)/sqrt(1 - r * r)) - 0.5 t <- (a + d) * (b + c)/4 + (a + c) * (b + d) * f2 * f2 + (a + b) * (c + d) * f1 * f1 + 2 * (a * d - b * c) * f1 * f2 + (c * d - b * a) * f2 + (b * d - a * c) * f1 s <- n^(-3/2) * sqrt(t)/ mnormt::dmnorm(c(z1, z2), varcov = matrix(c(1, r, r, 1), 2, 2)) c(n, z1, z2, r, f1, f2, t, s) }
但是给出的假定相关系数为0时的标准差表达式有误,具体为下列代码中的s1,修正为s2:
tetrachoric.se0 <- function(a, b, c, d) { n <- a + b + c + d z1 <- qnorm((a + c)/n) z2 <- qnorm((a + b)/n) r <- 0 f1 <- pnorm((z1 - r * z2)/sqrt(1 - r * r)) - 0.5 f2 <- pnorm((z2 - r * z1)/sqrt(1 - r * r)) - 0.5 t <- (a + d) * (b + c)/4 + (a + c) * (b + d) * f2 * f2 + (a + b) * (c + d) * f1 * f1 + 2 * (a * d - b * c) * f1 * f2 + (c * d - b * a) * f2 + (b * d - a * c) * f1 s <- n^(-3/2) * sqrt(t)/ mnormt::dmnorm(c(z1, z2), varcov = matrix(c(1, r, r, 1), 2, 2)) s1 <- n^(-5/2) * sqrt((a + b) * (a + c) * (b + d) * (b + c))/ mnormt::dmnorm(c(z1, z2), varcov = matrix(c(1, 0, 0, 1), 2, 2)) s2 <- n^(-5/2) * sqrt((a + b) * (a + c) * (b + d) * (d + c))/ mnormt::dmnorm(c(z1, z2), varcov = matrix(c(1, 0, 0, 1), 2, 2)) c(n, z1, z2, r, f1, f2, t, s, s1, s2) }
其中s为直接将相关系数设为0之后求得的标准差, s1为文章提供的表达式求得(估计印刷错误),s2为修正之后的值,与s的值相同.
有了相关系数为0的标准差之后,就可以检验相关系数的显著性了,p值为:
2 * pnorm(abs(r)/s2, lower.tail = F)
写一个完整的求四分格相关系数显著性的R语言代码:
tetrachoric.sig <- function(a, b, c, d) { n <- a + b + c + d z1 <- qnorm((a + c)/n) z2 <- qnorm((a + b)/n) r <- psych::tetrachoric(matrix(c(a, b, c, d), 2, 2))$rho se <- n^(-5/2) * sqrt((a + b) * (a + c) * (b + d) * (d + c))/ mnormt::dmnorm(c(z1, z2), varcov = matrix(c(1, 0, 0, 1), 2, 2)) p <- 2 * pnorm(abs(r)/se, lower.tail = F) list(rho = r, p.value = p) } tetrachoric.sig <- function(m) { n <- sum(m) z1 <- qnorm(sum(m[1, ])/n) z2 <- qnorm(sum(m[, 1])/n) r <- psych::tetrachoric(m)$rho se <- n^(-5/2) * sqrt(prod(rowSums(m), colSums(m)))/ mnormt::dmnorm(c(z1, z2), varcov = matrix(c(1, 0, 0, 1), 2, 2)) p <- 2 * pnorm(abs(r)/se, lower.tail = F) list(rho = r, p.value = p) }