R语言ggstatsplot包做卡方检验

library(ggstatsplot)
library(ggplot2)
library(dplyr)
data("diamonds")

diamonds2 <- diamonds %>% 
  filter(color == c('J', 'H', 'F'), clarity %in% c('SI2', 'VS1', 'IF'))#筛选出diamonds中颜色为J、H、F,清晰度为SI2、VS1、IF的数据,并保存为diamonds2。

ggbarstats(diamonds2, color, clarity, palette = 'Set2')
#以下为统计结果
Note: 95% CI for effect size estimate was computed with 100 bootstrap samples.
Note: Results from one-sample proportion tests for each level of the variable
clarity testing for equal proportions of the variable color.

# A tibble: 3 x 9
  condition N          F      H      J      `Chi-squared`    df `p-value` significance
                                         
1 SI2       (n = 1208) 45.20% 41.72% 13.08%         225.      2         0 ***         
2 VS1       (n = 966)  46.38% 38.20% 15.42%         149.      2         0 ***         
3 IF        (n = 251)  53.39% 39.44% 7.17%           84.6     2         0 ***   
image.png
-如图所示,卡方值为15.01,p = 0.005 < 检验水准0.05,可认为钻石的颜色与分类不独立,即存在关联。
-各个clarity的组内比较,不同颜色钻石的数量的差异均具有显著性(每个柱子上面为三颗星“*”,卡方值分别为225, 149, 84.6,均大于卡方分布在自由度为2,α为0.05时的值5.99,即p < 0.05, 所以都具有显著性)。
ggpiestats(diamonds2, color, clarity, palette = 'Set3')
#以下为统计结果
Note: 95% CI for effect size estimate was computed with 100 bootstrap samples.

Note: Results from one-sample proportion tests for each level of the variable
clarity testing for equal proportions of the variable color.

# A tibble: 3 x 9
  condition N          F      H      J      `Chi-squared`    df `p-value` significance
                                         
1 SI2       (n = 1208) 45.20% 41.72% 13.08%         225.      2         0 ***         
2 VS1       (n = 966)  46.38% 38.20% 15.42%         149.      2         0 ***         
3 IF        (n = 251)  53.39% 39.44% 7.17%           84.6     2         0 ***         
image.png
-此图统计结果与上面柱状图的结果一样,只是将柱状图换成饼图。
-这种些图形能够方便快速的将统计数据快速可视化,不仅能得到基本的卡方统计量,P值,还可以得到各分组内的分布状况,如颜色为J的钻石在分类为SI2的组内占比为13%,占比最大的为颜色F,占比45%。在分类VS1和IF组内,占比最大的也是颜色F,分别占比46%和53%。
grouped_ggpiestats(diamonds2[diamonds2$cut != 'Very Good',], color, clarity, grouping.var = cut, simulate.p.value = T)  #diamonds2[diamonds2$cut != 'Very Good',]表示去掉数据中cut为Very Good的数据,simulate.p.value = T表示对P值进行调整,因为cut为Fair的数据内,颜色为J和H的数量为0。
#以下为统计结果
Note: 95% CI for effect size estimate was computed with 100 bootstrap samples.

Note: Results from one-sample proportion tests for each level of the variable
clarity testing for equal proportions of the variable color.

# A tibble: 3 x 9
  condition N     F     H     J     `Chi-squared`    df `p-value` significance
                                 
1 SI2       (n =~ 47.7~ 41.7~ 10.4~          16.1     2     0     ***         
2 VS1       (n =~ 42.8~ 35.7~ 21.4~           2       2     0.368 ns          
3 IF        (n =~ 100.~ NA    NA              6       2     0.05  ns          
Note: 95% CI for effect size estimate was computed with 100 bootstrap samples.

Note: Results from one-sample proportion tests for each level of the variable
clarity testing for equal proportions of the variable color.

# A tibble: 3 x 9
  condition N     F     H     J     `Chi-squared`    df `p-value` significance
                                 
1 SI2       (n =~ 49.6~ 35.7~ 14.6~         25.6      2     0     ***         
2 VS1       (n =~ 48.1~ 31.3~ 20.4~          9.71     2     0.008 **          
3 IF        (n =~ 69.2~ 15.3~ 15.3~          7.54     2     0.023 *           
Note: 95% CI for effect size estimate was computed with 100 bootstrap samples.

Note: Results from one-sample proportion tests for each level of the variable
clarity testing for equal proportions of the variable color.

# A tibble: 3 x 9
  condition N     F     H     J     `Chi-squared`    df `p-value` significance
                                 
1 SI2       (n =~ 44.5~ 42.0~ 13.3~         71.7      2     0     ***         
2 VS1       (n =~ 41.5~ 41.5~ 16.8~         29.6      2     0     ***         
3 IF        (n =~ 40.0~ 48.0~ 12.0~          5.36     2     0.069 ns          
Note: 95% CI for effect size estimate was computed with 100 bootstrap samples.

Note: Results from one-sample proportion tests for each level of the variable
clarity testing for equal proportions of the variable color.

# A tibble: 3 x 9
  condition N     F     H     J     `Chi-squared`    df `p-value` significance
                                 
1 SI2       (n =~ 45.4~ 44.6~ 9.91%          84.7     2         0 ***         
2 VS1       (n =~ 49.0~ 38.5~ 12.5~          84.7     2         0 ***         
3 IF        (n =~ 52.5~ 42.3~ 5.08%          66.3     2         0 ***  
image.png

你可能感兴趣的:(R语言ggstatsplot包做卡方检验)