绘制有相关性信息和边际直方图的完美散点图

更好的阅读体验>>

散点图常用于展示两个变量之间的关系。下面将首先展示如何在R中绘制散点图;使用ggpubr包中的函数来添加相关系数和显著性水平;还将介绍如何进行分组着色以及如何在每个组周围添加椭圆。此外,还将展示如何绘制气泡图,以及如何添加边际图(直方图,密度图或箱线图)。


绘制有相关性信息和边际直方图的完美散点图_第1张图片

加载数据

library(ggpubr)
# Load data
data("mtcars")
df <- mtcars
# Convert cyl as a grouping variable
df$cyl <- as.factor(df$cyl)
# Inspect the data
head(df[, c("wt", "mpg", "cyl", "qsec")])

##                     wt  mpg cyl qsec
## Mazda RX4         2.62 21.0   6 16.5
## Mazda RX4 Wag     2.88 21.0   6 17.0
## Datsun 710        2.32 22.8   4 18.6
## Hornet 4 Drive    3.21 21.4   6 19.4
## Hornet Sportabout 3.44 18.7   8 17.0
## Valiant           3.46 18.1   6 20.2

基本散点图

ggscatter(df, x = "wt", y = "mpg",
          add = "reg.line",                                 # Add regression line
          conf.int = TRUE,                                  # Add confidence interval
          add.params = list(color = "blue",
                            fill = "lightgray")
          )+
  stat_cor(method = "pearson", label.x = 3, label.y = 30)  # Add correlation coefficient
绘制有相关性信息和边际直方图的完美散点图_第2张图片

可以通过shape参数来修改点的形状:

ggscatter(df, x = "wt", y = "mpg",
          shape = 18)

要查看其他的点形状,可以输入如下代码:

show_point_shapes()
绘制有相关性信息和边际直方图的完美散点图_第3张图片

点分组着色

ggscatter(df, x = "wt", y = "mpg",
          add = "reg.line",                         # Add regression line
          conf.int = TRUE,                          # Add confidence interval
          color = "cyl", palette = "jco",           # Color by groups "cyl"
          shape = "cyl"                             # Change point shape by groups "cyl"
          )+
  stat_cor(aes(color = cyl), label.x = 3)           # Add correlation coefficient
绘制有相关性信息和边际直方图的完美散点图_第4张图片
#延伸回归线-> fullrange = TRUE
#添加边际地毯(marginal density)---> rug = TRUE
ggscatter(df, x = "wt", y = "mpg",
          add = "reg.line",                         # Add regression line
          color = "cyl", palette = "jco",           # Color by groups "cyl"
          shape = "cyl",                            # Change point shape by groups "cyl"
          fullrange = TRUE,                         # Extending the regression line
          rug = TRUE                                # Add marginal rug
          )+
  stat_cor(aes(color = cyl), label.x = 3)           # Add correlation coefficient
绘制有相关性信息和边际直方图的完美散点图_第5张图片

添加分组椭圆

主要参数:

  • ellipse = TRUE: 在分组周围添加椭圆
  • ellipse.level: 以正常概率表示椭圆的大小,默认值为0.95。
  • ellipse.type: 椭圆类型,可选值可以是‘convex’, ‘confidence’ 或ggplot2::stat_ellipse支持的类型,包括c(“t”, “norm”, “euclid”), 默认值为“norm”
ggscatter(df, x = "wt", y = "mpg",
          color = "cyl", palette = "jco",
          shape = "cyl",
          ellipse = TRUE)
绘制有相关性信息和边际直方图的完美散点图_第6张图片
#将椭圆类型更改为'convex'
ggscatter(df, x = "wt", y = "mpg",
          color = "cyl", palette = "jco",
          shape = "cyl",
          ellipse = TRUE, ellipse.type = "convex")
绘制有相关性信息和边际直方图的完美散点图_第7张图片
#添加组均值和星星图
ggscatter(df, x = "wt", y = "mpg",
          color = "cyl", palette = "jco",
          shape = "cyl",
          ellipse = TRUE, 
          mean.point = TRUE,
          star.plot = TRUE)
绘制有相关性信息和边际直方图的完美散点图_第8张图片

添加点标签

主要参数:

  • label: 包含点标签的列名称。
  • font.label: 一个列表,可以包含以下元素的组合: 点的大小(例如:14),样式(例如:“plain”, “bold”, “italic”, “bold.italic”),颜色(例如:“red”)。例如,font.label = list(size = 14, face = “bold”, color =“red”)
  • label.select: 字符向量,指定要显示的一些标签。
  • repel = TRUE: 避免标签重叠。
#使用行名作为点标签
df$name <- rownames(df)
ggscatter(df, x = "wt", y = "mpg",
   color = "cyl", palette = "jco",
   label = "name", repel = TRUE)
绘制有相关性信息和边际直方图的完美散点图_第9张图片
# 指定要显示的标签
ggscatter(df, x = "wt", y = "mpg",
   color = "cyl", palette = "jco",
   label = "name", repel = TRUE,
   label.select = c("Toyota Corolla", "Merc 280", "Duster 360"))
绘制有相关性信息和边际直方图的完美散点图_第10张图片
#根据一些标准显示标签
ggscatter(df, x = "wt", y = "mpg",
   color = "cyl", palette = "jco",
   label = "name", repel = TRUE,
   label.select = list(criteria = "`x` > 4 & `y` < 15"))
绘制有相关性信息和边际直方图的完美散点图_第11张图片

气泡图

在气泡图中,点大小由连续变量(此处为“qsec”)控制,参数alpha用于控制颜色的透明度,取值在0到1之间。

ggscatter(df, x = "wt", y = "mpg",
   color = "cyl", palette = "jco",
   size = "qsec", alpha = 0.5)+
  scale_size(range = c(0.5, 15))    # Adjust the range of points size
绘制有相关性信息和边际直方图的完美散点图_第12张图片

设置连续变量的颜色

下面将根据连续变量的值(此处为“mpg”)对点进行着色。默认情况下,将绘制蓝色渐变颜色,可以使用函数gradient_color()修改。

# 连续变量的颜色
p <- ggscatter(df, x = "wt", y = "mpg",
               color = "mpg")
p
# 修改渐变色
p + gradient_color(c("blue", "white", "red"))
绘制有相关性信息和边际直方图的完美散点图_第13张图片

绘制有相关性信息和边际直方图的完美散点图_第14张图片

添加边际图

ggExtra包中的函数ggMarginal()可用于向散点图添加边际直方图,密度图或箱线图。
首先,安装ggExtra包:

install.packages("ggExtra")

绘制散点图:

# 添加边际密度图
library("ggExtra")
p <- ggscatter(iris, x = "Sepal.Length", y = "Sepal.Width",
               color = "Species", palette = "jco",
               size = 3, alpha = 0.6)
ggMarginal(p, type = "density")
# 更改边际图类型
ggMarginal(p, type = "boxplot")
绘制有相关性信息和边际直方图的完美散点图_第15张图片

绘制有相关性信息和边际直方图的完美散点图_第16张图片

ggExtra包的局限性之一是它无法处理散点图和边际图中的多个分组,可以使用cowplot包来解决。

# Scatter plot colored by groups ("Species")
sp <- ggscatter(iris, x = "Sepal.Length", y = "Sepal.Width",
                color = "Species", palette = "jco",
                size = 3, alpha = 0.6)+
  border()                                         
# Marginal density plot of x (top panel) and y (right panel)
xplot <- ggdensity(iris, "Sepal.Length", fill = "Species",
                   palette = "jco")
yplot <- ggdensity(iris, "Sepal.Width", fill = "Species", 
                   palette = "jco")+
  rotate()
# Cleaning the plots
sp <- sp + rremove("legend")
yplot <- yplot + clean_theme() + rremove("legend")
xplot <- xplot + clean_theme() + rremove("legend")
# Arranging the plot using cowplot
library(cowplot)
plot_grid(xplot, NULL, sp, yplot, ncol = 2, align = "hv", 
          rel_widths = c(2, 1), rel_heights = c(1, 2))
绘制有相关性信息和边际直方图的完美散点图_第17张图片

添加边际箱线图:

# Scatter plot colored by groups ("Species")
sp <- ggscatter(iris, x = "Sepal.Length", y = "Sepal.Width",
                color = "Species", palette = "jco",
                size = 3, alpha = 0.6, ggtheme = theme_bw())             
# Marginal boxplot of x (top panel) and y (right panel)
xplot <- ggboxplot(iris, x = "Species", y = "Sepal.Length", 
                   color = "Species", fill = "Species", palette = "jco",
                   alpha = 0.5, ggtheme = theme_bw())+
  rotate()
yplot <- ggboxplot(iris, x = "Species", y = "Sepal.Width",
                   color = "Species", fill = "Species", palette = "jco",
                   alpha = 0.5, ggtheme = theme_bw())
# Cleaning the plots
sp <- sp + rremove("legend")
yplot <- yplot + clean_theme() + rremove("legend")
xplot <- xplot + clean_theme() + rremove("legend")
# Arranging the plot using cowplot
library(cowplot)
plot_grid(xplot, NULL, sp, yplot, ncol = 2, align = "hv", 
          rel_widths = c(2, 1), rel_heights = c(1, 2))
绘制有相关性信息和边际直方图的完美散点图_第18张图片

但是,上面的图美中不足的是在主图和边际密度图之间存在多余的空隙,不够美观,有一种解决方案如下:

library(cowplot) 
# 主图
pmain <- ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, color = Species))+
  geom_point()+
  ggpubr::color_palette("jco")
# 沿x轴的边际密度图
xdens <- axis_canvas(pmain, axis = "x")+
  geom_density(data = iris, aes(x = Sepal.Length, fill = Species),
              alpha = 0.7, size = 0.2)+
  ggpubr::fill_palette("jco")
# 沿y轴的边际密度图
# 如果想使用coord_flip(),需要设置coord_flip = TRUE
ydens <- axis_canvas(pmain, axis = "y", coord_flip = TRUE)+
  geom_density(data = iris, aes(x = Sepal.Width, fill = Species),
                alpha = 0.7, size = 0.2)+
  coord_flip()+
  ggpubr::fill_palette("jco")
p1 <- insert_xaxis_grob(pmain, xdens, grid::unit(.2, "null"), position = "top")
p2<- insert_yaxis_grob(p1, ydens, grid::unit(.2, "null"), position = "right")
ggdraw(p2)
绘制有相关性信息和边际直方图的完美散点图_第19张图片

参考

  • Perfect Scatter Plots with Correlation and Marginal Histograms

你可能感兴趣的:(绘制有相关性信息和边际直方图的完美散点图)