使用ggscatter函数绘制散点图
加载所需R包
library(ggpubr)
基本用法:
Usage
ggscatter(data, x, y, combine = FALSE, merge = FALSE, color = "black",
fill = "lightgray", palette = NULL, shape = 19, size = 2,
point = TRUE, rug = FALSE, title = NULL, xlab = NULL, ylab = NULL,
facet.by = NULL, panel.labs = NULL, short.panel.labs = TRUE,
add = c("none", "reg.line", "loess"), add.params = list(),
conf.int = FALSE, conf.int.level = 0.95, fullrange = FALSE,
ellipse = FALSE, ellipse.level = 0.95, ellipse.type = "norm",
ellipse.alpha = 0.1, mean.point = FALSE,
mean.point.size = ifelse(is.numeric(size), 2 * size, size),
star.plot = FALSE, star.plot.lty = 1, star.plot.lwd = NULL,
label = NULL, font.label = c(12, "plain"), font.family = "",
label.select = NULL, repel = FALSE, label.rectangle = FALSE,
cor.coef = FALSE, cor.coeff.args = list(), cor.method = "pearson",
cor.coef.coord = c(NULL, NULL), cor.coef.size = 4, ggp = NULL,
show.legend.text = NA, ggtheme = theme_pubr(), ...)
常用参数:
Arguments
data # a data frame
x, y #x and y variables for drawing.
combine #logical value. Default is FALSE. Used only when y is a vector containing multiple variables to plot. If TRUE, create a multi-panel plot by combining the plot of y variables.
merge #logical or character value. Default is FALSE. Used only when y is a vector containing multiple variables to plot. If TRUE, merge multiple y variables in the same plotting area. Allowed values include also "asis" (TRUE) and "flip". If merge = "flip", then y variables are used as x tick labels and the x variable is used as grouping variable.
color, fill #point colors.
palette #the color palette to be used for coloring or filling by groups. Allowed values include "grey" for grey color palettes; brewer palettes e.g. "RdBu", "Blues", ...; or custom color palette e.g. c("blue", "red"); and scientific journal palettes from ggsci R package, e.g.: "npg", "aaas", "lancet", "jco", "ucscgb", "uchicago", "simpsons" and "rickandmorty".
shape #point shape. See show_point_shapes.
size #Numeric value (e.g.: size = 1). change the size of points and outlines.
point #是否显示点 logical value. If TRUE, show points.
rug #是否添加边际线 logical value. If TRUE, add marginal rug.
title #plot main title.
xlab #character vector specifying x axis labels. Use xlab = FALSE to hide xlab.
ylab #character vector specifying y axis labels. Use ylab = FALSE to hide ylab.
facet.by #character vector, of length 1 or 2, specifying grouping variables for faceting the plot into multiple panels. Should be in the data.
panel.labs #a list of one or two character vectors to modify facet panel labels. For example, panel.labs = list(sex = c("Male", "Female")) specifies the labels for the "sex" variable. For two grouping variables, you can use for example panel.labs = list(sex = c("Male", "Female"), rx = c("Obs", "Lev", "Lev2") ).
short.panel.labs #是否缩写分面标题 logical value. Default is TRUE. If TRUE, create short labels for panels by omitting variable names; in other words panels will be labelled only by variable grouping levels.
add #添加回归线 allowed values are one of "none", "reg.line" (for adding linear regression line) or "loess" (for adding local regression fitting).
add.params #parameters (color, size, linetype) for the argument 'add'; e.g.: add.params = list(color = "red").
conf.int #是否添加置信区间 logical value. If TRUE, adds confidence interval.
conf.int.level #设置置信区间的范围 Level controlling confidence region. Default is 95%. Used only when add != "none" and conf.int = TRUE.
fullrange #should the fit span the full range of the plot, or just the data. Used only when add != "none".
ellipse #是否添加分组椭圆 logical value. If TRUE, draws ellipses around points.
ellipse.level #the size of the concentration ellipse in normal probability.
ellipse.type #Character specifying frame type. Possible values are 'convex', 'confidence' or types supported by stat_ellipse including one of c("t", "norm", "euclid").
ellipse.alpha #Alpha for ellipse specifying the transparency level of fill color. Use alpha = 0 for no fill color.
mean.point #是否添加均值的点 logical value. If TRUE, group mean points are added to the plot.
mean.point.size #numeric value specifying the size of mean points.
star.plot #是否添加星图 logical value. If TRUE, a star plot is generated.
star.plot.lty, star.plot.lwd #星图的线型和线宽 line type and line width (size) for star plot, respectively.
label #the name of the column containing point labels. Can be also a character vector with length = nrow(data).
font.label #a vector of length 3 indicating respectively the size (e.g.: 14), the style (e.g.: "plain", "bold", "italic", "bold.italic") and the color (e.g.: "red") of point labels. For example font.label = c(14, "bold", "red"). To specify only the size and the style, use font.label = c(14, "plain").
font.family #character vector specifying font family.
label.select #character vector specifying some labels to show.
repel #a logical value, whether to use ggrepel to avoid overplotting text labels or not.
label.rectangle #logical value. If TRUE, add rectangle underneath the text, making it easier to read.
cor.coef #是否添加相关系数和p-value值 logical value. If TRUE, correlation coefficient with the p-value will be added to the plot.
cor.coeff.args #a list of arguments to pass to the function stat_cor for customizing the displayed correlation coefficients. For example: cor.coeff.args = list(method = "pearson", label.x.npc = "right", label.y.npc = "top").
cor.method #设定相关系数的计算方法 method for computing correlation coefficient. Allowed values are one of "pearson", "kendall", or "spearman".
cor.coef.coord #numeric vector, of length 2, specifying the x and y coordinates of the correlation coefficient. Default values are NULL.
cor.coef.size #correlation coefficient text font size.
ggp #a ggplot. If not NULL, points are added to an existing plot.
show.legend.text #logical. Should text be included in the legends? NA, the default, includes if any aesthetics are mapped. FALSE never includes, and TRUE always includes.
ggtheme #function, ggplot2 theme name. Default value is theme_pubr(). Allowed values include ggplot2 official themes: theme_gray(), theme_bw(), theme_minimal(), theme_classic(), theme_void(), ....
... #other arguments to be passed to geom_point and ggpar.
使用示例:
Examples
# Load data
data("mtcars")
df <- mtcars
df$cyl <- as.factor(df$cyl)
head(df)
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
## Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
## Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
## Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
# Basic plot
p1 <- ggscatter(df, x = "wt", y = "mpg",
color = "red")
p1
p2 <- ggscatter(df, x = "wt", y = "mpg",
color = "black", shape = 21, size = 3, # Points color, shape and size
add = "reg.line", # Add regressin line
add.params = list(color = "blue", fill = "lightgray"), # Customize reg. line
conf.int = TRUE, # Add confidence interval
cor.coef = TRUE, # Add correlation coefficient. see ?stat_cor
cor.coeff.args = list(method = "pearson", label.x = 3, label.sep = "\n")
)
p2
# loess method: local regression fitting
p3 <- ggscatter(df, x = "wt", y = "mpg",
add = "loess", conf.int = TRUE,
cor.coef = TRUE, # Add correlation coefficient. see ?stat_cor
cor.coeff.args = list(method = "spearman", label.x = 3, label.sep = "\n")
)
p3
# Control point size by continuous variable values ("qsec")
p4 <- ggscatter(df, x = "wt", y = "mpg",
color = "#00AFBB", size = "qsec")
p4
# Change colors
# Use custom color palette
# Add marginal rug
p5 <- ggscatter(df, x = "wt", y = "mpg", color = "cyl", size = "qsec",
palette = c("#00AFBB", "#E7B800", "#FC4E07") )
p5
p6 <- ggscatter(df, x = "wt", y = "mpg", color = "cyl", rug=TRUE,
palette = c("#00AFBB", "#E7B800", "#FC4E07") )
p6
# Add group ellipses and mean points
# Add stars
p7 <- ggscatter(df, x = "wt", y = "mpg",
color = "cyl", shape = "cyl",
palette = c("#00AFBB", "#E7B800", "#FC4E07"),
ellipse = TRUE)
p7
p8 <- ggscatter(df, x = "wt", y = "mpg",
color = "cyl", shape = "cyl",
palette = c("#00AFBB", "#E7B800", "#FC4E07"),
ellipse = TRUE, ellipse.type = "convex",
mean.point = TRUE,
)
p8
p9 <- ggscatter(df, x = "wt", y = "mpg",
color = "cyl", shape = "cyl",
palette = c("#00AFBB", "#E7B800", "#FC4E07"),
ellipse = TRUE, ellipse.type = 'confidence',
mean.point = TRUE,
star.plot = TRUE)
p9
# Textual annotation
df$name <- rownames(df)
p10 <- ggscatter(df, x = "wt", y = "mpg",
color = "cyl", palette = c("#00AFBB", "#E7B800", "#FC4E07"),
label = "name")
p10
p11 <- ggscatter(df, x = "wt", y = "mpg",
color = "cyl", palette = c("#00AFBB", "#E7B800", "#FC4E07"),
label = "name", repel = TRUE)
p11
参考来源:
https://www.rdocumentation.org/packages/ggpubr/versions/0.1.4/topics/ggscatter
sessionInfo()
## R version 3.5.1 (2018-07-02)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: OS X El Capitan 10.11.3
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
##
## locale:
## [1] zh_CN.UTF-8/zh_CN.UTF-8/zh_CN.UTF-8/C/zh_CN.UTF-8/zh_CN.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] bindrcpp_0.2.2 ggpubr_0.1.7.999 magrittr_1.5 ggplot2_3.0.0
##
## loaded via a namespace (and not attached):
## [1] Rcpp_0.12.18 rstudioapi_0.7 bindr_0.1.1 knitr_1.20
## [5] tidyselect_0.2.4 munsell_0.5.0 colorspace_1.3-2 R6_2.2.2
## [9] rlang_0.2.2 stringr_1.3.1 plyr_1.8.4 dplyr_0.7.6
## [13] tools_3.5.1 grid_3.5.1 gtable_0.2.0 withr_2.1.2
## [17] htmltools_0.3.6 assertthat_0.2.0 yaml_2.2.0 lazyeval_0.2.1
## [21] rprojroot_1.3-2 digest_0.6.16 tibble_1.4.2 crayon_1.3.4
## [25] purrr_0.2.5 ggrepel_0.8.0 glue_1.3.0 evaluate_0.11
## [29] rmarkdown_1.10 labeling_0.3 stringi_1.2.4 compiler_3.5.1
## [33] pillar_1.3.0 scales_1.0.0 backports_1.1.2 pkgconfig_2.0.2