之前的推文介绍了一个链接,https://simplystatistics.org/2019/08/28/you-can-replicate-almost-any-plot-with-ggplot2/
这个里面有5个R语言ggplot2作图的实例,数据代码都有,非常好的学习素材。但是他的代码相对比较长,初学者看起来可能有些吃力。后面争取出推文把代码都拆解一下。今天的推文介绍一下其中的柱形图实现的代码
先介绍一个小知识点
ggplot2作图X轴默认坐标轴的刻度是朝下的,Y轴默认的刻度是朝左的,如果要改为朝上和朝右,该如何设置。之前也有人问过这个问题
library(ggplot2)
library(ggstar)
ggplot()+
geom_star(aes(x=1,y=1),
size=100,
starshape=16,
fill="red")+
theme_bw()+
theme(axis.ticks.length.x = unit(-1,'cm'),
plot.margin = unit(c(1,1,2,1),'cm'),
axis.text.x = element_text(vjust=-20),
axis.title.x = element_text(vjust=-20),
axis.ticks.length.y = unit(-1,'cm'),
axis.text.y =
element_text(
margin = margin(0,1.2,0,0,'cm')
))
这里我们把axis.ticks.length.x = unit(-1,'cm')
刻度线的长度调整为负数就可以了,
但是还遇到一个问题是 横坐标的文本和标题可以通过vjust
参数上下调节,纵坐标的参数却不能够用hjust
的参数左右调节,不知道是什么原因
下面开始重复开头提到的柱形图
首先是数据,用到的是dslabs
这个R包
安装直接使用命令install.packages("dslabs")
加载数据集
library(dslabs)
data("nyc_regents_scores")
给数据集增加一列
library(dplyr)
nyc_regents_scores %>% head()
nyc_regents_scores$total <- rowSums(nyc_regents_scores[,-1], na.rm=TRUE)
对数据集过滤
如果score这一列是缺失值就把这行数据删除
nyc_regents_scores %>%
filter(!is.na(score)) -> new_df
最基本的柱形图
new_df %>%
ggplot(aes(score, total)) +
geom_bar(stat = "identity",
color = "black",
fill = "#C4843C")
指定位置添加背景
new_df %>%
ggplot(aes(score, total)) +
annotate("rect", xmin = 65,
xmax = 99,
ymin = 0,
ymax = 35000,
alpha = .5) +
geom_bar(stat = "identity",
color = "black",
fill = "#C4843C")
添加文本注释
new_df %>%
ggplot(aes(score, total)) +
annotate("rect", xmin = 65,
xmax = 99,
ymin = 0,
ymax = 35000,
alpha = .5) +
geom_bar(stat = "identity",
color = "black",
fill = "#C4843C") +
annotate("text",
x = 66,
y = 28000,
label = "MINIMUM\nREGENTS DIPLOMA\nSCORE IS 65",
hjust = 0,
size = 3) +
annotate("text",
x = 0,
y = 12000,
label = "2010 Regents scores on\nthe five most common tests",
hjust = 0,
size = 3)
修改坐标轴刻度和位置
new_df %>%
ggplot(aes(score, total)) +
annotate("rect", xmin = 65,
xmax = 99,
ymin = 0,
ymax = 35000,
alpha = .5) +
geom_bar(stat = "identity",
color = "black",
fill = "#C4843C") +
annotate("text",
x = 66,
y = 28000,
label = "MINIMUM\nREGENTS DIPLOMA\nSCORE IS 65",
hjust = 0,
size = 3) +
annotate("text",
x = 0,
y = 12000,
label = "2010 Regents scores on\nthe five most common tests",
hjust = 0,
size = 3)+
scale_x_continuous(breaks = seq(5, 95, 5),
limit = c(0,99)) +
scale_y_continuous(position = "right") +
ggtitle("Scraping By") +
xlab("") + ylab("Number of tests")
最后是对主题进行设置
new_df %>%
ggplot(aes(score, total)) +
annotate("rect", xmin = 65,
xmax = 99,
ymin = 0,
ymax = 35000,
alpha = .5) +
geom_bar(stat = "identity",
color = "black",
fill = "#C4843C") +
annotate("text",
x = 66,
y = 28000,
label = "MINIMUM\nREGENTS DIPLOMA\nSCORE IS 65",
hjust = 0,
size = 3) +
annotate("text",
x = 0,
y = 12000,
label = "2010 Regents scores on\nthe five most common tests",
hjust = 0,
size = 3)+
scale_x_continuous(breaks = seq(5, 95, 5),
limit = c(0,99)) +
scale_y_continuous(position = "right") +
ggtitle("Scraping By") +
xlab("") +
ylab("Number of tests")+
theme_minimal() +
theme(panel.grid.major.x = element_blank(),
panel.grid.minor.x = element_blank(),
axis.ticks.length = unit(-0.2, "cm"),
plot.title = element_text(face = "bold"))
欢迎大家关注我的公众号
小明的数据分析笔记本
小明的数据分析笔记本 公众号 主要分享:1、R语言和python做数据分析和数据可视化的简单小例子;2、园艺植物相关转录组学、基因组学、群体遗传学文献阅读笔记;3、生物信息学入门学习资料及自己的学习笔记!