跟着Science学画图:R语言ggplot2作小提琴图展示NLRs基因的拷贝数(copy number)

论文是

De novo assembly, annotation, and comparative analysis of 26 diverse maize genomes

image.png

部分数据代码是公开的 下载链接https://zenodo.org/record/4781590#.YSB40Hzivic

论文本地pdf 玉米Science.pdf

附件本地pdf abg5289_Hufford_SM.pdf

今天的推文我们来重复一下论文附件中的Figure S16

image.png

没有找到论文中提供代码中用到的数据集NLR-violin-col.csv,论文中提供的数据集是NLR-violin4.csv

部分数据集如下

image.png

首先是读取数据集

violin2<-read.table('NLR-violin4.csv', sep=',',header=F)

将第二列第三列转换成因子

violin2$V2 <- as.factor(violin2$V2)
violin2$V3 <- as.factor(violin2$V3)

最基本的小提琴图

library(ggplot2)

plot3 <- ggplot(violin2, aes(x=V2, y=V1)) +
  geom_violin()

plot3
image.png

在此基础上加一个表示平均值的点

plot3 + 
  stat_summary(fun=mean, geom="point", shape=23, size=2)
image.png

在这个基础上再添加抖动的散点图

plot3 + 
  stat_summary(fun=mean, geom="point", shape=23, size=2)+ 
  geom_jitter(position = position_jitter(height = .3, 
                                         width = .3), 
              aes(colour = factor(V3)),
              size=0.5) 
image.png

自定义设置散点的颜色

plot3 + 
  stat_summary(fun=mean, geom="point", shape=23, size=2)+ 
  geom_jitter(position = position_jitter(height = .3, 
                                         width = .3), 
              aes(colour = factor(V3)),
              size=0.5)  +
  scale_colour_manual(name="colour", 
                      values=c("goldenrod1",
                               "royalblue",
                               "gray47",
                               "orchid",
                               "orangered",
                               "limegreen",
                               "brown", 
                               "darkgreen"))
image.png

最后是更改x轴和y轴的标题和设置一个主题

plot3 + 
  stat_summary(fun=mean, geom="point", shape=23, size=2)+ 
  geom_jitter(position = position_jitter(height = .3, 
                                         width = .3), 
              aes(colour = factor(V3)),
              size=0.5)  +
  scale_colour_manual(name="colour", 
                      values=c("goldenrod1",
                               "royalblue",
                               "gray47",
                               "orchid",
                               "orangered",
                               "limegreen",
                               "brown", 
                               "darkgreen"))+
  labs(x="NLR prediction", y="Copy Number of NLRs") +
  theme_minimal() 
image.png

论文中提供的代码到这里就结束了,和最终附件中的图还是有些许差别的,接下来增加一些代码使之更像附件中的图

更改x轴刻度的文本,并将其设置为斜体

plot3 + 
  stat_summary(fun=mean, geom="point", shape=23, size=2)+ 
  geom_jitter(position = position_jitter(height = .3, 
                                         width = .3), 
              aes(colour = factor(V3)),
              size=0.5)  +
  scale_colour_manual(name="colour", 
                      values=c("goldenrod1",
                               "royalblue",
                               "gray47",
                               "orchid",
                               "orangered",
                               "limegreen",
                               "brown", 
                               "darkgreen"))+
  labs(x="NLR prediction", y="Copy Number of NLRs") +
  theme_minimal() +
  scale_x_discrete(labels=c("A. thaliana",
                            "B. distachyon",
                            "Z. mays"))+
  theme(axis.text.x = element_text(face="italic"))
image.png

更改图例的文字标签

plot3 + 
  stat_summary(fun=mean, geom="point", shape=23, size=2)+ 
  geom_jitter(position = position_jitter(height = .3, 
                                         width = .3), 
              aes(colour = factor(V3)),
              size=0.5)  +
  scale_colour_manual(name=NULL, 
                      values=c("goldenrod1",
                               "royalblue",
                               "gray47",
                               "orchid",
                               "orangered",
                               "limegreen",
                               "brown", 
                               "darkgreen"),
                      labels=c("Stiff stalk",
                               "Nonstiff stalk",
                               "Mixed",
                               "Popcorn",
                               "Sweetcorn",
                               "Tropical",
                               "Accessions",
                               "Accessions"))+
  labs(x="NLR prediction", y="Copy Number of NLRs") +
  theme_minimal() +
  scale_x_discrete(labels=c("A. thaliana",
                            "B. distachyon",
                            "Z. mays"))+
  theme(axis.text.x = element_text(face="italic"))
image.png

通过代码我只能做到这种程度了,如果需要完全模仿原图的图例目前只能借助其他工具编辑图片了

欢迎大家关注我的公众号

小明的数据分析笔记本

小明的数据分析笔记本 公众号 主要分享:1、R语言和python做数据分析和数据可视化的简单小例子;2、园艺植物相关转录组学、基因组学、群体遗传学文献阅读笔记;3、生物信息学入门学习资料及自己的学习笔记!

今天推文的示例数据和代码可以在后台留言20211009获取

你可能感兴趣的:(跟着Science学画图:R语言ggplot2作小提琴图展示NLRs基因的拷贝数(copy number))