跟着Cell学作图:R语言ggplot2做散点图并添加拟合曲线和文字标签

论文

https://www.sciencedirect.com/science/article/pii/S0092867421008916#da0010

Ancient and modern genomes unravel the evolutionary history of the rhinoceros family

image.png

犀牛

本地论文 1-s2.0-S0092867421008916-main.pdf

数据和代码下载链接

https://github.com/liushanlin/rhinoceros-comparative-genome

今天的推文我们来重复一下论文中的 Figure5

image.png

数据集用到的是TableS4,部分数据如下

image.png

加载需要用到的R包

library(readxl)
library(tidyverse)
library(ggplot2)
library(ggrepel)

将数据整理成作图需要的格式


df<-read_excel("mmc4.xlsx",
               skip = 1) %>% 
  select(2,5,9,10,11) %>% 
  rename('V2'=`Common name`,
         'V1'=`conservation status`,
         'V3'=`# Missense`,
         'V4'=`# LoF mutation`,
         'V7'=`# Silent`) %>% 
  mutate(V5=V3/V7,
         V6=V4/V7) %>% 
  select(V1,V2,V5,V6) %>% 
  group_by(V1,V2) %>% 
  summarise(V5=mean(V5),
            V6=mean(V6)) %>% 
  mutate(V3=case_when(
    V1 == "Least Concern" | V1 == "Least concern" | V1 == "Near Threatened" ~ 'A',
    TRUE ~ "B"
  ))
head(df)

作图代码

pdf(file = "output.pdf",
    width = 10,
    height = 8,
    family = "serif")
plota = ggplot(data = df, aes(x=V5, y=V6)) +
  geom_point(aes(color=V3, shape=V3),size=4)+
  geom_text_repel(aes(label=V2),size=4) + 
  scale_x_continuous(name = "mean rate of Missense / Slient") + 
  scale_y_continuous(name = "mean value of LoF mutation rate") +  
  geom_smooth(method = "lm", 
              formula = y~x, 
              color="black",
              size=1, se=F) + 
  scale_shape_manual(values = c(15,19),
                     labels=c("Least concern, Data deficient, Near threatened",
                              "Vulnerable, Endangered, Critically endangered"))+
  scale_color_manual(values = rev(c("#D55E00","#999999")),
                     labels=c("Least concern, Data deficient, Near threatened",
                              "Vulnerable, Endangered, Critically endangered"))+ 
  annotate("text", x=0.6, y=0.03,
           label = "atop(italic(R) ^ 2 == 0.61, 'P value = 9.43e-5')", 
           parse=T, size=6) +  
  theme(panel.background = element_blank(),
               panel.grid = element_blank(),
               axis.line  = element_line(),
               axis.text = element_text(size = 12),
               axis.title = element_text(size = 12),
               legend.position = c(0.3,0.9),
               legend.title = element_blank()
)
print(plota)
dev.off()

最终结果

image.png

欢迎大家关注我的公众号

小明的数据分析笔记本

小明的数据分析笔记本 公众号 主要分享:1、R语言和python做数据分析和数据可视化的简单小例子;2、园艺植物相关转录组学、基因组学、群体遗传学文献阅读笔记;3、生物信息学入门学习资料及自己的学习笔记!

示例数据和代码可以留言加我的微信获取

你可能感兴趣的:(跟着Cell学作图:R语言ggplot2做散点图并添加拟合曲线和文字标签)