FigDraw 9. SCI 文章绘图之韦恩图 (Vennplot)

这期来聊聊韦恩图,这种图形虽然简单,但是也是文章中很常见的,今天就来看看 CNS 级别文章中的Venn该怎么绘制?


前言

维恩图用于展示在不同的事物群组(集合)之间的数学或逻辑联系,尤其适合用来表示集合(或)类之间的“大致关系”,它也常常被用来帮助推导(或理解推导过程)关于集合运算(或类运算)的一些规律。一般个数在2到7组之间。

FigDraw 9. SCI 文章绘图之韦恩图 (Vennplot)_第1张图片

我们希望实现下面的韦恩图以及更高维度的图形。

FigDraw 9. SCI 文章绘图之韦恩图 (Vennplot)_第2张图片

1. 软件包安装

if (!require(VennDiagram)) install.packages("VennDiagram")
if (!require(venn)) install.packages("venn")
if (!require(UpSetR)) install.packages("UpSetR")

library(VennDiagram)
library(venn)
library(UpSetR)

2. 数据读取

该模块支持2种数据格式,下面是详细介绍:

  1. 韦恩图常用数据格式:第一行为组名,必须要有,会出现在图中。每一列都是一个分组。

    FigDraw 9. SCI 文章绘图之韦恩图 (Vennplot)_第3张图片

  2. 定量矩阵表格式:每行是一个基因,每列是个样本。行名和列名都要有,数值是定量值。

FigDraw 9. SCI 文章绘图之韦恩图 (Vennplot)_第4张图片

我们使用的第一种格式,如下:

dat <- read.table("flower.txt", header = T, sep = "\t")
head(dat)
##         c1       c2       c3       c4       c5       c6       c7       c8
## 1 gene1193 gene1253 gene1236 gene1325 gene1246 gene1414 gene1259 gene1249
## 2 gene1194 gene1254 gene1241 gene1327 gene1247 gene1416 gene1260 gene1250
## 3 gene1195 gene1255 gene1243 gene1328 gene1248 gene1417 gene1261 gene1251
## 4 gene1197 gene1256 gene1244 gene1329 gene1249 gene1421 gene1262 gene1253
## 5 gene1199 gene1257 gene1246 gene1330 gene1250 gene1422 gene1263 gene1256
## 6 gene1202 gene1259 gene1247 gene1331 gene1251 gene1425 gene1265 gene1258
dim(dat)
## [1] 1662    8
venn_list = as.list(dat[, -8])
# 查看交集详情,并导出结果

3. 绘制多集合韦恩图

这里我们使用两个软件包 venn 和 VennDiagram 都是经典的绘制Venn图非常棒的软件包,其中Venn能实现2-7个集合的韦恩图绘制,而VennDiagram可以实现2-5个集合的韦恩图,两者都有自己的风格,集合过多就不适合这么做了,羡慕的例子可别适合做转录组多分组比较之后的差异基因集合个数,有想做这种分析的,一定参照这些例子,做出图形非常美观!注意:venn在做韦恩图时候自始至终都是一个函数venn;而VennDiagram每次不同的集合个数都需要变换函数,而且需要自己统计好交集的个数,2-3个集合还算好弄,但是高于4个集合,自己统计起来还是挺麻烦的,需要自己搞个脚本循环一下。另外高于7个集合的这需要另种表现形式,可以通过 UpSetR 软件包来实现。

2. 两个集合韦恩图

两个集合的韦恩图最常见也是做好理解和绘制的,我们同样使用两个软件包来实现绘制功能,如下:

A. venn {venn}

venn函数的输入数据为列表,所以我们需要将数据框转为列表,直接使用as.list()函数即可,如下:

venn2List <- as.list(dat[,1:2])

cross=venn(venn2List,
     zcolor='style', # 调整颜色,style是默认颜色,bw是无颜色,当然也可以自定义颜色
     opacity = 0.3,  # 调整颜色透明度
     box = F,        # 是否添加边框
     ilcs = 0.5,     # 数字大小
     sncs = 1# 组名字体大小
)

FigDraw 9. SCI 文章绘图之韦恩图 (Vennplot)_第5张图片

B. draw.pairwise.venn {VennDiagram}

创建一个包含两个集合的维恩图。当数据集满足特定条件时创建欧拉图。

cross
##       c1 c2 counts
##        0  0      0
## c2     0  1    756
## c1     1  0    777
## c1:c2  1  1    885
venn.plot <- draw.pairwise.venn(area1 = sum(cross[cross$c1 == 1, ]$counts), area2 = sum(cross[cross$c2 ==
    1, ]$counts), cross.area = sum(cross[which(grepl("c1:c2", rownames(cross)) ==
    TRUE), ]$counts), category = colnames(dat[, 1:2]), fill = c("blue", "red"), lty = "blank",
    cex = 2, cat.cex = 2, cat.pos = c(285, 105), cat.dist = 0.09, cat.just = list(c(-1,
        -1), c(1, 1)), ext.pos = 30, ext.dist = -0.05, ext.length = 0.85, ext.line.lwd = 2,
    ext.line.lty = "dashed")

FigDraw 9. SCI 文章绘图之韦恩图 (Vennplot)_第6张图片

3. 三个集合韦恩图

A. venn {venn}

venn3List <- as.list(dat[,1:3])

cross=venn(venn3List,
     zcolor='style', # 调整颜色,style是默认颜色,bw是无颜色,当然也可以自定义颜色
     opacity = 0.3,  # 调整颜色透明度
     box = F,        # 是否添加边框
     ilcs = 0.5,     # 数字大小
     sncs = 1# 组名字体大小
)

FigDraw 9. SCI 文章绘图之韦恩图 (Vennplot)_第7张图片

B. draw.triple.venn {VennDiagram}

创建一个包含三个集合的维恩图。当数据集满足特定条件时创建欧拉图。

cross
##          c1 c2 c3 counts
##           0  0  0      0
## c3        0  0  1    477
## c2        0  1  0    596
## c2:c3     0  1  1    160
## c1        1  0  0    581
## c1:c3     1  0  1    196
## c1:c2     1  1  0    255
## c1:c2:c3  1  1  1    630
venn.plot <- draw.triple.venn(area1 = sum(cross[cross$c1 == 1, ]$counts), area2 = sum(cross[cross$c2 ==
    1, ]$counts), area3 = sum(cross[cross$c3 == 1, ]$counts), n12 = sum(cross[which(grepl("c1:c2",
    rownames(cross)) == TRUE), ]$counts), n23 = sum(cross[which(grepl("c2:c3", rownames(cross)) ==
    TRUE), ]$counts), n13 = sum(cross[which(grepl("c1:c3", rownames(cross)) == TRUE),
    ]$counts) + cross["c1:c2:c3", ]$counts, n123 = sum(cross[which(grepl("c1:c2:c3",
    rownames(cross)) == TRUE), ]$counts), category = colnames(dat[, 1:3]), fill = c("blue",
    "red", "green"), lty = "blank", cex = 2, cat.cex = 2, cat.col = c("blue", "red",
    "green"))

FigDraw 9. SCI 文章绘图之韦恩图 (Vennplot)_第8张图片

4. 四个集合韦恩图

A. venn {venn}

venn4List <- as.list(dat[,1:4])
cross=venn(venn4List,
     zcolor='style', # 调整颜色,style是默认颜色,bw是无颜色,当然也可以自定义颜色
     opacity = 0.3,  # 调整颜色透明度
     box = F,        # 是否添加边框
     ilcs = 0.5,     # 数字大小
     sncs = 1# 组名字体大小
)

FigDraw 9. SCI 文章绘图之韦恩图 (Vennplot)_第9张图片

B. draw.quad.venn {VennDiagram}

创建一个包含四个集合的维恩图。

cross
##             c1 c2 c3 c4 counts
##              0  0  0  0      0
## c4           0  0  0  1    368
## c3           0  0  1  0    388
## c3:c4        0  0  1  1     89
## c2           0  1  0  0    507
## c2:c4        0  1  0  1     89
## c2:c3        0  1  1  0    104
## c2:c3:c4     0  1  1  1     56
## c1           1  0  0  0    494
## c1:c4        1  0  0  1     87
## c1:c3        1  0  1  0    114
## c1:c3:c4     1  0  1  1     82
## c1:c2        1  1  0  0    159
## c1:c2:c4     1  1  0  1     96
## c1:c2:c3     1  1  1  0    179
## c1:c2:c3:c4  1  1  1  1    451
venn.plot <- draw.quad.venn(area1 = sum(cross[cross$c1 == 1, ]$counts), area2 = sum(cross[cross$c2 ==
    1, ]$counts), area3 = sum(cross[cross$c3 == 1, ]$counts), area4 = sum(cross[cross$c4 ==
    1, ]$counts), n12 = sum(cross[which(grepl("c1:c2", rownames(cross)) == TRUE),
    ]$counts), n13 = sum(cross[which(grepl("c1:c3", rownames(cross)) == TRUE), ]$counts) +
    179 + 451, n14 = sum(cross[which(grepl("c1:c4", rownames(cross)) == TRUE), ]$counts) +
    96 + 451, n23 = sum(cross[which(grepl("c2:c3", rownames(cross)) == TRUE), ]$counts),
    n24 = sum(cross[which(grepl("c2:c4", rownames(cross)) == TRUE), ]$counts) + 451,
    n34 = sum(cross[which(grepl("c3:c4", rownames(cross)) == TRUE), ]$counts), n123 = sum(cross[which(grepl("c1:c2:c3",
        rownames(cross)) == TRUE), ]$counts), n124 = sum(cross[which(grepl("c1:c2:c4",
        rownames(cross)) == TRUE), ]$counts) + 451, n134 = sum(cross[which(grepl("c1:c3:c4",
        rownames(cross)) == TRUE), ]$counts) + 451, n234 = sum(cross[which(grepl("c2:c3:c4",
        rownames(cross)) == TRUE), ]$counts), n1234 = sum(cross[which(grepl("c1:c2:c3:c4",
        rownames(cross)) == TRUE), ]$counts), category = colnames(dat[, 1:4]), fill = c("orange",
        "red", "green", "blue"), lty = "dashed", cex = 2, cat.cex = 2, cat.col = c("orange",
        "red", "green", "blue"))

FigDraw 9. SCI 文章绘图之韦恩图 (Vennplot)_第10张图片

5. 五个集合韦恩图

A. venn {venn}

venn5List <- as.list(dat[,1:5])
cross=venn(venn5List,
     zcolor='style', # 调整颜色,style是默认颜色,bw是无颜色,当然也可以自定义颜色
     opacity = 0.3,  # 调整颜色透明度
     box = F,        # 是否添加边框
     ilcs = 0.5,     # 数字大小
     sncs = 1# 组名字体大小
)

FigDraw 9. SCI 文章绘图之韦恩图 (Vennplot)_第11张图片

B. draw.quintuple.venn {VennDiagram}

创建带有五个集合的维恩图。

venn.plot <- draw.quintuple.venn(area1 = 301, area2 = 321, area3 = 311, area4 = 321,
    area5 = 301, n12 = 188, n13 = 191, n14 = 184, n15 = 177, n23 = 194, n24 = 197,
    n25 = 190, n34 = 190, n35 = 173, n45 = 186, n123 = 112, n124 = 108, n125 = 108,
    n134 = 111, n135 = 104, n145 = 104, n234 = 111, n235 = 107, n245 = 110, n345 = 100,
    n1234 = 61, n1235 = 60, n1245 = 59, n1345 = 58, n2345 = 57, n12345 = 31, category = colnames(dat[,
        1:5]), fill = c("dodgerblue", "goldenrod1", "darkorange1", "seagreen3", "orchid3"),
    cat.col = c("dodgerblue", "goldenrod1", "darkorange1", "seagreen3", "orchid3"),
    cat.cex = 2, margin = 0.05, cex = c(1.5, 1.5, 1.5, 1.5, 1.5, 1, 0.8, 1, 0.8,
        1, 0.8, 1, 0.8, 1, 0.8, 1, 0.55, 1, 0.55, 1, 0.55, 1, 0.55, 1, 0.55, 1, 1,
        1, 1, 1, 1.5), ind = TRUE)

FigDraw 9. SCI 文章绘图之韦恩图 (Vennplot)_第12张图片

6. 六个集合维恩图

venn {venn}

venn6List <- as.list(dat[,1:6])
venn(venn6List,
     zcolor='style', # 调整颜色,style是默认颜色,bw是无颜色,当然也可以自定义颜色
     opacity = 0.3,  # 调整颜色透明度
     box = F,        # 是否添加边框
     ilcs = 0.5,     # 数字大小
     sncs = 1# 组名字体大小
)

FigDraw 9. SCI 文章绘图之韦恩图 (Vennplot)_第13张图片

7. 七个集合韦恩图

venn {venn}

venn7List <- as.list(dat[,1:7])
cross=venn(venn7List,
     zcolor='style', # 调整颜色,style是默认颜色,bw是无颜色,当然也可以自定义颜色
     opacity = 0.3,  # 调整颜色透明度
     box = F,        # 是否添加边框
     ilcs = 0.5,     # 数字大小
     sncs = 1# 组名字体大小
)

FigDraw 9. SCI 文章绘图之韦恩图 (Vennplot)_第14张图片

8. 多于7个集合的韦恩图


多于7个基本就实现不了这种带曲圆的方式绘制了,但是可以考虑通过UpSetR软件包中upset来实现一个热点表格的形式展现,我们先绘制8个集合的图形,如下:

require(ggplot2)
require(plyr)
require(gridExtra)
require(grid)
movies <- read.csv(system.file("extdata", "movies.csv", package = "UpSetR"), header = TRUE,
    sep = ";")
upset(movies, nsets = 8, nintersects = 30, mb.ratio = c(0.5, 0.5), order.by = c("freq",
    "degree"), decreasing = c(TRUE, FALSE))

FigDraw 9. SCI 文章绘图之韦恩图 (Vennplot)_第15张图片

绘制9个集合的图形,如下:

upset(movies, nsets = 9, nintersects = 30, mb.ratio = c(0.5, 0.5), order.by = c("freq",
    "degree"), decreasing = c(TRUE, FALSE))

FigDraw 9. SCI 文章绘图之韦恩图 (Vennplot)_第16张图片

绘制10个集合的图形,如下:

upset(movies, nsets = 10, nintersects = 30, mb.ratio = c(0.5, 0.5), order.by = c("freq",
    "degree"), decreasing = c(TRUE, FALSE))

FigDraw 9. SCI 文章绘图之韦恩图 (Vennplot)_第17张图片

最后看看整体炫图,还是蛮酷毙了的赶脚!

FigDraw 9. SCI 文章绘图之韦恩图 (Vennplot)_第18张图片

References:

  1. Lex et al. (2014). UpSet: Visualization of Intersecting Sets IEEE Transactions on Visualization and Computer Graphics (Proceedings of InfoVis 2014), vol 20, pp. 1983-1992, (2014).

  2. Lex and Gehlenborg (2014). Points of view: Sets and intersections. Nature Methods 11, 779 (2014).

  3. Ruskey, F. and M. Weston. 2005. Venn diagrams. Electronic Journal of Combinatorics, Dynamic Survey DS5.

  4. Mamakani, K., Myrvold W. and F. Ruskey. 2011. Generating all Simple Convexly-drawable Polar Symmetric 6-Venn Diagrams. International Workshop on Combinatorial Algorithms, Victoria. LNCS, 7056, 275-286.

你可能感兴趣的:(SCI,文章绘图,r语言,数据挖掘,机器学习)