这期来聊聊韦恩图,这种图形虽然简单,但是也是文章中很常见的,今天就来看看 CNS 级别文章中的Venn该怎么绘制?
维恩图用于展示在不同的事物群组(集合)之间的数学或逻辑联系,尤其适合用来表示集合(或)类之间的“大致关系”,它也常常被用来帮助推导(或理解推导过程)关于集合运算(或类运算)的一些规律。一般个数在2到7组之间。
我们希望实现下面的韦恩图以及更高维度的图形。
if (!require(VennDiagram)) install.packages("VennDiagram")
if (!require(venn)) install.packages("venn")
if (!require(UpSetR)) install.packages("UpSetR")
library(VennDiagram)
library(venn)
library(UpSetR)
该模块支持2种数据格式,下面是详细介绍:
我们使用的第一种格式,如下:
dat <- read.table("flower.txt", header = T, sep = "\t")
head(dat)
## c1 c2 c3 c4 c5 c6 c7 c8
## 1 gene1193 gene1253 gene1236 gene1325 gene1246 gene1414 gene1259 gene1249
## 2 gene1194 gene1254 gene1241 gene1327 gene1247 gene1416 gene1260 gene1250
## 3 gene1195 gene1255 gene1243 gene1328 gene1248 gene1417 gene1261 gene1251
## 4 gene1197 gene1256 gene1244 gene1329 gene1249 gene1421 gene1262 gene1253
## 5 gene1199 gene1257 gene1246 gene1330 gene1250 gene1422 gene1263 gene1256
## 6 gene1202 gene1259 gene1247 gene1331 gene1251 gene1425 gene1265 gene1258
dim(dat)
## [1] 1662 8
venn_list = as.list(dat[, -8])
# 查看交集详情,并导出结果
这里我们使用两个软件包 venn 和 VennDiagram 都是经典的绘制Venn图非常棒的软件包,其中Venn能实现2-7个集合的韦恩图绘制,而VennDiagram可以实现2-5个集合的韦恩图,两者都有自己的风格,集合过多就不适合这么做了,羡慕的例子可别适合做转录组多分组比较之后的差异基因集合个数,有想做这种分析的,一定参照这些例子,做出图形非常美观!注意:venn在做韦恩图时候自始至终都是一个函数venn;而VennDiagram每次不同的集合个数都需要变换函数,而且需要自己统计好交集的个数,2-3个集合还算好弄,但是高于4个集合,自己统计起来还是挺麻烦的,需要自己搞个脚本循环一下。另外高于7个集合的这需要另种表现形式,可以通过 UpSetR 软件包来实现。
两个集合的韦恩图最常见也是做好理解和绘制的,我们同样使用两个软件包来实现绘制功能,如下:
venn函数的输入数据为列表,所以我们需要将数据框转为列表,直接使用as.list()函数即可,如下:
venn2List <- as.list(dat[,1:2])
cross=venn(venn2List,
zcolor='style', # 调整颜色,style是默认颜色,bw是无颜色,当然也可以自定义颜色
opacity = 0.3, # 调整颜色透明度
box = F, # 是否添加边框
ilcs = 0.5, # 数字大小
sncs = 1# 组名字体大小
)
创建一个包含两个集合的维恩图。当数据集满足特定条件时创建欧拉图。
cross
## c1 c2 counts
## 0 0 0
## c2 0 1 756
## c1 1 0 777
## c1:c2 1 1 885
venn.plot <- draw.pairwise.venn(area1 = sum(cross[cross$c1 == 1, ]$counts), area2 = sum(cross[cross$c2 ==
1, ]$counts), cross.area = sum(cross[which(grepl("c1:c2", rownames(cross)) ==
TRUE), ]$counts), category = colnames(dat[, 1:2]), fill = c("blue", "red"), lty = "blank",
cex = 2, cat.cex = 2, cat.pos = c(285, 105), cat.dist = 0.09, cat.just = list(c(-1,
-1), c(1, 1)), ext.pos = 30, ext.dist = -0.05, ext.length = 0.85, ext.line.lwd = 2,
ext.line.lty = "dashed")
venn3List <- as.list(dat[,1:3])
cross=venn(venn3List,
zcolor='style', # 调整颜色,style是默认颜色,bw是无颜色,当然也可以自定义颜色
opacity = 0.3, # 调整颜色透明度
box = F, # 是否添加边框
ilcs = 0.5, # 数字大小
sncs = 1# 组名字体大小
)
创建一个包含三个集合的维恩图。当数据集满足特定条件时创建欧拉图。
cross
## c1 c2 c3 counts
## 0 0 0 0
## c3 0 0 1 477
## c2 0 1 0 596
## c2:c3 0 1 1 160
## c1 1 0 0 581
## c1:c3 1 0 1 196
## c1:c2 1 1 0 255
## c1:c2:c3 1 1 1 630
venn.plot <- draw.triple.venn(area1 = sum(cross[cross$c1 == 1, ]$counts), area2 = sum(cross[cross$c2 ==
1, ]$counts), area3 = sum(cross[cross$c3 == 1, ]$counts), n12 = sum(cross[which(grepl("c1:c2",
rownames(cross)) == TRUE), ]$counts), n23 = sum(cross[which(grepl("c2:c3", rownames(cross)) ==
TRUE), ]$counts), n13 = sum(cross[which(grepl("c1:c3", rownames(cross)) == TRUE),
]$counts) + cross["c1:c2:c3", ]$counts, n123 = sum(cross[which(grepl("c1:c2:c3",
rownames(cross)) == TRUE), ]$counts), category = colnames(dat[, 1:3]), fill = c("blue",
"red", "green"), lty = "blank", cex = 2, cat.cex = 2, cat.col = c("blue", "red",
"green"))
venn4List <- as.list(dat[,1:4])
cross=venn(venn4List,
zcolor='style', # 调整颜色,style是默认颜色,bw是无颜色,当然也可以自定义颜色
opacity = 0.3, # 调整颜色透明度
box = F, # 是否添加边框
ilcs = 0.5, # 数字大小
sncs = 1# 组名字体大小
)
创建一个包含四个集合的维恩图。
cross
## c1 c2 c3 c4 counts
## 0 0 0 0 0
## c4 0 0 0 1 368
## c3 0 0 1 0 388
## c3:c4 0 0 1 1 89
## c2 0 1 0 0 507
## c2:c4 0 1 0 1 89
## c2:c3 0 1 1 0 104
## c2:c3:c4 0 1 1 1 56
## c1 1 0 0 0 494
## c1:c4 1 0 0 1 87
## c1:c3 1 0 1 0 114
## c1:c3:c4 1 0 1 1 82
## c1:c2 1 1 0 0 159
## c1:c2:c4 1 1 0 1 96
## c1:c2:c3 1 1 1 0 179
## c1:c2:c3:c4 1 1 1 1 451
venn.plot <- draw.quad.venn(area1 = sum(cross[cross$c1 == 1, ]$counts), area2 = sum(cross[cross$c2 ==
1, ]$counts), area3 = sum(cross[cross$c3 == 1, ]$counts), area4 = sum(cross[cross$c4 ==
1, ]$counts), n12 = sum(cross[which(grepl("c1:c2", rownames(cross)) == TRUE),
]$counts), n13 = sum(cross[which(grepl("c1:c3", rownames(cross)) == TRUE), ]$counts) +
179 + 451, n14 = sum(cross[which(grepl("c1:c4", rownames(cross)) == TRUE), ]$counts) +
96 + 451, n23 = sum(cross[which(grepl("c2:c3", rownames(cross)) == TRUE), ]$counts),
n24 = sum(cross[which(grepl("c2:c4", rownames(cross)) == TRUE), ]$counts) + 451,
n34 = sum(cross[which(grepl("c3:c4", rownames(cross)) == TRUE), ]$counts), n123 = sum(cross[which(grepl("c1:c2:c3",
rownames(cross)) == TRUE), ]$counts), n124 = sum(cross[which(grepl("c1:c2:c4",
rownames(cross)) == TRUE), ]$counts) + 451, n134 = sum(cross[which(grepl("c1:c3:c4",
rownames(cross)) == TRUE), ]$counts) + 451, n234 = sum(cross[which(grepl("c2:c3:c4",
rownames(cross)) == TRUE), ]$counts), n1234 = sum(cross[which(grepl("c1:c2:c3:c4",
rownames(cross)) == TRUE), ]$counts), category = colnames(dat[, 1:4]), fill = c("orange",
"red", "green", "blue"), lty = "dashed", cex = 2, cat.cex = 2, cat.col = c("orange",
"red", "green", "blue"))
venn5List <- as.list(dat[,1:5])
cross=venn(venn5List,
zcolor='style', # 调整颜色,style是默认颜色,bw是无颜色,当然也可以自定义颜色
opacity = 0.3, # 调整颜色透明度
box = F, # 是否添加边框
ilcs = 0.5, # 数字大小
sncs = 1# 组名字体大小
)
创建带有五个集合的维恩图。
venn.plot <- draw.quintuple.venn(area1 = 301, area2 = 321, area3 = 311, area4 = 321,
area5 = 301, n12 = 188, n13 = 191, n14 = 184, n15 = 177, n23 = 194, n24 = 197,
n25 = 190, n34 = 190, n35 = 173, n45 = 186, n123 = 112, n124 = 108, n125 = 108,
n134 = 111, n135 = 104, n145 = 104, n234 = 111, n235 = 107, n245 = 110, n345 = 100,
n1234 = 61, n1235 = 60, n1245 = 59, n1345 = 58, n2345 = 57, n12345 = 31, category = colnames(dat[,
1:5]), fill = c("dodgerblue", "goldenrod1", "darkorange1", "seagreen3", "orchid3"),
cat.col = c("dodgerblue", "goldenrod1", "darkorange1", "seagreen3", "orchid3"),
cat.cex = 2, margin = 0.05, cex = c(1.5, 1.5, 1.5, 1.5, 1.5, 1, 0.8, 1, 0.8,
1, 0.8, 1, 0.8, 1, 0.8, 1, 0.55, 1, 0.55, 1, 0.55, 1, 0.55, 1, 0.55, 1, 1,
1, 1, 1, 1.5), ind = TRUE)
venn6List <- as.list(dat[,1:6])
venn(venn6List,
zcolor='style', # 调整颜色,style是默认颜色,bw是无颜色,当然也可以自定义颜色
opacity = 0.3, # 调整颜色透明度
box = F, # 是否添加边框
ilcs = 0.5, # 数字大小
sncs = 1# 组名字体大小
)
venn7List <- as.list(dat[,1:7])
cross=venn(venn7List,
zcolor='style', # 调整颜色,style是默认颜色,bw是无颜色,当然也可以自定义颜色
opacity = 0.3, # 调整颜色透明度
box = F, # 是否添加边框
ilcs = 0.5, # 数字大小
sncs = 1# 组名字体大小
)
8. 多于7个集合的韦恩图
多于7个基本就实现不了这种带曲圆的方式绘制了,但是可以考虑通过UpSetR软件包中upset来实现一个热点表格的形式展现,我们先绘制8个集合的图形,如下:
require(ggplot2)
require(plyr)
require(gridExtra)
require(grid)
movies <- read.csv(system.file("extdata", "movies.csv", package = "UpSetR"), header = TRUE,
sep = ";")
upset(movies, nsets = 8, nintersects = 30, mb.ratio = c(0.5, 0.5), order.by = c("freq",
"degree"), decreasing = c(TRUE, FALSE))
绘制9个集合的图形,如下:
upset(movies, nsets = 9, nintersects = 30, mb.ratio = c(0.5, 0.5), order.by = c("freq",
"degree"), decreasing = c(TRUE, FALSE))
绘制10个集合的图形,如下:
upset(movies, nsets = 10, nintersects = 30, mb.ratio = c(0.5, 0.5), order.by = c("freq",
"degree"), decreasing = c(TRUE, FALSE))
Lex et al. (2014). UpSet: Visualization of Intersecting Sets IEEE Transactions on Visualization and Computer Graphics (Proceedings of InfoVis 2014), vol 20, pp. 1983-1992, (2014).
Lex and Gehlenborg (2014). Points of view: Sets and intersections. Nature Methods 11, 779 (2014).
Ruskey, F. and M. Weston. 2005. Venn diagrams. Electronic Journal of Combinatorics, Dynamic Survey DS5.
Mamakani, K., Myrvold W. and F. Ruskey. 2011. Generating all Simple Convexly-drawable Polar Symmetric 6-Venn Diagrams. International Workshop on Combinatorial Algorithms, Victoria. LNCS, 7056, 275-286.