前言
对于数据集之间交叠关系的可视化,通常想到的是绘制韦恩图。
韦恩图是一种关系型图表,通过图形之间的重叠来反映数据集之间的相交关系。
下面,我们来简单介绍一下如何绘制韦恩图
韦恩图
绘制韦恩图的包有很多,比如 gplots
包的 venn()
函数、limma
包的 vennDiagram()
函数、venneuler
包的 venneuler()
函数。
但是这些包绘制出来的图像效果都不是很好,所以我们使用比较成熟的包 VennDiagram
来绘制韦恩图
安装导入
install.packages("VennDiagram")
library(VennDiagram)
library(RColorBrewer)
介绍
VennDiagram
最多能够绘制 5
个集合,分别有对应的绘制函数:
draw.single.venn
draw.single.venn(
area = 50,
category = "First",
fill = "#abc123"
)
draw.pairwise.venn
draw.pairwise.venn(
area1 = 50,
area2 = 100,
cross.area = 50,
category = c("First", "second"),
fill = brewer.pal(5, "Spectral")[2:3],
cat.pos = c(0, 0)
)
draw.triple.venn
draw.triple.venn(
area1 = 200,
area2 = 50,
area3 = 70,
n12 = 50,
n13 = 70,
n23 = 10,
n123 = 10,
category = c("First", "second", "third"),
fill = brewer.pal(5, "Set1")[1:3],
cat.pos = c(0, 0, 0)
)
draw.quad.venn
draw.quintuple.venn
四、五个的就不画了,要设置太多的参数了,意思就是这么个意思,理解了就行。
这些函数需要显示的指定每个集合的大小以及集合之间的交叠的元素数目,太麻烦了
我们可以使用 venn.diagram
函数,将集合以列表的方式传递给参数 x
venn.diagram(
x = list(
A = sample(genes, 100),
B = sample(genes, 80),
C = sample(genes, 128)
),
filename = "~/Downloads/gene_set.tiff",
fill = brewer.pal(3, "Set1")
)
这样就不需要自己手动计算集合之间的交集,并传递大量的参数了
注意:在这里,我们指定了图形输出文件,如果不想保存到文件中,只是在 RStudio
中查看,可以
venn_ploy <- venn.diagram(
x = list(
A = sample(genes, 100),
B = sample(genes, 80),
C = sample(genes, 128)
),
filename = NULL,
fill = brewer.pal(3, "Set1")
)
grid.draw(venn_ploy)
两个集合也是一样的
venn_ploy <- venn.diagram(
x = list(
A = sample(genes, 100),
B = sample(genes, 80)
#C = sample(genes, 128)
),
filename = NULL,
fill = brewer.pal(7, "Set1")[1:2]
)
grid.draw(venn_ploy)
五个集合
venn_ploy <- venn.diagram(
x = list(
A = sample(genes, 100),
B = sample(genes, 80),
C = sample(genes, 128),
D = sample(genes, 45),
E = sample(genes, 92)
),
filename = NULL,
fill = brewer.pal(7, "Set1")[1:5]
)
grid.draw(venn_ploy)
不能再多了,再多也分不清楚谁是谁了,像这五个集合的交叠已经比较难分辨了。
知道了如何绘制,那剩下的就是该怎么调整一些图形属性了。
venn.diagram
函数的参数非常多
例如,显示交叠数量的百分比和原始数值格式
venn_ploy <- venn.diagram(
x = list(
A = sample(genes, 100),
B = sample(genes, 80),
C = sample(genes, 128)
# D = sample(genes, 45),
# E = sample(genes, 92)
),
filename = NULL,
fill = c("#fb8072", "#80b1d3", "#fdb462"),
main = "example",
sub = "gene",
force.unique = T,
print.mode = c("percent", "raw")
)
grid.draw(venn_ploy)
不显示圆周线条
venn_ploy <- venn.diagram(
x = list(
A = sample(genes, 100),
B = sample(genes, 80),
C = sample(genes, 128)
# D = sample(genes, 45),
# E = sample(genes, 92)
),
filename = NULL,
fill = brewer.pal(7, "Set1")[1:3],
print.mode = c("percent", "raw"),
lty = "blank"
)
grid.draw(venn_ploy)