R语言绘图包系列:
- R语言绘图包01--优秀的拼图包patchwork
- R语言绘图包02--热图pheatmap
- R语言绘图包03--火山图EnhancedVolcano
- R语言绘图包04--GOplot:富集分析结果可视化
- R语言绘图包05--韦恩图的绘制:ggvenn和VennDiagram
- R语言绘图包06--基因表达相关性绘图corrplot
一般在对数据取交集的时候,通常使用韦恩图。但韦恩图的可视范围有限,对于超过五个以上的数据集取交集会显得很凌乱。这时候就可以使用UpSetR包。
UpsetR接受三种类型的数据输入:
表格,也是行是元素,列是数据集分配和额外信息的数据框。
元素名的集合(fromList)
venneuler包引入的用于描述集合交集的向量fromExpression。
1. 基础用法
install.packages(UpSetR)
library(UpSetR)
require(ggplot2); require(plyr); require(gridExtra); require(grid);
movies <- read.csv( system.file("extdata", "movies.csv", package = "UpSetR"), header=TRUE, sep=";" )
head(movies)
# Name ReleaseDate Action Adventure Children Comedy Crime Documentary
# 1 Toy Story (1995) 1995 0 0 1 1 0 # 0
# 2 Jumanji (1995) 1995 0 1 1 0 0 0
# 3 Grumpier Old Men (1995) 1995 0 0 0 1 0 0
# 4 Waiting to Exhale (1995) 1995 0 0 0 1 0 0
# 5 Father of the Bride Part II (1995) 1995 0 0 0 1 0 0
# 6 Heat (1995) 1995 1 0 0 0 1 0
# Drama Fantasy Noir Horror Musical Mystery Romance SciFi Thriller War Western AvgRating Watches
# 1 0 0 0 0 0 0 0 0 0 0 0 4.15 2077
# 2 0 1 0 0 0 0 0 0 0 0 0 3.20 701
# 3 0 0 0 0 0 0 1 0 0 0 0 3.02 478
# 4 1 0 0 0 0 0 0 0 0 0 0 2.73 170
# 5 0 0 0 0 0 0 0 0 0 0 0 3.01 296
# 6 0 0 0 0 0 0 0 0 1 0 0 3.88 940
upset(movies, nsets = 7, nintersects = 30, mb.ratio = c(0.5, 0.5),
order.by = c("freq", "degree"), decreasing = c(TRUE,FALSE))
主要参数:
nsets
: 最多展示多少个集合数据(上图有多少行)。毕竟原来有20多种电影类型,放不完的
nintersects
: 展示多少交集(上图有多少列)。
mb.ratio
: 点点图和条形图的比例。
order.by
: 交集如何排序。这里先根据freq,然后根据degree
decreasing
: 变量如何排序。这里表示freq降序,degree升序
2. 精细化绘图
我们还能在图中描述出1970-1980年恐怖片和动作片的情况
between <- function(row, min, max){
newData <- (row["ReleaseDate"] < max) & (row["ReleaseDate"] > min)
}
upset(movies, sets = c("Drama", "Comedy", "Action", "Thriller", "Western", "Documentary"),
queries = list(list(query = intersects, params = list("Drama", "Action")),
list(query = between, params = list(1970, 1980), color = "red", active = TRUE)))
通过attribute.plots
参数添加属性图
upset(movies,attribute.plots=list(gridrows=60,plots=list(list(plot=scatter_plot, x="ReleaseDate", y="AvgRating"),
list(plot=scatter_plot, x="ReleaseDate", y="Watches"),list(plot=scatter_plot, x="Watches", y="AvgRating"),
list(plot=histogram, x="ReleaseDate")), ncols = 2))
# attribute.plots接受各个plot函数组成的作图函数,可以用自带的,也可以自己写,只要保证里面的参数设置正确了。
upset(movies, attribute.plots = attributeplots,
queries = list(list(query = between, params = list(1920, 1940)),
list(query = intersects, params = list("Drama"), color= "red"),
list(query = elements, params = list("ReleaseDate", 1990, 1991, 1992))),
main.bar.color = "yellow")
参考:https://www.jianshu.com/p/324aae3d5ea4
https://mp.weixin.qq.com/s/DSyaje-nFb8o--kuzmTvaA