前言
我们前面所介绍的图形,基本上都是在笛卡尔坐标系上的图形。
今天,我们要介绍几种绘制在极坐标上的图形
南丁格尔玫瑰图
南丁格尔玫瑰图,即笛卡尔坐标系中的柱状图转换为极坐标系之后的图形。
因此,柱形会被拉伸为扇形,堆积柱状图也就是堆积扇形图,适用于比较大小相近的数值,x
轴为周期性变量的情况
示例
单数据型
count(mpg, class) %>%
ggplot(aes(x = class, y = n)) +
geom_col(aes(fill = class)) +
geom_text(aes(y = n - 3, label = n), colour = "white") +
coord_polar(theta = "x", start = 0) +
theme(
panel.background = element_blank(),
panel.grid.major = element_line(colour = "grey80",size=.25),
axis.text.x=element_text(size = 13,colour="black", angle = seq(-20,-340, length.out = 7)),
axis.ticks.y = element_blank(),
axis.text.y = element_blank(),
axis.title = element_blank(),
legend.position = "none"
)
堆积型
count(mpg, class, drv) %>%
ggplot(aes(x = class, y = n))
geom_col(aes(fill = drv)) +
geom_text(aes(y = n - 3, label = n), colour = "white") +
coord_polar(theta = "x", start = 0) +
theme(
panel.background = element_blank(),
panel.grid.major = element_line(colour = "grey80",size=.25),
axis.text.x=element_text(size = 13,colour="black", angle = seq(-20,-340, length.out = 7)),
axis.ticks.y = element_blank(),
axis.text.y = element_blank(),
axis.title = element_blank(),
legend.position = "none"
)
径向柱状图
径向柱状图也称为圆形柱状图或星图。
我们从 cBioPortal
网站下载了结直肠癌的一份 2015
年的 29
个样本数据,然后提取突变基因与样本信息。
https://github.com/dxsbiocc/learn/blob/main/data/mutation/data_mutations_mskcc.txt
我们提取突变频率大于 3
的基因,绘制单组径向柱状图如下
df <- read_delim("~/Downloads/coad_caseccc_2015/data_mutations_mskcc.txt", delim = "\t")
select(df, Tumor_Sample_Barcode, Hugo_Symbol) %>%
count(df, Hugo_Symbol) %>%
filter(n > 3) %>%
arrange(n) %>%
ggplot(aes(Hugo_Symbol, n, fill = Hugo_Symbol)) +
geom_col() +
geom_text(aes(y = n - 2, label = n), colour = "white") +
coord_polar(start = 0) +
ylim(c(-10, 35)) +
theme(
panel.background = element_blank(),
panel.grid.major = element_line(colour = "grey80",size=.25),
axis.text.x = element_text(size = 9, colour="black", angle = seq(-10, -350, length.out = 27)),
axis.ticks.y = element_blank(),
axis.text.y = element_blank(),
axis.title = element_blank(),
legend.position = "none"
)
我们根据突变频率对基因进行排序,只要做如下修改就行
ggplot(aes(factor(Hugo_Symbol, levels = Hugo_Symbol), n, fill = Hugo_Symbol))
那如果想要绘制多分组数据,要怎么做呢?
这份数据实在是画不出来效果,所以手动构建了一份基因突变数据
# 设置空白柱子的个数
empty_bar = 2
# 自定义突变类型
mut_type <- c("Ins", "Del", "Mismatch", "Silent")
# 构造数据
data <- tibble(
gene=paste( "Gene ", seq(1,60), sep=""),
group=c(rep('Ins', 10), rep('Mismatch', 30), rep('Del', 14), rep('Silent', 6)) ,
value=sample(seq(10,100), 60, replace=T)
) %>%
# 添加 NA 数据,用于在分组之间绘制空白柱形
add_row(tibble(
gene = rep(NA, empty_bar * length(mut_type)),
group = rep(mut_type, empty_bar),
value = gene
)) %>%
mutate(group = factor(group, levels = mut_type)) %>%
# 排序,为了让统一分组绘制在一起
arrange(group)
# 构造唯一标识,用作 x 轴,并按该顺序绘制
data$id = 1:nrow(data)
# 添加显示文本的角度
angle <- 90 - 360 * (data$id - 0.5) / nrow(data)
# 添加内圈注释
base_anno <- group_by(data, group) %>%
summarise(start = min(id), end = max(id) - empty_bar) %>%
mutate(mid = (start + end) / 2)
ggplot(data, aes(id, value, fill = group)) +
geom_col(position = position_dodge2()) +
geom_text(aes(y = value + 18, label = gene), size = 2.5, alpha = 0.6,
angle = ifelse(angle < -90, angle+180, angle)) +
# 内圈注释
geom_segment(data = base_anno, aes(x = start, y = -5, xend = end, yend = -5),
colour = "grey40") +
geom_text(data = base_anno, aes(x = mid, y = -18, label = group),
angle = c(-26, -100, -50, 26), colour = "grey40") +
coord_polar() +
ylim(-100,120) +
theme(
panel.background = element_blank(),
axis.text = element_blank(),
axis.ticks = element_blank(),
axis.title = element_blank(),
panel.grid = element_blank(),
)
是不是也很简单。不对,好像还是有点复杂的,但是还是很容易理解的。
绘制径向热力图,我们使用了比特币从 2015-2018
年的价格数据
https://github.com/dxsbiocc/learn/blob/main/data/bit_data.csv
bit_data <- read_csv("~/Downloads/bit_data.csv")
group_by(bit_data, year, month) %>%
summarise(value = mean(High), .groups = "drop") %>%
ggplot(aes(factor(month), year, fill = value)) +
geom_tile(width = 1, colour = "white") +
coord_polar() +
ylim(c(2010, 2020)) +
scale_fill_gradientn(colours = rainbow(10)) +
theme(
panel.background = element_blank(),
panel.grid.major = element_line(colour = "grey80",size=.25),
axis.text.x = element_text(size = 9, colour="black", angle = seq(-10, -350, length.out = 12)),
axis.ticks.y = element_blank(),
axis.text.y = element_blank(),
axis.title = element_blank()
)
从内圈到外圈,依次代表 2015-2018
年,每圈有 12
段代表月份,颜色深浅代表价格
我们还可以将每个年份数据分开,同时还添加了一些随机扰动,代表一些未知因素。
group_by(bit_data, year, month) %>%
summarise(value = mean(High)) %>%
mutate(
xmin = month,
xmax = month + 1,
ymin = (year - 2015) * 10 + 1,
ymax = ymin + sample(1:5, n(), replace = TRUE)
) %>%
ggplot(aes(fill = value)) +
geom_rect(aes(xmin = xmin, xmax = xmax, ymin = ymin, ymax = ymax)) +
scale_x_continuous(breaks = seq(1.5, 12.5, 1), labels = month.name) +
scale_fill_gradientn(colours = rainbow(10)) +
coord_polar() +
ylim(c(-5, 40)) +
theme(
panel.background = element_blank(),
panel.grid.major = element_line(colour = "grey80",size=.25),
axis.text.x = element_text(size = 9, colour="black", angle = seq(-10, -350, length.out = 12)),
axis.ticks.y = element_blank(),
axis.text.y = element_blank(),
axis.title = element_blank()
)
哈哈,图形看起来又不大一样了。
代码:
https://github.com/dxsbiocc/learn/blob/main/R/plot/polar_bar.R