ggplot2不能截断纵坐标肿么办

需求

昨晚讲到绘图,学员提出需求绘制类似这样的图,引起了我的兴趣:



第一组的纵坐标与其他组相差较大,需要截断。

R语言实现

数据和原图

df <- data.frame(a = c(1,2,3,500), b = c('a1', 'a2','a3', 'a4'))
library(ggplot2)
ggplot(df) + 
  aes(x = b, y = a,fill = b) +
  geom_col() +
  theme_bw()+
  coord_flip()

方法1:plotrix

搜索结果指向这个包。一看它并不是ggplot体系的,但确实可以实现截断的需求。横纵都可,也能画截断式的箱线图。如果需要多分组,也能实现,在:https://stackoverflow.com/questions/24202245/grouped-barplot-with-cut-y-axis

library(plotrix)
gap.barplot(df$a, gap=c(5,495))
gap.barplot(df$a, gap=c(5,495),horiz = T)

方法2:ggplot2局部放大

ggplot2不支持截断坐标轴,在各种地方搜索都没有很好的结论,stackoverflow上面有一段说明:

As noted elsewhere, this isn't something that ggplot2 will handle well, since broken axes are generally considered questionable.

与此同时,大佬们也给出了别的解决方案。

library(ggforce)
ggplot(df) + 
  aes(x = b, y = a,fill = b) +
  geom_col() +
  facet_zoom(ylim = c(0, 10))


参考代码来自:https://stackoverflow.com/questions/7194688/using-ggplot2-can-i-insert-a-break-in-the-axis
这个方法应该是最好的替代方案。我顺手试了一下怎么加errorbar:

tgc = ToothGrowth
library(ggplot2)
library(ggforce)
library(Rmisc)
tgc$len[tgc$dose==2] = tgc$len[tgc$dose==2]*25
tgc2 <- summarySE(tgc, measurevar="len", groupvars=c("supp","dose"))
tgc2$dose <- factor(tgc2$dose)

p = ggplot(tgc2, aes(x=dose, y=len, fill=supp)) + 
  geom_bar(position=position_dodge(), stat="identity") +
  geom_errorbar(aes(ymin=len-ci, ymax=len+ci),
                width=.2,                    # Width of the error bars
                position=position_dodge(.9));p
#局部放大
p+
  facet_zoom(ylim = c(0,25))


参考:http://www.cookbook-r.com/Graphs/Plotting_means_and_error_bars_(ggplot2)/

方法3:双图表示

library(ggplot2)
g1 <- ggplot(df) + 
  aes(x = b, y = a,fill = b) +
  geom_col() +
  coord_flip()
g2 <- ggplot(df) + 
  aes(x = b, y = a,fill = b) +
  geom_col() +
  coord_flip() +
  ylim(NA, 10)
library(patchwork)
g1+g2
#> Warning: Removed 1 rows containing missing values (position_stack).
library(ggplot2)
ggplot() + 
  aes(x = b, y = a,fill = b) +
  geom_col(data = df %>% mutate(subset = "all")) +
  geom_col(data = df %>% filter(a <= 10) %>% mutate(subset = "small")) +
  coord_flip() + 
  facet_wrap(~ subset, scales = "free_x")

刚才的柱状图也可以双图表示:

library(dplyr)
library(ggplot2)
ggplot() + 
  aes(x = dose, y = len,fill = supp) +
  geom_bar(data = tgc2 %>% mutate(subset = "all"),stat = "identity",position = "dodge") +
  geom_bar(data = tgc2 %>% filter(len <= 25) %>% mutate(subset = "small"),stat = "identity",position = "dodge") +
  geom_errorbar(data = tgc2 %>% mutate(subset = "all"),aes(ymin=len-ci, ymax=len+ci),
                width=.2,                    # Width of the error bars
                position=position_dodge(.9))+
  geom_errorbar(data = tgc2 %>% filter(len <= 25),aes(ymin=len-ci, ymax=len+ci),
                width=.2,                    # Width of the error bars
                position=position_dodge(.9))+
  coord_flip() + 
  facet_wrap(~ subset, scales = "free_x")
g

如果是ggplot2点图 还真的可以

我以为这个必然是可以用ggplot2实现的,唯一能够搜到并真正实现的的代码是这个,借助了分面和坐标轴label修改的思想,虽然实现起来比较复杂,但毕竟是可以实现的:
代码来自:https://www.j4s8.de/post/2018-01-15-broken-axis-with-ggplot2/
我仿照作者的数据画了一个。

rm(list = ls())
###基础图片----
data.sum = data.frame(name = c(letters[1:8],"desert","desert"),
                  value = c(sample(1:75,8),505,689),
                  sens =  rep(c("A","B"),times=5))
base.plot <- function(data) {
  p <- ggplot(data, aes(x=value, y=name, col=sens))
  p <- p + theme_bw()
  p <- p + theme(legend.position="bottom")
  p <- p + geom_point(size=2.5, position=position_jitter(w=0, h=0.15), alpha=0.8)
  p <- p + scale_color_brewer(palette="Set1", guide=guide_legend(ncol=6, title=NULL))
  p <- p + xlab("") + ylab("")
  return(p)
}
p <- base.plot(data.sum)
#2.切分----
data.sum$mask = 0
data.sum$mask[data.sum$name == "desert"] = 1
max.value <- max(data.sum$value)
max.value.other <- max(data.sum$value[data.sum$name != "desert"])
min.value.desert <- min(data.sum$value[data.sum$name == "desert"])
#拉到同一比例尺
#尺度:异常组最小值/其他组最大值
scale <- floor(min.value.desert / max.value.other) - 1
data.sum$value[data.sum$mask == 1] = data.sum$value[data.sum$mask == 1] / scale
#划分刻度
step <- 10
low.end <- max(data.sum$value[data.sum$name != "desert"])#正常组的最大值
up.start <- ceiling(max(data.sum$value[data.sum$name != "desert"])) #正常组最大值向上取整
breaks <- seq(0, max(data.sum$value), step)
labels <- seq(0, low.end+step, step)
labels <- append(labels, 
                 scale * seq(from=ceiling((up.start + step) / step) * step, 
                             length.out=length(breaks) - length(labels),
                             by=step))

# 作图
base.plot(data.sum) +
  facet_grid(. ~ mask, scales="free", space="free")+
  scale_x_continuous(breaks=breaks, labels=labels, expand=c(0.075,0))+ 
  theme(strip.background = element_blank(), strip.text.x = element_blank())

能够实现这样的需求是很难做到的,想必原作者是高手一枚。代码复杂,如果要适配到其他数据上,是需要一定的代码功底的。并且,这个方法只适用于点图,试了一下箱线图和barplot是做不到的。

你可能感兴趣的:(ggplot2不能截断纵坐标肿么办)