什么是双坐标系柱线图
啥么是双坐标系柱线图呢?就是为了在一张图中展示更丰富的信息,既有柱状图又有折线图。
如果柱状图和折线图的值域不一致,比如柱状图表示的是数量,折线图表示累计百分比,当二者出现在一张图中的时候,值域范围 [0, 1] 折线图就会几乎贴近 x 轴而失去意义。
这时候我们就建立两个坐标轴,柱状图和折线图各自使用各自的量程:Give back to Ceasar what is Ceasar's and to God what is God's
需要什么样的数据
数据有三列,一列分组信息,两列数值信息。
本例中的两列数值信息分别表示 分组计数 和 百分比累积。
data <- data.frame(group = c("<10", "10-15", "15-20", "20-25", "25-30", ">30"),
count = c(70, 15, 8, 4, 2, 1),
percent = c(0.70, 0.85, 0.93, 0.97, 0.99, 1.00))
> data
group count percent
1 <10 70 0.70
2 10-15 15 0.85
3 15-20 8 0.93
4 20-25 4 0.97
5 25-30 2 0.99
6 >30 1 1.00
开始作图
1. 添加柱状图
为 count
列添加柱状图:
data <- data.frame(group = c("<10", "10-15", "15-20", "20-25", "25-30", ">30"),
count = c(70, 15, 8, 4, 2, 1),
percent = c(0.70, 0.85, 0.93, 0.97, 0.99, 1.00))
# 转换 group 列为 factor 类型,从而横坐标按序显示
data[["group"]] <- factor(data[["group"]], levels = as.character(data[["group"]]))
ggplot(data) +
geom_bar(aes(x = group, y = count), stat = "identity", fill = '#168aad')
2. 添加折线图
为 percent
列添加折线图:
data <- data.frame(group = c("<10", "10-15", "15-20", "20-25", "25-30", ">30"),
count = c(70, 15, 8, 4, 2, 1),
percent = c(0.70, 0.85, 0.93, 0.97, 0.99, 1.00))
# 转换 group 列为 factor 类型,从而横坐标按序显示
data[["group"]] <- factor(data[["group"]], levels = as.character(data[["group"]]))
ggplot(data) +
geom_bar(aes(x = group, y = count), stat = "identity", fill = '#168aad') +
geom_line(aes(x = group, y = percent), size = 1, color = '#800080') +
geom_point(aes(x = group, y = percent), size = 3, shape = 19, color='#800080')
遇到报错:
geom_path: Each group consists of only one observation. Do you need to adjust the group aesthetic?
参考 stackoverflow 的解决办法
For line graphs, the data points must be grouped so that it knows which points to connect. In this case, it is simple -- all points should be connected, so group=1. When more variables are used and multiple lines are drawn, the grouping for lines is usually done by variable.
data <- data.frame(group = c("<10", "10-15", "15-20", "20-25", "25-30", ">30"),
count = c(70, 15, 8, 4, 2, 1),
percent = c(0.70, 0.85, 0.93, 0.97, 0.99, 1.00))
# 转换 group 列为 factor 类型,从而横坐标按序显示
data[["group"]] <- factor(data[["group"]], levels = as.character(data[["group"]]))
ggplot(data) +
geom_bar(aes(x = group, y = count), stat = "identity", fill = '#168aad') +
geom_line(aes(x = group, y = percent, group = 1), size = 1, color = '#800080') +
geom_point(aes(x = group, y = percent, group = 1), size = 3, shape = 19, color='#800080')
由于 count
和 percent
的值域范围不一样,会得到这样的效果:
3. 调整折线图的值域
要想让 count
和 percent
分别按照自己的值域范围显示,并且呈现在同一个图中,就需要把其中之一的值域范围向另一个做投影,以统一值域范围,相当于 scaling。
这里我们选择将 percent
向 count
做投影,投影之后新增一列 percent_transform
,然后通过改变坐标轴 label 的方式达到保持原指标值域范围的目的:
data <- data.frame(group = c("<10", "10-15", "15-20", "20-25", "25-30", ">30"),
count = c(70, 15, 8, 4, 2, 1),
percent = c(0.70, 0.85, 0.93, 0.97, 0.99, 1.00))
data[["group"]] <- factor(data[["group"]], levels = as.character(data[["group"]]))
data[["percent_transform"]] = data[["percent"]] / max(data[["percent"]]) * max(data[["count"]])
> data
group count percent percent_transform
1 <10 70 0.70 49.0
2 10-15 15 0.85 59.5
3 15-20 8 0.93 65.1
4 20-25 4 0.97 67.9
5 25-30 2 0.99 69.3
6 >30 1 1.00 70.0
使用投影之后的列 percent_transform
做折线图:
data <- data.frame(group = c("<10", "10-15", "15-20", "20-25", "25-30", ">30"),
count = c(70, 15, 8, 4, 2, 1),
percent = c(0.70, 0.85, 0.93, 0.97, 0.99, 1.00))
data[["group"]] <- factor(data[["group"]], levels = as.character(data[["group"]]))
data[["percent_transform"]] = data[["percent"]] / max(data[["percent"]]) * max(data[["count"]])
ggplot(data) +
geom_bar(aes(x = group, y = count), stat = "identity", fill = '#168aad') +
geom_line(aes(x = group, y = percent_transform, group = 1), size = 1, color = '#800080') +
geom_point(aes(x = group, y = percent_transform, group = 1), size = 3, shape = 19, color='#800080') +
scale_y_continuous(limits = c(0, max(data[["count"]])),
breaks = c(seq(0, ceiling(max(data[["count"]]) / 10) * 10, 5)),
sec.axis = sec_axis(~./0.99, name = "percent(%)",
breaks = seq(0, max(data[["count"]]), max(data[["count"]]) / 10),
labels = paste0(seq(0, 100, 10))))
4. 样式调整
调整 x/y 轴颜色,ticks 颜色:
data <- data.frame(group = c("<10", "10-15", "15-20", "20-25", "25-30", ">30"),
count = c(70, 15, 8, 4, 2, 1),
percent = c(0.70, 0.85, 0.93, 0.97, 0.99, 1.00))
data[["group"]] <- factor(data[["group"]], levels = as.character(data[["group"]]))
data[["percent_transform"]] = data[["percent"]] / max(data[["percent"]]) * max(data[["count"]])
ggplot(data) +
geom_bar(aes(x = group, y = count), stat = "identity", fill = '#168aad') +
geom_line(aes(x = group, y = percent_transform, group = 1), size = 1, color = '#800080') +
geom_point(aes(x = group, y = percent_transform, group = 1), size = 3, shape = 19, color='#800080') +
scale_y_continuous(limits = c(0, max(data[["count"]])),
breaks = c(seq(0, ceiling(max(data[["count"]]) / 10) * 10, 5)),
sec.axis = sec_axis(~./0.99, name = "percent(%)",
breaks = seq(0, max(data[["count"]]), max(data[["count"]]) / 10),
labels = paste0(seq(0, 100, 10)))) +
theme(axis.line.x = element_line(linetype = 1, color = "darkblue", size = 1),
axis.line.y = element_line(linetype = 1, color = "darkblue", size = 1),
axis.ticks.x = element_line(color = "darkblue", size = 1),
axis.ticks.y = element_line(color = "darkblue", size = 1),
axis.ticks.length = unit(.4, "lines"))
调整主题,样式微调:
data <- data.frame(group = c("<10", "10-15", "15-20", "20-25", "25-30", ">30"),
count = c(70, 15, 8, 4, 2, 1),
percent = c(0.70, 0.85, 0.93, 0.97, 0.99, 1.00))
data[["group"]] <- factor(data[["group"]], levels = as.character(data[["group"]]))
data[["percent_transform"]] = data[["percent"]] / max(data[["percent"]]) * max(data[["count"]])
ggplot(data) +
geom_bar(aes(x = group, y = count), stat = "identity", fill = '#168aad') +
geom_line(aes(x = group, y = percent_transform, group = 1), size = 1, color = '#800080') +
geom_point(aes(x = group, y = percent_transform, group = 1), size = 3, shape = 19, color='#800080') +
scale_y_continuous(limits = c(0, max(data[["count"]])),
breaks = c(seq(0, ceiling(max(data[["count"]]) / 10) * 10, 5)),
sec.axis = sec_axis(~./0.99, name = "percent(%)",
breaks = seq(0, max(data[["count"]]), max(data[["count"]]) / 10),
labels = paste0(seq(0, 100, 10)))) +
theme_minimal() +
theme(panel.grid.major.x = element_blank(),
panel.grid.minor.x = element_blank(),
panel.grid.major.y = element_blank(),
panel.grid.minor.y = element_blank()) +
theme(axis.line.x = element_line(linetype = 1, color = "darkblue", size = 1),
axis.line.y = element_line(linetype = 1, color = "darkblue", size = 1),
axis.ticks.x = element_line(color = "darkblue", size = 1),
axis.ticks.y = element_line(color = "darkblue", size = 1),
axis.ticks.length = unit(.4, "lines")) +
theme(plot.title = element_text(hjust = 0.5)) +
labs(title = paste0("CV% distribution"), x = "group", y = "count")
欢迎留言、讨论、点赞、转发,转载请注明出处~
相关文章
[1] R 数据可视化:BoxPlot
[2] R 数据可视化:水平渐变色柱状图
[3] R 数据可视化:环形柱状图
[4] R 数据可视化:PCA 主成分分析图