2.折线图
导入的时候很多图挂了,有需要的麻烦大家移步原文:https://mp.weixin.qq.com/s/AGZJtQkB-JvfBsX8XNDDpA
这个系列是关于R中基础图形和进阶图形的绘制。视频课程会陆续更新到我的B站【木舟笔记】,希望大家多多支持!
折线图通常用来对两个连续变量的相互依存关系进行可视化,其中,x轴对应于自变量,y轴对应于因变量。折线图的x轴通常对应的是连续型变量或者有序离散型变量。
- 2.1 绘制简单折线图
- 2.2 向折线图添加数据表记
- 2.3 绘制多重折线图
- 2.4 修改线条样式
- 2.5 修改数据标记样式
- 2.6 绘制面积图
- 2.7 绘制堆积面积图
- 2.8 绘制百分比面积堆积图
- 2.9 添加置信域
- 参考书籍
2.1 绘制简单折线图
library(ggplot2)
ggplot(BOD, aes(x = Time, y = demand)) + geom_line()
BOD
## Time demand
## 1 1 8.3
## 2 2 10.3
## 3 3 19.0
## 4 4 16.0
## 5 5 15.6
## 6 7 19.8
BOD1 <- BOD # Make a copy of the data
BOD1$Time <- factor(BOD1$Time) #转为因子型变量
ggplot(BOD1, aes(x = Time, y = demand, group = 1)) + geom_line()
数据集BOD
中没有对应于Time=6
的数据点,因此Time
被转化为因子型变量时,它并没有6
这个水平。
可以运行ylim()
设定y轴范围或者运行含一个参数的expand_limit()
扩展y轴的范围。
# 以下结果都是相同的
ggplot(BOD, aes(x = Time, y = demand)) + geom_line() + ylim(0, max(BOD$demand))
ggplot(BOD, aes(x = Time, y = demand)) + geom_line() + expand_limits(y = 0)
2.2 向折线图添加数据表记
ggplot(BOD, aes(x = Time, y = demand)) + geom_line() + geom_point()
library(gcookbook)
# wordlpop 对应的采集时间间隔不是常数。时间越近的采集频率越高。
ggplot(worldpop, aes(x = Year, y = Population)) + geom_line() + geom_point()
[图片上传失败...(image-352239-1626921661081)]
# 当y轴取对数时也一样
ggplot(worldpop, aes(x = Year, y = Population)) + geom_line() + geom_point() + scale_y_log10()
[图片上传失败...(image-e761e1-1626921661081)]
2.3 绘制多重折线图
# 载入plyr,便于使用ddply() 创建样本数据集library(plyr)# 汇总ToothGrowth 数据集tg <- ddply(ToothGrowth, c("supp", "dose"), summarise, length=mean(len))# 将 supp 映射给 colourggplot(tg, aes(x=dose, y=length, colour=supp)) + geom_line()
[图片上传失败...(image-4287b7-1626921661081)]
# 将 supp 映射给 线型 linetypeggplot(tg, aes(x=dose, y=length, linetype=supp)) + geom_line()
[图片上传失败...(image-5bd46d-1626921661081)]
# ggplot(tg, aes(x=factor(dose), y=length, colour=supp, group=supp)) + geom_line()
[图片上传失败...(image-1edf2c-1626921661081)]
# 不能缺失group=supp语句,否则ggplot()会不知如何将数据组合在一起,从而报错ggplot(tg, aes(x=factor(dose), y=length, colour=supp)) + geom_line()
# 分组不正确也有可能变成锯齿状ggplot(tg, aes(x=dose, y=length)) + geom_line()
ggplot(tg, aes(x=dose, y=length, shape=supp)) + geom_line() + geom_point(size=4) # 更大的点
ggplot(tg, aes(x=dose, y=length, fill=supp)) + geom_line() + geom_point(size=4, shape=21) #使用有填充色的点
[图片上传失败...(image-bada48-1626921661081)]
# 数据标记相互重叠,需要相应的移动标记点以及连接线。ggplot(tg, aes(x=dose, y=length, shape=supp)) + geom_line(position=position_dodge(0.2)) +#将连接线左右移动0.2 geom_point(position=position_dodge(0.2), size=4) # 将点的位置左右移动0.2
[图片上传失败...(image-f2442c-1626921661081)]
2.4 修改线条样式
通过设置线型(linetype
)、线宽(size
) 和颜色(colour
)参数可以分别修改折现的线型、线宽和颜色。
ggplot(BOD, aes(x = Time, y = demand)) + geom_line(linetype = "dashed", size = 1, colour = "blue")
[图片上传失败...(image-6a763f-1626921661081)]
library(plyr)tg <- ddply(ToothGrowth, c("supp", "dose"), summarise, length = mean(len))ggplot(tg, aes(x = dose, y = length, colour = supp)) + geom_line() + scale_colour_brewer(palette = "Set1"))
[图片上传失败...(image-72b077-1626921661081)]
# 在aes()函数外部设定参数则会对所有折线图有效ggplot(tg, aes(x = dose, y = length, group = supp)) + geom_line(colour = "darkgreen", size = 1.5)
[图片上传失败...(image-a90ba9-1626921661081)]
# supp被映射给了颜色,所以自动作为分组变量ggplot(tg, aes(x = dose, y = length, colour = supp)) + geom_line(linetype = "dashed") + geom_point(shape = 22, size = 3, fill = "white")
[图片上传失败...(image-e9465c-1626921661081)]
2.5 修改数据标记样式
# geom_point()设置点大小、颜色、填充ggplot(BOD,aes(x = Time,y = demand)) + geom_line() + geom_point(size = 4,shape = 22,colour = "darkred",fill = "pink")
[图片上传失败...(image-2ae81a-1626921661081)]
ggplot(BOD, aes(x = Time, y = demand)) + geom_line() + geom_point(size = 4,shape = 21, fill = "white")
[图片上传失败...(image-31291f-1626921661081)]
pd <- position_dodge(0.2)ggplot(tg, aes(x = dose, y = length, fill = supp)) + geom_line(position = pd) + geom_point(shape = 21, size = 3, position = pd) + scale_fill_manual(values = c("black","white"))
[图片上传失败...(image-b96f6b-1626921661081)]
2.6 绘制面积图
运行 geom_area()
函数即可绘制面积图
# 将sunspot.year数据集转化为数据框,便于本例使用sunspotyear <- data.frame(Year = as.numeric(time(sunspot.year)), Sunspots = as.numeric(sunspot.year))ggplot(sunspotyear, aes(x = Year, y = Sunspots)) + geom_area()
[图片上传失败...(image-405c6d-1626921661081)]
# 颜色、透明度设置ggplot(sunspotyear, aes(x = Year, y = Sunspots)) + geom_area(colour = "black",fill = "blue", alpha = 0.2)
[图片上传失败...(image-c4615e-1626921661081)]
# 去掉底部横线 不设定colour,使用geom_line()绘制轨迹ggplot(sunspotyear, aes(x = Year, y = Sunspots)) + geom_area(fill = "blue",alpha = 0.2) + geom_line()
[图片上传失败...(image-aaa612-1626921661081)]
2.7 绘制堆积面积图
library(gcookbook) ggplot(uspopage, aes(x = Year, y = Thousands, fill = AgeGroup)) + geom_area()
[图片上传失败...(image-f3b0a5-1626921661081)]
head(uspopage)
> head(uspopage) Year AgeGroup Thousands1 1900 <5 91812 1900 5-14 169663 1900 15-24 149514 1900 25-34 121615 1900 35-44 92736 1900 45-54 6437
# 通过设定breaks翻转堆积顺序# 透明度、颜色、大小设置ggplot(uspopage, aes(x = Year, y = Thousands, fill = AgeGroup)) + geom_area(colour = "black", size = 0.2, alpha = 0.4) + scale_fill_brewer(palette = "Blues", breaks = rev(levels(uspopage$AgeGroup)))
[图片上传失败...(image-ba486-1626921661081)]
# 设定order = desc(AgeGroup) 可以对堆积顺序进行反转library(plyr) ggplot(uspopage, aes(x = Year, y = Thousands, fill = AgeGroup, order = desc(AgeGroup))) + geom_area(colour = "black", size = 0.2, alpha = 0.4) + scale_fill_brewer(palette = "Blues")
[图片上传失败...(image-20f3e9-1626921661081)]
# 去掉框线ggplot(uspopage, aes(x = Year, y = Thousands, fill = AgeGroup, order = desc(AgeGroup))) + geom_area(colour = NA, alpha = 0.4) + scale_fill_brewer(palette = "Blues") + geom_line(position = "stack", size = 0.2)
[图片上传失败...(image-a74f41-1626921661081)]
2.8 绘制百分比面积堆积图
# 先计算百分比uspopage_prop <- ddply(uspopage, "Year", transform, Percent = Thousands/sum(Thousands) * 100)ggplot(uspopage_prop, aes(x = Year, y = Percent, fill = AgeGroup)) + geom_area(colour = "black", size = 0.2, alpha = 0.4) + scale_fill_brewer(palette = "Blues", breaks = rev(levels(uspopage$AgeGroup)))
[图片上传失败...(image-36f399-1626921661081)]
head(uspopage)
> head(uspopage) Year AgeGroup Thousands1 1900 <5 91812 1900 5-14 169663 1900 15-24 149514 1900 25-34 121615 1900 35-44 92736 1900 45-54 6437
uspopage_prop <- ddply(uspopage, "Year", transform, Percent = Thousands/sum(Thousands) * 100)
2.9 添加置信域
运行 geom_ribbon()
分别映射一个变量给 ymin
和 ymax
。
climate
数据集中的Anomaly10y
变量表示了各年温度相对于1950-1980平均水平变异的10年移动平均。Unc10y
表示其95%置信区间。
library(gcookbook) # 抓取 climate 数据的一个子集clim <- subset(climate, Source == "Berkeley", select = c("Year", "Anomaly10y", "Unc10y"))head(clim)
> head(clim) Year Anomaly10y Unc10y1 1800 -0.435 0.5052 1801 -0.453 0.4933 1802 -0.460 0.4864 1803 -0.493 0.4895 1804 -0.536 0.4836 1805 -0.541 0.475
# 将置信域绘制为阴影# 注意一下图层的顺序ggplot(clim, aes(x = Year, y = Anomaly10y)) + geom_ribbon(aes(ymin = Anomaly10y - Unc10y, ymax = Anomaly10y + Unc10y), alpha = 0.2) + geom_line()
[图片上传失败...(image-b192c0-1626921661081)]
# 使用虚线表示置信域的上下边界ggplot(clim, aes(x = Year, y = Anomaly10y)) + geom_line(aes(y = Anomaly10y -Unc10y), colour = "grey50", linetype = "dotted") + geom_line(aes(y = Anomaly10y +Unc10y), colour = "grey50", linetype = "dotted") + geom_line()
[图片上传失败...(image-73e4cf-1626921661081)]
参考书籍
- R Graphics Cookbook, 2nd edition.