桑基图做法

参考链接:
https://mp.weixin.qq.com/s/dgxgi_3PdjW5g-fOPlAe4Q?
https://www.jianshu.com/p/148ca7c7af59

画法:R包ggalluvial

######################################################## 
#-------------------------------------------------------
# Topic:桑基图练习
# Author:
# Date:Wed Mar 04 12:04:32 2020
# Mail:
#-------------------------------------------------------
########################################################


#-------------------------------------------------------
# 
# Chapter1:简单示例(宽表格模式)
# 
#-------------------------------------------------------
# 加载包
library(ggalluvial)

# 转换内部数据为数据框,宽表格模式
titanic_wide <- data.frame(Titanic)

# 显示数据格式
head(titanic_wide)
#>   Class    Sex   Age Survived Freq
#> 1   1st   Male Child       No    0
#> 2   2nd   Male Child       No    0
#> 3   3rd   Male Child       No   35
#> 4  Crew   Male Child       No    0
#> 5   1st Female Child       No    0
#> 6   2nd Female Child       No    0

# 绘制性别与舱位和年龄的关系
ggplot(data = titanic_wide,
       aes(axis1 = Class, axis2 = Sex, axis3 = Age,
           weight = Freq)) +
  scale_x_discrete(limits = c("Class", "Sex", "Age"), expand = c(.1, .05)) +
  geom_alluvium(aes(fill = Survived),width = 0, knot.pos = 0, reverse = FALSE) +
  geom_stratum(width = 1/4) + geom_text(stat = "stratum", label.strata = TRUE) +
  theme_minimal() +
  ggtitle("passengers on the maiden voyage of the Titanic",
          "stratified by demographics and survival")
# 其中注意:
# 1、该图基本由两部分构成,一部分是分类层即如图所示三个分类标准,另一部分是冲击层,即特征的流向
# 2、对于宽数据而言,在ggplot中一般使用axis来表示分类层,weight表示冲积层数值
# 3、其中geom_alluvium主要是对冲击层进行调整,其默认继承ggplot中的weight,决定冲击层的宽度,fill表示冲击条带的颜色
#    width决定宽度,knot.pos决定冲击层的曲折程度,默认1/6,reverse功能未知
# 4、geom_stratum主要是对分类层进行调整,其中width决定分类层的宽度
#-------------------------------------------------------
# 
# Chapter2:长数据练习
# 
#-------------------------------------------------------
# 长表格模式,to_lodes多组组合,会生成alluvium和stratum列。主分组位于命名的key列中
titanic_long <- to_lodes(data.frame(Titanic),
                         key = "Demographic",
                         axes = 1:3)
head(titanic_long)
ggplot(data = titanic_long,
       aes(x = Demographic, stratum = stratum, alluvium = alluvium,
           weight = Freq, label = stratum)) +
  geom_alluvium(aes(fill = Survived)) +
  geom_stratum() + geom_text(stat = "stratum") +
  theme_minimal() +
  ggtitle("passengers on the maiden voyage of the Titanic",
          "stratified by demographics and survival")
# 注意:
# 1、to_lodes类似于melt等函数,区别在于会自动生成alluium和stratum列,可以理解为对融合的列的idex,而融合的列的原列名最终保存在新定义的key中,原数值保存在stratum中
# 2、对于长数据而言,一定要有alluium和stratum列,其中stratum表示分类层,weight表示冲积层数值,x代表新定义的包含原列信息的列名

你可能感兴趣的:(桑基图做法)